Skip to main content

Analyzing Luck in the League's Recent History

· 12 min read
Scottie Enriquez
League Webmaster
Callen Trail
Callen Trail
League Commissioner

Background​

The Winner Is a Tryhard, affectionately known as TWIATH, began in 2014 and has been going strong ever since. We started our journey on ESPN before migrating to Sleeper in 2020. A few years ago, ESPN quietly deleted many leagues' histories without warning, including ours. We had previously stored some historical data in an EBS snapshot in AWS, but sadly, we haven't been able to recover the full dataset. Since Sleeper provides an API for accessing data (and most of us are nerds who work as technology professionals), we opted to use this to create a data lake in AWS to prevent this from happening again. More importantly, we wanted to use this data to start answering some important questions, such as, who is the luckiest person in league history?

Building an Initial Data Lake Using Python​

Before answering any questions, we must first build our dataset. Thankfully, Sleeper offers an API to pull data programmatically. Numerous Python wrapper packages are available to accelerate development. With object storage like Amazon S3 to store the data, it only takes a bit of Python code to get started. With pandas, pyarrow, and sleeper-py installed via pip, the following code establishes the base dataset as partitioned Parquet files uploaded to S3:

twiath-datalake-prep/historical-sleeper-load.ipynb
# update with your S3 bucket name
base_s3_path = 's3://your-bucket-name/sleeper'
# get the leauge IDs from your Sleeper leauge's URL
# there is one ID per season
past_league_ids = ['541384381865115648', '706690303247065088','784536701455429632','961113588070985728']
default_regular_season_weeks = 14
# this is because the NFL extended the season in 2021
year_playoff_map_exceptions = {2020: 13}
for index, league_id in enumerate(past_league_ids):
# as noted above, we migrated to Sleeper in 2020
# update this value with the first year in Sleeper
year = index + 2020
# fetch the league members
rosters = pd.DataFrame(Leagues.get_rosters(league_id))
# there should only be 18 weeks, but I've set this to 20 in case there are more weeks added in the future
for week in range(1, 20):
# fetch the matchups from the Sleeper API
matchups = pd.DataFrame(Leagues.get_matchups(league_id, week))
if not matchups.empty:
# join rosters and matchups
week_result = pd.merge(rosters, matchups, on='roster_id', how='inner')
# set week and year values
week_result['week'] = week
week_result['year'] = year
# determine which weeks are regular season and which are playoffs
# these API endpoints do not have this metadata
playoff_week = year_playoff_map_exceptions.get(year, default_regular_season_weeks)
week_result['type'] = 'regular' if week <= playoff_week else 'playoff'
players_points_list = []
# format the player points to support Parquet
for players_points_week in week_result['players_points'].to_list():
players_points_list.append([{'player_id': player_id, 'points': points} for player_id, points in players_points_week.items()])
week_result['players_points'] = players_points_list
# drop columns that cause issues with Parquet
week_result = week_result.drop(columns=['settings', 'metadata'])
# write partitioned file to S3
week_result.to_parquet(f'{base_s3_path}/matchups/{year}/{week}/result.parquet', engine='pyarrow')

For a complete Jupyter Notebook example, see this GitHub repository. Note that this does not follow data lake best practices like a medallion architecture. Still, it's enough to start analyzing the entire Sleeper's historical dataset. In a later post, we'll cover our scheduled jobs to load the data weekly during the season.

Scoring​

To provide additional context for the numbers shown for those outside the league, here's our configuration:

  • 12 teams
  • Half-point per reception
  • One-point receiving first down
  • Quarter-point loss per point for defenses
  • Half-point sacks
  • No kickers (sadly)
  • Two WR/RB/TE flex spots

A Naive Attempt at Defining Luck and Skill​

In the simplest terms, you might define luck as the fewest points against (PA) since you have no control over your opponent's lineup. By the same logic, you could define skill as points for (PF) since you chose the players on your roster. Let's start with regular season PF in the past four seasons:

Next, let's convert these to weekly averages:

So, does regular season PF correlate with championships? For the most part, yes. The past four champions were Mark (2023), Mark (2022), Matt (2021), and Travis (2020). Despite consistently leading the league in PF, Scottie and Callen have yet to win a championship (although Scottie did lose in the finals twice during these four seasons). Mark and Travis are in the top four scorers, and Matt's championship season looks like an outlier compared with his average performance. With this in mind, does regular season PA correlate better with championships?

First, we need to compute points against for each week in the Pandas DataFrame since this is not available in the API:

def calculate_points_against(row):
points_against = matchups.loc[matchups['year'] == row['year']] \
.loc[matchups['week'] == row['week'] \
.loc[matchups['matchup_id'] == row['matchup_id']] \
.loc[matchups['owner_id'] != row['owner_id']]['points'].values
if len(points_against) == 1:
return points_against[0]
else:
return 0.0
matchups['points_against'] = matchups.apply(calculate_points_against, axis=1)

It appears that PA may not correlate as well as PF. First of all, weekly PF ranges (89.2 to 103.2) have a higher minimum/maximum delta than PA ranges (95.0 to 100.7). While we see Mark and Scottie in the bottom three (i.e., luckiest), Matt has the highest PA (average and total). In any case, these aggregates omit much of the story. To paint a clearer picture, let's introduce a new metric.

Measuring Wins Against All Opponents​

The core aspect of luck in fantasy football is scheduling (i.e., PA for a given week). For example, you could be the second-highest-scoring team and still lose the week. Likewise, you could put up the second-worst performance and win the week. Aggregating the totals for PA and PF does not account for this. However, we can measure how many teams a player would have beaten in any given week with the following formula:

wo=∑p∈P(r)I(p<r.points)w_o = ∑_{p ∈ P(r)} I(p < r.points)

Where:

  • P(r)P(r) is the set of points from all other players in the same year and week as rr, defined as p∣(p.wy=r.wy)∧(p.u≠r.u){p | (p.wy = r.wy) ∧ (p.u ≠ r.u)} where wywy represents week and year and uu represents username
  • I(condition)I(condition) is the indicator function defined as 1 if true 0 if false { 1 \text{ if true } 0 \text{ if false } }
  • r.pointsr.points represents the points from the input row

Or expressed in Python with the Pandas DataFrame:

def calculate_weekly_wins_against_all_opponents(row):
other_player_points = list(matchups.loc[matchups['year'] == row['year']] \
.loc[matchups['week'] == row['week']] \
.loc[matchups['username'] != row['username']]['points'])
wins_against_all_opponents = 0.0
for other_player_point in other_player_points:
if other_player_point < row['points']:
wins_against_all_opponents += 1
return wins_against_all_opponents
matchups['wins_against_all_opponents'] = matchups.apply(calculate_weekly_wins_against_all_opponents, axis=1)

This function produces an integer between 0 and 11 for any given week that corresponds to the number of teams a player would have beaten (irrespective of who they played against). When looking at the regular season averages over the past four years, we see the order shift slightly in the middle.

Over time, this metric largely correlates with PF. At the top, Scottie leads this metric. At the bottom, Logan's low PF and mediocre PA hurt him again. How about during a smaller time window (e.g., a single season)? Can we start to quantify luck?

Quantifying the Luckiest Seasons

Using Pandas, we can query the total number of actual wins and wins against all opponents (wow_o) and group by player and season.

matchups.loc[matchups['type'] == 'regular'] /
.groupby(['username', 'year']) \
.agg({'actual_win_loss':'sum','wins_against_all_opponents':'sum'})

We can identify anomalous seasons by comparing the actual wins to the number of teams the player would have beaten. Using the following formula, we can convert these deltas to a percentage above or below actual wins (as ΔwΔw with waw_a as actual wins and wow_o as wins over all opponents based on tt possible wins in the regular season and 11t11t possible wins over all opponents):

Δw=(wa/t)−(wo/(11t))Δw = (w_a / t) - (w_o / (11t))

For 2021 and on, t=14t=14. For 2020 and prior, t=13t=13. Based on this metric, let's look at the top five luckiest seasons:

NameSeasonwaw_awow_oΔwΔw
Andrew2022106132%
Mark2023106728%
Carl2020108319%
Carl202397317%
Logan202375216%

And the bottom five luckiest seasons:

NameSeasonwaw_awow_oΔwΔw
Carl2022573-12%
Callen2020795-13%
John2023577-14%
Caleb2022579-16%
Trond2023359-17%

Caleb's 2022 squad outscored 79 opponents, while Andrew's 2022 team outscored 61. Andrew ended up with ten wins to Caleb's five, thus marking the luckiest season in the league's history.

The Anatomy of a Lucky Season​

First, let's graph Andrew's wow_o (y-axis) over each regular season week (x-axis) with purple indicating an actual loss:

While we see that the two weeks in which he scored higher than only one team were actual losses, he benefited from five wins in which he scored higher than only four or fewer teams. Andrew would secure the bye week in the playoffs this season, but ultimately, this lucky run wouldn't matter. Mark's team exposed him as fraudulent in the second round of the playoffs with a commanding 111.52 to 88.87 win. Speaking of Mark, let's look at his wow_o for his 2023 campaign to examine a season that ended in victory.

Sure enough, Mark's team benefited in the same way as Andrew's: four wins in which he scored higher than only four or fewer teams. We've finally quantified what luck looks like in the regular season.

The Luckiest and Unluckiest Individual Weeks​

Since wow_o values range from 0 to 11, the unluckiest outcome is to score higher than ten other teams and still lose. By the same token, the luckiest would be to outscore only one team and still win. These have happened several times over the past four years. First, the wo=10w_o=10 losses:

matchups.loc[matchups['type'] == 'regular'] \
.loc[matchups['wins_against_all_opponents'] == 10] \
.loc[matchups['actual_win_loss'] == 0] \
.groupby(['username', 'year', 'week']) \
.sum()
NameYearWeekPFPA
Travis20239119.48166.01
John20208109.42114.39
Logan20201115.15127.59
Matt20234129.45154.08
Caleb202114130.06130.74
Caleb20227136.90141.45

Next, the wo=1w_o=1 wins:

matchups.loc[matchups['type'] == 'regular'] \
.loc[matchups['wins_against_all_opponents'] == 1] \
.loc[matchups['actual_win_loss'] == 1] \
.groupby(['username', 'year', 'week']) \
.sum()
NameYearWeekPFPA
Callen20211264.0756.10
Callen2022378.1274.83
Caleb2021174.6458.41

Conclusion​

Over time, luck regresses to the mean. We can spot it in an individual season or week, but luck-based metrics like PA tend to balance out within a few points on average. Skill-based metrics like PF and wow_o have wider ranges and identify performance outliers, such as Logan at the bottom of both. However, luck is clearly still required to win the championship, as evidenced by Scottie and Callen, who lead the skill metrics and have yet to win.