Building Smarter Fantasy Football Projections with Machine Learning

Jun 2, 2016 | 4 min read | Software Engineering, AI, Draft Punk, Fantasy Football, Algorithms

Fantasy football has evolved from a casual hobby into a data-driven pursuit where success often hinges on having access to the best projections and analytics. As a software engineer and long-time fantasy football fan, I’ve been developing advanced machine learning techniques to create far more robust projections for my Android app, Draft Punk.

Why fantasy football rankings are flawed

Most cheatsheets or “Top 200” lists collapse uncertainty into a single number, ignore your league’s scoring/roster quirks, mishandle positional scarcity and replacement level, assume linear value across steep tier cliffs, treat noisy inputs as precise, lag news, overlook schedule/playoff timing, miss correlation/stacking and roster-construction effects, echo market bias, and so much more.

In short, unless you know the exact inputs an analyst used for their rankings, they aren’t good for your league. Instead, you want accurate player projections and calculate your own rankings using your league’s scoring settings and other tendencies from fellow managers.

So the quest becomes which projections should you use?

Why accurate projections are hard

Player projections differ because different sources and analysts make different modeling choices: player usage assumptions, injury priors, depth chart volatility, game environment, too much reliance on patterns (running back cliffs at age 28 ), or frankly uncertainty on health and past injuries. Some sources might also be great for QBs but shaky for RBs, while others nail median outcomes but miss tails.

I’m building a sophisticated data pipeline that addresses that fundamental challenge in fantasy football: creating more accurate and reliable projections than what’s currently available.

Instead of relying on single-source projections or simple averages, I’m employing advanced statistical methods and machine learning to synthesize data from multiple sources into superior predictions.

My Approach

1. Multi-Source Data Integration

First, I’m normalizing player info, stat definitions, team, and positions.

Then, I’m aggregating projections from many different sources (CBS, ESPN, FantasySharks, FFToday, NumberFire, NFL.com, Walterfootball) to create a comprehensive dataset. Aggregators (eg FantasyPros, FantasyFootballNerd) are either excluded or deduped so I don’t double-count the same analysts.

2. Ensemble learning with dynamic source weighting

For each position and stat family (passing/rushing/receiving), I’ll evaluate recent historical accuracy by source (eg rolling window MAE, rank correlation).

I’ll convert any error profiles into weights with a softmax over negative errors, with:

Recency boosts (what’s accurate lately counts more),
Flooring to avoid zeroing out minority views,
Context splits (home/away, spread/total buckets, dome/weather, usage volatility).

And finally, combine with model-level ensembles (GBMs, RFs, neural nets) for robust medians.

3. Advanced Statistical Method

Once I have those projections, I’m often generating derived data, like final rankings and tiers. Though, that often happens in the app itself in real-time for Draft Punk users to always have the most accurate and up to date information.

I’m generating things like:

Value Over Replacement (VOR) compute VOR from those projections and league settings
Uncertainty quantification using Monte Carlo simulations to produce calibrated prediction intervals and risk metrics
Tier-based clustering using statistical distance metrics
Custom scoring optimization for any league format

Key Technical Innovations

Dynamic Source Weighting: I’m implementing machine learning algorithms that automatically adjust source weights based on their historical accuracy, rather than using static weights.

Monte Carlo Uncertainty: Using Monte Carlo simulations to generate prediction intervals around projections, giving users a better understanding of risk.

Ensemble Learning: Combining multiple prediction models (random forests, gradient boosting, neural networks) to create more robust projections.

Real-time Model Updates: Continuously updating models as new data becomes available throughout the preseason.

Challenges and Solutions

Multi-source consistency. Every provider defines stats a little differently. For example, some sites compute fantasy points with fractional scoring, team abbreviations can change between seasons after relocations/rebrands, and return yards are tracked separately and only added to fantasy points if a league scores them.

Real-time updates. News moves fast, but projections shouldn’t whiplash too much. I version each ingest, track freshness by source, and apply change caps and rolling windows so new information flows through quickly without blowing up week-to-week stability in the preseason when a lot of player movement happens.

Source reliability. Not all sources shine in the same places. I’m starting to benchmark historical accuracy by position and context (eg dome vs. weather games), then convert those error profiles into dynamic weights. The ensemble automatically leans on the most trustworthy voices for the current situation.

Future Development for Draft Punk

I’m looking to continue exploring ways of improving the accuracy of my projections with things like:

Deep Learning Models: Neural networks for capturing complex non-linear relationships in player performance
Real-time Model Updates: Live projection updates during games using in-game statistics
Advanced Risk Assessment: Portfolio optimization techniques adapted for fantasy football
Personalized Recommendations: User-specific optimization based on league settings and draft history

Impact on Fantasy Football

My work on Draft Punk is hopefully pushing the boundaries of what’s possible in fantasy football analytics!

I have a passion for combining machine learning with practical apps, and in this case an app that I originally built for myself to help draft a team back in 2010.

By developing sophisticated ensemble models and uncertainty quantification methods, I’m creating projections that are not just accurate, but also more informative about the risks and opportunities in fantasy football.

The future of fantasy football isn’t just about having access to projections - it’s about having access to the right projections, with the right context, at the right time. That’s exactly what I’m building with Draft Punk.

« PREVIOUS
So, You Want to Move to Microservices?

NEXT »
Don't Ask for Feedback, Ask for Advice