Monte Carlo assumes independent returns, but historical returns aren't independent — they have momentum, mean reversion, and regime shifts that Monte Carlo can miss. See our [historical vs MC article](/blog/monte-carlo-vs-historical).

How many starting points should I test?

All of them. There are roughly 100 overlapping 50-year windows in our dataset. Testing the worst 5 cohorts honestly is the point — that's where plans actually break.

What if the future is worse than any historical cohort?

Add a buffer. Plan for the worst observed cohort and add 10–15% margin. That covers the possibility of an unprecedented future without becoming impossibly conservative.

How to Model Your FIRE Plan With Real Historical Data

TL;DR

Average-return calculators (4% real, etc.) give a single answer. Historical sequence testing — running your plan through every starting point since 1871 — gives a distribution of outcomes that's more honest.

What "modelling" actually means

Most FIRE calculators on the internet are average-return calculators. You enter your savings rate, starting capital, and expected return. The calculator multiplies things together with a smooth 5% real return assumption and tells you the answer is, say, "21 years to FI."

This is mathematically correct but conceptually inadequate. Real markets don't deliver smooth 5% returns. They deliver -15%, then +28%, then -3%, then +6%, then +1%, then -22%, then +34%. Real-world FIRE timelines depend on the sequence of returns you happen to live through, not on the long-run average.

Historical sequence testing is the alternative methodology: instead of assuming one smooth return path, test your plan against every actual return sequence that's ever happened. The output is a distribution of FI dates, not a single number.

How it works

The Shiller dataset contains monthly S&P 500 returns and CPI data going back to 1871. That's 155 years of history, which means roughly 100+ overlapping "starting cohorts" for a typical 50-year retirement plan.

For each starting year, the simulator:

Takes your input parameters (savings rate, expenses, allocation, starting capital).
Applies the actual returns that happened starting from that year, month by month.
Tracks when your portfolio crosses the FIRE threshold.
Records the FI date for that cohort.

After running every cohort, you get a distribution of FI dates. The 25th percentile, median, and 75th percentile of that distribution tell you the realistic range of outcomes for your plan — not the theoretical average.

For a 50% savings rate starting from zero with a 75/25 allocation:

10th percentile (worst sequences, like starting in 1929 or 1966): ~22 years
Median: ~16.5 years
90th percentile (best sequences, like starting in 1982): ~12 years

The MMM table's 17 years lands between the median and 25th percentile. Honest but incomplete — it doesn't show the spread.

Why historical beats Monte Carlo

Some FIRE calculators use Monte Carlo simulation: generate random return sequences from a normal distribution with assumed mean and standard deviation. Sounds more sophisticated than historical testing because you can run 10,000 paths instead of 100.

The problem: Monte Carlo assumes returns are independent and identically distributed (IID). Real returns aren't.

Momentum: returns are positively autocorrelated over 3–12 months. Up markets tend to keep going up; down markets tend to keep going down.
Mean reversion: returns are negatively autocorrelated over 3–7 years. After several great years, mediocre years tend to follow.
Regime shifts: 1970s stagflation, 2000s lost decade, 2010s tech boom — these aren't random; they're persistent macro patterns.

Pure Monte Carlo can generate sequences that never actually happen, particularly very benign sequences that mislead you into optimism. Historical sequence testing constrains the simulation to realistic patterns. For deeper coverage, see our Monte Carlo vs historical article.

What you should look at

When testing your FIRE plan against historical data, focus on three things:

The worst cohort, not the median. A plan that survives the median cohort isn't really a plan — half of all sequences would have killed it. A plan that survives the worst cohort (or the 5th percentile, defensibly close) is robust. The worst US cohorts are 1929, 1966, 1969, and 2000.
The shape of failure. When a cohort fails, look at how — sharp early drawdown, slow grinding stagflation, late-cycle inflation. Different failure modes call for different mitigations (lower withdrawal rate, cash buffer, factor tilt, etc.).
The relationship between savings rate and worst-case timeline. Higher savings rates compress the entire distribution. At 70% savings rate, the worst cohort still hits FI in ~11 years, only 2.5 years worse than the median. That compression is one of the deepest reasons FIRE adherents fixate on savings rate.

What to do with the output

Run your specific plan through the simulator and look at the survival curve. The most useful exercises:

Identify the cohorts where your plan fails. Are they all stagflation cohorts? Depression cohorts? That tells you what kind of risk dominates.
Compare mitigations. Adding spending flexibility, factor tilts, or a cash buffer should visibly shift the survival curve to the right. If they don't, they're not adding what you think.
Set conservative targets. Plan for the 25th percentile FI date, not the median. The 4-year buffer is the cost of insurance.

The deeper methodology behind our simulator is covered in how FIRE Wealth OS simulates 155 years. The underlying data is freely available — we use Shiller for US equity history and Ken French for factor returns.

The honest baseline

Historical sequence testing isn't perfect. It assumes the future will be drawn from the same distribution as the past, which may not hold. US equity history is unusually good by global standards (no world wars on home soil, no hyperinflation, no political revolutions during the era of public markets). The actual future might be worse than the historical worst case.

The defensible response: use historical testing to identify the worst plausible scenarios, then add 10–15% buffer on top for the futures the data doesn't cover. That's more honest than either smooth-return calculators or pure Monte Carlo.

Open our simulator and run your real numbers. Compare the distribution to what a simple 5% real calculator would tell you. The differences will probably surprise you — and the worst-case cohorts will tell you what to actually plan against.

Frequently asked questions

Why not Monte Carlo?: Monte Carlo assumes independent returns, but historical returns aren't independent — they have momentum, mean reversion, and regime shifts that Monte Carlo can miss. See our [historical vs MC article](/blog/monte-carlo-vs-historical).
How many starting points should I test?: All of them. There are roughly 100 overlapping 50-year windows in our dataset. Testing the worst 5 cohorts honestly is the point — that's where plans actually break.
What if the future is worse than any historical cohort?: Add a buffer. Plan for the worst observed cohort and add 10–15% margin. That covers the possibility of an unprecedented future without becoming impossibly conservative.

Stress-test your own FIRE plan

FIRE Wealth OS runs your savings rate and expenses against every historical market starting point since 1871. Free to use, no card required.

Start free Open simulator