2025 Season Results
Optimizing app for faster loading...
We use analytics to improve your experience. No personal data is collected.
Screen Recording:

How Cross Country Ratings Work

Understanding the place-based rating algorithm


๐Ÿ“Š Overview

Iowa Cross Country Ratings uses the Elo-MMR rating system, a Bayesian algorithm designed for massive multiplayer competitions. Unlike time-based systems, I focus on where you finish rather than how fast you run.

Key Principle: Beating quality runners in head-to-head competition matters more than running fast times on easy courses. The algorithm is mathematically proven to be incentive-compatible - your rating can only improve by performing better, never by sandbagging.

This approach creates ratings that:

  • โœ… Work across all courses (hilly, flat, fast, slow)
  • โœ… Reward competitive racing (placing well against strong fields)
  • โœ… Minimize course bias (no "pad your stats on easy courses")
  • โœ… Update after every race (fresh, current data)

๐ŸŽฏ What is a Rating?

Your rating is a number that represents your competitive strength relative to other runners in Iowa. Think of it like a golf handicap or chess ELO rating.

Rating Scale
  • 2450+ - Top 1%
  • 2150-2449 - Top 5%
  • 1980-2149 - Top 10%
  • 1790-1979 - Top 20%
  • 1445-1789 - Top 50%
  • Below 1445 - Bottom 50%

Based on 2025 season data (5,127 boys, 3,634 girls). New runners start at 1500.

What Affects Your Rating
  • โœ… Place - Where you finish
  • โœ… Field Strength - Who you race against
  • โœ… Race Combination - Multiple races (JV/Varsity) at same meet are combined by time
  • โœ… Recent Performance - Latest races weighted more
  • โŒ NOT Time - Course difficulty varies too much

๐Ÿงฎ How Your Rating is Calculated

The algorithm uses a two-phase process to separate your day-to-day performance from your underlying skill level:

Phase 1: Estimate Your Performance

For each race, the algorithm estimates how well you ran that specific day (your "performance") by looking at who finished ahead of and behind you:

  • Runners you beat: Performance increases based on their skill ratings
  • Runners who beat you: Performance decreases based on their skill ratings
  • Your finish position: Provides additional context for the overall field strength

Think of "performance" as your rating on that specific day - you might overperform or underperform your true skill.

Phase 2: Update Your Skill Rating

Your overall skill rating is updated by combining your past rating with your new performance evidence:

  • Prior belief: What the algorithm knew about your skill before this race
  • New evidence: Your performance in this race
  • Bayesian update: The algorithm combines these using probability theory to estimate your current skill

Your skill rating is an estimate of your underlying competitive level based on all your races.

๐Ÿ“ Case Studies: Real Examples from 2025

Let's examine six real examples from the 2025 season that illustrate how the rating system works in different scenarios. Each case study shows the complete data: prior rating, race result, head-to-head matchups, and rating update.

Pre-Race
  • Prior Rating: 2884 (top 1%)
  • Days Since Last Race: 9
  • Field Size: 558 runners
  • Median Field Rating: 2055
  • Highest Opponent: 2833 (51-point gap)
Race Result
  • Actual Place: 1st
  • Expected Place: 1st (as predicted)
Head-to-Head Analysis
  • Expected Wins: 557
  • Unexpected Wins: 0
  • Expected Losses: 0
  • Unexpected Losses: 0
Rating Update
  • Performance Score: 2905
  • Updated Rating: 2896
  • Change: +12
๐Ÿ’ก What This Shows: Even winning the state championship produces a modest rating gain (+12) when the result is expected. This runner entered as the favorite and delivered exactly what the ratings predicted. With 557 expected wins and zero unexpected results, the performance score (2905) was only slightly above the incoming rating (2884). Expected results = small changes.

Pre-Race
  • Prior Rating: 2006 (top 10%)
  • Days Since Last Race: 8
  • Field Size: 558 runners
  • Median Field Rating: 2055
  • Highest Opponent: 2884 (878-point gap!)
Race Result
  • Actual Place: 69th
  • Expected Place: 326th (257 places better!)
Head-to-Head Analysis
  • Expected Wins: 232
  • Unexpected Wins: 257 ๐Ÿ”ฅ
  • Expected Losses: 68
  • Unexpected Losses: 0
Rating Update
  • Performance Score: 2188
  • Updated Rating: 2139
  • Change: +133
๐Ÿ’ก What This Shows: This is what a breakthrough race looks like! Beating 257 higher-rated runners produced a massive +133 rating gainโ€”one of the largest possible single-race improvements. The performance score (2188) was 182 points above the incoming rating (2006), reflecting these exceptional unexpected wins. The field included the state champion (rated 2884), creating an 878-point gap, but this runner finished 69thโ€”257 places better than their expected 326th place finishโ€”by beating runners far above their rating level. Unexpected wins = major gains.

Pre-Race
  • Prior Rating: 2066 (top 10%)
  • Days Since Last Race: 4
  • Field Size: 517 runners
  • Median Field Rating: 1652
  • Highest Opponent: 2811 (745-point gap)
Race Result
  • Actual Place: 117th
  • Expected Place: 71st (46 places worse!)
Head-to-Head Analysis
  • Expected Wins: 397
  • Unexpected Wins: 3
  • Expected Losses: 67
  • Unexpected Losses: 49 ๐Ÿ˜ฌ
Rating Update
  • Performance Score: 1933
  • Updated Rating: 2006
  • Change: -60
๐Ÿ’ก What This Shows: This is what a bad race looks like. Coming in rated 2066 (top 10%), this runner was expected to finish around 71st but finished 117thโ€”46 places worse than expected. The algorithm detected 49 unexpected losses to runners rated below 2066. The performance score (1933) was 133 points below their incoming rating, producing a -60 point penalty. In a large field (517 runners), having nearly 50 unexpected losses signals significant underperformance. The rating system is unforgiving when you lose to runners you were expected to beatโ€”it's not about your absolute time or place, it's about your results relative to the competition's ratings.

Pre-Race
  • Prior Rating: 1696 (~50th percentile)
  • Field Size: 96 runners
  • Median Field Rating: 1677
  • Highest Opponent: 2315 (619-point gap)
Race Result
  • Actual Place: 46th (48th percentile)
  • Expected Place: 44th (close to prediction)
Head-to-Head Analysis
  • Expected Wins: 48
  • Unexpected Wins: 2
  • Expected Losses: 43
  • Unexpected Losses: 2
Rating Update
  • Performance Score: 1694
  • Updated Rating: 1694
  • Change: -2
๐Ÿ’ก What This Shows: This is what rating stability looks like. A mid-pack runner finishes in the middle of the pack with almost perfectly balanced results: 2 unexpected wins, 2 unexpected losses, and mostly expected outcomes in both directions. The performance score (1694) nearly matched the incoming rating (1696), resulting in just a -2 point change. The algorithm recognized this as confirmation of their current ability levelโ€”neither improving nor declining. Small changes (ยฑ10 points) indicate you're racing at your current level.

Pre-Race
  • Prior Rating: 1500 (default for new runners)
  • Days Since Last Race: N/A (first race)
  • Field Size: 78 runners
  • Median Field Rating: 1500 (many new runners)
  • Highest Opponent: 2267 (767-point gap)
Race Result
  • Actual Place: 9th
  • Expected Place: 31st (22 places better!)
Head-to-Head Analysis
  • Expected Wins: 47
  • Unexpected Wins: 22
  • Expected Losses: 8
  • Unexpected Losses: 0
Rating Update
  • Performance Score: 2014
  • Updated Rating: 2008
  • Change: +508
๐Ÿ’ก What This Shows: New runners start at 1500 (average). This runner's first race shattered that assumptionโ€”they finished 9th out of 78, beating 22 runners rated higher than 1500. This strong debut produced a performance score of 2014, resulting in a massive +508 jump to 2008 (top 10-15%). Because this is their first race, there's high uncertainty about their true ability, so the Bayesian update heavily weighted the new evidence. First races create large rating swings because the system has no prior data. Subsequent races will fine-tune this estimate with smaller adjustments.

Pre-Race
  • Prior Rating: 2492 (top 1%)
  • Days Since Last Race: 5
  • Field Size: 54 runners (small field)
  • Median Field Rating: 1500 (weak field)
  • Highest Opponent: 2316 (176-point gap)
Race Result
  • Actual Place: 1st
  • Expected Place: 1st (as predicted)
Head-to-Head Analysis
  • Expected Wins: 53
  • Unexpected Wins: 0
  • Expected Losses: 0
  • Unexpected Losses: 0
Rating Update
  • Performance Score: 2495
  • Updated Rating: 2494
  • Change: +2
๐Ÿ’ก What This Shows: This is the "rating ceiling" effect in action. An elite runner wins convincingly (probably ran a fast time too), but gains almost nothing (+2 points) because the competition wasn't strong enough. The second-place finisher was rated 2316โ€”a full 176-point gap. From the algorithm's perspective, this runner beat 53 opponents they were expected to beat. Zero unexpected wins means zero evidence of improvement. The performance score (2495) was barely above their incoming rating (2492). To move up from 2492, this runner needs to race against opponents rated 2400+ where unexpected wins can actually move the needle. Elite runners need elite competition.
๐ŸŽฏ Key Takeaways from These Cases
  • Expected results = small changes (Cases 1 & 6) - Dominating weak competition or winning as the favorite produces minimal rating gains
  • Unexpected wins = major gains (Case 2) - Beating higher-rated runners is how you make big jumps in the rankings
  • Unexpected losses = rating drops (Case 3) - Losing to lower-rated runners signals underperformance
  • Balanced results = stability (Case 4) - Roughly equal unexpected wins and losses means your rating accurately reflects your current level
  • First races = large swings (Case 5) - High uncertainty about new runners allows dramatic rating changes from initial evidence
  • Field strength matters more than placement (All cases) - 18th place with 257 unexpected wins (+133) beats 1st place with zero unexpected wins (+2)

๐Ÿค” Why Place-Based Instead of Time-Based?

You might wonder: "Why not just rank runners by their fastest time?" Great question! Here's why place-based ratings are more accurate:

โŒ Problem with Time-Based Rankings
  • Course bias: Fast, flat courses produce faster times than hilly courses
  • Gaming the system: Runners could chase fast times on easy courses
  • Weather impact: Wind, rain, heat drastically affect times
  • Inaccurate comparisons: Can't fairly compare a 17:00 on Course A vs 17:30 on Course B
โœ… Benefits of Place-Based
  • Course neutral: Works on any course (flat, hilly, grass, dirt)
  • Rewards competition: Beating strong runners matters more than running fast times alone
  • Weather neutral: If wind slows everyone, relative places stay the same
  • Accurate comparisons: Head-to-head results are objective
Bottom line: Your place in the race tells us more about your ability than your time. A 17:30 while beating top-ranked runners is more impressive than a 16:45 against weak competition.

โ›ฐ๏ธ What About Course Difficulty?

One of the most common questions I get: "Doesn't the algorithm account for course difficulty?"

Short Answer: No, and here's why that's mathematically sound!

The algorithm doesn't need to adjust for course difficulty because relative rankings are course-independent. Here's the key insight:

  • Everyone faces the same course: If a hilly course slows you by 30 seconds, it slows everyone by roughly 30 seconds
  • Places reveal true skill: The runners who finish ahead of you on a hard course would likely also finish ahead on an easy course
  • Rating updates depend on opponents: You gain/lose points based on who you beat, not how fast you run
  • Course effects cancel out: If everyone's performance drops equally, the relative comparisons (and thus ratings) remain valid

This is a fundamental property of place-based systems: as long as all competitors face identical conditions, course difficulty doesn't bias the results.

Example:

Scenario: You run 17:00 on an easy, flat course with 300 runners and place 10th. Your competitor runs 17:45 on a difficult, hilly course with only 30 runners and also places 10th.

Result: You earn a much higher rating because:

  • Field size: 10th out of 300 is far more impressive than 10th out of 30
  • Course difficulty: Reflected in times, but doesn't affect ratings
  • Field strength: If your 300-runner race includes state champions, your 10th place performance score would be even higher. If their 30-runner race is a small local meet, their performance score would be lower.

Key insight: Your rating reflects who you beat, not how fast you ran. 10th place means very different things depending on field size and field strength.

๐Ÿ’ญ What Coaches & Athletes Are Asking

In my November 2025 survey, I heard from over 130 Iowa coaches, athletes, and fans. Here are some common concerns - and how the place-based system addresses them:

Concern #1: Course Difficulty

"We had instances when a team we beat head to head one week (with the same lineup / top 7) would be rated ahead of us the next after they ran on a fast / flat course. We shouldn't be downgraded for running challenging / difficult courses up until the end of the season while other teams run on flat and sometimes short courses."

Iowa High School Coach

"Course Difficulty - I felt my runners sometimes didn't get bump due to a time on a tough course. I'm considering changing my schedule to run much easier courses and meets to help get a better rank score."

Iowa High School Coach
โœ… How place-based ratings solve this:

The algorithm never uses times - only places. Course difficulty doesn't matter because everyone races the same course. Your rating depends on who you beat, not how fast you run. See the Course Difficulty section above for the full explanation.

Concern #2: Time-Based Expectations

"Rankings are difficult when based on time. Courses vary too much. There are often guys who are ranked higher than another runner, even though the second runner beat them head to head. Head to head should somehow be included in the rankings."

Iowa High School Coach

"People with faster season pr's being ranked lower than others."

Iowa High School Athlete
โœ… Great news - this system IS head-to-head!

Ratings are calculated entirely from head-to-head results, not times. If you have a faster PR but a lower rating, it means you haven't beaten as many highly-rated runners in competition. The place-based approach rewards competitive performance over fast times on easy courses.

Concern #3: Understanding the Numbers

"I was confused on the meaning of the rating and performance numbers."

Iowa High School Athlete

"I feel more education or knowledge needs to be given to coaches in particular for what goes into the rankings. How do each individual's rankings on a team go into what the team ranking is. Also, is it the last performance or a comprehensive rating that includes results from previous seasons."

Iowa High School Coach
โœ… That's exactly why I created this guide!

The Step-by-Step Example above walks through a real race showing the following:

  • Performance score: Your rating on that specific day (can be higher or lower than your skill rating)
  • Rating (Skill): Your overall competitive level based on your recent races
  • Rating changes: How the Bayesian update combines performance + prior rating

How previous seasons factor in: Returning runners start each season with their final rating from the previous season. However, because of the large time gap during the off-season, ratings can change dramatically after the first race if performance differs significantly. Your rating primarily reflects your most recent race and your incoming rating - it's NOT an average of your last several races.

Team rankings: Teams are ranked by simulating a mock race where all runners are placed in order by their current ratings, then scored using traditional 5-runner team scoring (just like an actual meet).

๐Ÿ’ฌ Common Questions

This most commonly happens when racing against much stronger competition than usual. For example: you normally finish top 10 at smaller meets, but at a large competitive invitational with 200 runners, you finish 50th. You ran a good race and gave great effort, but the field was significantly tougher than your typical competition.

The math: Your rating is based on who you beat, not just how you felt or how hard you tried. At the big meet, you beat fewer highly-rated runners than usual, so your performance score reflects the tougher competition.

Don't worry about small drops: A 20-30 point decrease after racing up in competition is normal and doesn't significantly impact your overall rating. These races are valuable for experience and development, even if they don't boost your rating.

  1. Race against strong competition: Beating highly-rated runners earns the most points
  2. Finish as high as possible: Top finishes create larger rating gains (similar to a bonus) because rating gaps are bigger at the top of the distribution than in the middle
  3. Race frequently: More races = more opportunities to prove your ability
  4. Improve your fitness: The better you race, the better your places!

Ratings become more predictive as the season progresses and runners race more frequently. Early season predictions (August-September) are less reliable than late season predictions (October-November) as there's simply less data to work with.

Meet Forecast tool validation: The ratings-based Meet Forecast tool consistently predicts team placements and individual results with high reliability. State meet results typically match rating-based predictions within 10-15%, which is remarkably strong given the inherent unpredictability of human performance in sports.

Coach confidence: My 2025 survey showed 83% of coaches believe the ratings accurately reflect runner ability and competitive level.

Research backing: The Elo-MMR algorithm has been proven in academic research (published at the Web Conference 2021) to outperform other rating systems in both predictive accuracy and computational efficiency across multiple sports and competitions.

At many meets, there are separate JV and Varsity races for each gender. Since all races are the same distance on the same course, I combine same-gender races by finish time for rating calculations.

Realistic example: If the JV girls race finishes first, then Varsity girls race second:

  • JV 1st place: 22:30
  • Varsity 20th place: 22:25 (faster - just 5 seconds ahead!)
  • JV 2nd place: 22:45

For ratings, all girls at the meet are grouped together by finish time, then performance scores are calculated. The JV 1st place runner (22:30) is compared against all runners who finished faster than 22:30, including the top 20 Varsity runners.

Coaching tip: All runners should race hard all the way to the finish, even when far ahead of (or behind) nearby competitors. You're competing by time against runners in other races at that meet, not just those in your own race. An easy cruise to the finish because "no one is around you" might cost you places to runners finishing at nearly the same time in a different race!

Why combine races? This ensures ratings reflect true competitive level across all competition at the meet, not just within artificial race divisions.

Large competitive meets create breakthrough opportunities for elite runners (who need strong competition to prove their level) but provide similar outcomes for most runners. For the majority of athletes, big meets offer more precise rating refinement rather than dramatically different rating changes.

The system is incentive-compatible (hard to game):

  • Win a small, weak meet โ†’ Low rating gain (you only beat weak runners)
  • Lose at a big, strong meet โ†’ Small rating loss (it was expected)
  • Elite runner at small meet โ†’ Rating ceiling (can't prove elite status without elite competition)
  • Average runner at any meet โ†’ Similar percentile placement = similar rating change

Balance competition with development: Small local meets are valuable for early-season racing, building confidence, and giving younger runners experience. Racing on different types of courses (flat, hilly, grass, dirt) is an important part of developing a complete cross country athlete - not just fun variety, but essential training.

Scheduling philosophy: The key is to challenge yourself against the best competition available throughout the season, while prioritizing athlete development over ratings optimization. The best schedules balance high-stakes invitationals (for elite runners to break through rating ceilings and for all runners to gain big-meet experience) with local meets (for community and course variety).

Currently, only Iowa runners are included in the rating calculation. Out-of-state races ARE included (if Iowa runners compete in them), but out-of-state runners are not rated. This means the competitive field appears smaller than reality - if you race against 200 runners but only 50 are from Iowa, your rating calculation only considers those 50 Iowa competitors.

Future plans: I'm working on cross-state integration where users from other states could upload their results and get rated here as well. This would create more complete ratings for Iowa runners who race out-of-state!

Was this article helpful?

Still have questions? Contact us