With the college basketball season ending and NFL season not starting up for another four months I’ve been spending some time working on a Tennis model which as led me to reading a lot about Elo Ratings. Pronounced Elo – after Arpad Elo not E-L-O. There’s quite a bit of info out there, but not much I could find that really simplified and explained Elo Ratings in a clear, concise and detailed manner. Fivethirtyeight.com has used them extensively for most major sports including NFL, NBA, and Tennis to create rating systems for teams over time. I can see why.

The **advantages** include

1) They are easy to calculate

2) You don’t need much input data, just wins and losses

3) They’re pretty good. A basic Elo Rating system for any sport will have a very high correlation to any subjective ratings system.

There’s plenty of mostly minor **disadvantages** too including

1) They don’t account for matchups. Some teams/individuals styles match up poorly vs each other. Peyton usually loses to Brady. A quick paced basketball team may be more likely to beat a slow paced one. Some righty pitchers do particularly poorly vs lefty hitters etc.

2) The most basic Elo system doesn’t account for margin of victory. Winning by 20 is not the same as winning by 1, but basic Elo doesn’t know that.

3) Ratings inflation/deflation. Elo ratings now and those 10 years ago are not apples to apples comparisons due to possible ratings inflation or deflation. Good players usually retire with a greater than 1500 rating they started with and bad players may leave with less than 1500 rating. Once those excess points leave the system they’re gone forever.

4) Not every win is the same. A playoff victory should probably impact a team’s rating more than a week 1 victory. A win or a loss after a long hiatus should probably impact a player’s rating more than a team’s 100th game 100 days.

5) They don’t account for injuries/personnel changes. Jimmy Clausen at QB for the Ravens is a very different team than when Joe Flacco is healthy.

For a **simple overview of Elo** everyone with no record is assigned a default rating of 1500. This could be compared to a poker bankroll. When two players/teams play each other, they each put up a portion of their rating or bankroll(known as the K value) into the pot and the winner gets the pot added to their rating. So for example with a k value of 20, if two teams rated at 1500 play each other with a k value of 20, the winner would end up with a 1510 rating and the loser would end up with a 1490 rating. The trick of Elo comes when two teams of drastically different ratings play each other. For example when 1600 beats 1400 with a K value of 20, the new ratings are only 1602 and 1398. Not much changed – only a difference of 2 because the favorite won and was expected to win. However, if 1400 beats 1600, the new ratings would be 1418 and 1582. A much bigger change of 18, because a huge upset occurred its likely that the original 1400 and 1600 rating were incorrect.

There’s plenty of Elo Libraries out there, but most of them are a few hundred lines of code. All you really need is **14 lines of python code for the basic system**

def rate_1vs1(p1, p2, k=20, drawn=False): rp1 = 10 ** (p1/400) rp2 = 10 ** (p2/400) exp_p1 = rp1 / float(rp1 + rp2) exp_p2 = rp2 / float(rp1 + rp2) if drawn == True: s1 = 0.5 s2 = 0.5 else: s1 = 1 s2 = 0 new_p1 = p1 + k * (s1 - exp_p1) new_p2 = p2 + k * (s2 - exp_p2) return new_p1, new_p2

It could actually be done in less, but this easier to follow and understand. You may also need a win probability function

def win_probability(p1, p2): diff = p1 - p2 p = 1 - 1 / (1 + 10 ** (diff / 400.0)) return p

Usage is pretty simple

>>> rate_1vs1(1600,1400) (1601.8181818181818, 1398.1818181818182) >>> rate_1vs1(1400,1600) (1418.1818181818182, 1581.8181818181818) >>> win_probability(1600,1400) 0.759746926647958

[…] Elo Ratings Part 1 I provided a basic background on Elo ratings, the advantages and disadvantages of them and some […]