Autocorrelation is the tendency of a time series to be correlated with its past and future values. Let me put this into football terms. Imagine I have the Dallas Cowboys rated at 1550 before a game against the Philadelphia Eagles. Their rating will go up if they win and go down if they lose. But it should be 1550 after the game, on average. That’s important, because it means that I’ve accounted for all the information you’ve given me efficiently. If I expected the Cowboys’ rating to rise to 1575 on average after the game, I should have rated them more highly to begin with.

It’s true that if I have the Cowboys favored against the Eagles, they should win more often than they lose. But the way I was originally designed, I can compensate by subtracting more points for a loss than I give them for a win. Everything balances out rather elegantly.

However it did get me curious so I ran some sims and did find a slight autocorrelation. If you run 1550 vs 1450 matchup 1000 times, you should expect the 1550 to win ~640 times and 1450 to win ~360 times and their ratings should end up right back where they started. However, I found that 1550 tends to end up slightly higher, a bit below 1552 on average. I’m not 100% sure why this is, but in an ideal world this doesn’t happen and the autocorrelation is bad – unless you believe in the streak theory that wins are more likely after wins and losses are more likely after losses. My theory is that it is due to a non perfect win probability function. I’ve tried a different win probability function which I found posted somewhere

def alt_win_probability(p1, p2): diff = p1 - p2 p = 1 - 1 / (1 + math.exp(0.00583 * diff - 0.0505)) return p

which is highly correlated to the original

and this one tends to leave the 1550 slightly lower on average overall. A perfect win probability for this matchup is actually around 0.637, but we’re splicing hairs at this point. No model is going to be perfect and as the famous quote goes “All models are wrong, but some useful.”

]]>*1) They don’t account for matchups. Some teams/individuals styles match up poorly vs each other.*

There are a few ways you can account for more detailed stats. For example, let’s look at adjusting for home/away teams. 538’s NFL ratings adjust for this by adding 65 points the home team. Using the python code from part 1 for two neutral teams, if team A won at home we would have

>>> rate_1vs1(1565,1500, 20) (1575.0, 1490.0) >>> win_probability(1565,1500) 0.5924662305843318

You would then need to subtract 65 from team A’s new rating to get their new rating so it would be 1510. Notice this is the exact same as team A winning on a neutral field.

>>> rate_1vs1(1500,1500, 20) (1510.0, 1490.0)

According to my calculations 59% win probability for the home team on is too high. The NFL home team historically has won only 57% of the time so an an ELO home field adjustment of 50 may be more appropriate and is a good example of why you should always verify someone’s calcs on your own. I believe the reason this doesn’t match up is because they are also making adjustments for margin of victory which I will go into below.

>>> win_probability(1550,1500) 0.5714631174083814

You could make adjustments for bye weeks, QB injuries etc all in a similar manner. You would need to figure out how many ELO points each is worth. You could also try using alternative methods like the Rateform method but I will not go into that here.

*2) The most basic Elo system doesn’t account for margin of victory.*

Elo ratings can be adjusted to account for margin of victory. Unlike in #1 where the adjustment is made **before** the match being played, margin of victory adjustments are made **after** a game is played. Let’s say we think teams that win by 7 or more points should get double the amount of points added to their rating. We could write a new rating function which adds a new term the k_multiplier which is calculated based on the margin of victory.

def rate_1vs1(p1, p2,mov=1,k=20, drawn=False):k_multiplier = 1.0 if mov >= 7: k_multiplier = 2.0rp1 = 10 ** (p1/400) rp2 = 10 ** (p2/400) exp_p1 = rp1 / float(rp1 + rp2) exp_p2 = rp2 / float(rp1 + rp2) if drawn == True: s1 = 0.5 s2 = 0.5 else: s1 = 1 s2 = 0 new_p1 = p1 +k_multiplier *k * (s1 - exp_p1) new_p2 = p2 +k_multiplier *k * (s2 - exp_p2) return new_p1, new_p2

Our new function will provide the following output

>>> rate_1vs1(1500,1500,1,20) (1510.0, 1490.0) >>> rate_1vs1(1500,1500,3,20) (1510.0, 1490.0) >>> rate_1vs1(1500,1500,7,20) (1520.0, 1480.0) >>> rate_1vs1(1500,1500,20,20) (1520.0, 1480.0)

All good right? Almost. There are two flaws with this system.

1) It could be improved by also adding a multiplier for other margins of victory. Should a 7 point victory really be twice as valuable as a 6 point victory? You could either hard code these multipliers with

if mov == 1 k_multiplier = 0.7 elif mov == 2 k_multiplier = 1.1 ...etc

Or you could use a function. 538 uses the function ln(abs(mov) + 1) seen below

You can see that as the margin of victory gets larger and larger, the multiplier also gets larger and larger, but at a lower rate. I assume they found this equation by guessing what random multipliers worked best for a few sample margin of victories and then graphing a line of best fit.

2) The second problem is autocorrelation Lets say instead of two neutral teams playing each other on a neutral field we have two mismatched teams playing each other. We would have four possible outcomes

1)favorite wins small

2)favorite wins big

3)underdog wins small

4)underdog wins big

>>> rate_1vs1(1550,1450,1,20) (1560.0, 1440.0) >>> rate_1vs1(1550,1450,14,20) (1570.0, 1430.0) >>> rate_1vs1(1450,1550,1,20) (1460.0, 1540.0) >>> rate_1vs1(1450,1550,14,20) (1470.0, 1530.0)

If these outcomes were all equally likely to occur as they are in the example where its two evenly matched teams playing each other there would be no problem. However, 1550 vs 1450 matchup has a 64% favorite win probability and a 36% underdog win probability. You can use historical averages to see that 64% win % converts to about a 4 point favorite. How often do 4 point favorites lose or win by 7+ points? 4 point favorites win by 7+ points about 40% of the time and lose by 7+ points about 17% of the time. We get the following outcome probabilities

1)favorite wins small 24%

2)favorite wins big 40%

3)underdog wins small 19%

4)underdog wins big 17%

So what is the favorite’s expected ratings after this game is played?

1560 * .24 + 1570 * .40 + 1540 * .19 + 1530 * .17 = 1555.1

But this is wrong. The favorite’s expected rating should actually be 1550, because that is what the team started out rated at. To account for this you need to use a larger k multiplier when the underdog wins big and a smaller k multiplier when the favorite wins big. 538 uses the equation (2.2/((ELOW-ELOL)*.001+2.2)) graphed below as what we’ll call the auto correlation adjustment multiplier or corr_m.

Plugging in our values, when the favorite wins we have 2.2/(100*.001+2.2) = 0.956 and when the underdog wins we have 2.2/(-100*.001+2.2) = 1.048. Our rewritten elo rating function will be

def rate_1vs1(p1, p2, mov=1, k=20, drawn=False): k_multiplier = 1.0corr_m = 1.0if mov >= 7: k_multiplier = 2.0corr_m = 2.2 / ((p1 - p2)*.001 + 2.2)rp1 = 10 ** (p1/400) rp2 = 10 ** (p2/400) exp_p1 = rp1 / float(rp1 + rp2) exp_p2 = rp2 / float(rp1 + rp2) if drawn == True: s1 = 0.5 s2 = 0.5 else: s1 = 1 s2 = 0 new_p1 = p1 + k_multiplier *corr_m *k * (s1 - exp_p1) new_p2 = p2 + k_multiplier *corr_m *k * (s2 - exp_p2) return new_p1, new_p2

Our new ratings

>>> rate_1vs1(1550,1450,1,20) (1560.0, 1440.0) >>> rate_1vs1(1550,1450,14,20) (1569.1304347826087, 1430.8695652173913) >>> rate_1vs1(1450,1550,1,20) (1460.0, 1540.0) >>> rate_1vs1(1450,1550,14,20) (1470.952380952381, 1529.047619047619)

And our new expected rating for p1 is 1560 * .24 + 1569 * .40 + 1540 * .19 + 1529 * .17 = 1554.5

It’s not perfect, but its closer to what it’s true value should be. You could further adjust the corr_m equation or adjust the k_multiplier to optimize your own custom Elo rating system, but this is left as an exercise for the reader.

*3) Ratings inflation/deflation.*

This is actually not a problem if you’re trying to make predictions in real time. It only matters if you’re a journalist writing an opinion piece on “who’s the greatest of all time?” It is also noteworthy that ratings inflation or deflation will only occur in sports where teams join and leave the league frequently. As long as the # of total points in the league / # of teams in the league ~= 1500 there is no inflation or deflation. In team sports it’s usually not a problem. Expansion teams will come into the league with 1500 points and usually defunct teams are relocated not discontinued so their points can follow them to their new city. For individual sports like boxing or tennis however where new people join the league and old people retire all the time it can be factor to consider when making historical comparisons.

*4) Not every win is the same.*

This can easily be accounted for with changing k values for games. In chess when established players play each other the match will have a low k value where as when newcomers play they will have a large k value. If you’re using a previous season’s NFL ratings to predict a new season’s games, you should probably have a higher k value for week 1 games than week 14 because its a new season and the ratings should change more quickly based on a single early game rather than a game late in the season where you can be more confident in how good a team really is. For example, if we wanted to use a 50% larger k value for a game, we would just pass in a k value of 30 instead of 20.

>>> rate_1vs1(1500,1500,1,30) (1515.0, 1485.0)

*5) They don’t account for injuries/personnel changes.*

This is very similar to the adjustments made in #1. If a team has an injury to QB you should lower their team rating going into a game, where as if their starting QB is coming back from injury you should raise their rating.

The **advantages** include

1) They are easy to calculate

2) You don’t need much input data, just wins and losses

3) They’re pretty good. A basic Elo Rating system for any sport will have a very high correlation to any subjective ratings system.

There’s plenty of mostly minor **disadvantages** too including

1) They don’t account for matchups. Some teams/individuals styles match up poorly vs each other. Peyton usually loses to Brady. A quick paced basketball team may be more likely to beat a slow paced one. Some righty pitchers do particularly poorly vs lefty hitters etc.

2) The most basic Elo system doesn’t account for margin of victory. Winning by 20 is not the same as winning by 1, but basic Elo doesn’t know that.

3) Ratings inflation/deflation. Elo ratings now and those 10 years ago are not apples to apples comparisons due to possible ratings inflation or deflation. Good players usually retire with a greater than 1500 rating they started with and bad players may leave with less than 1500 rating. Once those excess points leave the system they’re gone forever.

4) Not every win is the same. A playoff victory should probably impact a team’s rating more than a week 1 victory. A win or a loss after a long hiatus should probably impact a player’s rating more than a team’s 100th game 100 days.

5) They don’t account for injuries/personnel changes. Jimmy Clausen at QB for the Ravens is a very different team than when Joe Flacco is healthy.

For a **simple overview of Elo** everyone with no record is assigned a default rating of 1500. This could be compared to a poker bankroll. When two players/teams play each other, they each put up a portion of their rating or bankroll(known as the K value) into the pot and the winner gets the pot added to their rating. So for example with a k value of 20, if two teams rated at 1500 play each other with a k value of 20, the winner would end up with a 1510 rating and the loser would end up with a 1490 rating. The trick of Elo comes when two teams of drastically different ratings play each other. For example when 1600 beats 1400 with a K value of 20, the new ratings are only 1602 and 1398. Not much changed – only a difference of 2 because the favorite won and was expected to win. However, if 1400 beats 1600, the new ratings would be 1418 and 1582. A much bigger change of 18, because a huge upset occurred its likely that the original 1400 and 1600 rating were incorrect.

There’s plenty of Elo Libraries out there, but most of them are a few hundred lines of code. All you really need is **14 lines of python code for the basic system**

def rate_1vs1(p1, p2, k=20, drawn=False): rp1 = 10 ** (p1/400) rp2 = 10 ** (p2/400) exp_p1 = rp1 / float(rp1 + rp2) exp_p2 = rp2 / float(rp1 + rp2) if drawn == True: s1 = 0.5 s2 = 0.5 else: s1 = 1 s2 = 0 new_p1 = p1 + k * (s1 - exp_p1) new_p2 = p2 + k * (s2 - exp_p2) return new_p1, new_p2

It could actually be done in less, but this easier to follow and understand. You may also need a win probability function

def win_probability(p1, p2): diff = p1 - p2 p = 1 - 1 / (1 + 10 ** (diff / 400.0)) return p

Usage is pretty simple

>>> rate_1vs1(1600,1400) (1601.8181818181818, 1398.1818181818182) >>> rate_1vs1(1400,1600) (1418.1818181818182, 1581.8181818181818) >>> win_probability(1600,1400) 0.759746926647958]]>

Let’s suppose you want to make a living sports betting. **How many bets will you need to make to earn $100k per year?** How large a bankroll do you need? How much should you bet? Which book should you bet with? If you can manage a 10% ROI (unlikely) you would only have to bet $1M/year ($100k / 0.10 = $1M). A more realistic goal is a 2-5% ROI which means you’ll need to wager between $2M to $5M. In order to bet $3M/year you need to make

-300 bets of $10k a piece or

-600 bets of $5k a piece or

-3000 bets of $1k a piece

There’s just no way you’re going to find 3000 +EV bets per year. That’s nearly 10 per day. So betting small in volume is not going to earn you much. Even if you gain a big edge in a small market like esports 300 bets of $100 a piece at 10% ROI would earn you $3,000. Yeah it’s decent, but the amount of work and effort required to achieve this would probably be better served working a plain old job with much less risk. Sports betting for most people is a hobby and should be treated as such by 99% of people.

Now you understand why there are so few people making a living betting sports. In order to make it full time you’ll need to bet a large amount(4 figures) on hundreds of games per year **with a positive expectation**. Obviously the amount you bet will not matter if you have no edge. Increasing your wager size will only increase the amount of money you will lose per bet if you have no edge. This is very important and obvious, but worth repeating. **Knowing you have a winning strategy is your #1 priority before risking ANY amount of money wagering.**

There’s very few people that have the bankroll, risk tolerance and skills to making living betting sports. If they do they’ll probably make more with less risk as a consultant or tout. Don’t expect to do this full time anytime soon. In order to make $100k/year you are going to have to risk something like 5% of your expected yearly earnings per bet. That’s incredibly risky. If you do manage to find a big edge in a sport, it’s unlikely you’ll be able to bet very large because the limits will be low. For the major sports, the limits are higher but potential edges are lower. For a sport like the NFL it’s easy to bet $10k+ per game before close but there’s only 256 games per year. There’s thousands of college basketball games per year, but limits are much lower. If your goal is to maximize $ earned I would recommend focusing on a sport that balances the amount of $ you can bet with the potential amount of edge you can gain. More realistically though your goal should be to maximize your enjoyment and minimize your losses with your likely losing hobby so I’d say focus on a sport you like and forget about everything else!

So say we’re shooting for a 3% ROI. A realistic goal. **What kind of winrate do we need to have a 3% ROI?** Using the ROI Calculator and typing in 1.9091 for odds (-110 is typical for a 50/50 wager) and 3% ROI and clicking calculate we get that we need to win about 54% of our bets. So we need to build a system that can go 54-46 on average for every 100 bets.

Let’s suppose we download a database of our sport of choosing, get all the historical closing lines and try to find an angle. Let’s say after much digging we find that on night games where there’s a full moon the home team covers the spread 40 times and loses to the spread 30 times. Great! Winning strategy, let’s bet it right?

Hold on a second. **First, the angle has to make sense.** You should be able to explain why the full moon favors the home team. Remember if you watch the roulette wheel long enough in the casino you may start to think you’re seeing winning patterns there too. **Second, you need to be sure the results are significant.** Is going 40-30 good enough to bet? Try typing it into the T-test calculator. This will perform a one sided t-test to see if the results are >52.38% (enough to cover the vig on a typical -110 odds bet). You will find that is less than one standard deviation away from the mean and not at all significant. What about a tout saying they are 12-5 or 100-70? Which is more significant? Type them in and find out for yourself

A more realistic example is say we find that betting large home underdog moneylines in MLB are underpriced.

**Does this angle make sense?** This makes intuitive sense because the public likes betting favorites so the bookies may shade their line towards underdogs. So say for example we think that betting home dogs of +170 are +EV.

**
What kind of winrate do we need?** Using the Odds Converter and typing in +170 in the US Odds and clicking calculate we find that we only need to win 37% of the time or more to make a profit on these wagers.

**Are our findings significant?** In the past season let’s say we see that they actually went 127-173. Using the t-test calculator once again and using 127 for wins, 173 for losses and a population mean of 37% we find that the results are a winrate of 42% with a p-value of 0.03 which may be significant.

**What’s our expected ROI?** Using the odds converter we find that +170 equates to 2.7 decimal odds. Using the ROI calculator with 2.7 decimal odds, 42% winrate and solving for ROI we get an expected ROI of 13.4%.

**How much should we bet?** In general I’d say set aside a sports betting bankroll you are willing to lose and bet 1-2% of it per bet. Expected ROIs are just guesses. Just because an angle did well in the past does not mean it will do well in the future or at a bare minimum the ROI for the angle will not go down significantly. If you’d like to use the kelly criterion calculator I think 1/4 kelly is somewhat reasonable. So for our example, kelly multiple is 0.25, odds are 2.7, winrate is 42% and we get 1.97% recommended betsize.

Finally, let’s say after the first month of the season we’ve gone 18-42 getting an average of 2.7 odds on our bets. This is pretty bad, but **is it bad enough to quit?** Try out the Confidence Interval Calculator We expected to win 42% and need to win 37% to profit remember? 18-42 is a 30% winrate and the 95% confidence interval is 18.2% – 41.8%. So there’s still a decent chance we have a winning bet strategy, but it’s unlikely we are winning at the 42% chance we expected. It may be time to quit or re-evaluate or at bare minimum lower your betsize.

One last note… **Always line shop at multiple books and always bet the larger number.** For american odds +111 is better than +110. +101 is better than -101. -101 is better than -103. For decimal odds 2.101 is better than 2.09, 1.9 is better than 1.87. If you shop long enough you may even find two lines that when combined have negative vig on the Vig Free Calculator A few tips for when/if you do.

1. Make sure the book dealing the bad line will not scam you

2. Always bet the bad line before you hedge with the widely available line

3. Locking in the profit ie Hedging is -EV. In general you’ll make more just betting the rogue line but if you bet more than you’re comfortable with letting ride it’s a smart play to hedge

Maybe. It depends where you live. A couple countries and states have explicitly outlawed it. I’d recommend talking to a lawyer or doing your own research if you want more clarity, but this Is sports betting legal in the United States article may provide some basic background. The one sentence summary is no actual bettors have ever been prosecuted, but plenty of bookies and payment processors have. It is very similar to poker in that regard.

Make winning bets on sports obv. It’s a little more complicated than that though. Just like if you wanted to become a professional poker player you would be essentially saying *“I think I can play in poker games where I make better betting decisions than the majority of my opponents”* If you wanted to become a professional sports bettor you would essentially be saying *“I think I can find and bet sports book lines that are improperly priced”* Your job is to sniff out market inefficiencies and exploit them.

So now that we’ve covered the legalities and job description let’s examine some people who make a living do this. First off, who have been some of the best?

The story of Billy Walters is the story of how a lot of great fortunes have been made. He was in the right place at the right time and met the right people. He was pretty much a degenerate gambler who moved to Vegas at the right time and was hired by members of the infamous Computer Group of the 1980s to place bets for them. The Computer Group was one of the first groups to use computers to handicap sports and bet on inefficient lines. The group broke up around 1985 and Billy went out on his own and hired his own handicappers after that. You can see him on 60 minutes below.

Bob was born in 1975. He was well aware of his predecessors like Billy Walters, but he started off making a living being a purely subjective bettor. He started betting NBA games in the late 1990s and made a fortune betting halftime totals back when bookies were dumb enough to set the halftime game totals at exactly 1/2 the game total(teams score more points in the second half down the stretch). In 1999 and 2000 he made huge all in bets for the Lakers to win the championship with his $80,000 bankroll . After that he had $1M to his name and went about hiring an ex-Wall Street quant to help him simulate and handicap NBA games. He has been quoted as having spent in excess of $3 million on his statistical NBA handicapping model.

Excellent interview write-up

2+2 Well

Blog

Twitter

His house and girlfriend at the time

In 2009-2010 he quit betting for awhile – nine months or so to pursue a job with an NBA team. This was his pretty hilarious response to someone who says they’ll never hire a gambler like him.

You couldn’t be more wrong. If I wanted to work for a team right now I could be working for a team. The problem isn’t finding a job, the problem is these jobs don’t pay anything.

GM salaries aren’t normally reported, but I know two assistant GMs who are making less than I pay 2 guys on my team, and I’d bet almost all but one or two GMs make less than my quant does.

A conference at which he was a featured guest. He’s on the far left.

There’s plenty of other less successful bettors out there. If you really wanted to make a career out of this most likely you will end up broke or like one of these guys. Betting sports for a living is incredibly difficult to do.

Making a living as a bookie is way, way easier and they make way, way more money than the bettors. A sports bettor has to pick choose his spots very carefully and work very hard to find a 5% advantage. I bookie already has a 2-7% advantage on 99% of the bets they offer due to the vigorish How to be a successful bookie goes something like

1. Copy and paste lines off Pinnacle Sports

2. Pad the vig a bit to increase your bottom line

3. Optional: One minute before the line closes hedge any lopsided action at a larger book to decrease your risk

4. Profit

The problem of course is it’s probably illegal and you run the risk of going to jail.

Please don’t pay for picks. I’d say 99% are scams, but that’s probably an underestimate. Good handicappers don’t sell their picks. If you want to see some of these scammers at work check out CNBC’s show Money Talks The tout takes 50% of his clients’ winnings which makes it impossible for them to be long term winners. His clients need to win over 69% of their bets to break even. Even the best sports bettors in the world only win around 60%. He is making a good living off scamming – I mean sports betting though.

The one reputable sports tout I will mention is Right Angle Sports or RAS. They are proven winners who will charge something like $1,700 for a season’s worth of picks. Sounds great right? The problem with them is check out their Beating Line Movement page. Sports books subscribe to their picks too. Lines move within 5-30 seconds of when the pick is released. The only way you are going to make money using their picks is if you can bet a lot of money faster than anyone else who is also receiving their picks. I would not recommend trying.

There are plenty of handicappers who post their picks for free on twitter or public forums after they make them. If you find a winning one you’re welcome to tail them, but you’ll never get as good of a price on lines as them because the original handicapper’s bet will often move the line so even if he’s a proven winner his followers will have a lower ROI.

Winning at sports betting essentially boils down to

1. Find an inefficient betting line

2. Bet on the +EV side

3. Collect your winnings

Sounds simple right? Well there’s a lot more to it than that. In order to find an inefficient line you are going to need to know what the “correct” line is. In general, the smaller the book, the more likely it is that they

1. Have an inefficient line

2. Will scam you and not pay you your winnings

You will need to weigh the risks of these when betting with any small book. The ultimate tiny book is a “friendly” wager you might make with a buddy of yours. Maybe you ask your unwitting friend who knows nothing about sports betting if he wants to bet on the underdog at even money tonight? A step up from this might be your local bookie you met at the local casino or bar or whatever who will take your bets and doesn’t have good lines or updates them infrequently. You compare the lines he gives you to Pinnacle’s and bet when you find one that’s off by enough. A step up from that might be depositing on and trying to exploit one of the thousands of tiny illegal sports books online that don’t update their lines as quick as they should. A step up from that is trying to beat lines put out by the big boys like Pinnacle or Vegas Casinos. This is an order of magnitude more difficult, but the rewards are much greater and the risk much lower. There’s almost no chance of you being scammed out of your winnings and your max wager allowed will be much higher than any other books.

So if your plan is to try and beat the lines from one of the major books you’re going to need to start coming up with lines yourself. This is known as Handicapping and the biggest books are VERY good at it. Even when they make mistakes and put out bad opening lines, they are constantly adjusting the lines based on how much is bet on either side. An opening line will typically be different than a closing line. Even if you’re better at handicapping than the sportsbook, you need to figure out WHEN to place your bet. Depending on what data you are using to handicap, some models may be best suited for betting on opening lines and others may be best suited for betting closing lines when the game is about to start.

There are many different ways to gain an edge vs the book. I’ll start with the least savory.

Unless you’re Marsellus Wallace or actually play a sport at a high level this option is not open to you and it’s obviously frowned upon – illegal? but it’s important to be aware that it does happen. It’s also the main reason bookmaking is largely illegal in the US. Major pro leagues like the NFL, MLB, NHL are terrified that if sports betting were legal it would ruin the integrity of the game and games would be fixed all the time. However, some people are changing their mind like the NBA’s ex-commissioner David Stern says it’s time for legalized sports gambling

The most infamous case of game fixing is probably the Black Sox Scandal of the 1919 World Series

Tim Donaghy was a high profile case of an NBA referee who bet on games and changed the spread on games by allegedly miscalling the end of games he officiated

As recently as today two lower ranked tennis players were suspended for match fixing

Insider knowledge scares the crap out of sports books. It’s hard to obtain and if you have it there’s a decent chance you’re not allowed to bet on the games or don’t care about betting on games. I’m going to assume you’re not dating Tom Brady so we’ll largely skip over this edge.

The main advantage a bettor has over the book is they don’t have to set lines for every game. They can pick and choose which games to focus on. I would recommend specializing in one particular niche where the competition is small. The downside to this is the max wager limits will be smaller, but there is more opportunity to find inefficient lines. The reason is the number of people handicapping, betting on and watching major league sports games like the NFL is orders of magnitudes larger than a less popular sport like golf.

This goes along with specializing. It’s going to be nearly impossible for you to gather more data than the sports books or other handicappors for a major game like the superbowl, but for a small game hardly anyone is going to be paying attention to like Mississippi Valley State vs Alabama State both members of the SWAC conference you may have a chance.

You may also find and use some data that no one else would think of using to handicap. Some hedge funds trade stocks based on your tweets New data is always being released. If you can find a source of data that predicts the outcomes of games that no one else is using you may have found yourself a nice edge.

There’s plenty of data out there, but people draw different conclusions based on the same data. Maybe you see something in the data that no one else does that will help you predict the outcome of a game. Maybe you write a better machine learning algorithm than anyone else. Maybe you adjust your model better than anyone else based on a major change in the game such as a lockout or a rule change. Week one of any sport is going to be difficult to predict because its a brand new team with different players and potentially different rules. There’s a big difference between predicting what kind of stats 30 year old veteran Lebron James will have in his 50th game of the season and a college freshman’s first collegiate game ever. A college freshman doesn’t have much to go by in terms of career stats and his play is likely going to improve drastically as time goes on. Maybe you figure out how to handicap young players play better than anyone else.

It takes time for information to spread. If you can be one of the first to find out about something and bet based on that data before anyone else does you can gain an edge. Last season during the opening drive of a preseason game Sam Bradford tore his ACL If you were at the game and saw it happen or you were the team doctor(insider) who x-rayed him you could use that information to bet against the Rams in the upcoming games.

Sometimes a player’s previous game data is a poor indicator of a player’s expected performance. After Brett Favre’s father died on a Sunday night he completely shredded the raiders on a Monday night game Look at the packers’ 2003 game by game passing stats Favre had by far his best game of the season and clearly had extra motivation to play well that day. Betting Green Bay to cover the spread was probably a good bet.

Take a look at what Steve Smith did when he played his former team the Carolina Panthers last season. He had two touchdowns and 139 yards. By far his best game of the season. Betting Baltimore to cover was probably a good bet.

In my next post I’ll go into the math and statistics involved with sports betting.

]]>I’m certainly not advocating anyone bet on the games, but if you are going to Nate Silver has proven to be a fairly good handicapper. He thinks Villanova is pretty undervalued by the bettors in general especially their final 4 odds and Arizona is slightly undervalued to make it to the final 4.

Update: I took a look at the 1st round numbers too. The most interesting matchup is BYU vs Mississippi. The betting markets have Mississippi a 41% underdog, but Nate Silver’s model has them a 56% favorite.

March 19 update: Wrote some code to automate these updates. You can view it here: https://github.com/andr3w321/march-madness-betting I’ll periodically update the data folder which shows the bets.

March 25 update: Model has yielded pretty terrible results. I’m not convinced Nate Silver would beat the spread.

]]>https://webdocs.cs.ualberta.ca/~bowling/papers/15science.pdf

You can query the GTO bot named Cepheus’ strategy here http://poker.srv.ualberta.ca/strategy

But it’s slow and not really displayed in an easy to understand way for humans emulate. It’s not easy to figure out if it tends to check back certain boards or what its cbet % is for example. I was curious if you could download the code and run it yourself and how much it would cost. The good news is you can download and run it yourself, the bad news it’s going to cost you a lot to get an exact Cepheus replica($500k by my estimate).

Looking at http://aws.amazon.com/ec2/pricing/

A large compute optimized on-demand instance with 32 2.6-GHz Intel Xeon cores, 60 GB of RAM, and 320-GB of local disk is probably the most similar node to the ones used by the Alberta researchers. They cost $1.68 per hour. To run 200 of them for 68.5 days would cost you 200*1.68*24*68.5=**$552,384**

Of course computing cost decreases over time over time as shown by the graph below. In a five years this computation cost should be one tenth of this current estimate and we’ll have 10TB hardrives for $50(current cost of 1TB hardrives http://www.amazon.com/Seagate-Desktop-3-5-Inch-Internal-ST1000DM003/dp/B005T3GRNW) and be able to download and save the whole strategy easily on our home computer.

Source: http://hblok.net/storage_data/storage_memory_prices-2013-02.png

There were a few interesting stats given in the paper that I don’t feel were talked about much by the media because it wasn’t explained all that well and put in normal “poker speak” terms. They referred to a “hand” as a “game” and measured edges in “milli-big-blinds/game” instead of a more typical “bb/100”.

The **maximum achievable winrate playing vs Cepheus** is listed in the paper as 0.986 milli-big-blind per game(by game they mean hand). In poker terms this equates to (0.986 milli big blinds / hand) * (1 big blind / 1000 milli big blinds) * (100) = 0.986 * .1 = **0.0986 bb/100 maximum winrate**. So for example if you were playing true GTO headsup $200/$400 limit holdem vs Cepheus you could expect to win about $200 * 0.0986 = $19.72 per hour per table.

Some major troll nits were complaining that if it’s possible to have a positive winrate vs Cepheus then heads up limit hold’em isn’t solved. What I would say to these people is

a) 0.1 bb/100 achievable winrate is tiny and

b) if they really want to get this number lower they can do so by simply turning their algorithm back on and letting it run more iterations.

It lists the button’s winrate vs itself as “between 87.7 and 89.7 mbb/g for the dealer” This means the **GTO edge for the button** vs the big blind is 88 * 0.1 = **8.8 bb/100**. So in other words when someone hit and runs one hand with their button vs you they’re stealing 0.088bb or $17.6 at $200/$400 in EV from you.

There are a number of ways you could create a much cheaper “good enough” GTO bot yourself. The way that they created Cepheus is by running this code http://webdocs.cs.ualberta.ca/~games/poker/software/CFR_plus.tar.bz2 on a 200 node cluster of computers for 68 days. It is described in the paper as

Our CFR+ implementation was executed on a cluster of 200 computation nodes each with 24 2.1-GHz AMD cores, 32GB of RAM, and a 1-TB local disk. We divided the game into 110,565 subgames (partitioned according to preflop betting, flop cards, and flop betting). The subgames were split among 199 worker nodes, with one parent node responsible for the initial portion of the game tree. The worker nodes performed their updates in parallel, passing values back to the parent node for it to perform its update, taking 61 minutes on average to complete one iteration. The computation was then run for 1579 iterations, taking 68.5 days, and using a total of 900 core-years of computation (43) and 10.9 TB of disk space, including filesystem overhead from the large number of files.

If you simply reduced the number of iterations ran you could create a not quite as good bot for a fraction of the cost. See the figure below from the scientific paper. Since they ran their sim for 900 core years or 1579 iterations they achieved a maximum exploitability of ~0.1 bb/100 (~$500k computation cost). Interpolating this graph that means that in 90 core-years of computation you could create a a bot with 1 bb/100 of exploitablity (~$50k computation cost). After 27 core-years computation cost you could create a bot with 10 bb/100 of exploitability (~$17k computation cost). After 9 core-years computation cost you could create a bot with 30 bb/100 of exploitability (~$5k computation cost).

You could further reduce computation cost by reducing the number of subgames. They solved for 110,565 subgames and their preflop strategy is very easy to view and download here http://poker.srv.ualberta.ca/preflop. You could hard code this in and reduce your computation cost drastically. Unfortunately at this time I haven’t worked out the math on how they arrive at 110,565 to calculate exactly what order of magnitude of computation this may save. If someone could help me out with that it would be greatly appreciated. There’s only 169 different preflop hand combinations, 1755 different flops each with 47 different turn cards and 46 different river cards.

Their algorithm could easily be adapted to other limit games. Headsup limit Omaha 8 or better should be considered solved as well at this point. If someone wants to give me $1 million I’ll prove it. Same goes for headsup 2-7 triple draw. The stud games have a much higher number of game states so they may not be cheaply solvable at this point. Razz on the other hand ignores suits and only has 13 unique cards so a “good enough” headsup GTO bot (one with say 1bb/100 exploitability) could probably be created for $50k or less in computation cost at this point. It’s all a matter of time before all the games are dead and solved.

Some people will point out that the game state size of a no limit game is huge and may never be solved in our lifetime.

http://poker.cs.ualberta.ca/publications/2013-techreport-nl-size.pdf

Sure, it’s true that we may not see a Headsup No Limit Hold’em bot with maximum exploitability of less than 0.1bb/100 in our lifetime. That does not mean a “good enough” no limit bot with artificial pot size bet constraints (can only bet pot, 1/2 pot, 1/4 pot for instance) with less than 1 bb/100 of maximum exploitability could not be created TODAY for a couple hundred thousand.

It’s just a race to see who can figure it out first. The nosebleed guys hire programmers to figure out things out for them. They have the most money, resources and incentive to do so. It’s no secret at this point. I’m not optimistic about online poker’s future. Nobody plays online chess for money. Then again they do play online blackjack.

A more plain english summary of the paper is available here

http://spectrum.ieee.org/tech-talk/computing/software/computers-conquer-texas-holdem-poker-for-first-time

You can play against Cepheus here

http://poker-play.srv.ualberta.ca/

For the past couple of months I’ve been playing **bitcoin poker** almost every day, railing and analyzing the nosebleed games, and watching strategy vids and discussing hands with friends. I haven’t done a great job of working out, but I’m really going to try to make that a priority over the next two months as it plays a big role in the marathon that is the wsop. I plan on playing 25-30 events hopefully and have rented a house with a ton of poker players this time. There’s a eleven of us in a five bedroom house but not everyone’s there the whole time.

In other news I **bought a condo**! I don’t close for another twenty five days but I see no reason why the sale shouldn’t go through so I’m pretty excited. My wife and I ended up spending a bit more than we were initially planning, but it’s a nice place and I’m sure we’ll be happy there. It’s a 2 BR/1 BA in belltown in a nice building with all the amenities hot tub, gym, rooftop deck etc. The negotiations went well. The place was listed for $475k, we offered $450k, they countered at $465k. At this point our realtor urged us to take the deal. The place had been on the market less than a week and good places sell very quickly in Seattle. A couple other condos we were contemplating making offers on in the past would be under contract a day later before we made up our minds. I decided to ignore the realtors advice however and gamboool and offer $460k which they ended up accepting. It was risky because they had an extra 24 hours to shop that offer around to other potential buys to try and get a higher offer but it ended up paying off so we’re happy.

But that’s enough boring life updates I’m sure you’re just here for the poker strat! I downloaded **Pokersnowie** today as I’d heard about and thought I’d check it out(no they didn’t pay me to write a review, although if they read this and would like to retroactively I will happily accept donations :)). I played about 200 hands of headsup NL vs it and apparently it didn’t think I played too badly:

Unfortunately I can’t say the same for it. In reality I’m by no means a world class HU NL player and I’m pretty confident the best ones would be able to beat this bot for a healthy winrate. Having said that it’s pretty fun to play and I think it could improve your game quite a lot – certainly well enough to be able to beat midstakes. Below are two hands it says I played badly and I think it played badly and don’t think it’s even close.

The hand went I minraised button, and the bot check/called every street. Evaluating the bot’s play there’s no way that check/calling on the river is a better play than going all in. If you notice it shows that the EV of calling is 31.93 bets and calling is only 11.93 bets. It’s really hard to make backdoor flushes especially out of position. You might say yeah I mean he has the best hand 90% of the time but when he’s called he’s behind 70% of the time, but vs an actual human player there’s no way this is the case. Most humans are going to have a very hard time folding 2 pair+ to the raise and if they are folding 2pair+ to the raise every time well… it’s going to be pretty easy exploiting them by just check raising them every single time we get to the river with a hand that doesn’t beat AT since the in position guy will have a strong Ace thru top set much more often than he will have a flush.

Evaluating my play it says my bet with the AT is a mistake. I disagree. Most human players are going to be check raising their 2pair+ sometime before the river, so I basically lose to A9,J2s,J3s(24 hands), A2,A4,A5,A6,A7,A8,24,25,26,27,28,29,2T,2Q,2K of clubs though some of these will be 3bet preflop sometimes(15 hands) and the occasional random backdoor flush or slowplay. I beat A4,A5,A6,A7,A8 both offsuit and suited non clubs(75 hands). I chop with AT and most people will 3bet AJ-AK. Most people fold J2o, J3o preflop. That’s 39 hands I’m beat by and 75 hands I beat. Even if I add in all A2 and A3 combos that only adds about 30 hand combos making it still a value bet. The only way this is not a value bet is if the oop guy is never check raising this board on the flop or turn. This may be optimal, but in practice most humans I think will have a flop or turn check raising range.

The second hand I disagree with pokersnowie on is a paired board where again it went I minraise button and the bot check/calls three streets with a very strong hand instead of check raising the river.

This is such a strong hand and very easy river checkraise. I think worse hands that bet/call in my spot are any 7 so Q7,T7,97,87,76,74(96 hands). Better hands that bet/call are all boats so K7,J7,75,KK,JJ,55(66 hands). 55 probably checks the flop or turn and A7 chops obviously. Interestingly, Q7 is actually a losing river check raise. It beats 80 hands vs 98 hands it loses to. Of course this is all based on these bet/call ranges which or may not be correct but for humans I think they’re pretty close. You could add in a 73s,72s,73o for button opening ranges and some people may hero call AK or KJ which would probably push Q7 to a check raise, but that’s very villain and read dependent.

Evaluating my play it thinks the K8 river bet is a mistake. I’m going to be barreling most gutshots AQ,AT,QT,T9,89 and plenty of diamond flush draws on the turn and bluffing many of these on the river. I need to be value betting thinly to balance for this. I’m not going to go through all the exact combos for this one, but very similar to the previous hand if the oop player has any sort of flop or turn check raising range which 90% of players do this is a very easy value bet.

My other big complaint is the level of accuracy it gives. It should be giving a range of EV where its 90% confident or something. Not exact EV down the the decimal point for each bet. When reviewing my “errors” it says I made it would bring up all these spots where calling might be -0.03 bets and folding was 0 bets and says I made a mistake by calling. There’s no way it can provably say for example that I am losing money by calling a 3x raise with 96o from the bb like it believes. Also when I made a bet size it didn’t like it would tell me I was making an error by betting 2/3 instead of 1/2 pot in a single raised pot. There’s very little chance that is a provable mistake. The errors should be organized in largest to smallest and ones that fall under statistically insignificant or too small to be proven shouldn’t come up as errors IMO.

I’m sure they will get there eventually but for now I actually feel pretty good about man vs machines chances at heads up NL.

And some funny vines for you

http://www.vineroulette.com/v/This-is-what-really-happened-MMrvlTd53TD

http://www.vineroulette.com/v/Escalator-Moonin-Fly-Blakie-Blake-MMt26ddzgbK

]]>2. **Wake up and go to bed at regular times.** I’ve been self employed for over six years now and waking up early and regularly is something I’ve struggled with as long as I can remember. Waking up late prevents me from having a very productive day and achieving my goals quicker than I otherwise would if I woke up early. Over the years I’ve learned a couple tricks to help me like practicing waking up, going to bed early the night before, and most of all doing it regularly since it takes about a week for your sleep schedule to adjust. Hopefully I can change that in 2014.

3. **Read less 2+2/reddit.** I’ve grown into a schedule of browsing the internet for an hour or two when I first wake up and right before I go to bed. Sometimes I read and learn interesting things, but usually I just end up reading random semi entertaining threads like If a player has a prosthetic arm that he used for holding onto the football and the arm got detached during the play, would that be considered a fumble or would the prosthetic limb still have possession for the player? In the morning it stops me from being productive, and in the evening it tends to keep me up later than I want to stay up and prevents me from being productive the next day. I don’t think I’ll ever kick the habit for good, but I’d like to decrease the amount of time I spend randomly browsing to focus on more productive or entertaining things.

4. **Read more books.** I’m somewhat embarrassed to say I only read 1.5 books last year. This is the least number I’ve read in a year since I was probably 12. I read American Sniper: The Autobiography of the Most Lethal Sniper in U.S. Military History and half of What Every BODY is Saying: An Ex-FBI Agent’s Guide to Speed-Reading People Reading novels has been shown to boost brain function for days It increases your vocabulary and provides much deeper explanation on topics that you just can’t get from tweets and short internet articles which takes up the majority of young people’s reading sources these days. Plus at 1.5 books/year I’m getting dangerously close to becoming like Kanye West:

5. **Exercise more.** Pretty standard resolution here. Three times a week is what I’m shooting for. Maybe I will finally run that marathon this year I’ve been wanting to for awhile now, but we’ll see.

6. **Blog at least once a month.** I started tweeting a bit in 2013 and it kind of replaced my urge to blog, but there’s just some things you can’t get out in 140 characters. I don’t think 12 blogs per year is asking too much of my time.

7. **Watch less TV.** Spend more time reading books and working out instead.

8. **Grow my businesses. Buy a house.** These are pretty vague goals which are generally bad for goal making, but I’ll take these more on a month to month basis. In January I’d like to focus on BitcoinRichList.com and then on poker for Australia and the Aussie millions. In February I’d like to focus more on ATMs and condo buying.

Happy new year and best of luck to you all in 2014!

]]>