
Writer, analyst, podcaster, Spurs fan. Three out of four is not bad. If there is a data angle, I will find it.
Can AI Beat The Bookmakers
Premier League betting is a global phenomenon, with punters everywhere testing their knowledge each week and taking on football betting sites. Despite the unpredictability, many rely on their insights, but what if AI could enhance those predictions? At OLBG, we're exploring how two predictive models could forecast Premier League match outcomes and scores.
With those predictions, we are going to bet £1 stakes on each of the 380 Premier League matches for the outcome and score.
£760 bets across the two market and two models means an eventual total outlay of £1,520 - can we end the season with profit to show for it?
The Super Models
Doing their best to get the better of the bookmakers are two AI-models.
In the red corner is the offering from OpenAI's ChatGPT
in the blue corner is the offering from X's Grok
Each have been asked the same question - can you predict the upcoming set of Premier League fixtures?
Although to make things slightly different there level of inputs into the model are different, so as to give us the ability to see which of the two can perform better.
With the new season underway, we can start to look at how successful the two models have been to date.
Results In Business
At the time of writing, 309 Premier League matches have taken place in the 2024/25 season and if we look solely at how the two models have performed in terms of purely predicting results, their scorecard is as follows:
Model | Matches | TRUE | FALSE | Correct % | Incorrect % |
---|---|---|---|---|---|
ChatGPT | 309 | 152 | 157 | 49.19% | 50.81% |
Grok | 309 | 143 | 166 | 46.28% | 53.72% |
We can already see that there is a difference between the two models with 309 games in the record books and it is ChatGPT that has come out on top with 152 outcomes correct thus far - 9 better off than its Grok rival with 143.
This means that ChatGPT is 49.19% correct in terms of outcomes and Grok sits at 46.28% and considering the consensus is that you need 55% of all Premier League predictions to be correct across the season to be profitable in flat stakes betting - due to the nature of the odds involved, it is fair to say that the models could improve if they are not to be stuck in a relegation battle.
Then again, for all the celebrations in terms of percentage terms, both models needs to also churn out profitability. It is no good if the model simply predicts Manchester City and Liverpool to win each week, these two AI-based boffins need to get the closer games correct to have any chance of turning this into a profitable venture.
Which is where the second table comes into play and when we look at what £309 on each model would return at this stage of the season
Model | Stake | Return | P/L | ROI % |
---|---|---|---|---|
ChatGPT | 309 | 277.85 | -31.15 | -10.08% |
Grok | 309 | 280.03 | -28.97 | -9.38% |
In terms of ChatGPT's offering, an outlay of £309 has returned £277.85 - a loss of £31.15 giving us a negative ROI of 10.08%. By comparison, the £309 spent on Grok has returned slightly more at £280.03 - a loss of £28.97 giving us a negative ROI of -9.38%.
Over halfway through the season now and neither is currently showing profit after doing so earlier in the season. But what happens if we were to spend the same outlay on correct score betting?
What's The Score
Here the same rules apply - the same flat stake of £1 for each game and this time we are not betting on the outcome that either of the models has provided but the correct score that is has served up.
The most important thing to remember here is that we do not need the same level of strike rate as the outcome predictions and this is because of the greater odds that are priced in on correct score bets.
Therefore, if we look at the data that comes from correct score bets, it is as follows:
Model | Stake | Correct Score Return | P/L | ROI % | YES | Correct Score % |
---|---|---|---|---|---|---|
ChatGPT | 309 | 268.00 | -41.00 | -13.27% | 33 | 10.68% |
Grok | 309 | 294.50 | -14.50 | -4.69% | 30 | 9.71% |
Here we can see that predicting solely correct scores could be where the money is and with the strike rate being only 10.68% for ChatGPT at present or just over one in nin3, it is coming up short in terms of profitability.
The models have currently made 33 and 30 successful predictions (not always the exact same across the two samples) and the latest results are:
ChatGPT: £41.00 Loss, -13.27% ROI
Grok: £14.50 Loss, -4.69% ROI
As you can see this is where Grok starts to show its own talents. It may be lacking in outcomes, but it has found its niche when it comes to correct scores of considerably higher value and it is even more impressive when you consider its strike rate in terms of correct score predictions currently sits at 9.71% or just under 1 in 10.
A number of long shots leaves Grok just £14.50 short of breakeven - if it can just get its strike rate to 10% and above, then profitability may soon be found.
Overall Results
To wrap up the analysis, we need to look at the total spend of both models for outcomes and correct score bets. When combining the data together, are the two models turning a profit or not
Model | Stake | Combined Return | P/L | ROI % |
---|---|---|---|---|
ChatGPT | 618 | 545.85 | -72.15 | -11.67% |
Grok | 618 | 574.53 | -43.47 | -7.03% |
Grand Total | 1236 | 1120.38 | -115.62 | -18.71% |
When we combine the total spends of each of the two models, we can see the following:
ChatGPT's model has spent £618 and returned £545.85 - a negative ROI of 11.67% (£72.15 down)
Grok's model has spent £618 and returned £574.53 - a negative ROI of 7.03% (£43.47 down)
And if we also account for overall total spend across the two models, the result is as follows:
Total Spend £1,236, return £1,120.38 - a negative ROI of 18.71% (£115.62 down)
Both models are currently in the red, which means a total loss of 18.71% at present - just under one fifth down on initial return on investment and a trend that needs to be rectified before the business end of the season gets underway.
As mentioned above, it is the correct score predictions that have helpled a decent profit on each model at the start of the season but now the grind of predicting outcomes is becoming tougher and if we were to re-ask the question as to whether AI can predict Premier League results and beat the bookmakers, we would have to say that it is not doing a bad job right now but human instinct may be a better commodity across the course of a full campaign.
Can there be a great escape in these final few weeks of the season?