Like most Kiwis, right now my mind is on the Rugby World Cup (and little else). Unfortunately in Norway no one knows what rugby is, so I’ve been spending my time educating them. A common response, apart from “why do the big guys keep making the turtle thing” – they mean the scrum (I think) – has been “you’re a numbers dude, why don’t you make some match predictions.
So I have.
Using results from all matches between the 20 teams involved in this year’s RWC, I put together a pretty simple model that calculates team ratings. The difference between the ratings for any two teams is the model’s prediction for the expected score difference should they play each other. The idea behind the model that it will try to minimise the prediction error (well, the square of the error) overall. Also, I’ve added also in a “home advantage” rating – realising that playing at home of course generally (but interestingly, not always…) makes you perform better.
So, all in all not a super sophisticated model, but neither should it be really. We only have past data to go on here – each of the teams playing is potentially quite different to previous incarnations of themselves – different players, coaches, tactics, hair gel, the lot. So the past is only a so-so guide. Any model is only as good as the information used to build it, and it is a waste of time building a sophisticated model on a house of sand.
Anyway, so what are the ratings? I used two data sets – the first only including all matches since 2014, and the second all matches since 2012. The thought was that this would give us some indication of ratings stability and also recent form. So, here they are:
With the ratings, you can see things like the value of Twickenham for England (5 points), that Wales performs as well away as at home, and that Ireland and Australia most definitely outperform on home soil. And of course the ABs are best. As if there was any doubt.
Note that the Home Advantage ratings are in multiples of 5 only – there is a limit to accuracy here and it’s no good pretending otherwise. This also means of course that two teams with 1, 2 or 3 points between them in ratings are essentially equal – or at least the model has difficulty in separating them anyway.
Rugby World Cup – Round 1 match predictions
So, let’s use these to predict the results of the first round of pool play, that starts tonight (yes!). Note as well as giving England full home advantage, I’ve given the other “home” nations half their Home Advantage ratings, reflecting that they are – nearly – playing in front of home crowds. Anyway, the predictions:
No great surprises here – the closest games then look like being Samoa v. USA, Tonga v. Georgia and NZ v. Argentina. We’ll come back next week and see how the model has done.
(Note: I inadvertently transposed Australia and Argentina in my original post, so have fixed this – sorry for any confusion this may have caused!)