All Things Techie With Huge, Unstructured, Intuitive Leaps
Showing posts with label rugby. Show all posts
Showing posts with label rugby. Show all posts

Performance Analytics Guide To Aviva Premiership Rugby Standings


Every professional sport has its analytics and statistics department because player and team analysis translates into money.  Curiously the sport of professional rugby has resisted the applied data science to performance despite the efforts of myself and some others.  The owners or coaches simply do not believe that extreme analytics play a part in their view that rugby is essentially a stochastic game that cannot be predicted on a macro level.  In the olden days, the coach used to take a seat, sometimes in the stands once the game began.  Of course they are wrong if they believe that performance on every level can't be measured and predicted.

Even the simplest of metrics and analyses can benefit a team, or benefit someone betting on a team.  I have developed a software package called RugbyMetrics to digitize a game from video so that analytics and data mining can be run on the game record.  Here is a screen shot of the capture mode:


The degree of granularity goes right to the player and events.  One of the biggest advantages to using analytics, is to use it to determine level of compensation based on ability in terms of peer standings.  But the current crop of owners and coaches are leery of using deep-dive analytics.

The field of analytics and the mathematical underpinnings have evolved greatly in the past twenty years.  As an example, a mini-tutorial and a prediction for final standings at the end of the 2015-2016 campaign, I will demonstrate the Pythagorean Analytics Won Loss Formula for evaluating team performance and ultimate standings at the end of the season.

In the early parts of the season, as it is now in the Aviva Premiership Rugby season, only 8 matches have been played.  The top team, the Saracens have 36 points and the bottom team, the London Irish have 4 points.  While this seems like a lot, each win garners 4 points so the chasm in between the bottom and the top doesn't seem that drastic to the casual observer.  Let me quote the league rules on how points are amassed:

During Aviva Premiership Rugby points will be awarded as follows:
• 4 points will be awarded for a win
• 2 points will be awarded for a draw
• 1 point will be awarded to a team that loses a match by 7 points or less
• 1 point will be awarded to a team scoring 4 tries or more in a match
In the case of equality at any stage of the Season, positions at that stage of the season shall be determined firstly by the number of wins achieved and then on the basis of match points differential. A Club with a larger number of wins shall be placed higher than a Club with the same number of league points but fewer wins.

If Clubs have equal league points and equal number of wins then a Club with a larger difference between match points "for" and match points "against" shall be placed higher in the Premiership League than a Club with a smaller difference between match points "for" and match points "against".

OK, so you see the way that a team amasses points after the game is played. During the game, the chief method of scoring is called a try where the ball is downed behind the goal line. That gives you five points, and like football, if you kick a conversion, you get 2 extra points.  You can also get a penalty kick or have a drop kick and those are worth three points. In the past the points for these various scoring methods have varied.

The current league table looks like this:

As you can see, each team has played 8 games. The Saracens have won them all and the London Irish have won only one. The Sarries, in their first place have amassed a total of 218 points for themselves across the 8 games and have yielded 81, for a difference of 151.  In terms of the major scoring of trys, they have made 24 trys and 5 have been scored against them. Impressive. The Chiefs have scored more trys but are in second place. This would indicate that their defense capabilities doesn't match their offense and the 12 trys (compared to the 5 against the Sarries) proves it.

So one would look at the table and say that a few teams still have a shot at winning or getting into the top 4.  For example, I admire the Saracens, but I like the Tigers very much, and RugbyMetrics was developed with Bath in mind. So what are their chances? How will it finish when it is all said and done?  That's where the predictive power of the Pythagorean Analytics Won Loss Formula.

So what does this Pythago-thing-a-majig do?  I am going to talk technical for a minute.  If your eyes glaze over, skip this paragraph.  Mathematically,  the points that a team scores  and points scored against a team,  are drawn from what is known as independent translated Weibull distributions,  In statistics, the Weibull distribution is a continuous probability distribution. It is named after Swedish mathematician Waloddi Weibull. What is a  probability distribution?   A probability distribution assigns a probability to each measurable subset of the possible outcomes of a procedure of statistical inference.  And we are going to infer the win ratio based on how often the team scores and how often they are scored against.  That is the end goal of what we are doing.  Here is an infographic of the proof of Goals (trys) for and Goals Against are calculated as a Weibull distribution.  Through my math trials I have found a proprietary exponential power to calculate the Pythagorean Won Loss formula.


OK, so all of the theory aside, what do the results look like?  Who will rise to the top and who will sink to the bottom when it is all said and done?

Currently the standings, early in the season look like this:

1 Saracens
2 Exeter Chiefs
3 Harlequins
4 Leicester Tigers
5 Northampton Saints
6 Gloucester Rugby
7 Sale Sharks
8 Bath Rugby
9 Wasps
10 Worcester Warriors
11 Newcastle Falcons
12 London Irish

When we extrapolate winning performance based on points scored for and against using the Pythagorean Won Loss Formula, the predicted outcome changes the order.  The Won/Loss numbers are the decimal places.  Using their scoring record, the Sarries are expected to win 86.9 times out of a hundred against their Premiership rivals.

 Saracens 0.869224
Exeter Chiefs 0.756876
Northampton Saints 0.609814
Harlequins 0.60633
Leicester Tigers 0.563826
Bath Rugby 0.517414
Wasps 0.505625
Gloucester Rugby 0.492264
Sale Sharks 0.399586
10  Worcester Warriors 0.357453
11  Newcastle Falcons 0.206142
12  London Irish 0.193901

Note that the Northampton Saints jumped from 5th place to 3rd dropping the Harlequins by one, and the Tigers drop to 5th, while the Bath rises to 6th. The top two and the bottom three teams are where they should be. Poor old Gloucester drops to  8th place.

Lets take this analysis a little deeper though.  The above Pythagorean Won Loss formula is based on total points.  The main offensive ability results in trys worth either 5 or 7 points depending on whether the conversion is made.  Luckily we have the stats for trys for and against eliminating the points for drop kicks and penalties. This reflects the upper boundaries of where the team would end up based on try scoring ability.  Calculating the  Pythagorean Won Loss formula using the try data alone, again gives hope to certain fans.

1 Saracens 0.943933
2 Exeter Chiefs 0.811483
5 Northampton Saints 0.639505
8 Bath Rugby 0.636054
3 Harlequins 0.585137
4 Leicester Tigers 0.535957
9 Wasps 0.5
6 Gloucester Rugby 0.418684
10 Worcester Warriors 0.349619
7 Sale Sharks 0.32523
12 London Irish 0.212348
11 Newcastle Falcons 0.1612
The top three remain the same as the total points analysis.  Based on trys alone, Bath Rugby, currently in 8th has the ability to finish in 4th place. Worcester climbs up one over Sale, and the bottom two remain the same.

So if you are a betting man, you have a decent chance of taking this list to Ladbrokes or some other wagering shop, and make a couple of quid on the good old Pythagorean Won Loss formula.

Most of the rest of RugbyMetrics is player-centric for the benefit of measuring individual performance. I predict that the first team that adopts this, and brings home the silverware, will open the flood gates for performance analytics in professional rugby.

RugbyMetrics Queries

I have been getting some queries via comment postings about RugbyMetrics. Some people have even been trying to find a trial download. I will be posting some sample results and white papers here shortly. In the meantime, if you have any queries, please drop me a line at:

rugbymetrics-at-gmx.com (substitute "@" for "-at-").

Line Formation Elasticity -- RugbyMetrics

A lot of objective information is falling out from the results of my Software tool called RugbyMetrics. While doing extensive statistical data mining on actual professional rugby games in the Aviva Premiership, an incredible statistic fell out of the exercise, and that was line formation elasticity.

Rugby is a game where the defense lines up across the field to defend against a similar line of offence. When a player carrying the egg finds that his forward progress is blocked, he passes left or right down the line to his team mates. If there is a hole in the line somewhere on either the defense or offence, then there is a problem.

So I decided to measure line elasticity -- how quickly the line forms or reforms after it is distorted from a play. This analysis fell out of another analysis where I did a ratio of jersey counts between attackers and defenders at the time of tackle, which had a very interesting result.

What the line elasticity measure showed, was that the more efficient that the line was at reforming, the more successful the play (both in offence and defense). This is especially evident when the team with possession grinds away for a long time with very little field gained. The opposing defensive line is very elastic at reforming and very efficient.

What frame-by-frame video also showed, was the laggards who were late at assuming their position, thus leaving holes in the line. It was very interesting.

From there, when we saw that we could identify the defensive laggards, we saw that we could assign a numeric co-efficient of line efficiency, both at a team level, and at a player level.

From there, it was a short step to rating the roster of a team, and let the results settle into a hierarchy of the best players. There are many developed measures of a players worth coming out of RugbyMetrics. The thought struck me, that if a player is negotiating a raise in his contract, one of the bargaining chips could be a RugbyMetrics analysis to show that he is in the company of the best of the breed in the Premiership. Conversely, a team could use RugbyMetrics to prove that a player asking for a raise tends more to a journeyman than a star.

Its all fascinating stuff, and is opened by the doors of data mining and performance analysis.

Toby Flood Reduced To An Equation

Toby Flood is a fly-half for the Leicester Tigers, and a rugby star in the Aviva Premiership. This is his photograph from Wikipedia:


It's almost sad, but true that Toby's running game whilst playing rugby can be reduced to a mathematical equation. If you had to describe Toby's running game performance mathematically, you would do it this way:

Obviously I am not going to tell you what x and y stand for, because it came from digitizing and sifting through mounds of data to come up with the mathematical model using predictive analytics and linear regression.

However, if you wanted to choose a player with Toby's prowess, this formula would be incredibly helpful. It was derived using my software package called RugbyMetrics which adds objective knowledge of the game through data-mining and sifting through mounds of statistics.

Click on the video below to watch Toby kick a conversion after a Tiger try. The fly-half is really good!!

Regress to Success -- RugbyMetrics

So let's suppose that you run a rugby team in the Aviva Premiership or any other professional rugby club. So you haven't qualified for the Heineken Cup and your team is full of journeymen players and you consistently sit in the cellar of the standings table. And let's suppose that you don't have a Daddy Warbucks owner that can buy you a Dan Carter and you want to create a competitive team.

So what are you going to do? You have to find young untried players who will eventually turn into Thomas Waldrom, Schalk Brits, Chris Aston or Tom Wood. How are you going to identify them when they haven't had a chance to prove themselves and amass some statistics to prove that they have the stuff of the egg-chasing gods.

You turn to the geeks, that's how you do it. How so? You regress your way to success. You would use my RugbyMetrics tool (click on this LINK to see all of the articles on RugbyMetrics). Then you would take a game film of your targeted acquisition and using the tool, digitize that player's performance. From there you would use advanced statistics to create a mathematic model (using regression and Bayesian inference) to determine if your player has the right stuff.

How does it work? The seeds of athletic greatness are sown early. However they may not become manifest because the player is not on a team that enhances his skillset, or he is blindside oriented on a team that is predominantly openside oriented. There are many many reasons, however that player will demonstrate the subtle qualities that shows that he has the key performance indicators that tend to greatness.

So what are these KPI's or key performance indicators? They are a new set of statistics that are gleaned from data mining every aspect of the game. These are proprietary knowledge to the users of the system. But as a trivial example, one finds that an Olly Barkley will average x amounts of carries, gaining y amounts of yards, in a certain ratio to the opposition yards gained. This is objective, scientific knowledge of the game of rugby that comes from the field of predictive analytics.

So once you have the three mathematical formulas gleaned from going through mountains of statistics, you can eliminate the pretenders and give yourself a roster of possible stars. This is not meant to replace the years of coaching and scouting, but rather it is meant to give the teams a scientific, valid starting point when scouting for new team members.

The interesting aspect is that the front 8 will have different formulas than the back seven, and each position will have different regression parameters in the models. Also style of play comes into effect as well. If you like a Tom Wood style of play, you would determine the mathematical model by analyzing his performance and looking for players who have similar numbers to him. It sure beats the shot in the dark method of a player that "looks good".

If you have any questions, please leave a comment and I will answer them.