Robert's Rankings
What is this?
Robert's Ranking is a computer
model designed for ranking sports teams. It's not horribly sophisticated
and by no means does
it take into account all the details that a human observer
might. However, it does have one significant advantage over humans: It
can produce a relative ranking of all the teams in the league simultaneously.
I also designed it so that it wasn't sport specific, and could
easily be applied to a number of different forms of competition. This
is the basic reason I pursued this project; I want to have some idea of where
the teams I am interested in really fall in the mix.
How does it work?
Well the details are complicated (and I'm still fiddling with it some),
but the basic principle is that every team starts with a "quality factor" of 1, and
after each game the winner takes some fraction of the quality points away from the
loser. Also, scores matter. Blowouts benefit the winner more than close games. In fact,
the loser can even move up if he keeps it to a close game against a much higher
ranked opponent. Of course, it's not quite that simple because the ranks are
computed simultaneously and not in a chronological sequence.
Who gets ranked?
Right now I am ranking basketball
(college and pro), football (college and pro), pro baseball and pro hockey. As
this process becomes more automated, I might expand to include other sports. This method is generally
applicable to any sport where individual contests involve pairs of teams (or players) and
are decided on the basis of a numeric score.
How often are the ranks updated?
As
of right now, I have it automated so that games played during a sport's regular
season will be automatically updated. Updates occur between 1 and 3 AM PT.
Bowl games, championships, etc, have
to be run through by hand so the rate of updates after the
close of the regular season will generally be more erratic.
What do the various items in each ranking set mean?
- Rank - This is the ordering that the computer
spits out for the various teams. It is directly based off the quality factor.
- Name - This is the
team's name, or an abbreviation thereof.
- Record - The number of wins and losses
that a team has experienced this season. While all games played are
incorporated in these numbers, only games played against other teams on
the list are counted when determining a team's rank. For instance, college
games played out of division would be ignored.
- Strength of Schedule - This is a rank based
on the quality of a team's opponents. Teams with harder schedules will be
ranked nearer to 1. There is no distinction between wins and losses when
calculating this number.
- Quality Factor - This
is the number that determines rank. Every team is assigned a quality
factor by the computer, and these numbers are then sorted to produce the
ranking. Teams with closely matched quality factors should be of similar
quality. The average quality factor is set to be 1.
- Previous - Gives the
team's rank during the previous set of rankings and arrows to indicate if the
team has moved up or down. More arrows indicate bigger jumps.
- Upsets -
When currently higher
ranked team has been beaten by a lower ranked team. By
looking at the number of upsets in the ranking scheme, one can get a
sense of how amenable (or not so), the given sport is to being ranked in this manner.
A perfect ranking scheme would have 0 upsets, whereas a worthless ranking
would show 50% upsets. It's not neccesarily the ranking's fault, many
sports just aren't designed to easily provide a linear ordering from best
to worst.
- Grouping Factor - This
represents the minimum
distance two teams have to be seperated before it can be considered
statistically significant. As you will notice there
are often several teams within plus or minus one grouping factor, and all
of these teams should be considered to be of basically the same quality.
This serves to illustrate another difficulty in ranking schemes, i.e. the
differences between teams often aren't pronounced enough to definitively say
which of two teams is better.
- Probability of Future Upsets
- Based on a few plausible
assumptions about the form of the upset distribution function, it's possible to estimate the
likelihood of an upset in terms of the difference in
quality factors between the two teams who are playing. While it's always
true that we would expect the higher ranked team to win, sports are partly
based on chance and teams can have good days and bad. This measure
provides a means of guessing the likelihood of an upset given how far
apart the two combatants are in the ranking scheme.
Can I use this system to predict the outcome of a particular
game?
Yes and No. It's certainly true that in it's essence the process of ranking teams is about predicting whose better, but
it's also true that a general rank doesn't tell you all that much
about the outcomes of individual games. The grouping factor and probabilities of
upset help give a sense of how big a role luck and uncertainty may
play in the outcome of any game. Fundementally, sports aren't designed so that
teams line up linearly on the basis of a single factor. If one truly wants
to know the outcome of a particular contest then it make sense to study many
factors of both sides. In football, for instance, you'd probably want to look
at things like rushing offense and defense, and passing offense and defense, in
addition to who has won how many games. Linear ranking gives a rough
estimate of who is the better team, but in the complex field of sports its almost
always possible to improve one's knowledge, especially when researching
individual games.