A Look at the RPI
The RPI is one way to compare the performances of teams that play different schedules, and have different percentages of home and away
games.
- For example, as of 2/27/06, George Washington has a record of 24-1 and Georgetown has a record of 19-7. But Georgetown belongs
to the powerful Big East conference, and has played a rather tough schedule (including a nonconference game against mighty Duke),
while George Washington has played what seems to be a much easier schedule. Some might argue that Georgetown is the better team
despite GW's impressive winning percentage. But it's impossible to say what would have happened had GW played Georgetown's schedule,
and Georgetown played GW's schedule, and so we are left with uncertainty as to which team is really better.
- Looking at another D.C. area team, Maryland is 16-11 (as of 2/27/06). Some might argue that Maryland plays in the
tough ACC and that their record may be better if they played in a league without as many strong teams. But a fair criticism of Maryland is that they
play many more home games than away games, and that their winning percentage may not be as good if they had to play more games away
from their home court. If we average their home winning percentage (0.867) and their away winning percentage (0.125), the result is
0.496. Some might argue that their winning record is a bit misleading, and that Maryland's overall record has benefitted from the fact that
they play nearly twice as many home games as away game. (NC State, Pittsburgh, Louisville, Ohio State, Illinois, Wisconsin, Florida,
Arkansas, and some other teams from major conferences similarly benefit from playing many more home games than away games.) So in
evaluating Maryland, it seems fair to give them credit for having played a strong schedule (by one ranking, the 8th toughest schedule
overall), but at the same time they should perhaps be penalized for playing only a relatively small percentage of their games on the
road. It's a difficult task to determine how to weight all of the facts when assessing Maryland's strength.
The RPI attempts to make adjustments for strength of schedule and different percentages of home and away games. In a sense it tries
to determine how strong teams would appear to be if they all played "on a level playing field."
- Instead of using "straight" winning
percentages, an adjustment is made by weighting each away win and home loss more than each home win and away loss: the weight given
to an away win is 1.4, while the weight given to each home win is only 0.6; and the weight for a home loss is 1.4, while the weight
for a road loss is only 0.6.
- Then the RPI for a team is determined using not only its adjusted winning percentage, but also the adjusted winning percentages
of the team's opponents, and the opponent's opponents, in an attempt to adjust for differing strengths of schedules.
In using various winning percentages to arrive at final RPI values for teams, it's not completely clear that the 25%-50%-25% weighting
is ideal, or even that a weighted average should be used at all --- maybe a good argument could be made for multiplying various
winning percentages. However, rather than look into that aspect of the RPI, here the focus is on the formula used for the modification
of a team's winning percentage using weights
in an attempt to adjust for fact that some teams load their schedules with many more home games than away games, while other teams
play much more balanced schedules.
The Effect of Different Weights for Home Wins and Road Wins
Consider a team that wins 70% of its home games, but only 30% of its away games. If it played all of its games at home its winning
percentage would be 0.700, if it
played all of its games on the road its winning
percentage would be 0.300, and if it
played an equal number of home and away games its winning percentage would be 0.500. Clearly, the proportion of the team's games which are home
games will have a large effect on its winning percentage.
Letting f denote the proportion of the team's game which are home games, and assuming no neutral court games so that the
proportion of away games is 1 - f, and letting n denote the total number of games the team plays,
the number of home games is nf and
the number of away games is n(1 - f).
If 70% of the home games are wins, and 30% of the away games are wins, then
the number of home wins is 0.7nf,
the number of home losses is 0.3nf,
the number of away wins is 0.3n(1 - f),
and the number of away losses is 0.7n(1 - f).
Thus the unadjusted winning percentage is
0.7nf +
0.3n(1 - f)
------------------------------------------
0.7nf +
0.3n(1 - f) +
0.3nf +
0.7n(1 - f)
which is equal to
0.3 + 0.4f.
But if each home win and road loss is given wieght 0.6, and
each road win and home loss is given wieght 1.4,
the adjusted winning percentage is
0.7nf(0.6) +
0.3n(1 - f)(1.4)
--------------------------------------------------------------
0.7nf(0.6) +
0.3n(1 - f)(1.4) +
0.3nf(1.4) +
0.7n(1 - f)(0.6)
which is equal to 0.5 (no matter what the value of f is).
As the table below shows, although the unadjusted winning percentage depends on the value of f, the adjusted winning percentage
does not. This means that if two teams perform identically in the sense that each wins 70% of its home games and 30% of its away
games, the adjusted winning percentage would rate the two teams equally even if one loaded up on home games and had a winning record
while the other played a scheduled balanced in home and away games and had a 0.500 record, or even if it had more road games than home
games and had a losing record. Thus, with regard to such a team's adjusted winning percentage, there is no incentive to load up on
home games and play a smaller number of away games. (Of course there are other incentives for favoring home games, and since the
adjusted winning percentage does not really penalize the team for choosing to play more games at home than on the road, one might say
that while the adjusted winning percentage used in the RPI makes comparisons fairer, it doesn't not provide an incentive for teams to
play a balanced schedule. So, if one wanted to urge all teams to play balanced schedules, the home and road weights used in the RPI
may not be different enough to penalize teams that like to load up their schedules with home games.)
proportion of home games (f) |
unadjusted winning pct |
adjusted winning pct |
0.7 |
0.580
|
0.500
|
0.65 |
0.560
|
0.500
|
0.6 |
0.540
|
0.500
|
0.55 |
0.520
|
0.500
|
0.5 |
0.500
|
0.500
|
0.45 |
0.480
|
0.500
|
0.4 |
0.460
|
0.500
|
While the adjusted winning percentage is invariant to the proportion of home games, f, when a team wins 70% of
its home games and 30% of its away game, it is not always the case that the adjusted winning percentage doesn't
depend on f. If a team wins 75% of its home games and 25% of its away games, then the unadjusted winning
percentage is
0.75f + 0.25(1 - f) = 0.25 + 0.5f,
and the adjusted winning percentage is
0.75nf(0.6) +
0.25n(1 - f)(1.4)
------------------------------------------------------------------
0.75nf(0.6) +
0.25n(1 - f)(1.4) +
0.25nf(1.4) +
0.75n(1 - f)(0.6),
which simplifies to
0.4375 + 0.125f.
So, for such a team, it's adjusted winning percentage will increase as the proportion of home games increases,
although as the table below shows, the adjusted winning percentage does not vary with f nearly as much as the
unadjusted winning percentage does. Still, for such a team, the weights of 0.6 and 1.4 used to determine the
adjusted winning percentage aren't different enough to make it so that the team shouldn't load it's schedule with
more home games than away games. For such a team, to make the adjusted winning percentage not depend on f,
the weights would have to be 0.5 and 1.5 instead of 0.6 and 1.4.
proportion of home games (f) |
unadjusted winning pct |
adjusted winning pct |
0.7 |
0.600
|
0.525
|
0.65 |
0.575
|
0.519
|
0.6 |
0.550
|
0.513
|
0.55 |
0.525
|
0.506
|
0.5 |
0.500
|
0.500
|
0.45 |
0.475
|
0.494
|
0.4 |
0.450
|
0.488
|
If we consider a team that wins 90% of its home games and 50% of its away games, then things get a bit screwier.
The unadjusted winning
percentage is
0.9f + 0.5(1 - f) = 0.5 + 0.4f,
and the adjusted winning percentage is
(0.7 - 0.16f)/(1 - 0.32f),
which increases as f increases, meaning that the team should try to schedule as many home games as possible in
order to increase their RPI. The table below shows how both the unadjusted and the adjusted winning percentages
increase as f increases. It can be noted that the adjusted winning percentage is not as heavily dependent on
the proportion of home games.
proportion of home games (f) |
unadjusted winning pct |
adjusted winning pct |
0.7 |
0.780
|
0.758
|
0.65 |
0.760
|
0.753
|
0.6 |
0.740
|
0.748
|
0.55 |
0.720
|
0.743
|
0.5 |
0.700
|
0.738
|
0.45 |
0.680
|
0.734
|
0.4 |
0.660
|
0.729
|
Another thing to note is that the adjusted winning percentage is not equal to 0.700 when
f = 0.5, although with this team its actual winning percentage would be 0.700 if it played an equal number
of home and away games.
In order to make the adjusted winning percentage not depend on f, the weights would have to be 0.5 and 1.5
instead of 0.6 and 1.4. With weights of 0.5 and 1.5 the adjusted winning percentage for a team that wins 90% of its
home games and 50% of its away games equals 0.750 no matter what proportion of the games are home games. (Note that
with weights of 0.5 and 1.5, although the adjusted winning percentage doesn't depend on f, it's not equal to
0.700 as one might guess it should be.)
If a team wins 90% of its home games and 60% of its away games, or a team wins 90% of its home games and 30% of its
away games, the behavior of the adjusted winning percentage is similar --- it both cases it increases as the
proportion of home games increases, and so in both cases such teams should try to maximize their number of home games
in order to maximize their adjusted winning percentage.
Summary and Conclusions:
The Effect of Different Weights for Home Wins and Road Wins
Here I focused on the adjusted winning percentage, which is just one component of the RPI. (Note: What I refer to as
the adjusted winning percentage may go by another name that I am not aware of.) It can be seen that if a team wins
70% of its home games and 30% of its away games then the adjusted winning percentage does not change as the
proportion of home games changes. But if a team wins 75% of its home games and 25% of its away games, or a team wins
90% of its home games and 50% of its away games, then the adjusted winning percentage does change as the proportion
of home games changes --- in both cases it increases as the proportion of home games increases. For such teams the
adjusted winning percentage does not provide an incentive to have an equal number of home and away games, since the
adjusted winning percentage will increase as the proportion of home games is increased. In order
to make the adjusted winning percentage not increase as the proportion of home games increases, in both cases the
weights would have to be 0.5 and 1.5 instead of 0.6 and 1.4. (Note: In other cases the weights would have to be
different values in order to have the adjusted winning percentage not depend on the proportion of home games.)
I think that a case could be made for changing the weights from 0.6 and 1.4 to values with are even more different,
like 0.5 and 1.5, or perhaps something even more extreme. If the adjusted winning percentage can be increased by
increasing the proportion of home games, then a team can benefit by loading its schedule with more home games and
there is no incentive for teams to balance the numbers of their home
and away games, which seems to be the fair thing to have for a variety of reasons. Even if the adjusted winning
percentage does not depend on the proportion of home games, there still is no incentive for teams to balance their
home and away games. Only if the adjusted winning percentage is such that it penalizes teams that load up their
schedules with a rather large proportion of home games will it provide an incentive for teams to not play a large
proportion of their games on their home court.
Working Within the System
Given that the RPI is an index that the Selection Committee uses, one might wonder if there are any loopholes that
can be exploited to improve a team's rating. I think that in general it would be hard to gain appreciably by
"playing the system" but that there are perhaps a few somewhat subtle points worth noting.
Home wins vesus Road Wins
With a home win given a weight of 0.6 and a road win given a weight of 1.4, one might guess that road wins are the
thing to focus on. But actually, with good teams, home wins are very important. With a schedule that is balanced in
home and away games, it would be better to win 90% of the
home games and only 60% of the away games than it would be to win 80% of the home games and 70% of the away games,
even though in each case the team's overall winning percentage would be 0.750. (An explanation of why a home win
weighted only 0.6 is so valuable is that it needs to be kept in mind that a home loss carries a weight of 1.4, and
so if a lightly-weighted win doesn't occur, a heavyweight loss will happen, which sort of counters the fact that road
wins are given a weight of 1.4 and thus seem so valuable. In a sense it's just as important to avoid a costly home loss
with a lightweight home win as it is to earn a valuable road win, and when one actually does the calculations it can
be seen that for teams that have good home and road winning percentages, it is actually better to trade road wins for
home wins.)
The table below gives the adjusted winning percentages for a team with an overall winning percentage of 0.750 under
three different scenerios. It can be seen that it is important to maintain a high winning percentage at home even if
it means having a lower winning percentage on the road. So perhaps in scheduling one would want to avoid taking on
the risk of a home loss. If one wants to accept some tough games in order to get credit for having a high strength
of schedule, it may be better to go on the road for such games and hope for the best. (If a team is going to take a
few road losses in the course of a tough season, the road losses might as well be to good teams in order to at least
have the strength of schedule benefit.)
home winning pct |
away winning pct |
unadjusted winning pct |
adjusted winning pct |
0.900 |
0.600
|
0.750
|
0.784
|
0.850 |
0.650
|
0.750
|
0.772
|
0.800 |
0.700
|
0.750
|
0.760
|
The table belows shows the effect of road winning percentage on the overall unadjusted and adjusted winning
percentages for a team that wins 90% of its home games and plays an equal number of home and away games.
Note that with such a high home winning percentage, the adjusted winning percentage stays respectable even as the road
winning percentage gets rather low. Furthermore, it's interesting to note that as the road winning percentage ranges
from 0.300 to 0.800, the adjusted winning percentage is always greater than the unadjusted winning percentage.
home winning pct |
away winning pct |
unadjusted winning pct |
adjusted winning pct |
0.900 |
0.900
|
0.900
|
0.900
|
0.900 |
0.800
|
0.850
|
0.865
|
0.900 |
0.700
|
0.800
|
0.826
|
0.900 |
0.600
|
0.750
|
0.784
|
0.900 |
0.500
|
0.700
|
0.738
|
0.900 |
0.400
|
0.650
|
0.688
|
0.900 |
0.300
|
0.600
|
0.632
|
Other Scheduling Concerns
On 2/28/06, George Washington was only ranked 29th by the RPI despite a gaudy 24-1 record. This shows that strength
of schedule can have a strong impact on a team's RPI rating. So while it's nice to have a great winning percentage,
one should not overlook strength of schedule.
To a large degree, strength of schedule depends on the quality of other teams in the league, and GW is hurt because a
lot of the other Atlantic 10 teams, which are GW's opponents and thus contribute to their strength of schedule, are
somewhat weak. But when it comes to scheduling nonconference games, one should perhaps keep
a few things in mind. For example, since both a team's opponents and their opponents' opponents contribute towards
the RPI, a team like George Mason should favor a tough opponent from the ACC or SEC over a tough opponent from a weak
conference, like GW from the Atlantic 10. In both cases the tough opponent would help out, but if GW is played one
risks a loss against a tough opponent while not getting a boost from GW's opponents being tough. So it would be
better to take a chance against a tough opponent from a top conference, since whether the game is won or lost, the
opponent's tough opponents would at least serve to help raise the RPI. (A beautiful thing about Mason's win over
Wichita State, in addition to it being a road win, is that not only does the fact that Wichita State is a good team
help to boost Mason's RPI value, but the fact that Wichita State's opponents are generally strong also helps to boost
Mason's RPI.)
Summary and Conclusions:
Working Within the System
Based on the observations made above, it seems like a good strategy would be to guard against home losses, and take
chances by playing tough opponents on the road --- making sure that the opponents generally have strong opponents and
are not isolated good teams in weak leagues.
Supplements to the RPI
I generally like the RPI because it makes adjustments for strength of schedule and also adjusts to give increased
credit for wins on the road. Not only do these adjustments make it easier to compare teams in a fair way, but
perhaps they will encourage teams to adopt better scheduling practices (although it seems to me like the RPI doesn't
provide a strong enough incentive to cause teams to play fewer home games and more road games). For example, perhaps top teams, in an effort to
improve their RPI values, will start scheduling more challenging opponents.
But the RPI isn't perfect. It could be that it sometimes penalizes outstanding teams in weak conferences too much
(e.g., George Washington in 2006). Jeff Sagarin's ELO CHESS rating method is less punishing to top teams in weak
conferences (although perhaps it can err on the side of not being punishing enough). In general, ELO CHESS, like the
RPI, takes into account strength of schedule. It just does it differently than the RPI does. My guess is that it is
the case that neither method always does better than the other one, and so it may be a good idea to average the
rankings produced by both methods in developing a final ranking of teams.