The Bill James of polimetrics (
by Tony Petrangelo
May 30, 2012, 7:00 AM

What the heck is PVI anyway?

Last week I posted about the different types of election numbers that people use in devising metrics for scoring political districts, always a fun topic! That post didn’t advocate for or against any one metric, rather, it was an argument for what types of data should be used in such a metric.

In this post I’m going to discus the actual metrics that these data can create.

The one that most people are familiar with is Cook PVI, or just PVI. This is a description of PVI from Cook’s website:

Developed for The Cook Political Report by Polidata, the index is an attempt to find an objective measurement of each congressional district that allows comparisons between states and districts, thereby making it relevant in both mid-term and presidential election years.

While other data such as the results of senatorial, gubernatorial, congressional and other local races can help fine tune the exact partisan tilt of a particular district, those kinds of results don’t allow a comparison of districts across state lines. Only Presidential results allow for total comparability.

A Partisan Voting Index score of D+2, for example, means that in the 2004 and 2008 presidential elections, that district performed an average of two points more Democratic than the nation did as a whole, while an R+4 means the district performed four points more Republican than the national average. If a district performed within half a point of the national average in either direction, we assign it a score of EVEN.

Since it came out PVI has been the standard tool used for assigning a partisan score to a district, it’s not the end all be all of analysis, but for a quick “birds-eye” view of a congressional district, PVI works pretty well and is easy to calculate.

Let’s take a look:

PA = Democratic share of the two-way Presidential vote, 2008
PB = Democratic share of the two-way Presidential vote, 2004

pA = Democratic share of the two-way Presidential vote in the district, 2008
pB = Democratic share of the two-way Presidential vote in the district, 2004

    \[ PVI = (\frac{pA+pB}{2}-\frac{PA+PB}{2})*100 \]

Doing it this way a positive score means a district is more Democratic, a negative score more Republican. You could use the Republican share instead and you would get the same results, only with the positive and negative scores flipped.

The key here is that only the two-party Presidential vote is being used, as opposed to the share of the entire Presidential vote. Here’s how to calculate the Democratic two way vote share of an election:

d = Number of votes for Democratic candidate
r = Number of votes for Republican candidate

    \[ x=\frac{d}{(d + r)} \]

Prior to the recent redistricting, this is how Minnesota’s eight congressional districts rated out in PVI:

1 R+1 (Tim Walz)
2 R+4 (John Kline)
3 EVEN (Eric Paulson)
4 D+13 (Betty McCollum)
5 D+23 (Keith Ellison)
6 R+7 (Michele Bachmann)
7 R+5 (Collin Peterson)
8 D+3 (Chip Cravaack)

PVI isn’t the only method out there though, Nate Silver (of course) has his own version that he calls PPI (Partisan Propensity Index).

I’ve been working for some time on developing an alternative to Cook’s Partisan Voting Index (PVI). Not that there’s anything wrong with PVI; it’s a pretty robust little metric. But, PVI is derived from voting in Presidential elections, whereas normally it’s used to help forecast, or contextualize, Congressional elections. Are there any systematic differences in the ways that votes tend to fall for the Congress, as opposed to the Presidency? Are certain districts better or worse for Democrats, or Republicans, than PVI alone would suggest?

It turns out that there’s one other factor which is fairly useful to look at, which is socioeconomic status. Relative to how they do for the Presidency, Democrats are somewhat more likely to win races for Congress in poorer districts, and somewhat more likely to lose them in wealthier ones. Another way to put this is that a split ticket of Republican for President, Democrat for Congress is more likely to occur in a poor district, whereas a split ticket of Democrat for President, Republican for Congress is more likely to occur in a wealthy one.

Silver’s metric uses only two inputs, the Democratic vote share of the most recent Presidential election and the percentage of people in the district who make less than $25,000.

The output of Silver’s PPI differs quite a bit from Cook’s PVI, whereas PVI is formatted as the partisan lean number with a corresponding letter in front, PPI gives you

the percentage chance that the Democrat would have won an open-seat race for Congress in a particular district given the conditions present, on average, between 2002 and 2008 (a period which conveniently featured two good cycles for Republicans and two good ones for Democrats).

Under Silver’s metric here is how Minnesota’s congressional districts came out pre-redistricting (with the districts corresponding PVI score in parenthesis):

1 49.8% (R+1 Tim Walz)
2 9.4% (R+4 John Kline)
3 23.2% (EVEN Eric Paulson)
4 96.9% (D+13 Betty McCollum)
5 99.9% (D+23 Keith Ellison)
6 5.3% (R+7 Michele Bachmann)
7 40.0% (R+5 Collin Peterson)
8 74.6% (D+3 Chip Cravaack)

You can see the influence that adding in the socioeconomic data has on the ratings of the districts. CD7 is rated as a higher probability of a Democratic win then either CDs 2 or 3 using PPI, even though it is more Republican according to PVI.

With most other districts though the two metrics agree; CD1 is a coin flip, and CDs 4 through 6 are basically locks for their respective incumbent parties.

There are other metrics as well, too many to cover in this post, but one more I’ll mention is the House Vulnerability Index, compiled by David Jarman of Daily Kos Elections (DKE):

Here’s a quick recap of how it works. Check out the chart of vulnerable Democrats below, which indicates that Bobby Bright is in the worst shape. Bobby Bright had the 3rd narrowest margin of victory of any Democrat (0.6%, behind only Tom Perriello at 0.2% and Scott Murphy at 0.4% in the NY-20 special), and he’s in the district with the 4th worst PVI of any Democrat (R+16, behind only Chet Edwards, Gene Taylor, and Walt Minnick). Add them up for a raw vulnerability score of 7, the worst of any Democrat. Slightly below him you might notice that LA-03 gets a margin of 0 (despite that Charlie Melancon won unopposed in 2008); that’s the tweak that I perform for all open seats. With PVI alone (R+12, 13th worst of any Dem-held seat), the raw score is 13, good for 3rd place.

Both Nate Silver’s PPI and DKE’s House Vulnerability Index either use PVI itself in their metric or were largely inspired by PVI to create their metric.

PVI paved the way for people like Silver to do things better, similar to the way Bill James paved the way for the multitudes of people who would follow him (like Nate Silver in a previous life) to make better baseball stats.

In a similar way polimetrics is a constantly evolving field, with new and better stats always coming out. Local media outlets are even getting into the game, Minnpost has developed their own polimetric, which they are creatively calling PVI, for scoring state legislative districts.

If you’ve been reading my stuff for the last few years you’ll know that I also have my own version of PVI that I use for scoring Minnesota legislative districts and first introduced in July of 2010. It’s called hPVI, or hybrid PVI, so you can see how much more creative I am than the Minnpost acronymologists, I added an “h” to my metric!

I’ll be rolling out the hPVI numbers for the new legislative districts over the next couple of weeks, so stay tuned!

Thanks for your feedback. If we like what you have to say, it may appear in a future post of reader reactions.