Put a math guy and a volleyball guy together and strange things happen.
The motivation for this was we don't cover volleyball enough, basically. The symbolism of our label counts is notable: as I write this, football has 673 posts, basketball 490, hockey 369, and then all the way down there at 42 is volleyball—behind the CFL (153) and not even half of the NCAA (91). And last I checked, this isn't called The NCAA Blog. (If it were, we'd be doing a lot more work.)
However, unfamiliarity with players is always an issue when you're trying to cover a nationwide league where games often aren't televised or webcasted. So resident volleyball nut Andrew Bucholtz and I spent a few hours one night hashing out a system for expressing a volleyball player's statistical contributions in terms of points and, ultimately, in terms of wins.
Many things volleyball players do are directly related to points, of course. A kill is by definition one point for their team, as is a block. An error is one point for the other team, so they should lose one point when they commit an error. But we need to consider the context: if the leaguewide hitting percentage is, say, .150, then one kill is worth just 0.85 points above what the average player would be expected to obtain (1 minus .150). This "above average" comparison is the foundation for these rankings, unimaginatively called the Bucholtz rankings in honour of their co-creator.
We established baseline rates for hitting percentage, blocks per set, and services aces per set for each conference and for each position. Adjustments by conference are required due to differences in level and style of play across the country: for example, the hitting percentage in Canada West was .182 last year, but just .159 in the OUA, an extremely significant difference. And we adjusted by position to account for the higher conversion rates for middles, whose attacks are often unsuspected (.219 hitting percentage nationwide) and who often record many more blocks than outsides (0.7 per set vs. 0.3). No positional adjustment was made with respect to service aces.
We should also point out that these rankings don't include digs or assists, so they aren't intended to evaluate setters or liberos, just middles and outside hitters. We've only looked at women's results so far, but the same methodology would easily apply to the men as well.
Take UBC's Liz Cordonier, this year's player of the year, as an example. Given her 527 attempts, we would expect her to have 77 kills minus errors, or a .146 hitting percentage. She actually had a .280 hitting percentage, meaning 148 kills minus errors. Thus, 148 minus 77 gives 71 points that Cordonier added above the average outside hitter in Canada West. We can do this for service aces (4 expected, 24 actual for Cordonier) and blocks (23 expected, 16 actual) and we find that she had a plus/minus of 84.5 points. To convert this to something more meaningful, we divided by 25 (points per set) and then by 3 (sets per match) to get unadjusted matches above average. Finally, we scaled the resulting numbers such that each team's total plus/minus for all of its players roughly added up to that team's record.
In the end, we are able to say Cordonier was worth 2.9 matches (or wins) above average, meaning if you took an average player off a 10-10 team and replaced her with Cordonier, you'd expect that team to go about 13-7. A +2.9 puts Cordonier 10th nationwide, and surprisingly third on her team: Kyla Richey (+3.0) and Jen Hinze (+3.1) both rank ahead. Tops in the country is Montreal's Nadine Alphonse (+4.9), with another Carabin at 17th in Laetitia Tchoualack (+2.3), the 2008 BLG Award winner.
The full women's rankings are here; note that anyone with very little playing time is considered average, because they didn't have enough time to contribute to their plus/minus. This isn't necessarily true, so take the results for anyone with less than 30 or 40 sets played with enough salt as you deem appropriate. Also note that the names are, for the most part, left as they were on the CIS site, so last names like "Vallee-V." and "St.georges" come with the territory.
The seven first-team all-Canadians were mostly at the top of these rankings: only Montreal's Alexandra Lojen (49th) and X's Catherine Thornton (105th) were out of the top 15. In Lojen's case, our decision to ignore assists hurts her, since the Montreal setter had nearly 600 of them. For Thornton, her lack of blocks keeps her down: only 0.12 per set, about one-third the baseline rate we compared her against. One could argue that it wasn't Thornton's job to block, but that's one of the issues we run into with a one-size-fits-all stat like this.
Questions are welcome in the comments below or by e-mail.
Nice work - good to see some people applying advanced statistical work to CIS sports.
ReplyDeleteLet me start by saying the work you have put forth is a good metric for evaluating talent and I am by no means going after the math (as it seems sound) but will raise some points about the data which has been used. This response does not serve as a dismissal of the work but rather hopefully brings some clarity to the results.
Note: all of the stats used come from the CIS WVB stats page
http://english.cis-sic.ca/sports/wvball/2009-10/players?sort=bt&view=&pos=&r=0
The final results of this project are only as good as the stats you are given to work with. In this case (as will be the case with any CIS work) the stats are at best inconsistent and at worst biased or incorrect. Lack of consistency and understanding of correct stat scoring has biased the system towards those with high amounts of blocks.
At a glance, 8 of the top 11 players in the spreadsheet score well in the blocks category with the majority of those players all suiting up as middle blockers.
This creates a number of interesting problems for the math:
1) A number of these players have had their stats inflated due to the incorrect classifying of blocks, namely the difference between solo blocks (1 point) and block assists (0.5 points).
The FIVB rule book states (section 14.1.4) that:
"A collective block is executed by two or three players close to each other and is completed when one of them touches the ball."
Having considerable experience in watching and scoring CIS women's volleyball, the understanding of this rule is often not known by scorers. A block assist should be awarded for any play where two or more players make an attempt to block even if it is clear that one player is solely responsible for the block happening.
If we take a look at Nadine Alphonse's (of Montreal) numbers, it becomes clear that her #1 ranking was helped along immensely due to her 52 solo blocks, the highest total of any player in the CIS. Furthermore she is only one of 3 players in the top 25 to have more solo blocks than block assists.
There are only two explanations for these stats:
a) Either Montreal, as a a team, does a poor job of blocking - leaving the onus completely on Alphonse
b) The stats compiled are incorrect, as the scorer's were not aware of the proper way to award blocks vs block assists
Having seen Montreal play regularly, the most likely explanation is B. The Carabins are a very talented team that play a high level of volleyball and most certainly do not rely on only one player for the majority of their blocks.
2) Delving deeper into the issue unveiled in section 1 it also becomes clear that players benefit from being on a team in a conference not called "Canada West". The CW, for the most part, does a very good job of taking stats in all sports thanks to the work of some very vocal SIDs past and present. By no means am I arguing categorically that their stat taking is unquestionable but there is some well earned respect in this field of work.
Looking at the top blockers in the nation, there are only seven of the top 25 that were awarded less than 10 solo blocks this season. Each of these players suit up for teams in Canada West. In addition to these seven, there are four other Canada West players who rank in the top 25 in blocks. This group ranges between 14-19 solo blocks, which ranks these players from eighth lowest to 12th lowest in solo blocks among top 25 blockers.
3) The effect of section 2 means that any player in the Canada West is handicapped in the scoring system because of the lack solo blocks given out by CW scorekeepers.
ReplyDeleteA good example of this phenomena is that of UBC players Jen Hinze and Kyla Richey. Both of these players were among the top blockers in the nation, however, in the spreadsheet math their overall numbers are hurt by the fact that they were only awarded 7 and 6 solo blocks respectively. They did, however, rank tied for fourth in block assists with 66 apiece.
I will use Hinze and Alphonse to break down the blocking numbers.
Hinze
7 solo
66 assists
40 total (based on point values of 1 and 0.5)
73 "total" (disregarding point values)
14.7 total block score in spreadsheet
68 sets played
Alphonse
52 solo
46 assists
75 total (based on point values of 1 and 0.5)
98 "total" (disregarding point values)
51.9 total block score in spreadsheet
63 sets played
If we break down each of the total values by sets played, it illustrates how much weight the solo block carries.
Hinze (by set)
0.59 total (based on point values of 1 and 0.5)
1.07 "total" (disregarding point values)
(increase of 81%)
Alphonse (by set)
1.19 total (based on point values of 1 and 0.5)
1.56 "total" (disregarding point values)
(increase of 31%)
Using the traditional points method for blocks has Alphonse more than doubling Hinze but when we correct for stat keeper error, Alphonse's advantage shrinks considerably.
4) Sections 1-3 illustrate the value of solo blocks. It also points out the obvious variation in how this stat is recorded from coast to coast.
Recommendation: To correct for stat keeper inconsistency, all blocks should be considered equal. This would, in theory handicap, any extraordinary player that really did accomplish a high number of solo blocks but in reality, there is no such player in the CIS.
Would it be possible to redo the math with the blocking numbers altered to reflect the generic "total" category?
The next issue would be getting at hitting percentages and kills by middles (as opposed to outside attackers) but that is perhaps for another day.
@B+M: Great point on the unevenness of solo blocks vs. assisted blocks across conferences; that's something I hadn't noticed before, but it makes sense given CIS statkeeping. Rob's away on vacation, but when he comes back, we'll talk about perhaps tweaking that.
ReplyDeleteAs per the middles/outside attackers, we looked at that, and we built a positional adjustment into the formula based on the average kill rate and block rate at each position. Rob can explain the math much better than I can, but suffice it to say that it is there. Were there any particular concerns you had with that?
If there is an adjustment that was written into the math, then it is probably been adequately adjusted for.
ReplyDeleteMy only concern was that middles, in the CIS at least, have consistently much higher hitting percentages than either outside hitters. This can probably be attributed to the fact middles:
1) Mostly attack when passing is good, so sets are likely better.
2) Speed of middle attacks, when well executed, is much hard to defend against than traditional sets to outside
3) Easier to maintain a higher percentage when taking less swings and most swings are happening on high percentage plays
4) Game plans usually account for outside hitters first, middles second
Middles do average slightly less kills per game due to lack of court time. Hopefully the adjustment in the match adequately captures this difference.
Was there any balance put in for the fact that middles also only play, on average, 65% of the points (50% at the net, and then guessing about 15% on their own serve)? If not, middles value would only increase in the math.
Interestingly enough, despite the great numbers posted by the middles in this breakdown, I would guess most programs would take an outside hitter as their no. 1 pick in a fantasy draft. Their ability to play every point and affect the game from any spot on the court is a huge bonus over a middle that only sees the court 65% of the time. Teams can be at a real disadvantage in matches when it comes down to the wire and their best player, a middle, is sitting on the bench as they go down in flames.
Yeah, I agree with you on outside hitters probably being more valuable in a vacuum thanks to staying on the court in the back row. That isn't looked at in this, as we chose not to try to include defensive stats. Middles do have higher kill percentages and less kills per game (for the reasons you mentioned), and that is considered in the adjustments; Rob can speak to the specific mathematical details of those.
ReplyDeleteI would be curious to see the # of comments on each sport as opposed to just the tags you all have listed above in this article.
ReplyDeleteI enjoy the site and seeing the total posts you list above had me wondering which of the sports draw more visitors to this site?
Thanks.
Better horrendously late than never, I always say.
ReplyDeleteTo the last question - it's football, usually, that gets the most traffic. We tend to dip in the winter months, then by a lot in the April-July period (which is why we write about volleyball statistics!).
To the (legitimate) issues raised with the rankings, or rather the input into them, we'll definitely take a look at the block stats in more detail, see if we can correct for that.
As for middles, we corrected for the higher hitting percentages and higher block rates that players at those positions often have. Before doing that, we had middles all over the top of the list, even more than we do now. Andrew immediately realized we needed a better positional adjustment.
I think we're all on the same page here, but if you have any other concerns don't hesitate to share them.