Litvak's Linear Regression Statistic

From QBWiki
Jump to navigation Jump to search

Paul Litvak's Linear Regression Statistic is a statistic created by Paul Litvak after discussion at Illinois Open with Andrew Yaphe about personal statistics. It attempts to quantify a player's ability based on the player's performance against teams of varying strength.


Calculation is relatively straightforward and can be done quickly with either a spreadsheet program (e.g. Excel) or a graphing calculator.

In one column (or list), input the winning percentages of the teams the player faced in each round. In a second column, input the number of points that player scored against that opponent in that round. Graph points scored vs. opponent winning percentage and perform linear regression. The statistic consists of the regression equation.


According to Jerry Vinokurov, the intercept of the regression equation is "a rough measure of how many points you would score against your own teammates per round". This is a far better estimate when there are no winless teams in the field, since even winless teams can still get tossups. The intercept of the equation, then, roughly represents the player's breadth of knowledge.

The slope of the regression equation roughly indicates the depth of knowledge. A slope above or around zero indicates that a player has extremely deep knowledge of one or more core areas, which that player expects to get questions on every round unless faced with a similarly good specialist in that area. This also holds true for players with very low intercepts, though it may likely indicate deep knowledge of one or more topics or answers rather than entire subjects. A sharply negative slope indicates that a player has some command of giveaways and easier clues, but does not yet have the deeper knowledge to be able to compete with better teams.

Dividing the slope coefficient by the intercept introduces a third interpretation, which indicates the percentage by which a player's points per game increases or decreases with opponent strength. Although this is usually thought of as a more correct indicator of player depth, it can also be seen a rough measure of a player's "clutch" ability to score points in an attempt to either win or keep a game close against good teams.

Shortcomings of the Statistic

The statistic does not adequately correct for players who play partial games, and the regression coefficients are ridiculously low (meaning that the regression lines fit pretty much none of the data). Ray Luo also noted that it would be possible to increase one's slope by letting teammates score points against weaker teams, and demonstrated this at ACF Fall with a ten-point performance against one of the weaker teams in the field. Paul Litvak admitted that the statistic was incomplete due to inadequate compensation for the shadow effect.

The relatively small number of data points available from any given tournament make it uncertain whether the slope-intercept stat is actually showing some useful relationship, or merely interpreting fairly random results.


The post in which the statistic was introduced