Re: A little thought experiment

"Thus, for each team, P/20TH divided by the
number of losses (normalized based on a 13 game schedule
and adjusted for field strength) should be an
accurate method for ranking the "bubble" teams."

I
am not an expert in statistics; however, I would
strongly suspect that claim. In general:

[1] Saying
that two effects are linear does not allow one to
claim that the ratio of the two effects should also be
a good predictor. 

[2] There is no good
rationale for saying that if the number of games played is
less than 13, simply "scaling" the results up to 13
games is justified.

[If, for example, the NE
sectional consisted of a double round robin instead of a
best 2-of-3 playoffs, Yale, MIT, and Williams would
each have faced off against each other an additional
time each. Would the results of the second matchup
have broken down the same way they did the first time
(Yale 2-0, Williams 1-1, MIT 0-2)? Quite possibly.
Would they always? Probably not. Both of MIT's losses
were on the last tossup; conceivably, if the matches
were replayed an infinite number of times, those three
teams would each have roughly equal numbers of losses
and wins. Therefore, should we penalize MIT and
Williams and reward Yale for statistically small samples?
Probably not. Dividing by losses, rescaled to 13,
accomplishes precisely that result.

[Furthermore,
comparing "tournament standings" and win-loss records only
make sense if all the SCT tournaments use the same
structure: round-robins only. Any elimination structure
beyond that, and teams will have inequal number of
meetings, which make it difficult to say that one team is
better than another. To see this, consider a tie for
fourth in a four-team single-elim playoff. If the team
that wins the tie-breaker then loses in the first
round of the playoff, as it (normally) should, it
accrues an extra loss the other team did not
have.]

(continued . . .)

This archive was generated by hypermail 2.4.0: Sat 12 Feb 2022 12:30:43 AM EST EST