On measuring difficulty [II of II]

First off, the packets ranged in difficulty from
over 600 points per 20 TUs (Virginia Tech--which was
why I decided to use it as the "leadoff" packet) to
under 250, but about 2/3 of them were within a standard
deviation of the mean.

The most interesting result,
though, was that I tended to *consistently* underestimate
the difficulty of the packets. Obviously, it goes
without saying that I will do better, on the whole, in
certain subjects, and will do worse in others. I did not
expect, however, the net difference to be
positive.

Moreover, I tended to consistently underestimate difficulty
by **80-100 points/20 TUs:** while my estimated
scores averaged in the low 400s, the actual average
scores were in the low 300s. [The average score I had in
mind was somewhere around 380-420 points/20 TUs. This
was predicated on a TU conversion rate of ~85%; the
actual TU conversion rate at PB10 was ~72%, which I find
disappointing.] 

In my mind, this does *not* represent a
flaw in the system, but rather a discrepancy between
my knowledge base and that of the circuit as a
whole: it means, among other things, that I did not do
quite as good a job as I would have liked of keeping
the difficulty at a reasonable level. I tried to push
all questions towards a common medium: I tried to
make easy questions harder, and hard questions
easier.

For Penn Bowl 11, I will still do that to some
extent, but I will focus more on reducing the high end of
the difficulty spectrum than on raising the low
end.

--STI

This archive was generated by hypermail 2.4.0: Sat 12 Feb 2022 12:30:44 AM EST EST