Re: Comments on ICT

R. Robert Hentzel <*topquark_at_rhentzel.yahoo.invalid*> · Wed, 16 Apr 2003 21:06:49 -0000

> However, I do agree with much of what the poster said with regards 
to 
> the questions.  I enjoyed the tournament very much and definitely 
> thought the questions were better than what I heard last year.  I 
> thought NAQT actually had the difficulty almost perfect, and there 
> did seem to be more academic stuff in most of the rounds than I'd 
> expected (before someone from NAQT posts and says the distribution 
is 
> always the same, let me just say that this was a personal 
> impression).

Subash --

Fair enough.  I won't mention it. :-)

[General dislike of general knowledge snipped along with list of 
enjoyed questions]

> What I would like to take issue with, is what has mostly been 
brought 
> up already.  Probably half or more of the power TU's I had this 
past 
> weekend were "fraudulent," meaning that they were on questions that 
I 
> thought were not pyramidally written.  Now I don't mean to list 
these 
> as an indirect means of self-aggrandizement as Nathan Freeburg 
> constantly feels the need to do; I simply feel that they make my 
> point.  Among those TUs that come to mind (I'll skip the egregious 
> ones the earlier poster mentioned) were a Hyksos TU that 
> mentioned "shepherd kings" in the first sentence; a shaky Mimir TU; 
a 
> poor The Nose TU giving major plot immediately; a Satyricon TU that 
> had one of the major characters as the second clue; a Gunter Grass 
TU 
> that had his most recent novel as the first clue; a TU on The Wasps 
> that had one of its two main characters within the first five 
words; 
> a TU on Vico that started by listing his four stages of history; a 
> particularly awful TU on townships that referred to its use in the 
> acronym SOWETO; and the list goes on.

In my following comments, I don't want to be seen as defending each 
and every one of NAQT's ICT questions as a flawless jewel; NAQT 
certainly acknowledges that some had problems, including some of 
those that have been mentioned by you and naqtrauma.  Though this 
post is not primarily about evaluating or acknowledging those 
mistakes; I'm certainly willing to discuss individual questions on 
the Yahoo! club or in private on behalf of NAQT.

The principal issue that you and naqtrauma (and a few others in 
subsequent posts) have raised is an overwhelming prevalence of 
questions that begin with clues that are too easy; later comments in 
your post claim that 1/3 of the academic questions at the ICT began 
with what were, effectively, giveaways.  This is the larger, and more 
important, issue to which I want to respond, because I think that it 
may stem from a difference in philosophy about what the goals of a 
set of questions should be.

NAQT writes its questions to obtain 85% tossup conversion, 50% bonus 
conversion, and 16% power conversion for the tournament, a criterion 
that is necessarily based around the average team's ability.  NAQT 
does not tune its sets to the abilities of the top teams alone; 
scoring points is necessary for both differentiating teams and 
enjoying quiz bowl and this is true of teams near both the top and 
the bottom of the standings. 

With that it mind, the overall difficulty of the ICT (Division I) 
sets was close to what we were looking for:  78% tossup conversion, 
49% bonus conversion, 15% power conversion.  Statistically, I don't 
think that there is good reason to believe that the sets were ill-
suited in difficulty to the field.

The obvious rejoinder is that questions that are too easy do not 
actually differentiate the top teams in the tournament since buzzer 
races are effectively random.  While true in principle, I don't see 
any statistical reason to think that this was an issue at the 2003 
ICT.  Of the 83 games between teams that finished in the top half of 
the field ("top teams"), only 12 (14%) resulted in upsets with 
respect to the standings after round 11.  Three-quarters of these 
were upsets of either one or two places in the final standings; no 
upset among those teams was more than eight places.  These results 
seem quite reasonable and in accordance with what would be expected 
from fair questions of the appropriate difficulty.

For what it's worth, using the post-powermatching (round 11) rankings 
rather than the overall (round 15) rankings makes the situation seems 
worse; whatever its flaws, the overall ranking *better* reflects the 
outcome of games in the sense that there were fewer upsets relative 
to its ordering.  I am using the round 11 rankings to make the 
numbers as bad as possible and because there is a general perception 
that it was a more accurate reflection of team's abilities among 
those critical of the questions.

Finally, looking at the questions from the first six rounds of the 
tournament (the only rounds for which I have this data), there were 
only six that were powered in 50% or more of the rooms and only 15 
powered in more than one-third of the rooms.  The former six could be 
categorized, quite correctly, as flawed for being too easy, but they 
represent approximately 4% of the questions heard Friday night.

NAQT *certainly* agrees that some of the tossups (e.g. Mandelbrot 
set) began with clues that were too easy.  We will definitely work on 
eliminating that for next year.  On the other hand, given the 
principal goal of writing a set that successfully differentiated 
teams of all abilities--including the top ones--I don't see any 
statistical reason to think that it wasn't achieved.

Many of the tossups that have been cited were not, in fact, powered 
in very many rooms (though a few certainly were), and not all claims 
about where clues appeared are correct; some listed as "in the first 
line" are actually in the second, or even third, line behind a number 
of other clues.

To summarize, NAQT is disappointed if anybody, on any team, found the 
questions to be unfair, too easy, too hard, or otherwise not up to 
his or her expectations; certainly we listen to all criticism and 
will work hard to fix flaws that are identified for next year.  Some 
of our ICT questions (Miro) were bad.  Some should have been 
researched more (crystallographic groups).  Some started with a clue 
that was too easy (Finger Lakes).  But, having looked at the overall 
conversion, upset, and power-clump rates, I don't see that any 
criticism of the set as a whole as "too easy for the top teams," "too 
powerable," or "too easy overall" is justified.

-- R. Robert Hentzel
President and Chief Technical Officer,
National Academic Quiz Tournaments, LLC