Difference between revisions of "Difficulty"

Revision as of 13:59, 27 August 2023

Difficulty can refer to either or both of the following:

How hard the questions at the tournament were for the players to answer, as measured either subjectively by the players themselves or objectively through conversion statistics.
How hard the writers or editors of the tournament expect the questions to be, by analogy to a previously-played tournament or general standard. This is often denoted target difficulty.

Theory

Difficulty is a concept that can be applied to an entire tournament, a specific packet, one question, or even a single clue. The ideal standard for difficulty would be an objective assessment of how well-known a fact (or collection of facts) is among the quizbowl community or the wider populace. This is, of course, impossible - determination of difficulty is thus a combination of various imperfect measures.

Subjective

Personal perception of difficulty is the most immediate and straightforward way to determine how hard something is. However, it is inevitably biased by the idiosyncrasies of one's knowledge base - it is possible (and indeed, fairly common) for someone to only know the leadin of a tossup or the hard part of a bonus. The pitfalls of this approach may be succinctly summarized in the aphorism "If I know it, it's too easy; if I don't, it's too hard" (sometimes called the fundamental difficulty error).

Joining together multiple opinions can reduce these fluctuations and produce a single consensus on difficulty; any overarching biases can then be viewed as broad community tendencies, though it's rare that these are obvious. Creating this sort of aggregation is one function of post-tournament discussion, along with identifying errata and talking about other aspects of the set (answer choice, distribution/subdistribution, writing philosophy, logistics, etc.).

Objective

There is a pervasive notion that the difficulty of a set (or a question, or a clue) may be distilled into a single numerical value which describes how hard it was on an "absolute" scale. Statistics like PPB, power rate, and BPA are examples of objective quantifiers that are often employed to serve this purpose - it is more accurate to say that these describe how well the field did. Nevertheless, hard data are very useful for talking about and comparing difficulties.

Assertions about how hard something was are tacitly assumed to be based on this sort of evidence, as relying purely on one's personal perception requires an undue amount of generalization. Such statements may be phrased as if it were possible to determine something's absolute difficulty ("this question is too hard...") but any potential confusion from this grammatical convention can alleviated by making oneself clear ("...because it wasn't powered in any site").

Absolute

There are various metrics which are sometimes used to approximate an absolute measurement of difficulty. These are largely unorthodox, as it is accepted that conventional methods like measuring PPB are very field-dependent:

Wikipedia page views
various Google statistics (Trends, Ngram Viewer, search results)

While it is true that these are free of many of the assumptions that plague other statistics, this cuts both ways: measures like these are not particularly useful because they are so far removed from the actual experience of playing the game.

Relative

Despite how nice it would be, it is generally only possible to determine the relative difficulty of a clue/question/set for numerous reasons:

The composition of a field can have considerable impacts on stats - for instance, the absence of strong players in a category can depress power numbers, making the subject appear more difficult.
The difficulty of individual clues can be skewed considerably by appearing in other sets frequently, recently, or both. This is a major factor in why some questions play significantly easier after time has passed: inclusion of a piece of information into the canon makes it significantly easier for players who pay attention to it.
Without some sort of comprehensive survey of all players, any single value will be incomplete in describing how difficult the community as a whole finds something.
Even a perfect description of something's difficulty within the game will not be able to describe how hard it is in broader society. It is known (and frequently commented on) that the demographics of quiz bowlers are substantially different from the general population. Despite this, metrics like difficulty and importance are often pinned to how well the average person (from the street or in a field) would know it.

One can know the direction which stats have been skewed by these factors, but not the precise magnitude. These factors are often small enough (and the bins of "difficulties" are large enough) that most observers will broadly agree - for instance, even though some collegiate two-dot sets are harder than others, they are almost always closer to one another than they are to three-dot sets.

Regular difficulty

Main page: Regular difficulty

Regular difficulty is the normative difficulty for questions at a given level of quizbowl. Theoretically, it represents the difficulty level at which any eligible closed team across the whole range of skill levels can play meaningful games against any other eligible team. For example, a regular-difficulty high school set should have a distribution, selection of clues/answers, etc. that allows the more knowledgeable high school team in a given match to consistently win,^[1] regardless of whether it's a match between weak teams, average teams, or strong teams.

In practice, regular difficulty sets may not align with the optimal difficulty for the population of active teams, especially among the subset that are nationally competitive. This can skew either way: in high school, the regular difficulty (as set by IS sets) is often considered to be "too easy", while in college regular difficulty (currently still set by ACF Regionals) it is "too hard".

College Level

See also: Collegiate difficulties

At the college and open levels of quizbowl, the four main general standards of difficulty (in increasing order of difficulty) are: novice, regular, nationals, and post-nationals. The first three levels roughly (but not exactly) correspond to the difficulty level of previous ACF Fall, ACF Regionals, and ACF Nationals sets, respectively; the fourth is reserved for anything harder than ACF Nationals.

There have been efforts to reframe "regular difficulty" as something easier than of ACF Regionals, which would be described "Regionals difficulty" instead. ACF Winter, the ACF tournament intermediate in difficulty to Fall and Regionals, lies in this range and returned after a ten-year hiatus in 2020.

Ophir Lifshitz has created a four-dot difficulty scale to remove ambiguities in difficulty terminology.

High School Level

At the high school level, HSAPQ tournament sets and NAQT IS sets are considered the standard for regular difficulty. Most other sets are described in terms of how much easier or harder than these sets a tournament is expected to be. HSQBRank keeps a set of "stat adjustments" that measures the difficulty of different packet sets: NAQT IS sets are set to zero, while more positive numbers indicate more difficult sets and more negative numbers indicate easier sets.

Middle School Level

At the middle school level, NAQT MS sets are considered the standard for regular difficulty. The lower number of middle school sets mean that difficulty is often pinned to high school sets.

References

↑ Some thoughts on the distribution and regular difficulty by Sen. Estes Kefauver (D-TN) » Sat Nov 13, 2010 9:09 pm

[1] Some thoughts on the distribution and regular difficulty by Sen. Estes Kefauver (D-TN) » Sat Nov 13, 2010 9:09 pm

[1]

@@ Line 26: / Line 26: @@
 Despite how nice it would be, it is generally only possible to determine the relative difficulty of a clue/question/set for numerous reasons:
 * The composition of a field can have considerable impacts on stats - for instance, the absence of strong players in a category can depress power numbers, making the subject appear more difficult.
-* The difficulty of individual clues can be skewed considerably by appearing in other sets frequently, recently, or both. This is a major factor in why some questions play significantly easier after time passed: inclusion of a piece of information into [[the canon]] makes it significantly easier for players who [[packet study|pay attention to it]].
+* The difficulty of individual clues can be skewed considerably by appearing in other sets frequently, recently, or both. This is a major factor in why some questions play significantly easier after time has passed: inclusion of a piece of information into [[the canon]] makes it significantly easier for players who [[packet study|pay attention to it]].
 * Without some sort of comprehensive survey of all players, any single value will be incomplete in describing how difficult the community as a whole finds something.
 * Even a perfect description of something's difficulty within the game will not be able to describe how hard it is in broader society. It is known (and frequently commented on) that the demographics of quiz bowlers are substantially different from the general population. Despite this, metrics like difficulty and [[importance]] are often pinned to how well the average person (from the street or in a field) would know it.