Re: Difficulty levels of Colorado tests
- Subject: Re: Difficulty levels of Colorado tests
- From: Victor Steinbok <Victor.Steinbok@VERIZON.NET>
- Date: Mon, 28 Jan 2002 11:53:14 -0500
- In-reply-to: <email@example.com>
- Reply-to: Assessment Reform Network Mailing List <ARN-L@LISTS.CUA.EDU>
- Sender: Assessment Reform Network Mailing List <ARN-L@LISTS.CUA.EDU>
At 9:28 AM -0500 1/28/02, Deanna M. De'Liberto wrote:
Ah! A blast from the past!
To clarify, proficiency on norm referenced test is set at the 50th
True, very true--it is usually set at that, however arbitrary the
decision to classify half the population as "deficient" may be.
However, with criterion referenced tests, such as the ones being
given in Colorado and other states, the proficiency level could be ANYTHING.
I would have chosen ARBITRARY rather than ANYTHING, since it more
aptly describes the actual process. In Massachusetts, the people who
set out to determine proficiency levels on MCAS were given initial
instructions to make as many students fail as possible--I kid you
not! There were two groups who argued for this--some progressives
expected the test to be closer to the authentic assessment models and
hoped that failing scores would drive schools to reform, and John
Silber and his Cohort of Nefarious Sycophants, who claimed they just
wanted to reflect international comparisons, like TIMSS. In the end,
the progressives got screwed, along with the students. Silber's
argument is ludicrous because Massachusetts students did BETTER on
TIMSS than most of the rest of the world--in fact, they did much
better on TIMSS than they doing on MCAS, DESPITE the 50% increase in
passing rates in one year (and this is ONLY at grade 10--other grade
levels did not see the passing rates rize).
If you are asking whether the proficiency level should be that high, the
answer is "It depends on what the purpose of the test is".
Well, actually, does it NOT depend on the NATURE of the test as well?
If it is indeed a minimal competency test (not just alleged to be
one, but actually recounts the most basic tasks) it is conceivable
that the proficiency scores can be set high.
On the road test,
the proficiency level is set very high and no one has complained about that.
It all depends on what is being measured and what you are going to do with
In California, when I took the "written" test (I already had a
license, so no road test was necessary), you were allowed no more
than two wrong answers (out of, I believe 60 mc questions, although I
do not recall the length of the test). You were also allowed to keep
your copy of the test if you failed (with huge crosses on the wrong
answers) and to retake the test (I also do not recall if there was a
cap in the number of retakes, but it was at least three). The beauty
of it is that the test never changed, so you had people coming in who
were practically illiterate, but with little cheat-sheets that gave
them the correct answers to copy--as long as you could differentiate
A, B, C and D, you were all set. As I recall, the Illinois procedure
was not particularly different, although the tests were not to leave
the premises. Still, there were cheat-sheets circulating at least
among the Russian- and Polish-speaking communities, so that
immigrants with not one word of English could pass the test. If you
consider the implications, you might not want to go out on the road.
But your questions asked about how to determine the difficulty level of tests
and that, at least in my mind, is NOT the same as difficulty level.
Deanna is right, of course--proficiency levels and difficulty levels
are NOT the same, BUT they ARE inextricably linked to each other. One
cannot possibly set a proficiency level without CONSIDERING the
difficulty level of the test--something that is, regrettably, rarely
done with state tests. But even this consideration does not mean that
the set levels are not arbitrary (see above).
initially thought you were asking about reading levels (which should be set
at two grade levels below the grade level the test is designed for) and there
are a number of formulas for determining reading level. Difficulty for tests
should take into account reading level but from a more statistical angle,
difficulty is determined by the proportion of students answering an item
correctly.....thus the more students that answer an item correctly, the
easier the item is and the more students answering the item incorrectly, the
more difficult the item is.
Spoken like a true testing robot. :-) This is exactly the trouble
with psychometric "difficulty" levels--in the classroom, I teach my
students to answer difficult questions, yet, some numbskull can tell
me that BECAUSE they answer these questions, the questions are easy.
This is one reason for the Harvard grade inflation flap. Sure, there
is no ABSOLUTE difficulty, but defining it in terms of statistics of
how many students answer the question is pure behaviorist
nonsense--and <this for George Cunningham's benefit> it has very
clear lineage to the behaviorists!
One can then find the average difficulty of the
test based upon each item's difficulty. But again this is different from
determining proficiency levels which is more of a standards setting thing
(and I could write volumes on that subject).
Then it becomes the chicken-and-egg question. Again, the MCAS passing
rates increased nearly 50% in one year. The DoE said the tests were
comparable, the cut-off scores were only lowered by one question
(where "only" could have easily accounted for a very large chunk of
the increase), but they vehemently opposed the suggestion that the
test was easier. Go figure!
To unsubscribe from the ARN-L list, send command SIGNOFF ARN-L
Post a Message to arn-l: