[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: The elimination of items when too many students get themcorrect

Oh, yes, here is tha George Cunningham memorial WEBsite, once again, that sunmmarizes
the previous discussion among Victor, Geo C'ham, & I.

Thank you, George, w/this memorial site looking over your shoulder, I see you've
taken the time to be less strident and authoritarian on your views in this
explanation... ;-} rap.

The Relationship between Criterion, Domain-Referenced, and Norm-Referenced Testing:

George Cunningham wrote:
> Karen, Ken, Art, etc.
> Let me seen it I can add my 2 cents worth here.
> The question of whether items that most students get correct should be
> eliminated or left in the test depends on the purpose of the test. I think
> that is what Karen and several other are pointing out.
> I took some heat a while back when I provided my explanation for the
> difference between NRTs and CRTs. To understand the rationale for item
> selection this distinction is crucial. I believe that norm-referenced tests
> scores are interpreted by comparing students, either by rank ordering them,
> assigning percentile ranks, or interpreting them in terms of their
> relationship with the total score (standard scores). With NRTs, items with
> high p-values (most students get the item correct) are not useful because
> they do not tell us anything about how students compare. It is not the bell
> curve that test developers are seeking as much as high item-total score
> correlations, which only occur when there is variability in student
> responses, which does not happen if the item is too easy. Good item total
> correlations lead directly to high coefficient alpha reliability which is
> the holy grail of the developers of NRTs.
> As I define them, CRTs are intended to provide information about what a
> student knows and does not know in terms of mastery of objectives. If the
> purpose of an objective is to determine whether a student knows single-digit
> multiplication, and most students get all of the items correct, you know
> they have mastered the objective. This should please the test developer. He
> or she should not concerned about p-value, item-total correlations, or
> coefficient alphas.
> Unfortunately, the definition of criterion-referenced testing is clouded.
> When words are used for communication clarity of meaning is important, but
> words are also used to persuade. The public is skeptical of NRTs but tends
> to view CRTs in a more positive light. It is therefore to the advantage of
> politicians, departments of education, and publishers to use a broad
> definition of criterion-referenced testing. They always want to label their
> tests criterion-referenced because that seems more useful and more fair.
> There is no ultimate authority on what the correct definition of words in
> specialized fields mean. All one can do is appeal to reason and the various
> sources. The closest thing we have in measurement to an accepted source of
> knowledge is the _Standards for Educational and Psychological Testing_
> (Joint Standards) published by APA, NCME, and AERA. Here is how they define
> a criterion-referenced test:
> "A test that allows its users to make score interpretations in relation to a
> functional performance level, as distinguished from those interpretations
> that are made in relation to the performance of others. Examples of
> criterion-reference interpretations include comparisons to cut scores..."
> According to this definition any time you do not have an overt NRT like an
> achievement test where scores are reported as percentile ranks, you have a
> criterion-referenced test. I think this is a useless definition. I think I
> know a lot about criterion-referenced testing, but just about everything I
> know comes from the writing of James Popham. He didn't invent
> criterion-referenced testing, that is usually attributed to an article
> written by Glaser in 1963, but he is the one who has written most about it
> and did the most to popularize it. Here is his definition and I like it
> much better than the one used in the Joint Standards.:
> "A criterion-referenced test is used to ascertain an individual's status
> with respect to a defined assessment domain....So much for the technical
> definition. Put more simply, a criterion-referenced test lets us know what
> an examinee can or can't do. The really distinguishing feature of a
> criterion-referenced test is the clarity with which it describes whatever it
> measures" (_Modern Educational Measurement: A Practitioner's Perspective_,
> p. 27).
> I think this is a much more useful definition than the one found in the
> Joint standards. According to Popham's definition the items for norm- and
> criterion-referenced tests are not interchangeable because the difficulty of
> items for NRTs are selected to maximize test variability, while those on
> CRTs are set to measure a specified goal level of student performance.
> I don't think states should palm off NRT''s as absolute standard or CRTs.
> If a state is bragging about p-values, item-total correlations, and
> coefficient alphas they either have a NRT or they are using inappropriate
> test methodology. The two states that come closest to sticking with
> absolute standards seem to be Virginia and Massachusetts and Virginia claims
> to be using a CRT, but they report NRT test statistics.
> A good example of the failure of the joint standards definition is the
> California High School Exit Exam.. According to the joint standards it is a
> criterion-referenced test. Their original cut-score was to be 70% until
> they realized that this would result in too high of a failure rate. They
> lowered it to 65% for language arts and 55% for math. By looking at the
> results and adjusting the cut-score they were making norm-referenced
> decisions. It clearly does not meet Popham's definition of
> criterion-referenced testing.
> If anyone is still awake, I will follow this up with some information on
> state testing.
> George Cunningham
> University of Louisville
> ----- Original Message -----
> From: Karen Canty <kscanty@PACBELL.NET>
> Sent: Friday, March 29, 2002 3:19 PM
> Subject: Re: State Assessments/did anyone see frontline?
> > But Art, I thought that's what "No Child Left Behind" means - or am I
> wrong
> > about that?...what I keep hearing is "we will have standards, we will
> teach
> > to those standards, and we will test so that no child will be left
> > behind"...so doesn't that mean that eventually there will be a test that
> > everyone gets right? That seems to be the message to me --- but then
> again
> > I think that was tried somewhere else for about 70 years and it didn't
> work
> > there either so that couldn't possibly be what the message is....
> >
> > Karen
> >
> > -----Original Message-----
> > From: Assessment Reform Network Mailing List
> > [mailto:ARN-L@LISTS.CUA.EDU]On Behalf Of Art Burke
> > Sent: Friday, March 29, 2002 12:12 PM
> > Subject: Re: State Assessments/did anyone see frontline?
> >
> >
> > I don't know of any list of instruments used in the different state
> > assessment systems - sure would like to see one. Washington uses the
> Iowas
> > as the NRT component of its statewide assessment and the WASL as the
> > "standards based" component. The question of who should decide the
> > standards will always be a vexed one in an educational system as
> > decentralized as ours.
> >
> > A standards based test could be comprised of items that everybody gets
> > right. What would be the point?
> >
> > Art
> >
> >
> > --------------------------------------------------------------------------
> > To unsubscribe from the ARN-L list, send command SIGNOFF ARN-L
> --------------------------------------------------------------------------
> To unsubscribe from the ARN-L list, send command SIGNOFF ARN-L

"Dein Wachstum sei feste und lache vor Lust!
Deines Herzens Trefflichkeit
Hat dir selbst das Feld bereit',
Auf dem du bluehen musst." JS Bach: Bauern Kantata
Richard A. Parkany: SUNY@Albany
Prometheus Educational Services
Upper Hudson & Mohawk Valleys; New York State, USA

To unsubscribe from the ARN-L list, send command SIGNOFF ARN-L