[
Author Prev][
Author Next][
Thread Prev][
Thread Next][
Author Index][
Thread Index]
Re: NY Regent's Exam
- Subject: Re: NY Regent's Exam
- From: "Gerald W. Bracey" <gbracey@EROLS.COM>
- Date: Sat, 6 Nov 1999 18:06:59 -0500
- Reply-to: Assessment Reform Network Mailing List <ARN-L@LISTS.CUA.EDU>
- Sender: Assessment Reform Network Mailing List <ARN-L@LISTS.CUA.EDU>
Deborah has hit on something that goes to the core of traditional
psychometrics and which it cannot cope with. The portfolio system in
Vermont has been derided in some quarters because different teachers give
different marks (something that has been noted at least since the Starch and
Elliot studies of 1912).
For traditional psychometrics, such differences cannot be tolerated. The
system is, QED, unreliable.
But Deb is also right when she notes that we don't judge many other things
like that in real life. I imagine that Deb has had the experience that I
have: the categories for judges in refereed journals usually include
"accept as is, accept with minor revisions, accept with major revisions,
reject." I've had more than one article come back with four judges using
all four categories and making cases for their
positions.
There has been some work on this in the testing community, but far too
little. More work has been done in the direction of the article I posted
about scoring essays at Measurement Incorporated: it is highly formulaic and
leaves no room for creativity on the part of the writer or discretion on the
part of the scorer.
----- Original Message -----
From: Deborah Meier <dmeier@ESSENTIALSCHOOLS.ORG>
To: <ARN-L@LISTS.CUA.EDU>
Sent: Saturday, November 06, 1999 10:21 AM
Subject: Re: NY Regent's Exam
> The problem we're dealing with doesn't lie in the failure of testmakers to
> consider inter-rater reliability but in the attempt to get such
> reliabilityt! . That's my hunch. We don't judge anything else
important
> that way (well, very little else anyhow). We don't judge whether people
> are guilty or innocent that way; we don't judge movies or books that way;
> or people; we make very few decisions of grave importance in life that
> way. And if we did we'd have to develop scoring systems that would mimic
> human judgment, thus severely reducing the very qualities about human
> judgment that we most treasure--and educate for. The attempt to produce
> ahigh level of reliability rests on eliminating human variabiality,
past
> experience, hunches, etc; by its nature it seeks to turn us from judgers
> to machines. . Maybe it works for judging diving? I wonder. One can
> contain subjectivity, require it to be deepended, make it public, insist
on
> discussion that may moderate the more extreme forms of it, and allow for
> appeals, etc. As long as test-coaching is a minor part of our
work--not
> a substitute for the heart of schooling--then this quality can be
> tolerated. But any educational system that rests on the distrust of its
> teachers as judges has already hobbled itself so seriously as to be a
force
> against the interests of children. The sellers of these ideas builds their
> case around the distrust--it's an essential ingredient to getting us to
> turn over all this power to "outsiders" and strangers--neutral people
> functioning without a viewpoint of their own, but as mere conduits of a
> fail-proof scoring system! I get carrid away thinking about it. Deb
>
> >Hi everybody,
> >
> >MacUser <5alive writes:
> >>I know these tests go though more than one reader, but the readers are
> >usually >"normed" and trained to expect a certain "level" of competency
> >which is very >arbitrary.
> >
> >This is an insightful observation, coming from someone who hasn't scored
> >an English Regents exam. The rubric may be "fabulous" (??!!) but its
> >words mean different things for different scorers. In practice it's a
> >rather messy affair.
> >
> >In NYC English teachers have had to attend numerous workshops where they
> >look at sample test responses and learn how they were scored. I think
> >the coaching reflects the fact that the plain meaning of the rubric is
> >insufficient.
> >
> >On a couple of occasions at these workshops I listened to administrators
> >arguing that a writing sample had "implicit" ideas that the teachers
> >missed in their practice scoring. Both times there were blatant signs in
> >the writing that the student didn't understand the task or the reading.
> >The kids were out of their depth and the teachers were being asked to
> >comb through incoherent writing samples for words or phrases that could
> >be interpreted as points or ideas. Obviously some people are better at
> >this than others.
> >
> >IMO, it's highly arbitrary. I'd love to know if there's been any
> >independent study of the new English Regents with regard to interrater
> >reliability.
> >
> >John Lawhead
> >ESL teacher
> >Bushwick High School
> >718-381-7100 ext. 409
> >theyreback@juno.com
> >
>
>--------------------------------------------------------------------------
> >To unsubscribe from the ARN-L list, send command SIGNOFF ARN-L
> >to LISTSERV@LISTS.CUA.EDU.
>
> --------------------------------------------------------------------------
> To unsubscribe from the ARN-L list, send command SIGNOFF ARN-L
> to LISTSERV@LISTS.CUA.EDU.
--------------------------------------------------------------------------
To unsubscribe from the ARN-L list, send command SIGNOFF ARN-L
to LISTSERV@LISTS.CUA.EDU.
Post a Message to arn-l: