[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: Popham at NEA Regional

At 11:47 PM -0800 2/18/02, George Sheridan wrote:
...Beginning with the statement that "Traditionally constructed
standardized tests are inappropriate for judging the quality of
educational programs," he discussed the ESEA, saying that the "No
Child Left Behind Act" is really the "No Teacher Considered
Competent Act." When the law is fully implemented, he said, "The
vast majority of you will be in low-performing 'failing' schools."

A dynamic, wisecracking speaker who declared, "You can't measure
temperature with a tablespoon," Popham outlined the history of
standardized testing in America, emphasizing the World War I Army
Alpha test as forerunner and model of norm-referenced tests. On such
tests, as members of this list are well aware, score spread is
imperative. 50% of students must be at or above average; 50% below.
A reduction in score variation could even result in a negative
reliability coefficient.

A couple of points--assessment that is most effective in helping
teachers is closer to testing what students DON'T know rather than
what they know. That's its diagnostic value. Scores on such "tests"
would be inherently low, but cannot be used to gauge effectiveness of
instruction. On the other hand, some of the standardized tests do
exactly the opposite--they look for a benchmark based on a collection
of checkpoints. Although each item may indicate something about the
knowledge of a student, the collection is merely an index, not a
description of the knowledge. When these are aggregated across
populations, they lose any value to the teachers (e.g., all students
may answer Topic A questions 100% and miss all Topic B questions, but
their scores would be indistinguishable from a population that
answers 50% on both topics--what does that say about differences in
instruction). Finally the CRTs pick questions that are intentionally
differentiating the population into haves and have nots, but, again,
only for individual questions. To assume that even *statistically*
this gives an accurate representation of "average knowledge" of the
population is silly, since the tests are designed to rank, not to

Of course, none of this should be new--no test can meet multiple
goals. More importantly, no test of the students can EVER reveal
anything about the instruction they've undergone, except for weeding
out the absolute incompetence, and even that only with well-designed
tests. If the test is poorly done, everyone will appear to be
incompetent--both teachers and students.


To unsubscribe from the ARN-L list, send command SIGNOFF ARN-L