- Subject: Capitol Quips
- From: Susan Ohanian <SOhan70241@AOL.COM>
- Date: Mon, 18 Feb 2002 07:15:09 EST
- Reply-to: Assessment Reform Network Mailing List <ARN-L@LISTS.CUA.EDU>
- Sender: Assessment Reform Network Mailing List <ARN-L@LISTS.CUA.EDU>
Capitol QIPS (Quality Improvement Process Strategy) - a parent perspective on
what's happening in Colorado 2-18-02
On Wednesday, February 13, 2002, the Senate and House Education Committees
convened a joint session to hold a public hearing on the State Accountability
Reports. This was the first time the public had been asked for their input.
Although a Committee was organized at the Colorado Department of Education to
look at the report cards, parents or the public in general were not asked to
be part of this discourse. QIPS has followed the report cards since their
inception in the 2000 legislative session. This is a timeline that describes
issues affecting the report cards, why they are not reliable, and why the
practice of high stakes testing is questioned. Vicki
SCHOOL ACCOUNTABILITY REPORTS
The following information explains concerns about the reports such as:
· No reliable basis for the bell curve used in the report card
· CSAP writing portion is graded by Kelly Services temporaries
· Proficiency levels are set differently for each test
· Degree of difficulty in the same subject across grade levels is not
· Schools are penalized for children whose parents refuse to let them take
· Report card does not take into account socio-economic levels and mobility
· Report cards are not audited for accuracy
· Schools are arbitrarily excluded from the ratings
· Teachers surveyed describe the negative effects of the report card
· Report cards when released contained innumerable errors
· Other concerns listed at the end of this report
Dr. Lorrie Shepard, Dean of the School of Education at the University of
Colorado at Boulder, has consistently been asked for research on the
education reform in this state and you will see that research quoted. Dr.
Shepard is nationally known for her 30 year career in assessments and has won
the Distinguished Career Award from the National Council on Measurement in
Education. She has served as president of the American Educational Research
Association, president of the National Council on Measurement in Education,
and vice president of the National Academy of Education. Her research
focuses on psychometrics and the use and misuse of tests in educational
settings. Shepard co-chaired the NAE Panel on Standards-Based Reform (a 1995
study funded by the Carnegie corporation), and authored with Glaser, Linn,
and Bohrnstedt an evaluation of the National Assessment commissioned by
Congress. She has authored and co-authored four books and numerous articles.
At the end of the report, you will find a list of concerns about the report
cards in general and a list of Essential Characteristics that Denver Public
Schools devised to define a longitudinal growth grade. The final material
delineates the references Dr. Shepard has used in her research on high stakes
testing and formative assessment.
On September 29, 1999, Governor Owens said: "I do not want to mince words. We
are facing a crisis in public education in Colorado." He made this statement
following an announcement by the Colorado Department of Education on the
fourth grade CSAP Reading assessment scores. On October 27, 1999, Janet
Bingham of the Denver Post reported, "Nationally, only about a third of
fourth-graders tested met the Reading standard, and Colorado students didn't
do much better." Commissioner Moloney, when interviewed, recognized that
Colorado may be among the best in the nation, but he also said we needed to
understand that the nation wasn't doing too well. According to Dr. Lorrie
Shepard's response to that article, the nation wasn't doing too well because
the standard for proficiency on the national Reading test was set at the 71st
percentile. Proficiency is normally set at the 50th percentile.
FACTS ABOUT THE PERFORMANCE OF COLORADO'S 4TH GRADERS IN READING IN
COMPARISON TO NATIONAL AND INTERNATIONAL NORMS
Dr. Lorrie Shepard
1. In the last IEA international study of Reading, U.S. 4th graders ranked
2nd in the world (out of 32 nations participating) in reading proficiency.
2. At approximately the same time as the IEA study, the National Assessment
of Educational Progress set new performance standards (called "achievement
levels") for its national reading assessment. The achievement levels were
intended to show what 4th graders "should be able to do" and were purposely
not set at the current grade level average. In fact, in 1992 for 4th grade
reading, the standard for proficient performance was at approximately the
71st percentile, meaning that only the top 29% of 4th graders were judged to
be proficient or advanced in reading according to the new standards.
3. As shown in the most recently released 1998 NAEP report in Reading, the
nation's 4th graders have continued to improve since 1992 and Colorado's 4th
graders have "significantly" improved compared to the nation. Thus, Colorado
4th graders are among the most proficient readers in the nation and in the
4. The fact that only "a third of fourth-graders tested met the (NAEP)
reading standard, and Colorado student didn't do much better," is not
surprising given that the standard was set at the 71st percentile nationally.
Senate Bill 186 is introduced and institutes report cards for every school in
the state based on the CSAP test.
There was an informational meeting in House Education on this date with the
statistician from Evaluation Software Publishing, Glynn Ligon. When he was
asked how the bell curve was determined, he said it was based on "a "logical
basis" of where the schools would fall" and he later stated that it did not
have a statistical foundation. Representative Gotlieb asked: "Whose "logic"
system is the bell curve based on?" The answer was: "The Standard
Distribution of Behavioral Sciences". Representative King believed the bell
curve was skewed because not all children are tested on the same number of
assessments. Ligon explained this by saying there was a very complex
Dr. Elizabeth Pearman, Assistant Professor in the Department of Applied
Statistics and Research Methods (College of Education) at the University of
Northern Colorado (both her master's and doctorate are in Educational
Psychology) checked with 7 of her colleagues who also have their doctorates
in either Educational Psychology or Statistics. No one has ever heard of the
Standard Distribution of Behavioral Sciences.
Senate Bill 186 signed by Governor
Teachers receive virtually no feedback from the CSAP test to help them
improve their instruction. Legislation, passed in February of 2001, does
allow teachers to have the written portion of the CSAP returned to them, but
the cost at $6 per child is prohibitive. Therefore, the CSAP has little or
no diagnostic value to educators or as a basis for a report card.
Don Watson, Director of Assessments at the Colorado Department of Education,
testified that personnel from Kelly Services (temporary personnel) are hired
to grade the tests and one of the criteria for being hired is that they must
have been able to pass the test themselves.
The UCD public survey shows: "Almost 60% of the people surveyed would base
their assessments of school effectiveness on improvement in test scores,
compared to only 32% who would rely on the percentage of students who score
"proficient" or better."
Lorrie Shepard presents her Evaluation of the CSAP Writing test. Children
must score in the 70th percentile to be proficient on the test.
State Board of Education Meeting:
Evie Hudak: "Is the 4th grade reading test harder for 4th graders than the
7th grade reading test is for 7th graders?"
Commissioner Moloney: "This is not a perfect science."
Jared Polis: "When students refuse to take the test, do they receive a zero?"
Jerry Difford: "Actually it is not a zero, it is a -.5."
Jared Polis: "So it penalizes the school."
Jerry Difford: "Yes."
Dr. Ruby Payne (a nationally know researcher and speaker on the effects of
poverty on learning) says when asked about the bell curve and a grading
system: "The bell curve says that for every winner there is also a loser.
We need to throw that concept out."
Patrick McQuillan, a CU Boulder professor, completes a study of Denver Public
Schools, and recommends:
Reconsider high-stakes testing and accountability:
To put all schools in the same testing competition, to assume they have an
equal chance at success, and to dispense rewards and punishments accordingly,
is naive. Rather than remedying the situation, this test may have the
opposite effect: Failing schools will become even more socio-economically
segregated and educational opportunity for low-income students will be
The Colorado State Auditor has conducted a Performance Audit of CDE. The
fourteen member audit committee consisted of Senator Norma Anderson, Senator
Ron Tupa, Representative Tambor Williams and others.
Report Card Data Should Be Verified
The report cards are required by statute to contain the statement "School
Report Cards prepared by the Colorado Department of Education are
independently audited and verified by [name of firm]." However, this
statement is misleading, implying a level of review that does not exist. The
only verification that is planned is an independent review of the process of
preparing the report cards, not a verification of the data in the report
cards. Without verification of reported data, readers have no assurance that
the information contained in the report cards is reliable. Although auditing
all the data included in the report cards is a major undertaking, the
Department should consider having school districts expand the procedures
conducted by their independent audit firms to cover report card data. In the
absence of audit procedures, the report cards should disclose that all the
information is reported by school districts and is not checked for accuracy.
Jared Polis letter to Commissioner Moloney reprimanding him for arbitrarily
excluding 29 schools from report card ratings.
9 News reports:
About 100 schools did not receive ratings. Those included some schools that
the Department of Education say are so small that releasing the results could
violate federal law by identifying individual students.
On September 13, 2001, Evie Hudak, Gully Stanford and Jared Polis (members of
the State Board of Education) along with Colorado PTA and other stakeholders
held a press conference on the steps of the Colorado Department of Education.
Their intent was to inform the public that the report card is faulty and
that the law was not followed in rating schools. Earlier this year,
Commissioner of Education, William Moloney, had unilaterally and arbitrarily
excluded a significant number of schools from the ratings. This affected the
ratings of the remaining schools. Later that day, Jared Polis calls for the
immediate resignation of Commissioner Moloney.
Dr. Shepard and her colleagues have surveyed 1000 teachers in Colorado and
A Survey of
Teachers' Perspectives on High-Stakes Testing in Colorado:
What Gets Taught, What Gets Lost
Teachers Identified the Reporting of CSAP Results (i.e., the School Report
Card) as Most Problematic
· 72% of the teachers surveyed thought that the school report card would
have no effect on the number of poorly qualified teachers leaving the
profession. But, 76% of teachers said that school report cards would
increase the number of well-qualified teachers leaving the profession.
"I find that it is a demoralizing, stressful situation. Teachers are being
judged year to year on completely different students. The pressure put on
teachers has increased to the point where teachers will be leaving the
· 75% of teachers believe that public regard for the teaching profession
will decline as a result of the release of school report cards.
"How the results are used though is what I have a problem with. I think they
are inappropriately used perhaps first by the media and then second by the
state level government but mostly by the media. The way in which they are
reported or used to make comparisons are unhealthy for fostering positive
Bottom line of Conclusions: Less attention should be paid to raising test
scores per se and to evaluating the quality of school solely on the basis of
Parents Receive School Accountability Reports
After receiving report cards districts are reporting:
· Errors in class size
· Errors in number of administrators
· Errors in teacher's salaries
· Errors in administrator's salaries
· Errors in CSAP score data
· Errors in schools being arbitrarily divided into Elementary, Middle, and
High schools regardless of the grades they actually contain
· Errors in reporting CSAP data on students who were supposed to be exempted
· Errors in school history
· Errors in staff information
· Errors in safety information
· Errors in school environment information
· Errors in the number of teachers who left the district last year
· Errors in addresses and phone number of schools
· Errors in ratings between schools having the same CSAP scores
· Errors in the comparison lists of schools on the front page of the report
One school district reported that they found at least one if not several
errors on every school's report card in their district. The State has
admitted that "even if the data is 99.999 percent correct, there could still
be 2,500 data errors." When asked about errors on the report cards, the
Colorado Commissioner of Education, William Moloney, was quoted in the Denver
Post on September 25 saying, "For the first six days, we were averaging about
17 calls a day,"
Some schools who were rated Excellent but have now moved down because of
removing certain schools from the rating list, won't be able to apply for the
Blue Ribbon School status they were seeking, because they are not rated
Excellent in their own State.
The highest ranking school in the State is a Montessori school in Gilpin
County, where only ONE student was tested. Why was only one student tested?
The CSAP is given to third graders and the school has a total enrollment of
twelve students. Only one child was in the third grade.
"The road to education reform in Colorado has been paved with good
intentions. But it has led us to hell nonetheless."--Editorial Writer Billie
Stanton, The Denver Post, in her article referencing the School
The Denver Area Schools Superintendent's Council commissioned a study by
University experts to determine just how difficult the 10th grade Math CSAP
test was for our kids. Dr. Lorrie Shepard, Dean of the School of Education
at CU-Boulder and her team, found that the proficiency level for the 10th
grade Math CSAP is set at the 90th percentile. Also, Advanced Proficiency on
the Math CSAP is set at the 99th percentile. Normally proficiency is set at
the 50th percentile. Shepard was not alone in the undertaking of this study.
Her team consisted of Dominic D. Peressini, Jeffrey A. Frykholm, David R.
Grant, Damian Betebenner, all of CU-Boulder; William Briggs and William
Juraschek from the University of Colorado at Denver; and, Lew Romagnano and
James Loats from the Metropolitan State College of Denver.
1. If this report card is printed in other languages such as Spanish, etc.,
how much would the cost be then?
2. Could property values be affected by the grade on this report card? If
property values go down, how will this affect property tax revenue for
3. Special Needs children. Would teachers feel pressure to not include these
children in their classroom because their test scores might bring the grade
4. Would there be pressure on teachers to focus only on "teaching the test"
in order to keep their scores high?
5. How would scholarship organizations respond to children from "B", "C",
"D" or "F" schools?
6. Could this letter grade stigmatize children as well as the school?
7. Will teachers and administrators be reluctant to share ideas because they
are now competitive and could be helping a school to get a better grade than
their own school?
8. According to a Jeffco School District survey, issues affecting report
cards include family issues, drug/alcohol problems, differentiation of
instruction, student mobility, staff development, materials, staff stability.
9. Showing growth over time is important.
10. A highly specialized team of experts need to create the formula for
ratings that will make them fair.
11. A list of essential characteristics, created by Denver Public Schools,
for formulating a longitudinal growth rating follows.
Essential Characteristics for High Stakes Longitudinal Analysis
1. Longitudinal analysis should be based on individual student growth and
include a high percentage or all students assessed.
2. The methodology must be statistically valid.
3. The methodology employed must be endorsed by statistical experts as valid.
4. Statistically acceptable methods must be employed to separate various
growth categories (e.g., groups scoring higher than expected or lower than
5. The methodology must be valid for all subgroups mentioned in CCR
301-1.01(a), including ethnicity, gender, language background, disability,
exceptional ability, and socio-economic level.
6. The methodology must lend itself to making high stakes decisions about
each school's effectiveness in facilitating individual student growth.
7. The methodology must define or lend itself to an acceptable definition of
'one year's growth'.
8. The methodology and approach must lend itself to production of a parent
report showing the progress of any individual student from year-to-year.
9. Disaggregated results for the following groups should be provided:
district, grade within district, school, ethnicity, gender, language
background, disability, exceptional ability, and socio-economic level.
Dr. Lorrie Shepard's references reflecting the negative effects of high
stakes testing and the value of formative assessment in the classroom.
Colorado Education Issues Forum
"Colorado Public Education Post -9/11: Taking Stock of Standards-based
November 28, 2001
Session I: Standards, Assessment, and Accountability
Lorrie Shepard, University of Colorado at Boulder
Relevant Research Findings and References
U.S. Congress, Office of Technology Assessment. (1992, February). Testing
in American schools: Asking the right questions, OTA-SET-519. Washington,
DC: U. S. Government Printing Office
§ High-stakes testing leads to "test score inflation," meaning that test
scores go up without a corresponding gain in student learning. Numerous
studies have shown that test score gains on familiar and taught-to tests
cannot be verified by independent tests covering the same content.
§ High-stakes testing also leads to "curriculum distortion," which helps to
explain how spurious score gains may occur. Studies show that many teachers
eliminate science and social studies, especially in high poverty schools,
because more time is needed for math and reading. But, teaching the test
also involves rote drill in tested subjects so students are unable to use
their knowledge when asked in any other format.
Stipek, D. J. (1996). Motivation and instruction. In D. C. Berliner & R.
C. Calfee (Eds.). Handbook of Educational Psychology (pp. 85-113). New
York: Simon & Schuster Macmillan.
§ In the motivational literature, hundreds of studies show the negative
effects of working to look good or to do well on the test instead of working
to understand and master the material. Students who focus on being evaluated
become less intrinsically motivated. They learn less and are less willing to
persist with difficult problems.
Atkin, J. M., Black, P., & Coffey, J (Eds.). (2001). Classroom assessment
and the National Science Education Standards. Washington, DC: National
Pellegrino, J. W., Chudowsky, N., & Glaser, R. (Eds.). (2001). Knowing what
students know: The science and design of educational assessment. Washington,
DC: National Academy Press. www.nap.edu
Shepard, L. A. (2000). The role of assessment in a learning culture.
Educational Researcher, 29, 7, 4-14. www.aera.net
Formative assessment (assessment for learning) requires more than complex
data systems. Formative assessments help make student thinking visible, and
the insights and information provided help the teacher and students see how
to take the next steps in learning.
To unsubscribe from the ARN-L list, send command SIGNOFF ARN-L
Post a Message to arn-l: