[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Why MCT's? 2nd Try



I transmitted this post to ARN-L on 6 February 2006. But the Feb 2007 ARN-L archives at <http://interversity.org/lists/arn-l/archives/Feb2007_date/index.html> indicate that no posts appeared on the archives after 5 Feb. Could it be that this post shut the February archives down? ;-) . Will it do the same for the March ARN-L archives, now online (with no explanation of the archive hiatus that I can find) at <http://interversity.org/lists/arn-l/archives/Mar2007_date/index.html>?

In any case, here's another try:

***************************************************
If you reply to this long (15 kB) post please don't hit the reply button unless you prune the copy of this post that may appear in your reply down to a few relevant lines, otherwise the entire already archived post may be needlessly resent to subscribers.

In my ARN-L post of 23 Jan 2007 titled "Re: Lauren Resnick and higher-order thinking skills" [Hake (2007a)], I indicated some advantages of much maligned Multiple Choice Questions (MCT's), writing [bracketed by lines "HHHHH. . . . ."]:

HHHHHHHHHHHHHHHHHHHHHHHHH
WHY MCT'S? So that the tests can be given to thousands of students in hundreds of courses under varying conditions in such a manner that meta-analyses can be performed, thus establishing general causal relationships in a convincing manner.

CAN MCT'S MEASURE CONCEPTUAL UNDERSTANDING AND HIGHER-ORDER  LEARNING?
Wilson & Bertenthal (2005) think so, writing (p. 94):
"Performance assessment is an approach that offers great potential for assessing complex thinking and learning abilities, but multiple choice items also have their strengths. For example, although many people recognize that multiple-choice items are an efficient and effective way of determining how well students have acquired basic content knowledge, many do not recognize that they can also be used to measure complex cognitive processes. For example, the Force Concept Inventory . . . [Hestenes et al. (1992)] . . . is an assessment that uses multiple-choice items to tap into higher-level cognitive processes"
HHHHHHHHHHHHHHHHHHHHHHHHH

Two points regarding Peter Campbell's (2007) ARN-L response of 27 Jan 2007:

111111111111111111111111111111111111111111
1. Peter wrote: "I note that the evidence you cite for the use/value of MCT's comes from studies of high school and college students. Are you aware of any studies that have been done that show any value for younger kids? I would be rather surprised if there were, especially when you consider the kinds of MCT's that young kids are exposed to."

I'm not generally conversant with the literature on the learning of young children [except for the ground breaking work of the forgotten pioneer Louis Paul Benezet (1935/36)]. Possibly because of my ignorance, I know of no research showing that MCT's are of value for younger students (subscribers, please correct me if I'm wrong).

I also would be surprised if there were any research showing that MCT's are of value for younger students. But IF there were, then, in my opinion, their development would probably have been preceded by years of high-caliber qualitative and quantitative research on young children, such as that by cognitive scientist David Klahr et al. (1986-2007). Similar, but less extensive research by Halloun & Hestenes (1998a,b) on older students preceded development of the MCT Force Concept Inventory.


222222222222222222222222222222222222222222
2. Peter wrote:

CCCCCCCCCCCCCCCCCCCCCCCCCCC
You argue in response to the question "Why MCT's?":

"So that the tests can be given to thousands of students in hundreds of courses under varying conditions in such a manner that meta-analyses can be performed, thus establishing general causal relationships in a convincing manner. "

I'm not convinced by anything other than the fact that the students who did well were good at taking MCT's."
CCCCCCCCCCCCCCCCCCCCCCCCCCC

Peter's reaction might well be justified for the results of most MCT evaluations. However for my own survey [Hake (1998a,b)] and subsequent confirming work by many other physics education research groups [for references see Hake (2007b)] IT WOULD NOT BE EASY TO ARGUE THAT THE APPROXIMATELY TWO-STANDARD DEVIATION SUPERIORITY IN NORMALIZED GAINS OF INTERACTIVE ENGAGEMENT (IE) OVER TRADITIONAL (T) COURSES, WAS DUE TO THE FACT THAT STUDENTS IN THE IE COURSES JUST HAPPENED TO BE A LOT BETTER AT TAKING MCT'S THAN STUDENTS IN THE T COURSES.

As indicated in "Should We Measure Change? Yes!" [Hake (2007b)]
[bracketed by lines "HHHHH. . . ."

HHHHHHHHHHHHHHHHHHHHHHHHHH
THE VIEW FROM U.S. DEPARTMENT OF EDUCATION
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
"History" and maturation are among the nine threats to internal validity listed in Table 2.4 of Shadish et al. (2002), are discussed on pages 56-57 of that text, and are reiterated by the PEP. . . Psycholologist, Education specialist, Psychometrician]. . . dominated "Coalition for Evidence-Based Policy" (CEBP) at the U.S. Dept. of Education [USDE (2003)]:

USDE-USDE-USDE-USDE-USDE
There is persuasive evidence that the randomized controlled trial, when properly designed and implemented, is superior to other study designs in measuring an intervention's true effect.

1. "Pre-post" study designs often produce erroneous results. Definition: A "pre-post" study examines whether participants in an intervention improve or regress during the course of the intervention, and then attributes any such improvement or regression to the intervention.

The problem with this type of study is that, without reference to a control group, it cannot answer whether the participants' improvement or decline would have occurred anyway, even without the intervention. This often leads to erroneous conclusions about the effectiveness of the intervention.
USDE-USDE-USDE-USDE-USDE

But CEBP's criticism of pre/post testing is irrelevant for the recent pre/post studies in physics. The reason is that control groups HAVE been utilized - they are the introductory courses taught by the traditional method. The matching is due to the fact that (a) within any one institution the test [interactive engagement (IE)] and control [traditional (T)] groups are drawn from the same generic introductory course taken by relatively homogeneous groups of students, and (b) IE course teachers in all institutions are drawn from the same generic pool of introductory course teachers who, judging from uniformly poor average normalized gains <g> they obtain in teaching traditional (T) courses, do not vary greatly in their ability to enhance student learning.
HHHHHHHHHHHHHHHHHHHHHHHHHH

Richard Hake, Emeritus Professor of Physics, Indiana University
24245 Hatteras Street, Woodland Hills, CA 91367
<rrhake@earthlink.net>
<http://www.physics.indiana.edu/~hake>
<http://www.physics.indiana.edu/~sdi>

REFERENCES
Benezet, L.P. 1935/36. "The teaching of arithmetic I, II, III: The story of an experiment," Journal of the National Education Association 24(8), 241-244 (1935); 24(9), 301-303 (1935); 25(1), 7-8 (1936). The articles were: (a) reprinted in the Humanistic Mathematics Newsletter #6: 2-14 (May 1991); (b) placed on the web
along with other Benezetia at the Benezet Centre; online at
<http://www.inference.phy.cam.ac.uk/sanjoy/benezet/>. See also Mahajan & Hake (2000).

Campbell, P. 2007. "Re: Lauren Resnick and higher-order thinking skills," ARN-L post of 27 Jan 2007 12:10:35-0600; online at
<http://interversity.org/lists/arn-l/archives/Jan2007/msg00184.html>.

Hake, R.R. 1998a. "Interactive-engagement vs traditional methods: A
six-thousand-student survey of mechanics test data for introductory physics courses," Am. J. Phys. 66(1): 64-74; online at
<http://www.physics.indiana.edu/~sdi/ajpv3i.pdf> (84 kB).

Hake, R.R. 1998b. "Interactive-engagement methods in introductory mechanics courses," online at <http://www.physics.indiana.edu/~sdi/IEM-2b.pdf> (108 kB) - a crucial companion paper to Hake (1998a).

Hake, R.R. 2006. "Possible Palliatives for the Paralyzing Pre/Post Paranoia that Plagues Some PEP's" [PEP's = Psychometricians, Education specialists, and Psychologists], Journal of MultiDisciplinary Evaluation, Number 6, November, online at <http://evaluation.wmich.edu/jmde/JMDE_Num006.html>.

Hake, R.R. 2007a. "Re: Lauren Resnick and higher-order thinking skills," online at
<http://interversity.org/lists/arn-l/archives/Jan2007/msg00151.html>,
post of 23 Jan 2007 11:33:28 -0800 to ARN-L, AERA-D, ASSESS, EvalTalk, and PhysLrnR.

Hake, R.R. 2007b. "Should We Measure Change? Yes!" download directly by clicking on <http://www.physics.indiana.edu/~hake/MeasChangeS.pdf> (2.5 MB). Failure to access that URL probably means that a new version (T, U, V, W. . .) has been placed online - it can be accessed as ref. 43 at <http://www.physics.indiana.edu/~hake>. To appear as a chapter in "Evaluation of Teaching and Student Learning in Higher Education," a Monograph of the American Evaluation Association <http://www.eval.org/>. A severely truncated version appears at Hake (2006).

Halloun, I. & D. Hestenes. 1985a. "The initial knowledge state of college physics students." Am. J. Phys. 53: 1043-1055; online at <http://modeling.asu.edu/R&E/Research.html>. Contains the "Mechanics Diagnostic"test, precursor to the "Force Concept Inventory."

Halloun, I. & D. Hestenes. 1985b. "Common sense concepts about motion." Am. J. Phys. 53: 1056-1065; online at <http://modeling.asu.edu/R&E/Research.html>.

Hestenes, D., M. Wells, & G. Swackhamer. 1992. "Force Concept Inventory," Phys. Teach. 30: 141-158; online (except for the test itself) at <http://modeling.asu.edu/R&E/Research.html>. The 1995 revision by Halloun, Hake, Mosca, & Hestenes is online (password protected) at the same URL, and is available in English, Spanish, German, Malaysian, Chinese, Finnish, French,
Turkish, Swedish, and Russian.

Klahr, D. et al. 1986-2007. Articles on "Cognition and Instruction," online at <http://www.psy.cmu.edu/faculty/klahr/personal/pubs.htm> / "Cognition and Instruction," where "/" means "click on."

Mahajan, S. & R.R. Hake. 2000. "Is it time for a physics counterpart of the Benezet/Berman math experiment of the 1930's?" Physics Education Research Conference 2000: Teacher Education, online at <http://arxiv.org/abs/physics/0512202>.

Shadish, W.R., T.D. Cook, & D.T. Campbell. 2002. Experimental and Quasi-Experimental Designs for Generalized Causal Inference. Houghton Mifflin - information at <http://tinyurl.com/y3e7vw>. A goldmine of references to social-science research.

USDE. 2003. U.S. Department of Education, "Identifying and Implementing Educational Practices Supported by Rigorous Evidence: A User Friendly Guide. Institute of Education Sciences," National Center for Education Evaluation and Regional Assistance. The entire guide is online at
<http://www.ed.gov/rschstat/research/pubs/rigorousevid/rigorousevid.pdf>
(140 KB). The Guide's authoring group, the Coalition for Evidence-Based Policy (CEBP) <http://coexgov.securesites.net/index.php?keyword=a432fbc34d71c7> was formerly a part of the Institute of Education Sciences [IES (2006)], in turn a part of the USDE [for the structure of this bureaucratic colossus see <http://www.ed.gov/about/offices/or/index.html?src=ln>]. The CEBP is now sponsored by the "council for excellence in government" <http://coexgov.securesites.net/index.php>, with "the mission to promote government policymaking based on rigorous evidence of program effectiveness." The CEBP's Board of Advisors
<http://coexgov.securesites.net/index.php?keyword=a432fbc71d7564>
includes luminaries such as famed Randomized Control Trial (RCT) authority Robert Boruch (University of Pennsylvania); political economist David Ellwood (Harvard); former FDA commissioner David Kessler (Univ. of California - San Francisco); past American Psychological Association president Martin Seligman (University of Pennsylvania); psychologist Robert Slavin (Johns Hopkins); economics Nobelist Robert Solow (MIT); and progressive-education basher Diane Ravitch. Unfortunately, no physical scientists, mathematicians, philosophers, or K-12 teachers are members of the CEBP.

Wilson, M.R. & M.W. Bertenthal, eds. 2005. "Systems for State Science Assessment," Nat. Acad. Press; online at <http://www.nap.edu/catalog.php?record_id=11312>.