[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]
Re: Why MCT's? 2nd Try
- To: <arn-l@interversity.org>
- Subject: Re: Why MCT's? 2nd Try
- From: Richard Hake <rrhake@earthlink.net>
- Date: Thu, 29 Mar 2007 11:14:17 -0700
- Cc: <PHYSLRNR@LISTSERV.BOISESTATE.EDU>
If you reply to this very long (32 kB) post
please don't hit the reply button unless you
prune the copy of this post that may appear in
your reply down to a few relevant lines,
otherwise the entire already archived post may be
needlessly resent to subscribers.
*******************************************************
ABSTRACT: In an ARN-L post of 7 March 2007, I
pointed out that the approximately two-standard
deviation superiority in normalized gains of
interactive engagement (IE) over traditional (T)
courses does not support Peter Campbell's
suggestion of 27 January 2007 that students in IE
courses do better than students in T courses only
because the IE course students are inherently
better at taking multiple choice tests (MCT's)
than the T students. On 8 Mar 2007 Peter
responded that (a) causal relationships between
IE (T) courses and higher (lower) MCT scores
cannot be established, (b) IE course students
may do better than T course students because the
IE courses preferentially enhance the MCT-taking
abilities of IE students, (c) the claimed IE
superiority may be an artifact of non-random
selection. In this post I suggest that: "a"
indicates a misunderstanding of my research, "b"
is extremely unlikely, and "c" is a misconception
related to the mistaken notion that randomized
control trials are the gold standard of
assessment.
*******************************************************
My ARN-L post of 7 March 2007 "Re: Why MCT's? 2nd
Try" [Hake (2007b)], was my second attempt to
get though to ARN-L with a post "Re: Why MCT's?
(was Lauren Resnick and higher-order thinking
skills)" [Hake (2007a)], transmitted to ARN-L and
PhysLrnR on 6 Feb 2007. In both the 6 Feb and 7
March posts I wrote [bracketed by lines "HHHHHH.
. . ."]
HHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Peter's reaction . . .[ of 27 January in Campbell
(2007a) that MCT's only indicate how good
students are at taking MCT's]. . . might well
be justified for the results of most MCT
evaluations. However for my own survey [Hake
(1998a,b)] and subsequent confirming work by many
other physics education research groups [for
references see Hake (2007c)] **it would not be
easy to argue that the approximately two-standard
deviation superiority in normalized gains of
interactive engagement (IE) over traditional (T)
courses, was due to the fact that students in the
IE courses just happened to be a lot better at
taking MCT's than students in the T courses.**
HHHHHHHHHHHHHHHHHHHHHHHHHHHHH
In response, Peter Campbell (2007b) on 8 Mar
2007, made 4 points [my inserts at ". . . .
[insert]. . . ."]:
111111111111111111111111111111111111111111111
1. "Richard - You're arguing that students who
were taught using "interactive engagement" scored
better on MCT's than students who were taught
using "traditional" course methods. You use this
as proof that MCT's reflect the superiority of
"interactive engagement" over "traditional"
course methods. Is that a fair summary? "
NO! Your (a) "scored better on MCT's" needs to be
replaced with (b) "obtained higher normalized
pre-to-post test normalized gains on MCT's."
There's a world of difference between the correct
wording "b" and your incorrect wording "a". I
suspect that you may not have carefully read the
online reports [Hake (1998a,b)] of my research.
Furthermore, it should also be noted that the
"Mechanics Diagnostic" (MD) test and the "Force
Concept Inventory" (FCI) were not just your
everyday problematic MCT's. They were developed
through lengthy and arduous qualitative and
quantitative research by disciplinary experts
[Halloun & Hestenes (1985a,b)] and their use has
been shown to be valid ["internal", "external",
"construct", and "statistical conclusion"- see
e.g. Shadish, Cook, & Campbell (2000, pp.
33-42)]. Also the MD and FCI have been shown to
be consistently reliable, as judged by relatively
high Kuder-Richardson reliability coefficients
KR-20 in the 0.8 to 0.9 range (see, e.g., Halloun
& Hestenes, 1985b; Hake, 1998a, 1998b).
2222222222222222222222222222222222222222222
2. "If so, here are the problems I see (slightly edited by Hake):
A. You're establishing a causal relationship
between 'interactive engagement' and higher
scores. . . .[NO! between 'interactive
engagement' and normalized pre-to-post test
gains]. . . .
B. You're also establishing a causal relationship
between 'traditional' course methods and lower
scores. . . . .[NO! between 'traditional' course
methods and normalized pre-to-post test gains]. .
. .
C. You're arguing that MCT's can reflect the
superiority of 'interactive engagement.'
Because A and B cannot be established, you
cannot, therefore, argue that MCT's can measure
something that cannot be established. Here's a
rather crude analogy :
1. Babies fed soy milk are happy.
2. Babies fed breast milk are not happy.
3. The babies fed soy milk gained 20 pounds,
whereas the breast fed babies gained only 5
pounds.
4. Conclusion: the 15-pound difference between
the two groups shows the superiority of soy milk
as a nutritional supplement for babies. "
I agree with Peter that the analogy is crude - so
crude in fact that, in my opinion, it's
irrelevant to the present argument - as, I hope
will be made clear in the remainder of this post.
An analogy which IS relevant (as I hope will be
made clear in the remainder of this post) is the
following:
1. IE courses show relatively high average normalized gain <<g>> on MCT's.
2. T courses show relatively low average normalized gain <<g>> on MCT's.
3. IE courses gained approximately 2sd's greater in <<g>> than T courses.
4. Conclusion: the above approximately 2sd
difference in the <<g>>' s between the IE and T
courses shows the superiority of IE courses for
enhancing students' conceptual understanding of
Newtonian mechanics.
In the above the double angle brackets <<g>>
indicate an average over courses of the average
normalized gain <g> for each course.
Peter claims that "A" and "B" above can't be
established. What does he you mean? IF he means
that "A" and "B" (with my corrections) can't be
established with complete 100% certainty for all
time, then the same would be true for any
purportedly causal relationship developed through
*scientific* research, and his assertion would be
correct but both trivial and inapplicable. I
never argued that MCT's can measure anything that
can be established with complete 100% certainty
for all time.
The last sentence of the abstract of Hake (1998a)
is: "The conceptual and problem-solving test
results **strongly suggest** that the classroom
use of IE methods can increase mechanics-course
effectiveness well beyond that obtained in
traditional practice. "
Shavelson & Towne (2002, p. 16) put the matter well:
"Mistakes are made as science moves for forward.
The process is not infallible [see Lakatos &
Musgrave (1970)]; science advances through
professional criticism and self correction. . .
. .Popper (1959) argues that knowledge always
remains conjectural and potentially revisable,
largely by the process of testing (seeking
refutations) that Popper (1965) himself
described."
In Hake (2007c) I wrote [see that article for
references other than Hake (1998 a,b; 2002a,b)
and Shavelson & Towne (2002)]:
HHHHHHHHHHHHHHHHHHHHHHHHHHHHH
The approximately two-sigma superiority of IE
over T courses in introductory mechanics [shown
in Hake (1998a,b)] has been independently
corroborated in hundreds of courses with widely
varying types of instructors, institutions, and
student populations [see e.g., the references in
Hake (2002a,b)], thus satisfying Shavelson &
Towne's (2002) fifth principle of good scientific
practice [my CAPS]:
"Replicate and Generalize Across Studies: By one
replication we mean, at an elementary level, that
if one investigator makes a set of observations,
another investigator can make a similar set of
observations under the same conditions . . . . .
. . At a somewhat more complex level, REPLICATION
MEANS THE ABILITY TO REPEAT AN INVESTIGATION IN
MORE THAN ONE SETTING (FROM ONE LABORATORY TO
ANOTHER OR FROM ONE FIELD SITE TO A SIMILAR FIELD
SITE) AND REACH SIMILAR CONCLUSIONS."
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
Although ignored by most PEP's [psychologists,
education specialists, and psychometricians], and
even by some physicists [for example, those
contributing to McCray, DeHaan, & Schuck (2003)],
the above indicated research [Hake (1998a,b)] has
been noted positively by workers in many
different disciplines [astronomy, biology,
chemistry, cognitive science, communication,
economics, engineering, geoscience, mathematics,
medicine, physics, and even psychology !]. See
e.g. : Marchese (1997); Swartz (1999); Heller
(1999); Zeilik et al. (1999); Breslow (1999,
2000); Rothman & Narum (1999); Nelson (2000);
Albacete & VanLehn (2000); Stokstad (2001);
Morote & Pritchard (2002); Savinainen & Scott
(2002a,b); Dancy & Beichner (2002); Powell
(2003); Elliott (2003); Klymkowsky et al. (2003);
Wood & Gentile (2003); McConnell et al. (2003);
Evans et al., (2003); Hegedus & Kaput (2004);
Handelsman et al. (2004); Pavelich et al. (2004);
Khodor et al. (2004); DeHaan (2005); Buck & Wage
(2005); Smith et al. (2005); Hilborn (2005);
Moore (2005); Wieman & Perkins (2005); Heron &
Meltzer (2005); Kluck (2005); Bardar et al.
(2006); Froyd et al. (2006); Nuhfer (2006a,b);
and Michael (2006).
HHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Furthermore in "Design-Based Research in Physics
Education Research: A Review" [Hake (2007d)] I
wrote [see that article, when published, for
references other than Hake (1998a,b; 2002a,b)] :
HHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Average normalized gain differences between T and
IE courses that are consistent with the work of
Hake (1998a, 1998b, 2002a, 2002b) and Figure 1. .
. [of Hake (1998a)]. . . have been reported by:
Redish, Saul, & Steinberg, 1997; Saul, 1998;
Francis, Adams, & Noonan, 1998; Heller, 1999;
Redish & Steinberg, 1999; Redish, 1999; Beichner
et al., 1999; Cummings, Marx, Thornton, & Kuhl,
1999; Novak, Patterson, Gavrin, & Christian,
1999; Bernhard, 2000; Crouch & Mazur, 2001;
Johnson, 2001; Meltzer, 2002a, 2002b; Meltzer &
Manivannan, 2002; Savinainen & Scott, 2002a,
2002b); Steinberg & Donnelly, 2002; Fagan,
Crouch, & Mazur, 2002; Van Domelen & Van
Heuvelen, 2002;, and Belcher, 2003; Dori &
Belcher, 2004; Hoellwarth, Moelter, & Knight,
2005; Lorenzo, Crouch, & Mazur, 2006; &
Rosenberg, Lorenzo, & Mazur, 2006.
HHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Thus it would appear that Peter Campbell's
[corrected] assertion that causal relationships
between:
(a) "interactive engagement" and [normalized pre-to-post test gains] , and
(b) "traditional" course methods and [normalized pre-to-post test gains],"
cannot be established is either;
a. trivial if "cannot be established" means
"can't be established with complete 100%
certainty for all time," or
b. problematic if "cannot be established" mean
"cannot be shown to have a reasonable likelihood
of being correct."
33333333333333333333333333333333333333333333333
3. "It's entirely possible that students who were
taught using "interactive engagement" scored
better on MCT's. . . .[NO! on average achieved
higher pre-to-posttest average normalized gains].
. . than students who were taught using
"traditional" course methods due to something in
the design of the "interactive engagement"
curriculum/pedagogy. In other words, these
students might have been better trained and
prepared to take MCT's. You do not seem to
control for this variable."
Could it really be that the approximately
two-standard deviation superiority in average
normalized gains of interactive engagement (IE)
over traditional (T) courses was simply due the
fact that students subjected to IE courses became
more expert in taking MCT's as the course
progressed so that their posttest scores were
elevated over their pretest scores by MCT smarts,
rather than physics smarts?
If so, those who worked for years developing the
IE methods [e.g., Collaborative Peer Instruction,
Microcomputer-Based Laboratories, Concept Tests,
Modeling, Active Learning Problem Sets, Overview
Case Studies, and Socratic Dialogue Inducing
Laboratories] in the courses that I surveyed
will be disappointed that their methods failed to
improve students' conceptual grasp of Newtonian
mechanics more than traditional "direct
instruction" courses with passive student
lectures, recipe labs, and algorithmic homework
and exam problems. However, not to worry, IE
course developers can probably make a fortune
preparing students for the MCT components of
high-stakes tests such as the SAT's, GRE's, or
NCLB-induced state tests.
But seriously, I think Peter may be hampered by
his unfamiliarity with the nature of IE methods
in physics. It may be worthwhile to quote the
description of a fairly typical IE method [Hake
(1992)] [see that article and Hake (2007c) for
the references]:
HHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Socratic Dialogue Inducing (SDI) labs have been
shown [Hake (1998a, 1998b - Table Ic)] to be
relatively effective in guiding students to
construct a coherent conceptual understanding of
Newtonian mechanics. The SDI method might be
characterized as "guided construction," rather
than "guided discovery" or "inquiry." We think
the efficacy of SDI labs is primarily due to the
following essential features:
(1) interactive engagement of students who are
induced to think constructively about simple
Newtonian experiments which produce conflict with
their commonsense understandings;
(2) the Socratic method [e.g., Arons (1973, 1974,
1990, 1993, 1997); Hake (1992, 2002 f,g,h,j)] of
the *historical* Socrates [Vlastos (1990, 1991,
1993)], not Plato's alter ego in the "Meno"!, as
mistakenly assumed by many - even some
physicists; utilized by experienced instructors
who have a good understanding of the material and
are aware of common student preconceptions and
failings;
(3) considerable interaction between students and
instructors and thus a degree of individualized
instruction;
(4) extensive use of multiple representations
(verbal, written, pictorial, diagrammatic,
graphical, and mathematical) to model physical
systems;
(5) real world situations and kinesthetic
sensations (which promote student interest and
intensify cognitive conflict when students'
direct sensory experience does not conform to
their conceptions);
(6) cooperative group effort and peer discussions;
(7) repeated exposure to the coherent Newtonian
explanation in many different contexts.
HHHHHHHHHHHHHHHHHHHHHHHHHHHHH
I'm very doubtful that features "1" - "7" above were collectively:
(a) no more effective in enhancing students'
conceptual understanding of Newtonian mechanics
than traditional passive-student lecture courses
with recipe labs and algorithmic-problem homework
and exams;
(b) so effective in enhancing students' MCT
taking abilities that even despite "a" above, 5
IE (SDI) courses [see Table 1c of Hake (1998b)]
improved their posttest scores over their pretest
scores such as to obtain an average of average
normalized gains <<g>>> = (0.60 plus or minus
0.04sd) compared to 14 T courses with <<g>> =
(0.23 plus or minus 0.04sd), for a Cohen (1988)
effect size d = 9.
4444444444444444444444444444444444444444444\
4. "Finally, it's also entirely possible that
students who were taught using "interactive
engagement" scored better on MCT's than students
who were taught using "traditional" course
methods due to the composition of the students in
the "interactive engagement" group. Were the
students randomly selected to each group? How
large were the groups?"
The abstract of Hake (1998) states: ". .
.forty-eight courses (N = 4458) which made
substantial use of IE methods achieved an average
[normalized] gain <g>IE-ave = 0.48 ± 0.14 (std
dev), almost two standard deviations of <g>IE-ave
above that of the traditional courses." So, on
average, the size of the IE groups was 4458/48 =
92.8.
As to random selection, I'll repeat from my
previous post "Re: Why MCT's? 2nd Try" [Hake
(2007b)]: in "Should We Measure Change? Yes!"
[Hake (2007c)] I wrote:
HHHHHHHHHHHHHHHHHHHHHHHHHH
THE VIEW FROM U.S. DEPARTMENT OF EDUCATION
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
"History" and maturation are among the nine
threats to internal validity listed in Table 2.4
of Shadish et al. (2002), are discussed on pages
56-57 of that text, and are reiterated by the
PEP. . . Psycholologist, Education specialist,
Psychometrician]. . . dominated "Coalition for
Evidence-Based Policy" (CEBP) at the U.S. Dept.
of Education [USDE (2003)]:
USDE-USDE-USDE-USDE-USDE
There is persuasive evidence that the randomized
controlled trial, when properly designed and
implemented, is superior to other study designs
in measuring an intervention's true effect.
1. "Pre-post" study designs often produce
erroneous results. Definition: A "pre-post" study
examines whether participants in an intervention
improve or regress during the course of the
intervention, and then attributes any such
improvement or regression to the intervention.
The problem with this type of study is that,
without reference to a control group, it cannot
answer whether the participants' improvement or
decline would have occurred anyway, even without
the intervention. This often leads to erroneous
conclusions about the effectiveness of the
intervention.
USDE-USDE-USDE-USDE-USDE
But CEBP's criticism of pre/post testing is
irrelevant for the recent pre/post studies in
physics. The reason is that control groups HAVE
been utilized - they are the introductory courses
taught by the traditional method. The matching is
due to the fact that (a) within any one
institution the test [interactive engagement
(IE)] and control [traditional (T)] groups are
drawn from the same generic introductory course
taken by relatively homogeneous groups of
students, and (b) IE course teachers in all
institutions are drawn from the same generic pool
of introductory course teachers who, judging from
uniformly poor average normalized gains <g> they
obtain in teaching traditional (T) courses, do
not vary greatly in their ability to enhance
student learning.
HHHHHHHHHHHHHHHHHHHHHHHHHH
Furthermore, it's surprising that Peter appears
to side with the USDE in their mistaken idea that
randomized control trials (RCT's) are the gold
standard of assessment.
In "Will the No Child Left Behind Act Promote
Direct Instruction of Science?" [Hake (2005)], I
gave, as one of the seven reasons why "Direct
Science Instruction" threatens to predominate
nationally under the aegis of the No Child Left
Behind Act, the following [see that article for
references other than Shavelson & Towne (2002) :
HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
MOST INTERACTIVE ENGAGEMENT AND GUIDED INQUIRY
METHODS HAVE NOT BEEN TESTED IN RANDOMIZED
CONTROL TRIALS (RCT'S), THE "GOLD STANDARD" OF
THE U.S. DEPT. OF EDUCATION (USDE)
That a single research method should be
designated as the "gold standard" for evaluating
an intervention's effectiveness appears
antithetical to the report of the NRC's Committee
on Scientific Principles for Education Research
[Shavelson & Towne (2002) - ST]. ST state that
scientific research should "pose significant
questions that can be investigated empirically,"
and "use methods that permit direct investigation
of the questions."
Furthermore, the USDE's RCT gold standard is
considered problematic by a wide array of
scholars. Taking issue with the RCT gold standard
are philosophers Dennis Phillips [Shavelson,
Phillips, Towne, & Feuer (2003)] and Michael
Scrivin (2004); mathematicians Burkhardt &
Schoenfeld (2003); engineer Woodie Flowers
[Zaritsky, Kelly, Flowers, Rogers, Patrick
(2003)]; and physicist Andre deSessa [Cobb,
Confey, diSessa, Lehrer, & Schauble (2003)].
In addition, the following organizations oppose the RCT gold standard:
(a) American Evaluation Association (AEA)
<http://www.eval.org/doestatement.htm>,
(b) American Education Research Association (AERA)
<http://www.eval.org/doeaera.htm>, and
(c) National Education Association
<http://www.eval.org/doe.nearesponse.pdf> (88 kB).
HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Richard Hake, Emeritus Professor of Physics, Indiana University
24245 Hatteras Street, Woodland Hills, CA 91367
<rrhake@earthlink.net>
<http://www.physics.indiana.edu/~hake>
<http://www.physics.indiana.edu/~sdi>
"Conflict is the gadfly of thought. It stirs us
to observation and memory. It instigates to
invention. It shocks us out of sheep-like
passivity, and sets us at noting and contriving.
Not that it always effects this result; but that
conflict is a sine qua non of reflection and
ingenuity."
John Dewey "Morals Are Human," Dewey: Middle Works, Vol.14, p. 207.
REFERENCES [Tiny URL's courtesy <http://tinyurl.com/create.php>.]
Campbell, P. 2007a. "Re: Lauren Resnick and
higher-order thinking skills," ARN-L post of 27
Jan 2007 12:10:35-0600; online at
<http://interversity.org/lists/arn-l/archives/Jan2007/msg00184.html>.
Campbell, P. 2007b. "Re: Why MCT's? 2nd Try,"
ARN-L post of 8 Mar 2007 14:20:55 -0600, online at
<http://interversity.org/lists/arn-l/archives/Mar2007_date/msg00035.html>.
Cohen, J. 1988. "Statistical power analysis for
the behavioral sciences." Second edition.
Lawrence Erlbaum.
Hake, R.R. 1992. "Socratic pedagogy in the
introductory physics lab." Phys. Teach. 30:
546-552; updated version (4/27/98) online at
<http://www.physics.indiana.edu/~sdi/SocPed1.pdf>
(88 kB).
Hake, R.R. 1998a. "Interactive-engagement vs traditional methods: A
six-thousand-student survey of mechanics test
data for introductory physics courses," Am. J.
Phys. 66(1): 64-74; online at
<http://www.physics.indiana.edu/~sdi/ajpv3i.pdf> (84 kB).
Hake, R.R. 1998b. "Interactive-engagement methods
in introductory mechanics courses," online at
<http://www.physics.indiana.edu/~sdi/IEM-2b.pdf>
(108 kB) - a crucial companion paper to Hake
(1998a).
Hake, R.R. 2002a. "Lessons from the physics
education reform effort," Ecology and Society
5(2): 28; online at
<http://www.ecologyandsociety.org/vol5/iss2/art28/>.
Ecology and Society
(formerly Conservation Ecology) is a free online
"peer-reviewed journal of integrative science and
fundamental policy research" with about 11,000
subscribers in about 108 countries.
Hake, R.R. 2002b. "Assessment of Physics Teaching
Methods," Proceedings of the UNESCO ASPEN
Workshop on Active Learning in Physics, Univ. of
Peradeniya, Sri Lanka, 2-4 Dec. ; online at
<http://www.physics.indiana.edu/~hake/Hake-SriLanka-Assessb.pdf> (84 kB).
Hake, R.R. 2005. "Will the No Child Left Behind
Act Promote Direct Instruction of Science?" Am.
Phys. Soc. 50: 851 (2005); APS March Meeting, Los
Angles, CA. 21-25 March; online at
<http://www.physics.indiana.edu/~hake/WillNCLBPromoteDSI-3.pdf> (256 kB).
Hake, R.R. 2006. "Possible Palliatives for the
Paralyzing Pre/Post Paranoia that Plagues Some
PEP's" [PEP's = Psychometricians, Education
specialists, and Psychologists], Journal of
MultiDisciplinary Evaluation, Number 6, November,
online at
<http://evaluation.wmich.edu/jmde/JMDE_Num006.html>.
Hake, R.R. 2007a. "Re: Why MCT's? (was Lauren
Resnick and higher-order thinking skills),"
online only at the PhysLrnR archives
<http://tinyurl.com/2rlyju>. Post of 6 Feb 2007
23:22:43-0600 to ARN-L and PhysLrnR.
Unfortunately, as of today, the ARN-L archives
for Feb 2007 at
<http://interversity.org/lists/arn-l/archives/Feb2007_date/index.html>
are incomplete, having been last updated on Feb
05 19:14:05 2007. This archive failure has not,
to my knowledge been explained by the ARN-L list
manager.
Hake, R.R. 2007b. "Re: Why MCT's? 2nd Try,"
ARN-L post of 7 Mar 2007 21:44:07 -0800, online
at
<http://interversity.org/lists/arn-l/archives/Mar2007/msg00033.html>.
This is the second try to post the message Hake
(2007a) to ARN-L.
Hake, R.R. 2007c. "Should We Measure Change?
Yes!" online as ref. 43 at
<http://www.physics.indiana.edu/~hake>. To appear
as a chapter in "Evaluation of Teaching and
Student Learning in Higher Education," a
Monograph of the American Evaluation Association
<http://www.eval.org/>. A severely truncated
version appears at Hake (2006).
Hake, R.R. 2007d. "Design-Based Research in
Physics Education Research: A Review," in Kelly &
Lesh (2007).
Halloun, I. & D. Hestenes. 1985a. "The initial
knowledge state of college physics students." Am.
J. Phys. 53:1043-1055; online at
<http://modeling.asu.edu/R&E/Research.html>.
Contains the "Mechanics
Diagnostic" test, precursor to widely used the
"Force Concept Inventory" [Hestenes et al. (1992)]
Halloun, I. & D. Hestenes. 1985b. "Common sense
concepts about motion." Am. J. Phys.
53:1056-1065; online at
<http://modeling.asu.edu/R&E/Research.html>.
Hestenes, D., M. Wells, & G. Swackhamer, 1992.
"Force Concept Inventory," Phys. Teach. 30:
141-158; online (except for the test itself) at
<http://modeling.asu.edu/R&E/Research.html>. The
1995 revision by Halloun, Hake, Mosca, & Hestenes
is online (password protected) at the same URL,
and is available in English, Spanish, German,
Malaysian, Chinese, Finnish, French, Turkish,
Swedish, and Russian.
Kelly, A.E. & R.A. Lesh, eds. 2007. "Handbook:
Design-Based Research in Education, " in
preparation. Mahwah, NJ: Lawrence Erlbaum
Associates.
Lakatos, I. and A. Musgrave, eds. 1970.
"Criticism and the growth of knowledge."
Cambridge University Press, information at
<http://tinyurl.com/2lnyto>.
Popper, K. 1959. "The Logic of Scientific Discovery." Basic Books.
Popper, K. 1965. "Conjectures and Refutations." Basic Books.
Shadish, W.R., T.D. Cook, & D.T. Campbell. 2002.
"Experimental and Quasi-Experimental Designs for
Generalized Causal Inference." Houghton Mifflin.
A goldmine of references to the social-science
literature of experimentation. Amazon.com
information at <http://tinyurl.com/yowod6>. Note
the "Look inside this book feature."
Shavelson, R.J. & L. Towne, eds. 2002.
"Scientific Research in Education," National
Academy Press; online at
<http://www.nap.edu/catalog/10236.html>.
USDE. 2003. U.S. Department of Education,
"Identifying and Implementing Educational
Practices Supported by Rigorous Evidence: A User
Friendly Guide." Institute of Education
Sciences," National Center for Education
Evaluation and Regional Assistance. The entire
guide is online at
<http://www.ed.gov/rschstat/research/pubs/rigorousevid/rigorousevid.pdf>
(140 KB).