Which scientific reasoning research-based assessment should I use in my class?

posted April 10, 2021
by Adrian Madsen, Sarah B. McKagan, Eleanor C. Sayre, and Cassandra A. Paul

This recommendation initially appeared as an article in the American Journal of Physics: A. Madsen, S. B. McKagan, E. C. Sayre, and C. A. Paul, Resource Letter RBAI-2: Research-based assessment instruments: Beyond physics topics. Am. J. Phys, 87, 350 (2019). 

Scientific reasoning is an important skill that many faculties would like their students to develop. Most generally we can think of scientific reasoning skills as those needed to conduct scientific inquiry including evidence evaluation, inference, and argumentation to form theories about the natural world (Zimmerman 2007). There are two assessments of scientific reasoning that have been used in physics: the Lawson Classroom Test of Scientific Reasoning (CTSR) (Lawson 1978) and the Scientific Abilities Assessment Rubrics  (SAARs) (Etkina at al. 2006). The Physics Lab Inventory of Critical Thinking (PLIC) (Holmes and Wieman 2016) also assesses aspects of students’ scientific reasoning skills but focuses more on their reasoning skills as related to labs and is discussed in this expert recommendation. 

Title Content Format Intended Population Research Validation Purpose 
Lawson Classroom Test of Scientific Reasoning (CTSR)

Proportional thinking, probabilistic thinking, correlational thinking, hypothetico-deductive reasoning


Intro college, high school, middle school


Measure concrete- and formal- operational reasoning.

Scientific Abilities Assessment Rubrics (SAARs)

Represent information in multiple ways, design and conduct experiments, communicate scientific ideas, collect and analyze experimental data, evaluate experimental results

Rubric Intro college, high school Gold

Assess essential knowledge that mathematics education research has revealed to be foundational for students’ learning and understanding of the central ideas of beginning calculus.

Lawson Classroom Test of Scientific Reasoning Ability (CTSR)

The Lawson Classroom Test of Scientific Reasoning (CTSR) (Lawson 1978) is a multiple-choice pre/post-test with questions on conservation, proportional thinking, identification of variables, probabilistic thinking, and hypothetico-deductive reasoning. Lawson describes scientific reasoning as consisting of “a mental strategy, plan, or rule used to process information and devise conclusions that go beyond direct experience.” (Lawson 2004) The CTSR was originally intended to help instructors classify students’ reasoning abilities as concrete, transitional, or formal, based on the total number of questions they answer correctly. However, this method of classifying students reasoning abilities using their CTSR score is antiquated, oversimplified, and problematic (Lawson 1992), and it is not clear that the CTSR is measuring only one construct. The validity of the most recent version of the CTSR has been recently studied, and issues were found (Bao et al. 2018). Of the 12 pairs of questions on the CTSR, 5 pairs were found to have design problems, e.g., students were answering incorrectly when they did understand the content, or students were confused by details given and misinterpreted the question. The problematic questions came from three clusters of questions: proportional reasoning, control of variables, and hypothetic-deductive reasoning. For a more thorough discussion of the validity of the CTSR, see Bao et al. 2018. Because of these shortcomings around CTSR questions, the overall score should not be used to classify individual students, and the results should not be used as a stand-alone proxy for your students reasoning abilities. Instead, looking at the change in the overall CTSR score between pre- and post-test for your class can give you a sense of how your course influences students’ reasoning abilities because some of the effects of poor question design will probably average out over larger numbers of students. If you do this, you should use the CTSR in combination with other measures of reasoning abilities. Instructors can also use the percentage correct on the CTSR for each cluster of questions to get a sense of their students’ strengths and weaknesses around different aspects of scientific reasoning (while taking into account that questions in the proportional reasoning, control of variables, and hypothetic-deductive reasoning clusters have design issues). The Lawson test was developed for high school students but has also been used at the introductory college level. The questions were originally based on demonstrations, where the instructor would perform the demonstration and then ask students questions about it in an interview format. The most recent version has converted these interview questions into a multiple-choice paper and pencil test.

Scientific Abilities Assessment Rubrics (SAARs)

The Scientific Abilities Assessment Rubrics (SAARs) (Etkina at al. 2006) are a set of rubrics used to assess students’ levels of competence around seven different scientific abilities, which are as follows:

  1. The ability to represent physical processes in multiple ways.

  2. The ability to devise and test a qualitative explanation or quantitative relationship.

  3. The ability to modify a qualitative explanation or quantitative relationship.

  4. The ability to design an experimental investigation.

  5. The ability to collect and analyze data.

  6. The ability to evaluate experimental predictions and outcomes, conceptual claims, problem solutions, and models.

  7. The ability to communicate.

The SAARs are used to assess specific scientific abilities as evidenced in students’ written work around experiments or design tasks. The Scientific Abilities Assessment Rubrics outline the different levels of performance (0, missing; 1, inadequate; 2, needs some improvement; and 3, adequate) and include a description of each level, to enable students to self-assess as they work toward developing these abilities. In this way, the SAARs enable formative assessment of students’ scientific abilities. Instructors can also use the SAARs to assess their students’ acquisition of these scientific abilities by scoring students’ laboratory write-ups for a particular experiment or design task from 0 to 3 using the descriptions of the different levels on the rubric. Instructors can then compare the distribution of scores for a particular scientific ability at the beginning and end of the course in hopes of seeing more students scoring “adequate.” The SAARs were developed in the context of introductory college courses, though may also be appropriate for high school and intermediate college classes. The list of scientific abilities is based on literature on the history of the practice of physics, a taxonomy of cognitive skills, recommendations of science educators, and an analysis of science-process test items.

Recommendations for choosing scientific reasoning assessment

Use the CTSR if you want to assess your students’ reasoning skills, possibly in conjunction with an appropriate test of their mathematical skill or physics content knowledge and other assessments of reasoning skills, as many shortcomings have been identified in the CTSR, and it relying on the CTSR score as the only measure of your students reasoning skills would be problematic. Do not use this test if you need more detailed information about a specific student or group of students (such as for placement into a particular class), because the design and validity issues with the test do not average out over smaller numbers of students.

Use the SAARs to help your students self-assess their scientific abilities in lab courses. You can also use the SAARs as an instructor to give your students feedback on their competency around specific scientific abilities and sub-abilities and look at how your students’ scores change over the course of your class.