Assessments and Testing – Are We Doing It Correctly?

What Can We Learn From the Research and Clinical Sciences Models?

Several years ago, my colleague, Dr. Paul Eslinger, and I were asked to write a short piece about “high-stakes” testing in K-12 systems for The EducationPolicy and Leadership Center. At the time, Paul was a neuropsychologist that worked with patients suffering from various forms of cognitive impairments caused by a number of medical conditions ranging from accident-related head trauma to stroke and brain tumors, Alzheimer’s disease, and dementia. At the same time, I was Chief of Developmental Pediatrics and Learning at the College of Medicine at Pennsylvania State University. test-scoresIt was at around this time, in 2000-2004, that I became interested in being able to translate basic and medical neurocognitive research to human learning with emphasis on how such information might be applied in classrooms.

The importance of “high stakes” testing came to the national forefront as a result of the push for annual improvement on such tests inherent in the then newly approved No Child Left Behind program. The following is an abbreviated portion of our discussion that will serve nicely as background for more specific discussions of assessment in science education in upcoming blogs.

Scientists Use Tests All the Time

Science performs many tests as it tries to understand the physical and biological principles that make up the world around us. Tests and analysis of test results, it might correctly be argued, are the indispensable tools of modern science. However, scientists know that each test is subject to artifacts involved in both the collection and interpretation of data. To help alleviate this unavoidable property of scientific investigation, researchers typically employ a variety of tests whenever possible. Multiple avenues of examination can function to expose anomalous results of a particular test or, even better, confirm results through several different strategies of analysis. How many tests are enough to be confident of a given physical or biological phenomena? Simply put, one can never be too confident. The more independent lines of evidence the better. Further, the greater the importance of a particular scientific investigation – the greater the significance of the findings – the more independent tests and analysis are appropriate. Nowhere is this point more important than in fields of investigation in which tests directly impact humans, such as the medical and clinical sciences. Few diagnoses rely on a single test of any kind. The physical examination is coupled with multiple laboratory test results, radiological imaging studies, second opinions by independent experts and so on, in order to achieve convergence of results and consensus of findings. In this very important, ‘high stakes’ endeavor of human health and welfare, no single test, or series of tests for that matter, is given simply to determine if the patient or doctor passes or fails! Instead, standardized tests provide one means of ascertaining the physical state of an individual at a given moment in time but must be combined with other assessments in order to be a sensitive and specific guide for diagnosing a disorder. Then, some of the same tests are used again to evaluate progress, treatment effectiveness and demonstrate the ultimate cure.

Clearly, this discussion of the prudent use of converging and discriminating tests and exams in scientific investigation and medical practice has some relevance when we discuss high stakes testing in the basic education environment. If science and medicine had chosen, instead, to search for and use a single test to assess all aspects of illness and individuals, with no attempt to independently confirm its validity and predictive value, we would likely have not progressed much over the years.

Declarative and Procedural Memory

The brain has several inter-related memory systems. Two of the most important for education are declarative and procedural memory. Declarative memory refers to the conscious recollection of facts, knowledge, experiences, and events. This system mediates the general, specific and personal aspects of declarative knowledge that are acquired through intentional learning.

In contrast, procedural memory refers to sensory-motor and skill-based learning (e.g., knowledge of how to ride a bike, play the piano, use a microscope, etc.), acquired through direct experience. Procedural memory does not require explicit recollection of initial learning experiences but rather provides the accumulated benefits of hands-on learning and knowledge. Clearly, this critical memory system is not assessed well, if at all, on standardized tests.

Learning and Consolidation Processes

Learning and consolidation processes are closely linked to children’s abilities to both register and retain new information and experiences. While key aspects of learning are better understood (e.g., being prepared, having an overview, paying attention, using and manipulating new information in interesting ways, etc.), consolidation is less clear to many educators. Consolidation is the process responsible for transferring new information from short-term to long-term memory for later retrieval (see figure, below).

IPM_Web_SMAs new information must be “processed” for this to happen, numerous teaching approaches are directed at increasing the amount of attention and processing that are applied. That is, appealing to students’ natural interests, previous knowledge, and “higher order” thinking when consolidating new information, will increase the amount of processing the new material receives. On the other hand, simple memorization requires far less processing and little critical thinking. The use of externalizations, such as “hands-on” and “inquiry-based” activities may be particularly beneficial for increasing information processing.

Consolidation is a complex process in which the brain converts new material from fleeting traces of information to long-term memories. These are then stored in more stable form with existing knowledge in brain areas called association cortices. New information begins this process by entering the brain as new input from various stimuli. Brain structures involved at this level include the major senses of touch (somatosensory cortex), sound (auditory cortex) and sight (visual cortex).

This multi-modal information is then ready for frontal lobe analysis where executive function helps to filter, compare and interweave it with existing information, already present in long-term storage. Thus, sensory-perceptual traces must be “held” onto mentally until the consolidation processes are completed. The overall process of holding onto new information while comparing it to existing information is sometimes referred to as working memory.

Once consolidated, long-term storage of memories is organized into knowledge structures throughout the various association cortices for later retrieval. Thus, students store information in a variety of places in the brain. Further, students retrieve stored information in different ways, largely dependent upon the manner in which the information was initially stored and the way they are asked to retrieve it. Lists of facts, names, dates, geographical locations and so on, without developed meaning, may be difficult to retrieve in isolation – particularly as time passes. Similar information and facts that are more critically developed by the teacher and “deeply processed” by the student will be easier to recall from memory and, more importantly, will be available for more in-depth future thinking. A student’s ability to “transfer” learned martial to new situations and circumstances is a good indication of how well they actually know and understand the material. It is difficult for any single standardized test to assess a student’s ability to recall and use information located in all of the many brain structures where long-term memories are stored.


The Interrelationships Among Learning, Intelligence, and Executive Function

Critical thinking and problem solving, perhaps surprisingly, are not really measured through standardized tests of intelligence, specific content knowledge, and even operational skills and judgment when they are devoid of appropriate, real-life contexts. This has been demonstrated through numerous kinds of studies in child development and neuropsychology. For example, Eslinger and Damasio (1985) described a patient who retained exceptional intellectual, reasoning, operational skill, and even judgment capacities after a brain tumor, as long as he was assessed with paper and pencil types of measures (e.g., who was president during the civil war….why do we have child labor laws….what is the basis for the federal taxation system, etc.). All of these measures were ‘out of context’ and not related to any real-life tasks or situations. His scores were in the superior range. However, in any real-life settings, this person could not organize his work, formulate and follow a step-by-step plan to complete an assignment, look ahead and anticipate possibilities, and accurately monitor his own progress. In short, his executive function was impaired.

In child development and throughout adulthood, intelligence and executive function do not go hand in hand. There are virtually no correlations between measures of intelligence (encompassing specific content knowledge, some problem solving skills and general judgment) and measures of executive function (encompassing capacities for planning, organization, working memory, anticipation, and self-monitoring (Archibald and Kerns, 1999; Ardila, 1999; Crinella and Yu, 2000; Eslinger and Damasio, 1985; Welsh et al., 1991)). Furthermore, while intelligence and standardized paper and pencil tests may adequately assess “what” a child knows, executive function is thought to underlie the “how” of learning and knowledge, such as: how Gettysburg was related to the outcome of the Civil War and how energy is related to motion.

Fortunately, executive function is a teachable skill that students can acquire and is related to every content area they study (Eslinger, 1997). Executive function is demonstrated through the process of critical thinking, how children go about problem solving (e.g., identifying and utilizing resources, formulating a plan, seeking feedback, improving upon their first attempt, etc.), and the product of those collective cognitive and behavioral processes.



The utility of employing a single, annual, formalized test to ascertain student achievement will inescapably be linked to the ability of that test to assess all that we view as important in the education of a child. This is no small challenge. As discussed above, an important role of our frontal lobe is the phenomena know as “executive function”. Executive function is intimately involved in our ability to think critically, solve problems, plan for the future, and follow and modify our plans as new situations arise, while keeping our initial goals in mind. These skills, one could argue, are at the very heart of what we would hope all educated citizens could do (Verner, 2002). While we may well demand that the development and cultivation of executive function be a priority accomplishment of public education, we cannot easily assess, through existing high stakes testing, whether or not it has actually done so. Furthermore, as discussed above, evidence actually suggests that there is little, if any, correlation between executive function and IQ, another cognitive parameter assessed through a formalized, paper and pencil test.

Recommendations for Student Assessments Based on Cognitive Science Considerations

  • Use multiple assessment approaches in order to categorize an individual student’s many strengths and weaknesses. Do not use a single test.
  • In addition to sit-down, “paper and pencil” exams, use testing approaches that can ascertain the student’s procedural knowledge (“how”) abilities.
  • Include significant analysis of executive function capabilities in student assessments.
  • Assure that recall of information from memory is connected as closely as possible to the mechanism by which it was committed to memory. Try to develop more valid links between instruction and assessment that capitalizes on the benefits of contextual cues and setting.
  • Ideally, try to use assessment results as an integral component of a student’s instruction, rather than exclusively as a measure of success or failure. Use assessment results to direct future instruction and/or remediation of individual students.
  • When using tests in the analysis of performance of teachers, schools or school districts, take into account the natural range of student abilities and accentuate longitudinal, multiple parameter analysis as opposed to single, high stakes exams. Just as it is difficult to assess an individual student’s overall success by a single test at a single point in time, it is also difficult assess the success of an entire educational system by the same measure.
  • Given the political and financial importance associated with high stakes testing to school districts, policy makers should be aware of potential negative impacts of such tests on education. The justifiable desire to assure and monitor quality education for all children in the basic education system may inadvertently result in undesirable instructional strategies such as:
  1. “Teaching to the test” (which may ultimately lead to less content and more test-taking instruction),
  2.  Minimizing the importance of procedural skills because they will not be tested on formalized tests (such as correctly operating scientific instruments, reciting poetry, participating in academic debates, drawing a schematic of a model, and so on.),
  3. Minimizing the development of executive function abilities as these are not readily assessed by standardized tests,
  4. Diminishing the joy and respect for learning and the student’s desire to continue school through graduation and beyond.

Finally, the desire to assure and monitor quality education for all children is commendable. However, it is also critical enough to employ frequent analysis using multiple forms of testing, with the intention of using results to improve the education of individual students and the instructional strategies of local educational systems. In upcoming Blogs we will begin to discuss, much more specifically, how assessments can be used as an integral component of science education and a driving force behind the development of critical thinking skills.


Archibald, S.J., Kerns, K.A. (1999). Identification and description of new tests of executive functioning in children. Child Neuropsychology 5: 115-129.

Ardila, A. (1999). A neuropsychological approach to intelligence. Neuropsychology Review 9: 117-136.

Crinella, F.M., Yu, J. (2000). Brain mechanisms and intelligence. Psychometric g and executive function. Intelligence 27: 299-327.

Eslinger, P.J. (1997). Brain development and learning. Basic Education, 41, 6-8.

Eslinger, P.J., Damasio, A.R. (1985). Severe disturbance of higher cognition after bilateral frontal lobe ablation: Patient EVR. Neurology, 35, 1731-41.

Verner, K. (2001). Connections in the Classroom: Brain-Based Learning. In Basic Education. 45: 3-7.

Verner, K. (2002). Transcending the Status Quo:Scientists and school educators need to join forces to raise student proficiency in science. HHMI Bulletin.

Welsh, M.C., Pennington, B.F., Groisser, D.B. (1991). A normative developmental study of executive function: A window on prefrontal function in children. Developmental Neuropsychology 7: 131-149.