Abstract
Objective. To provide a user’s guide on measuring metacognition in authentic contexts so that educators and researchers can explore students’ metacognition with an aim towards improving their students’ metacognitive processes and achievement.
Findings. Metacognition can be measured in a variety of ways depending on whether the interest is knowledge, monitoring, or control. These methods include surveys, assessment of student predictions versus their performance on examinations, or investigating students’ decisions during their learning process.
Summary. Metacognition refers to people’s knowledge about and regulation of their cognitive processes. These aspects of metacognition are important for supporting students’ success in academic and experiential settings. In particular, students who recognize successful learning strategies can accurately monitor their own progress and make effective study decisions that are more likely to help them meet their learning goals. Thus, measuring metacognitive knowledge, monitoring, and control can help educators identify struggling students who may benefit from interventions to improve their metacognitive processes.
INTRODUCTION
Riley begins studying for an examination she will take tomorrow in her pharmacy course. She starts by diligently reviewing her notes from class. As she flips through her notes, she feels confident that she understands infusion kinetics and decides she does not need to devote any more effort to learning that topic. However, she keeps getting the concepts of hepatic clearance and renal clearance confused. To overcome this confusion, Riley decides to reread the section of her textbook about these topics. During the examination, Riley is feeling a bit anxious because although she studied the evening before, she still feels that she does not fully understand hepatic and renal clearance. She completes her examination but before turning it in, she goes through each question and puts a check next to the ones she is uncertain about so she can look up the answers later. Based on this rough analysis, she estimates that she will receive a score of about 80% on the examination. She plans to begin studying earlier for future examinations so that she can space out her study times and use different strategies in hopes of improving her learning and performance on the next examination.
Defining Metacognition
In the aforementioned scenario, Riley is using different aspects of metacognition. Metacognition is generally defined as “thoughts about one’s own thoughts,”1 or more specifically, “one’s knowledge concerning one’s own cognitive processes and products or anything related to them.”2 Metacognition consists of three primary components: knowledge, monitoring, and control (Table 1).3 Metacognitive knowledge refers to facts and beliefs about how we learn, such as our knowledge about the effectiveness of learning strategies or our efficacy in our ability to learn (ie, self-efficacy4). Riley presumably believed that studying the night before an examination (ie, cramming) and rereading were effective learning strategies, but after taking the examination, she realized that these strategies may have been limited. Metacognitive monitoring refers to evaluating the process of learning or current state of knowledge. Riley engaged in monitoring to discover that she felt confident in her understanding about infusion kinetics (and less confident about hepatic clearance) and to estimate her performance on the examination. Metacognitive control is the regulation of learning activities. Riley decided when she was going to study, what strategies she would use, and after the examination, how she would study differently for future examinations. Each of these aspects of metacognition can limit or enhance learning depending on the quality of the students’ knowledge, monitoring, and control processes, –how students regulate their learning can make a real difference in their ultimate success.
Aspects of Metacognitiona for Educators to Consider When Developing Strategies to Improve Student Learning
Most educators have probably encountered students like Riley who are using metacognitive knowledge and monitoring to inform how they will control their learning and test taking. Some students’ study habits will be sophisticated, while others’ study habits will be much less so. The latter may have difficulties monitoring their progress and even believe that some ineffective strategies are effective, which can undermine their learning. Accordingly, this review offers some advice on how to assess students’ metacognition by providing a review of the kinds of questions that can be asked (and empirically answered) about students’ metacognition within authentic educational settings, including the typical measures of metacognition and the limitations associated with each. Our goal in reviewing these measures was twofold. First, we wanted to help educators and discipline-based education researchers5 assess metacognition, which can inform instruction. If, for example, an educator discovers that many students tend to be overconfident with respect to their knowledge of particular concepts, the educator can alert their students to not underestimate the difficulty of the content and might decide to spend more time teaching those concepts. Second, measuring metacognition across time will allow educators to evaluate whether an intervention (eg, a change in instruction) improves student metacognition (for advice on conducting instructional research in a classroom, see Dunlosky and colleagues6). Unlike other factors related to achievement (eg, intelligence), metacognition is not hardwired and is malleable: students’ inaccurate beliefs can be changed, they can learn to use effective strategies, and they can be trained how to improve their monitoring and control of learning.7 Thus, finding ways to measure and improve students’ metacognitive knowledge, monitoring, and control can lead to insights into how to improve their achievement.
To help foster research, a list of questions that can be asked about metacognition, both within classroom and experiential settings, is presented in Table 2. Also, although some concepts are illustrated in the context of a pharmacy course, the measurement of metacognition is relevant to any course. Thus, if an educator encounters a particular question worth addressing, this overview can provide some basic research tools to begin answering it within a course, whether it focuses on pharmacy, chemistry, physics, psychology, or other subjects. For each question, we included representative articles that sought to answer the question and which provide in-depth illustrations on how one might approach answering a particular question. By no means is this list comprehensive. Instead, it identifies some of the questions that we and others have been interested in pursuing, both within laboratory settings and in authentic educational contexts.
Potential Questions for Educators to Ask About Each Aspect of Metacognition When Developing Strategies to Improve Student Learning
Measuring Metacognitive Knowledge and Beliefs
Research focused on learning has identified the relative effectiveness of different study strategies.8,9 However, do students know which strategies are most effective and use them consistently? This type of question can be answered by assessing students’ metacognitive knowledge. In the aforementioned scenario, the student believed reviewing her notes and rereading her textbook would lead to effective learning. Unfortunately, such relatively passive strategies are less effective than strategies that require the learner to take on a more active role, such as practice testing10,11 or self-explanation.12 This example demonstrates one reason to measure students’ knowledge and beliefs: students can be misguided and have misconceptions about what works best,13 perhaps because many students are not formally taught the basic principles of learning and how to study effectively.14 Accordingly, assessing students’ knowledge and, if necessary, providing guidance to them about when and how to use effective strategies may benefit their achievement.
Another important aspect of metacognitive knowledge pertains to students’ beliefs about their ability to succeed. For instance, as students progress through a course, are they more confident in their ability to learn new material and perform well on examinations? As students advance in their experiential curriculum, do they feel more confident in completing their professional activities (eg, entrustable professional activities), such as educating patients about their medications or identifying patients’ medication-related problems? These questions refer to students’ self-efficacy or people’s beliefs about their ability to succeed in specific situations.4 Self-efficacy is important because it is related to effective self-regulation, ie, students who are confident in their ability to succeed are more likely to set achievable goals and use effective strategies to reach them.15 Doing so ultimately leads to higher achievement, as supported by the relationship between self-efficacy and grade-point average.16 To reiterate an important point: self-efficacy pertains to a student’s belief about his or her ability and not to the student’s actual ability. Although people’s beliefs about their ability may be related to their actual ability, beliefs and performance can also be misaligned. That is, an intervention can enhance students’ efficacy yet have no impact on their performance. Thus, measures of self-efficacy should be viewed as beliefs about ability and not used as surrogates for ability or achievement. We recommend measuring both students’ self-efficacy (a belief) and performance, so that the degree to which students’ beliefs align with their performance can be explored. We return to this issue of measuring judgment accuracy under the section measuring metacognitive monitoring.
Given the importance of metacognitive knowledge and beliefs, how can they be assessed? Metacognitive knowledge and beliefs are typically measured through questionnaires, such as those listed in Table 3. Descriptions of other questionnaires measuring study habits, skills, and attitudes have been published elsewhere.17 Questionnaires are useful because they can be easily modified to meet a specific research goal and can generate informative (and extensive) data sets in a brief session. For instance, several questionnaires have multiple scales, some of which tap students’ knowledge about strategies, their use of strategies, and their self-efficacy for learning. Accordingly, an instructor may be interested in only a subset of the questions pertaining to those scales that are most relevant to the question(s) being addressed. Once equipped with information about students’ metacognitive knowledge, an instructor can address knowledge gaps and potentially use the responses to understand why particular students are struggling. With respect to motivating students to adopt more effective strategies, the good news is that instructing students about effective strategies appears to lead to students’ greater endorsement of them.13,18
Popular Questionnaires Aimed at Measuring Students’ Knowledge About Strategies, Efficacy, and Control of Learning
Before administering any questionnaire, we suggest reviewing the properties of each scale and thinking critically about what each one is assessing. Some questionnaires combine outcomes from multiple questions into a single scale, but sometimes the answers to a specific question may be of most interest. For instance, one scale of the Motivated Strategies for Learning Questionnaire19 combines items relevant to both ineffective and effective strategies. We would not expect use of ineffective strategies to predict performance in the same way that effective strategies would, so using this learning strategies scale would not make sense if instructors are only interested in students’ knowledge about effective strategies. Therefore, it is worth carefully considering how each question of a scale is relevant to one’s assessment goals. With this recommendation in mind, note that a full discussion of the challenges associated with using questionnaires20 is beyond the scope of this review.
Measuring Metacognitive Monitoring
Assessments are administered to evaluate student progress toward achieving learning objectives, which allows instructors to assign grades (ie, summative evaluation21). However, assessments can also provide instructors, preceptors, and students with formative evaluation or information about students’ understanding of the course material.22 Educators can use this information to modify their instruction to target more difficult concepts, and doing so comprises one of the most effective teaching practices.23 Students can also use this information to identify concepts that they have not yet learned. Students can receive targeted feedback about what concepts they are struggling with, or they can discover this for themselves by monitoring their performance. Note that accurate metacognitive monitoring is crucial for students to gain this formative evaluation. For instance, Riley’s confidence in her understanding of infusion kinetics influenced her decision to stop studying that topic. If she overestimated her understanding, then she would have benefitted from studying infusion kinetics more thoroughly. That is, if students are overconfident in their knowledge of a concept, they may not spend enough time studying.24 Under-confidence may also be problematic, because students may spend too much time on material that they have already learned well, which may lead them to use their limited study time inefficiently. Ultimately, accurate monitoring can support the most efficient and effective control of learning.
Monitoring can be measured at different phases of learning in the classroom or in experiential settings as illustrated in Figure 1.25 If instructors are interested in students’ level of understanding during learning (or acquisition), instructors should have students judge their current understanding before attempting to demonstrate what they have learned. For instance, instructors could ask students, “How confident are you that you understand infusion kinetics?” If instructors are interested in students’ level of understanding after learning but before they answer questions on an examination, instructors could have students predict their upcoming performance on an examination: “How confident are you that you will correctly answer questions about infusion kinetics on the exam?” Finally, if instructors are interested in students’ level of understanding after completing an examination, instructors could have students estimate how many questions they accurately answered on an examination after completing it (ie, have them make retrospective judgments of their performance). In a clinical setting, instructors could ask students to judge their ability to perform particular tasks (ie, perform a comprehensive medication review) either before or after doing so and receiving feedback from their preceptor.
Types of Monitoring Judgments and Control Decisions and When They Can be Measured
Note: Judgments can also vary in their level of specificity (item, category, global) at each time point. Inspired by Nelson and Narens.25
As implied by the aforementioned examples, monitoring can also be measured at different levels of specificity. At the global level, students judge their performance for all questions over an entire examination. At the concept level, students judge performance on particular concepts or topics, such as judging how well one will perform on all questions pertaining to the concept of renal clearance. At the item level, students judge whether they correctly answered a particular question on an examination. Most research on metacognitive judgments has focused on the global or item level and relatively little research has focused on the concept level. The relative lack of information about how well students can judge their concept-level knowledge is unfortunate because monitoring at this level is particularly relevant to providing useful formative evaluation. For instance, a student accurately judging that she answered about 70% of the items on an examination correctly (global level) will not necessarily help her figure out what specific concepts she does not understand. In contrast, accurately judging that her test performance was lower than desired because of her failure to perform well on questions about specific concepts (eg, infusion and nonlinear kinetics) would allow her to more effectively guide her restudying.
Collecting monitoring judgments from students can be easily incorporated into a classroom or clinical routine. For example, instructors could include a prediction cover sheet on examinations or have students provide confidence ratings as they answer quiz questions. A supervising clinician could have students assess their ability to complete tasks within a particular domain (eg, answering drug information questions). An example of a cover sheet (that involves predicting examination performance) that has been developed for a course entitled “Foundations in Pharmacokinetics” is provided in Appendix 1. This cover sheet was used to obtain monitoring judgments at both the global level and concept level, with the latter focusing on concepts being tested on the upcoming examination. A similar sheet could be used after the examination to obtain students’ judgments about how well they performed on the examination. Of course, at what level (and at what stage of learning and testing) students’ monitoring is assessed will depend on the questions being addressed.
Once students have made judgments about their knowledge or test performance, how does one estimate their monitoring (or judgment) accuracy? Measuring students’ level of judgment accuracy can help instructors discover the extent to which students can accurately estimate their own knowledge and whether any change in instruction influences their judgment accuracy. Monitoring accuracy involves comparing judgments with performance, and two types of accuracy, ie, absolute and relative accuracy, highlight different aspects of the relationship between students’ judgments and their performance.
Absolute accuracy refers to how well students can estimate their actual level of performance. Absolute accuracy can be measured in three ways: bias, absolute bias, and calibration. Each measure captures a different aspect of how well the judgments match performance, and each one is first computed at the level of each student. Example values based on a hypothetical student are presented in Figure 2. In this figure, judgments and test scores at both the global and concept level are included, which pertain to the cover sheet shown in Appendix 1. This student’s global prediction was 85, with the concept-level predictions ranging from 75 (for hepatic clearance) to 95 (for infusion kinetics).
Judgments and Examination Performance of a Hypothetical Student
Note: The global judgment does not need to be identical to the mean concept-level judgment, and, in most cases, mean examination performance will not equal the mean performance across topics (eg, they will not be equivalent if some questions tap topics that are not represented by the judged topics or if a different number of questions tap each topic on the examination). Resolution was computed as a gamma correlation between concept-level judgments and performance (see text for alternative measures).
Bias is the mean level of judgment minus actual performance, with positive values indicating overconfidence and negative values indicating under confidence. As shown in Figure 2, this student demonstrated overconfidence for their global judgment, with bias being +10. For bias at the concept level, the mean judgment (in this case, M=85) is compared to actual performance, and the student’s judgments were also on average overconfident at the concept level (bias=+5). Note also that the student was underconfident for one of the concepts (ie, single dose intravenous kinetics). Averaging across all students’ bias scores (which would be a standard approach to presenting descriptive analyses) can be misleading, because a mean of 0 (which would appear to be perfect absolute accuracy as measured by bias) could result from large discrepancies in both directions. That is, some students may show extreme overconfidence and others could show extreme underconfidence, yet the bias could average to 0. We recommend analyzing frequency distributions to evaluate whether excellent (close to 0) bias is resulting from averaging across bias values from both over and underconfident students. Another way to resolve this potential problem is to compute the absolute value of bias, which is called absolute bias. By computing the absolute value, then over and underconfidence across students will not reduce the magnitude of the overall bias, so absolute bias will provide a better estimate of how discrepant the judgment magnitude is from actual performance.
Another measure of absolute accuracy is called calibration, and it is based on an analysis of an entire calibration curve that maps percent performance as a function of increasing judgments. To construct a calibration curve26 and to estimate the corresponding measures of calibration, it would be ideal to have more observations per student than would be typically collected for concept-level judgments. Thus, calibration analyses would be most appropriate for judgments collected at the individual item level, such as when students provide a confidence judgment for each of their answers for an examination with many questions.
To estimate absolute accuracy, students must make judgments using the same metric as task performance because it would not be appropriate to subtract values that are from different metrics (eg, a judgment made on a seven-point Likert scale is not comparable to examination performance measured on a percent scale). We recommend having students make judgments on a percent scale (eg, estimate the percentage of questions they will answer correctly) because this scale makes intuitive sense to students and is a typical metric used for scoring examinations.
Relative accuracy (also referred to as resolution) refers to students’ ability to discriminate between different levels of performance across items or concepts. Many different measures of relative accuracy have been proposed, each with their own strengths and weaknesses.27-33 One common way to estimate resolution is to compute an intra-individual correlation between students’ judgments and their performance. A strong, positive correlation is observed when a student can accurately judge the likelihood of correct performance on one item (or concept) relative to another. In Figure 2, resolution was computed as a Goodman-Kruskal gamma correlation,30 and for the hypothetical student, resolution was close to perfect (which, for a correlation, would be a value of +1.0). Other correlations (eg, Pearson r) can be used as well.31 By contrast, others have argued that the correlational approaches provide more biased estimates of resolution as compared to measures based on signal detection theory.33 Our recommendation is to evaluate whether each measure supports the same conclusion by computing as many measures of resolution as viable given the structure of the data.
Measuring Metacognitive Control
Consider Riley once again. After taking her first examination, she plans to study even more in the future. Because she would eventually like to practice pharmacy professionally, her goal is to master all the material she is taught so that she can apply it when she is working in the field. Given this goal, she makes a schedule outlining when and what she will study each day. As she studies and judges her progress, she revises her plan based on her understanding of particular concepts. She also decides that she is going to use some new study techniques that promise to promote long-term retention.
Riley’s plan demonstrates a few key points about metacognitive control. First, her studying is influenced by her specific goals. According to all theories of self-regulated learning based on information-processing models,34-38 an effective learner develops goals and plans as to how best to obtain them. Students vary in their learning goals (eg, master all the material or simply pass a class), and those goals influence how they prepare for study, such as what they study, when and how they study, and how long they study. Students’ decisions on what and when to study will be influenced by other factors as well. In one survey, students reported focusing on topics they found interesting and whatever assignment was due the soonest.14 In addition to using surveys to assess when (and how) students prepare for examinations, technology allows for additional opportunities for instructors to measure patterns of students’ study behavior. For instance, researchers39,40 have assessed when students accessed and submitted assignments through a learning management system, which provided an objective measure of students’ learning behaviors such as academic procrastination. If these measures indicate students are driven by what is due soonest, it may mean that they are waiting until the last minute to cram for important examinations. If so, educators could consider administering low-stakes quizzes each week to encourage students to study more frequently.
Second, control decisions also include the kinds of study techniques that students explicitly adopt to learn material. In other words, Riley is demonstrating control when she decides to stop rereading her material passively and switch to using a more effective strategy, such as successive relearning.9,41,42 The surveys discussed above will provide some insight into students’ beliefs and use of some strategies, and brief checklists can be used to assess students’ use of effective (and ineffective) techniques and how they may change their use of these techniques across a course.14
A final point concerning Riley’s control of her study is that her monitoring is likely to be intricately linked to her control.43,44 In particular, one function of monitoring in this context is for Riley to identify material that she is struggling to learn. Riley can then decide how to engage differently with the less well-understood material, such as by studying it more, choosing a different strategy for studying it, or seeking help from a peer or instructor. In fact, one of the main reasons that cognitive researchers have been interested in metacognitive monitoring is because students use their monitoring to make decisions about how to control their subsequent study. The rationale for such an interest is based on the importance of accurate monitoring for effective control,44 which we suspect is intuitively plausible. If Riley’s judgments are perfectly accurate, then when she judges that she does not yet understand one topic and will not correctly answer questions on an examination, then in fact (because her judgments are accurate) she actually does not yet understand that topic and does need to study it more.
In summary, if students are inaccurately judging their learning, and in particular, if they are overconfident in their knowledge of a topic, their poor judgments could lead them to underperform. If after an examination you ever had a student say, “I was sure I knew all the material, so why did I perform so poorly?” then you have already run into the problem of misjudgment and how overconfidence can make students fall short of their learning goals. Again, using techniques to help students inform themselves about how well they understand content (eg, administering mini quizzes for use in formative evaluation) could be useful. Furthermore, with the methods described above for how to collect and evaluate judgment accuracy, educators could explore the degree to which any intervention (eg, mini quizzes) is actually improving student’s ability to accurately judge their knowledge.
SUMMARY
Most learning takes place outside of the classroom, where students have limited guidance about what and how to study and are faced with ample opportunities to become distracted. Thus, effective self-regulated learning (guided by students’ metacognition) is critical for students to reach their academic goals. Discovering the extent to which students know and use effective learning strategies, believe in their ability to succeed, and successfully monitor and control their own learning can inform instruction.
Appendix 1. Sample Cover Sheet for Examinations Used to Assess Student Monitoring at the Global and Concept Level (modified from Hartwig and Dunlosky)53

Footnotes
- Received June 11, 2019.
- Accepted October 31, 2019.
- © 2020 American Association of Colleges of Pharmacy