Abstract
Objective. To validate a problem-based learning (PBL) evaluation checklist to assess individual Doctor of Pharmacy (PharmD) students’ performance in a group.
Methods. In 2013, a performance checklist was developed and standardized. To evaluate the reliability and discriminant validity of the checklist, pharmacy students’ evaluation scores from 2015-2016 were assessed along with overall program grade point averages (GPA), and scores on knowledge and problem-solving examinations. Predictive analysis software was used to analyze the data.
Results. Seventy facilitators generated 1506 evaluation reports for 191 (90 third-year and 101 second-year) students over eight PBL cases. The mean (SD) total score was 40.6 (2.5) for P3s and 39.1 (2.7) for P2s out of a possible 44.2 points. Students’ scores improved each semester. Interrater reliability based on intraclass correlation coefficient for all cases was 0.67. Internal reliability as determined by Cronbach alpha was >0.7 for all binary checklist items across all cases. Discriminant validity assessed using Pearson correlation coefficient showed that the total score from the checklist did not correlate with knowledge or problem-solving examination scores.
Conclusion. This unique PBL checklist proved to be a reliable and valid tool to assess student performance in small group sessions in a PharmD curriculum.
INTRODUCTION
Problem-based learning (PBL) is a common pedagogy used in higher education, specifically in health care professions’ education.1,2 The majority of Doctor of Pharmacy (PharmD) programs report using PBL within their curricula.3 Problem-based learning is the student-directed pedagogy that most closely mimics clinical practice in a classroom setting. Skills developed through PBL include problem-solving, critical-thinking, clinical-reasoning, self-directed learning, collaborative practice, flexible knowledge, and intrinsic motivation.4-6 In order to acquire these skills, the PBL process should be authentically implemented using small, collaborative groups of students guided by a trained facilitator.7 The stimulus for group learning is an ill-structured, complex, open-ended problem, such as a realistic patient case.4
At Wayne State University (WSU), we integrated PBL within our pharmacotherapy modules in 2006. In 2014, we separated PBL from the modules to create a PBL four-course series called Pharmacotherapeutic Problem Solving, which spans the second- and third-professional (P2 and P3) years. Problem-based learning in a pharmacy curriculum offers a student-directed approach to learning pathophysiology, pharmacology, medicinal chemistry, and pharmacotherapeutics within the context of a patient case. The content learned in the PBL courses is unique but complimentary to the content of the concurrently delivered didactic pharmacotherapeutic module courses. The framework for the PBL courses was developed by aligning PBL skills with specific program-level ability-based outcomes.
Assessment of knowledge acquired from PBL experiences has been extensively described in the literature. In pharmacy education literature, the use of written case-based examinations at the end of the PBL process8-10 or as a pre- and post-test assessment of knowledge11,12 have been reported. In addition to summative examinations, weekly, written, patient-care plan submissions have been evaluated using a tick-box rubric.8 Others report using Likert scale rubrics to assess written submissions encompassing the entire PBL process, including the development of facts, hypotheses, and learning questions, as well as drug-related problems and their resolution.13 In the medical literature, students’ written patient summary statements have been assessed using rubrics based on Likert-scale ratings.14
A variety of assessment tools evaluating skills developed by students during PBL have also been described. Summative evaluations of PBL skills have been reported using objective structured clinical examinations (OSCEs)15-17 or problem-solving written examinations.9 Peer evaluations of performance within PBL sessions have been used,8,9,18,19 as have student self-evaluations.9,11,20 The literature that exists surrounding facilitator evaluation of student performance within a small group is limited. Ross and colleagues report on the use of a facilitator evaluation that encompasses PBL skills, such as participation, cooperation, and communication skills.9 Romero and colleagues describe the use of verbal feedback for students’ performance with a PBL group.10 Sim and colleagues discuss and evaluate the use of a five-point Likert-scale facilitator evaluation instrument.21 This tool assessed participation and communication skills, cooperation or team-building skills, comprehension or reasoning skills, and knowledge or information-gathering skills. Facilitators were surveyed, and 88% agreed that the tool was easy to use (n=34). Interrater reliability was assessed using an intraclass correlation coefficient that indicated overall consistency among facilitators’ scores with some strict (4 of 34) or indiscriminant (8 of 34) graders.21
Over the past decade at WSU, we used facilitator-graded Likert scales to evaluate student performance within the small group setting. Each facilitator evaluated the members of their group, which consisted of 8 to 9 students. Anecdotally, we observed that facilitators did not discriminate between students’ performance using this assessment tool, leading to inflated evaluation scores. In order to address the high facilitator scores, we instituted tally sheets within the sessions, which were used by both the facilitator and two student members of the group to determine the quantity and quality of comments made by each student. The facilitator was asked to use these tally sheets to inform their Likert scale-based rating of each student in the group. However, feedback from facilitators and students regarding the tally sheets indicated that they inhibited the students’ ability to participate and the facilitators’ ability to coach the group. In addition, student scores still appeared to be inflated.
At WSU, we employ a multi-modal assessment of student learning and performance within the PBL course series using four types of assessments that occur multiple times throughout the semester (Figure 1). One of the assessments requires students to develop individual patient care plans for each of the four cases during the self-directed learning process. These care plans are graded by the case writers, who are content experts. The plans are submitted and graded electronically using rubrics. To maintain the student-driven authenticity of the PBL pedagogy, rubrics for providing students with general feedback regarding their individual patient care plans were developed rather than providing students with content-specific answers. Instead, content-specific answers for the plans are discussed and consensus is reached within the small group during subsequent facilitated sessions. In addition, midterm and final examinations assessing pharmacotherapeutic and disease state knowledge acquisition are administered, each of which covered two patient cases. Written problem-solving examinations were administered with the midterm and final knowledge-based assessments. During the problem-solving examination, the students were presented with a patient case involving a disease state that had not been formally taught within the curriculum. The students were provided with selected primary, secondary, and tertiary resources regarding the disease state. The students individually completed the steps they would ordinarily complete with their group: identifying facts from the case, developing related hypotheses and learning questions, answering those learning questions using the provided resources, critiquing the provided resources, and developing an answer to the problem in the case through responding to a variety of question formats (eg, multiple choice, multiple answer, short answer, matching, essay).
Wayne State University Problem-Based Learning (PBL) Course Series Assessment Blueprint for the P2 and P3 2015-16 Year
We developed a fourth type of assessment, which is a novel checklist-based tool for facilitators to use within the small-group sessions to evaluate student performance (Appendix 1). The tool was developed to evaluate the following course objectives, which were derived from the Center for the Advancement of Pharmacy Education 2013 Educational Outcomes22: differentiate relevant patient characteristics; generate hypotheses or learning questions; discuss clinical and scientific principles of the disease state; discuss clinical and scientific principles of the pharmacotherapies; identify drug-related problems (DRPs) through a systematic evaluation process; formulate a patient-specific, literature-supported patient plan to resolve DRPs; employ an effective strategy to identify relevant scientific literature and resources; and evaluate the scientific literature to formulate a solution to the problem. Facilitators used the tool over the span of three group sessions, at the end of which they also provided a global assessment of student professionalism and performance. The tool was completed for each patient case (four times) over the course of the semester for each student. The checklist was developed as a unique tool to assess problem-solving skills. We hypothesized that this binary checklist was a reliable and internally valid tool to measure student performance within a PBL small group. In addition, we hypothesized that the facilitator checklist provided discriminant validity for student performance and, therefore, would be weakly correlated with other measures of performance including PBL-course knowledge-based and problem-solving examinations, and program grade point average (GPA). The goal of this study was to evaluate the reliability and discriminant validity of the checklist evaluation tool for assessing pharmacy students’ performance within a small-group PBL environment.
METHODS
A standardized facilitator evaluation checklist for the assessment of student performance in a PBL small group was developed in 2013. The checklist is provided in Appendix 1. The PBL planning committee developed a list of eight course objectives derived from the CAPE 2013 Educational Outcomes22 and WSU’s program ability-based outcomes. For each objective, the committee identified tasks and skills that would measure achievement of the objective. Through this process, there were 52 items identified for the checklist. Based on our curricular experience with previous facilitator evaluation forms that used Likert scales, and the concern for associated grade inflation, we chose to employ a binary checklist approach where a facilitator would give each student a check mark for each item that was demonstrated in one of the three facilitated group sessions. By using a checklist approach, we hoped to achieve a more objective measure of observed student performance within a small-group PBL experience. Students only received the check mark on the first instance that they demonstrated a specific task. Subsequent instances were taken into account by the facilitator in the global evaluation questions that applied to all three sessions. This global assessment allowed the facilitator to differentiate students who consistently engaged in the PBL process from those who did not. The specific questions used are found in the Overall Professional Performance and Global Assessment sections of Appendix 1.
Once the checklist was finalized and the PBL committee had reviewed all of the items and agreed on the final language of the checklist, the Angoff Method for Standard Setting was used to determine the weighting of each item on the checklist.23 In order to employ this process, we identified 10 judges24 who were familiar with our program and who work with and hire our graduates. These judges were pharmacy practice and pharmaceutical sciences faculty members and included PBL committee members, volunteer preceptors, and area pharmacy administrators from a variety of practice settings. The judges were provided with the checklist and asked to determine what percent of the time a minimally competent pharmacy student at the end of their third professional year would achieve each task. A minimally competent student was defined as someone who would be able to perform with a passing score of 70% within our program. The estimates for each item provided by the judges were then averaged, and those averages were used to determine the weighted score for each item.
The checklist was used for one academic year to assess both P2 and P3 students. The PBL planning committee then evaluated the checklist based on student scores and facilitator feedback and determined that some items were very similar and could be combined. Other items were assessed elsewhere in the course and were eliminated from the checklist. The revised checklist measured 37 items and eight objectives and has been used since fall 2015.
In order to determine the reliability and discriminant validity of the evaluation checklist, we performed a retrospective analysis of facilitator evaluations of P2 and P3 students from the 2015-2016 academic year. Data reports were generated and compiled from E*VALUE (MedHub, Minneapolis, MN) as well as from reports of student program GPAs and knowledge and problem-solving examination performance in each PBL course. Other data included in the database was student year (P2 or P3), PBL group assignment, facilitator assignment, and facilitator type. The facilitators responsible for evaluating student performance were volunteers, first-year pharmacy residents, and Pharmaceutical Science and Pharmacy Practice faculty at WSU. The facilitators were trained for facilitation and evaluation through a structured training program, which has been described elsewhere.25 All data were de-identified and numerically coded. This study was approved by the WSU Institutional Review Board as a Quality Improvement project.
Descriptive statistics were used to describe our student population and mean evaluation scores. Student t test was used to determine the similarity between standardized judges’ scores on the checklist items versus scores achieved by the students during their PBL sessions. Interrater reliability of all facilitator checklist assessments during the 2015-2016 academic year for P2 and P3 students was assessed using the intraclass correlation coefficient (ICC). Cronbach alpha was used to assess the reliability of the items within the checklist for each case for P2 and P3 courses. A conventional cut-off of 0.7 or higher was defined as reliable.26 Discriminant validity of the checklist was assessed using the Pearson correlation coefficient (PCC). The PCC was used to assess correlation between facilitator evaluation scores, overall program GPA, knowledge examination scores, and problem-solving examination scores. A p value ≤ .05 (two-tailed) was considered significant. The PCC was also used to assess correlation between items on the checklist that were achieved by <80% of students. A PCC of less than 0.3 was considered weakly correlated, while 0.3-0.49 was considered moderately correlated, and greater than or equal to 0.5 was considered strongly correlated.27 The IBM Statistical Package for the Social Sciences (SPSS, Armonk, NY) predictive analytics software was used for statistical analyses.
RESULTS
In 2015-2016, P2 and P3 students enrolled in WSU’s PBL courses completed 16 cases. These courses enrolled 191 students, which were divided into 24 groups and facilitated by 70 facilitators who generated 1506 evaluation reports (Figure 1). The mean checklist score for the P2 class over eight evaluations was 39.1 (SD=2.7) out of 44.2 possible points that students could achieve for their performance during the small group PBL sessions. The mean checklist score for the P3 class was 40.6 (SD=2.5). Both P2 and P3 mean evaluation scores improved over the course of each semester for the majority of cases (Figure 2).
Mean Problem-Based Learning Checklist Performance Scores Out of a Maximum of 44.2 Points for Each Case for Both P2 and P3 Classes
The ICC as a measure of interrater reliability was 0.67 (95% CI=0.57-0.73) for all P2 and P3 checklist scores. Cronbach alpha as a measure of internal reliability of the checklist was computed for each of the 16 cases and ranged from 0.57 to 0.72 for P2 cases and 0.60 to 0.77 for P3 cases (Table 1). When checklist item number 47 (Appendix 1), “What overall grade on a scale of 0-100% would you give this student?” was omitted from the analysis, the Cronbach alpha increased from 0.79 to 0.92 for P2 cases and from 0.79 to 0.95 for P3 cases.
Internal Reliability Scores for a Checklist Used to Evaluate P2 and P3 Students’ Performance on Problem-based Learning Cases
To assess the internal validity of the performance checklist, items achieved less frequently by students were analyzed as a subset of the checklist. All items achieved by fewer than 80% of students were positively correlated with one another. Within the P2 class, 13 checklist items that were achieved by fewer than 80% of students resulted in 78 pairwise correlations, and 73 of the 78 correlations were significant (p≤.05). Within these 73 significant correlations, there were 56 pairwise comparisons that were moderately or strongly correlated (PCC >0.3) (Table 2). Within the P3 class, five checklist items were achieved by fewer than 80% of students, resulting in 10 pairwise correlations, all of which were significantly correlated (p≤.05). Nine of the 10 pairwise correlations were moderately or strongly correlated (PCC >0.3) (Table 3).
Checklist Item Internal Correlation for P2 Student Performance in Small Group Sessionsa
Checklist Item Internal Correlation for P3 Student Performance in Small Group Sessionsa
The PBL checklist evaluation scores from facilitators on students’ performance in small group session were compared with other measures of performance within the PBL courses (knowledge and problem-solving examination scores) and overall program performance as measured by program GPA to assess discriminant validity of the checklist. With regard to P2 student performance, checklist evaluation scores for only three of the eight cases correlated significantly, albeit weakly (PCC <0.3), with student performance on the knowledge examinations. These included case 4 of the fall semester and cases 1 and 2 of the winter semester (Table 4). None of the evaluation scores during the fall or winter semester were significantly correlated with student performance on the problem-solving examinations. When correlated with overall program GPA, the evaluation scores for only two of the eight cases were significantly correlated (case 1 and case 2 of the winter semester, PCC <0.3) (Table 5).
P2 and P3 Students’ Evaluation Scores on Eight Cases During 2015-2016 Academic Year vs Performance on Knowledge and Problem-Solving Examinationsa
P2 and P3 Evaluation Scores vs Overall Program Grade Point Average
With regard to P3 student performance, evaluation scores from two of the eight cases (cases 2 and 3 of the winter semester) were significantly but weakly correlated (PCC <0.3) with student performance on the knowledge examinations. Weak correlations (PCC <0.3) were found between the problem-solving examinations and evaluation scores from cases 1, 2, and 3 of the winter semester (Table 4). Lastly, students’ overall program GPA was weakly correlated (PCC <0.3) with checklist evaluation scores from case 3 of the fall semester and cases 2 and 3 of the winter semester (Table 5).
DISCUSSION
Assessment of skills gained during the PBL process is necessary to determine whether educational outcomes have been met. The standards for effective and efficient performance evaluation within a PBL small group are not well described or studied. Our results describe the assessment of the reliability and validity of a checklist-based tool that can be used for the evaluation of student performance within a PBL small group in a pharmacy curriculum.
In small group PBL sessions, facilitators use the checklist to evaluate students’ performance by indicating whether or not each task was performed at least once. Students could achieve a maximum total of 44.2 points over the three facilitated group sessions. Mean checklist evaluation scores were higher in the P3 class compared with the P2 class. This may have been because the P3 students had had an additional year of content knowledge and experience within the PBL process and were more familiar with the expected performance items on the checklist. Mean evaluation scores for both P2 and P3 students generally increased over the course of each semester (Figure 2). Within the P2 course, there was a dip in mean score from fall case 2 to fall case 3. This may have been because of the narrow scope of the case content, which did not provide enough opportunities for all of the students within the group to share information and achieve the specific items on the checklist. For example, therapies for the case disease state were limited thus, some students may not have had the opportunity to discuss clinical and scientific principles of the pharmacotherapies.
Interestingly, the first case in both the fall and winter semesters resulted in the lowest scores of the year, as though performance reset to a baseline level with the start of each new semester of PBL. Why the knowledge gains made during the previous semester were not retained is unclear. We identified three factors that occur within a PBL course that could potentially lead to lower mean scores for the first case of each semester. These include: the group working with a new facilitator, students being exposed to new pharmacotherapeutic topics, and the timing of facilitator training. Within the PBL courses at WSU, facilitator rotation occurs each semester after case 2. In addition, the pharmacotherapeutic modules from which the case topics are drawn run half the semester, meaning students are exposed to new pharmacotherapeutic topics after case 2. If these were the causes of the dip in performance, we would expect an additional dip in performance after case 2; however, this was not evident in the results. We suspect that the timing of facilitator training, which is provided at the start of each semester, may have resulted in facilitators more stringently evaluating student performance immediately following this training (ie, the first case of each semester).
The interrater reliability of the performance checklist was close to the traditional cut off for ICC of 0.7 (ICC 0.67). We calculated the total ICC based on all available checklist items for both P2 and P3 student performance. Because student performance should improve over time in the PBL course series, an ICC of less than 0.7 is expected. Internal reliability of the checklist using Cronbach alpha was also assessed for each case in both the P2 and P3 PBL courses. The internal reliability of this checklist was determined to be well above the desired threshold of 0.7 for all cases when checklist item number 47, “What overall grade on a scale of 0-100% would you give this student?” was omitted from the analysis. This was the exact result expected as this item was not a binary checklist item, but rather a qualitative assessment. It was included to allow facilitators to differentiate consistency of students’ overall participation and performance for each case. Two cases (P2 winter case 4 and P3 winter case 7) had atypically low Cronbach alpha results (0.57 and 0.60, respectively). These results may be explained by cases that did not provide new topics and, therefore, did not stimulate in-depth exploration. Thus, students discussed topics at more superficial levels resulting in inconsistency in the achievement of items on the checklist.
To assess the internal validity of the checklist, we evaluated whether the performance checklist was able to differentiate performance among students by assessing correlation between items. Items achieved by fewer students (less than 80%) were analyzed. In both the P2 and P3 courses, more than 75% of the items that were achieved by less than 80% of students were either moderately or strongly correlated with one another, illustrating that stronger performers tended to achieve these items. Of the items identified, 10 of the 13 items that were achieved by fewer than 80% of P2 students were predicted to be performed by 80% or fewer students based on the judges’ standard setting scores. Among P3 students, the five items that were achieved by fewer than 80% of students were a subset of those same items identified in the P2 class. This indicates that students’ PBL skills continue to develop while moving through the PBL course series from the P2 to the P3 year. We explored potential explanations for why P3 students did not fully achieve the five identified items. For the items regarding the pathophysiology of the disease state, pharmacokinetic and pharmacodynamic parameters, and drug cost and pharmacoeconomic considerations, we identified that lack of case complexity (ie, fewer medications and drug related problems included in the case) did not afford multiple students the opportunity to address these issues in their group discussion. We have since reviewed PBL cases across the course series with the goal of increasing their complexity to allow ample opportunities for students to share their knowledge within the group setting in these subject areas. The lack of achievement of the items regarding sharing and defending literature search terms and strategies, and discussing the limitations and sources of bias of supporting literature may have been due to time constraints within the group setting, and the subsequent lack of prompting from the facilitator. Therefore, we have developed modified PBL courses for P3 students that emphasize critical evaluation, application, and synthesis of supporting literature.
The performance checklist was intended to measure students’ problem-solving skills during each PBL session. Checklist scores were compared against knowledge-based and problem-solving examinations, and program GPA to assess discriminant validity of the checklist. As hypothesized, our results indicated that the checklist scores were weakly correlated or not correlated with knowledge-based examinations, problem-solving examinations, or program GPA. The problem-solving examinations were developed with the intent of evaluating students’ problem-solving skills gained through the small group PBL experience. However, the examinations were individual, written examinations that did not afford the students the opportunity to demonstrate the skills assessed using the checklist. Potentially, a performance-based examination would provide opportunities for summative evaluation of these types of skills. To clarify, our P2 and P3 classes were two different groups of students rather than the same cohort of students followed from their second to third year in pharmacy school. In future research, we may consider evaluating the same group of students as they move through their P2 and P3 years, which would allow us to control for potential confounding variables that inevitably occur between two different classes of students.
CONCLUSION
To our knowledge, this is the first checklist-based instrument that has demonstrated reliability and discriminant validity in evaluating student performance in a small group PBL environment. Results of this study support that our checklist to evaluate student performance within a PBL small group has the potential to be generalized for use across any pharmacy PBL curriculum. The checklist demonstrated interrater reliability among a variety of facilitators including pharmaceutical science faculty members, pharmacy practice faculty members, volunteer adjunct faculty members from various practice settings, and pharmacy residents. Variations that do exist between facilitators could be addressed through continued improvements in standardizing facilitator training. Results of the Cronbach alpha could also be generalized to other programs to help identify cases that require improvement in order to enhance the PBL experience. For instance, the cases identified as having low alpha scores were replaced with disease state topics that had more complexity. Finally, this checklist differentiates student performance and offers pharmacy programs a mechanism for evaluating skills that are not routinely assessed through conventional examination-based assessments.
Appendix

Problem Based Learning (PBL) Facilitator Checklist to Evaluate Student Performance within Small Group Sessions. Each Checklist Item Represents Possible Tasks that a Student Could Perform in a Session.
- Received January 21, 2018.
- Accepted September 23, 2018.
- © 2019 American Association of Colleges of Pharmacy