Abstract
Objective. To determine whether instructor-prepared classroom examinations for pharmacotherapy courses were aligned with course goals and objectives.
Design. Assessment items from examinations in 2 pharmacotherapy courses were evaluated. Four categories of alignment (depth of knowledge, categorical concurrence, range of knowledge, and balance of representation) were used to match course assessments with objectives.
Assessment. While assessments met the criteria for acceptable alignment, there were areas for improvement. Goals and objectives were unevenly assessed, with 1 goal aligning with 45% of all assessment items. The assessments covered all content categories and the range of knowledge established by the objectives, but objectives under specific goals were not evenly assessed.
Conclusion. This alignment study provided quantitative data useful for review and revision of pharmacotherapy course objectives and assessments and demonstrated the usefulness of alignment assessment as a tool for continuous quality improvement.
INTRODUCTION
Effective educational outcomes require the coordination of curriculum, instruction, and assessment. One measure of educational outcomes is alignment, or the matching of test content to subject area content. Alignment is defined as “…the degree to which expectations and assessments are in agreement and serve in conjunction with one another to guide the system toward students learning what they are expected to know and do.”1 Using this definition, the “expectations” within a higher education course become the course goals and objectives; the corresponding assessments are the examinations.
In an aligned system, all parts of the system work together to guide instruction and facilitate student learning.2 Learning expectations and classroom assessments cover the same content and assessments accurately measure the students' knowledge across the depth and breadth of the goals and objectives. An analysis of alignment provides evidence of assessment content validity, identifies needs for improvement, and contributes to instructional accountability.
The University of New Mexico College of Pharmacy Curriculum Committee monitors the development and delivery of the curriculum as part of an ongoing commitment to continuous quality improvement. Course goals and objectives are mapped to the expected competencies and outcomes. Instruction is audited for content and delivery methods. Written classroom assessments undergo test-item analysis. However, no work has been conducted to determine whether the written classroom examinations are aligned with the course goals and objectives.
There are 3 major approaches to determining whether expectations and assessments are aligned: sequential development, expert review, and document analyses.2–4 Sequential development is a structured method of alignment. In this logical but time-consuming method the learning expectations are developed first and then used to develop the curriculum. The last step is to structure the assessments. Sequential development maps each assessment item to an objective, providing methodical evidence of alignment.
Expert review is the method often used when both objectives and assessments have already been developed. The process typically consists of the systematic, item-by-item review of an assessment by a committee of content specialists trained to judge alignment of assessment items with learning objectives. Each assessment item is matched to a learning objective. The analysis does not include a match of item depth or breadth.
Document analysis judges alignment by partitioning and coding goals and objectives, curriculum materials (textbooks), and assessment instruments. Alignment between the 3 can then be compared systematically and quantified. The process is tedious and influenced by simultaneous analyses of objectives, instructional materials, and assessment. Depth and breadth of assessment items are not considered.
Integrating these 3 methods has led to the development of several rigorous methods to evaluate alignment using specific criteria. One specific criterion is content focus, in which the specific content measured by each assessment item is matched to a learning objective. The alignment of assessment items includes analysis of the item depth and breadth. As part of the No Child Left Behind legislation, states are required to demonstrate alignment between educational content and mandated standardized achievement tests for elementary and high school students.3 Three common models use content focus as frameworks for designing and implementing alignment studies. These models are: (1) the Achieve model,5–7 (2) the Surveys of Enacted Curriculum model, and (3) Webb's alignment model.
While the Achieve model is recognized as a sound alignment method, it is specifically structured to analyze standardized tests used in elementary and secondary education. Achieve alignment services do not include a model for analysis of higher education instructor-generated assessments.
The Surveys of Enacted Curriculum (SEC) model codes standardized primary and secondary educational goals and objectives, instruction (teaching), assessments, and educational materials (textbooks, workbooks) using a framework of standardized content topics and item difficulty.3,6,8 Alignment results for the SEC model are highly quantified but narrowly restricted to standardized tests. Like the Achieve model, the SEC has no model for analysis of higher education instructor-generated assessments.
Webb's model uses a flexible analysis that measures the uniformity with which goals and objectives, and assessments share 4 attributes of content focus: depth of knowledge consistency, categorical concurrence, range of knowledge, and balance of representation2,3,6,9 A panel of local content experts is trained to identify depth of knowledge levels, and then independently rate knowledge levels and match each assessment item to a corresponding objective. The method is not restricted to specific content or set of goals and objectives, making it easily adaptable for any alignment study.10
The Webb model is reliable and comprehensive in alignment studies while unrestricted in subject matter or content.2,11 The model has been used extensively in elementary and secondary educational settings. It is simple and does not require extensive training or outside resources.
The most recent revision of the accreditation standards and guidelines for pharmacy education, developed by the Accreditation Council for Pharmacy Education (ACPE) provides specific detail on the educational outcomes intended to meet the new demands of professional practice and adds new expectations for the assessment of students' achievement.12 The ACPE standards and guidelines and the CAPE 2004 Supplemental Educational Outcomes13 were used to revise UNM's College of Pharmacy's existent pharmacotherapy course goals and objectives into a single comprehensive document.
Pharmacotherapy faculty members found that the revised goals and objectives accurately reflected the instructional content of the 3 semester courses. These goals and objectives were used in all 3 semesters of pharmacotherapy and anchored the alignment of assessments in this study. The 5 pharmacotherapy goals were supported by 20 objectives as seen in Table 1.
Pharmacotherapy Course Goals and Objectives
Pharmacotherapy instruction is presented in three 6-semester credit courses. The instruction is presented in blocks by organ systems, often with multiple instructors per block. A block examination is given about every 2 weeks. The final examination contains no new material and is comprehensive for the semester. The examinations are intended to challenge students by requiring a synthesis of content material and critical thinking to successfully respond to case-based scenarios. Course management is challenged by the number of instructors and the number of examinations. The purpose of this study is to assess the alignment of pharmacotherapy assessments with the course goals and objectives using Webb's model of alignment.
DESIGN
The University of New Mexico Health Sciences Center, Human Research Review Committee determined that this study was exempt from federal regulations. The study used the block examinations from 2 pharmacotherapy courses offered in spring 2008. Course A is the first course in the 3 course sequence and course B is the third course. Both courses used the same course goals and objectives, had similar instructional formats, and were taught by multiple instructors. Each content instructor contributed items for the block examinations that contained a mix of multiple-choice, matching, and short-answer questions, with the majority of items being short-answer questions.
The analysis included 13 examinations with assessment items written by more than 20 instructors. Course A had 6 block examinations with an average of 43 items per examination. Course B had 7 block examinations with an average of 43.8 items per examination. No new material is assessed on the final and grade-posting requirements prohibit the use of open-ended assessment items.
The alignment analysis procedure, adapted from procedures developed by Webb 1,2,9,14,15 was designed to measure the alignment of course assessment items with course goals and objectives. The analysis was structured to measure 4 categories:
Depth of Knowledge Consistency.
Depth of knowledge consistency evaluates the cognitive level of each assessment item. Three levels of cognitive classification were used in the alignment study.
Level 1: Knowledge.
Level 1 is comparable to Bloom's taxonomy level of knowledge.16 It measures the students' ability to recall previously learned facts or to recite information, ideas, or principles in the approximate form in which they were learned. Item response requires only a single step. Key words indicative of a level 1 item include identify, define, name, use, and list.
Level 2: Application.
Level 2 is comparable to Bloom's comprehension and application levels.16 It requires students to make some decisions in solving the problem and involves more than simple recall. Students must translate material from one situation to another by applying rules, concepts, and principles. Assessment items require more than a single thought process. Key words include classify, organize, estimate, calculate, predict, interpret, and give examples.
Level 3: Strategic Thinking.
Level 3 is a condensation of Bloom's synthesis, evaluation, and analysis levels.16 It requires reasoning, planning, and using evidence, as well as a higher level of thinking. The cognitive demands are complex and abstract in that the task requires more demanding reasoning than the other 2 levels. Items require students to apply prior knowledge to a new situation to develop a solution, to make recommendations based on data, or to evaluate a solution based on criteria or standards of practice. Key words include hypothesize, construct, recommend, summarize, differentiate, and design.
Recognizing that pharmacotherapy courses are key to the development of students' critical-thinking and problem-solving skills, the faculty members established an assessment target emphasizing the level 3 strategic thinking depth of knowledge in 2007. Faculty members agreed that each pharmacotherapy examination would target 20% of all items to be assessed at level 1; 30% at level 2; and 50% at level 3.
Raters independently assigned a depth of knowledge level (1, 2, or 3) to each assessment item on each examination. If kappa was greater than 0.7, the analysis continued. The raters' independent depth of knowledge level rankings were averaged and rounded to the nearest whole number (1, 2, or 3) to assign the depth of knowledge level for each item. The number of items per depth of knowledge level per goal and objective for all examinations was calculated.
Categorical concurrence was used to assess the spread of assessment items across the objectives. Each rater was asked to independently identify the objective(s) that were assessed in each item. Raters could select from 0 to 4 objectives for each assessment item. When an assessment item could not be matched to an objective, it was coded as “no objective found.” A “hit” designated that an assessment item was mapped to an objective. The analysis looked at the number of total hits matched to each objective to establish concurrence between the objectives and the examination items.
Range of knowledge correspondence was used to examine the extent to which all objectives were assessed. The span of knowledge expected of students as listed in the goals and objectives was compared to the span of knowledge represented by the assessment items. The number of hits per objective and per goal was calculated as percentages of all assessment items to determine the range covered by the assessments.
Balance of representation was calculated using an index to judge the distribution of assessment items among the objectives for a specific goal.17 If all assessment items assigned to a goal are evenly distributed among that goal's objectives, the balance index value for that goal will be 1. A smaller index value corresponds to a less even distribution of items across the objectives under a specific goal.
Webb suggests a balance index value of 0.7 or higher as acceptable for this criterion, indicating that assessment items are reasonably distributed among all of the objectives. Index values between 0.6 and 0.69 indicate that the balance of representation criterion is only weakly met. Lower index values are indicative of an uneven distribution of items across the objectives.4
Raters
Four pharmacy practice faculty members served as subject matter expert item raters. No rater had an integral role in the management, instruction, or assessment of either of the 2 courses. Each rater practiced clinical pharmacy and was familiar with current pharmacy education and curriculum issues. All raters received one-on-one training by the lead author in analyzing and coding examination items against the course goals and objectives. To evaluate the effectiveness of the training and rater consistency, each examination was tested for interrater reliability. An interrater reliability analysis using Fleiss' kappa statistic was performed to determine reliability of agreement among raters. No examination had a kappa less than 0.70, indicating substantial agreement among the raters.17 The mean interrater reliabilities of agreement for course A and course B were 0.83 and 0.91, respectively.
ASSESSMENT
Depth of Knowledge Consistency
The depth of knowledge percentage of items for each course compared to the target depth of knowledge is shown in Table 2. There was no significant difference between depth of knowledge levels from course A and the target distributions. Course B differed (p <0.05) from both course A and the target distribution of depth of knowledge levels.
Comparison of Depth of Knowledge Target Levels to Course Depth of Knowledge Levels
Analysis of each depth of knowledge level by goal is shown in Figure 1. Item depth of knowledge was inconsistently distributed within course goals. Assessment at level 1 ranged from 12% to 34%, while that for level 3 ranged from 19% to 48%. Goal 3 was most often assessed at level 1 and least often at level 3. The item cognitive levels that matched to goal 5 most closely approximate the target distribution.
Distribution of depth of knowledge levels of test item by goals.
Categorical Concurrence
Assessment items would be distributed equally among the goals if each goal was equal in emphasis. The distribution of assessment items matched to objectives within each goal is shown in Table 3. The distribution among goals was uneven, ranging from less than 4% to 45%. “No-objective found” accounted for 4.4% of the assessment items. The distributions of assessment items among the objectives within each goal also were unevenly distributed. While every objective was matched to multiple assessment items, Objectives 3.2 and 5.2 in course B received only 5 hits each.
Categorical Concurrence and Range of Knowledge of Items in Examinations Given in a Pharmacotherapy Course
Range of Knowledge
The breadth of knowledge required in both the goals and objectives and in the assessment items must be comparable. The minimum criterion was set at 1 assessment item per objective for at least half of the objectives.18 Table 3 shows that each objective within a course was matched to a minimum of 5 assessment items. The range of knowledge was judged acceptable in that all goals and objectives were matched with more than 1 assessment item.
Balance of Representation
The balance of representation criterion requires that the knowledge content within the goals and objectives be equally represented in the assessment items. When all assessment items that are matched to a goal are uniformly distributed among the goal objectives, the calculated index value will be 1, a perfect balance.18 The index values graphed in Figure 2 show that goal 1 and goal 5 were met with acceptable balance of representation. Goal 2 was weakly balanced. Goal 3 and goal 4 were not balanced, indicating an uneven distribution of assessment matches to goal objectives.
Balance of representation by course and goals. Index value ≥ 0.7 indicates acceptable balance; 0.60–0.69 indicates weak balance; < 0.60 indicates uneven distribution.
DISCUSSION
The purpose of an alignment study is to estimate the extent of alignment between the course objectives and assessment items. By examining the data from an alignment study, deficiencies can be identified and curricular, instructional, and assessment improvements can be recommended. A strong alignment can support a claim of content validity and consistency across assessments, while a weaker alignment identifies challenges for improvement.
Overall, assessment alignment could be improved by preparing and using a test blueprint that lays out the specifics for test development. Each objective to be measured is weighted in relative importance as a percentage of the total test items. As seen in this study (Table 3), some objectives are likely to be overrepresented and others underrepresented when tests are created without a blueprint.19 If each goal is equally important, then an even distribution of hits is expected. Goal 1 averaged 45% of all hits, while goal 5 averaged only 3%. This distribution is the result of test development without a blueprint.
Instructional faculty members need to review the 5 goals and determine if each goal is appropriately placed or should be moved to another course and the weight of each goal as a percentage. Faculty members need to commit to the use of a test blueprint. If each goal is determined to be equally important, future tests items should be evenly divided, with each goal receiving 20% of the assessment items. If the goals are not equally important, relative percentages should be defined to guide improved test development. If the current distribution of assessment items represents the relative importance of each goal, no changes need to be made. A sample test blueprint is provided in Table 4. In this example, goal 3 has been moved from pharmacotherapy to a more suitable course. The remaining goals are weighted with goals 1 and 5 at 20%, and goals 2 and 4 at 30%. A distribution of assessment points by goal and cognitive levels is suggested.
Example of a Test Blueprint Used in a Pharmacotherapy Course
Depth of knowledge can also be improved. While each goal had assessment items at all 3 depth of knowledge levels, the targets established by instructional faculty members were not met. Specifically, the number of level 2 items was too high while the number for level 3 items was too low. Overall, instructors were more likely to write knowledge-based (Level 1) items and fewer problem-solving (Level 3) items. Level 1 items usually matched only 1 objective, while level 2 and 3 items were more likely to match multiple objectives, indicating that instructors might need coaching in writing assessments items at higher cognitive levels.
The depth of knowledge for each objective should also be reviewed to determine whether the established targets are applicable for each objective. If alternative targets are suggested, the test blueprint should record the suggested percentages of each item level for each objective. Instructional faculty members should be provided with training and coaching in the effective use of a test blueprint to achieve the desired distribution of item difficulty for each objective.
A test blueprint would also address possible issues with categorical concurrence. A weighting of the goals should be followed by a similar weighting of each objective. The distribution of hits per objective measured by this study is quite wide with 6 of the 20 objectives receiving less than 1% of the hits and a single objective receiving 26%. A weighting of each objective within each goal should be determined by instructional faculty members and any suggested changes recorded on the test blueprint.
Each objective was assessed at least 5 times throughout the course. This met Webb's definition for an acceptable range of knowledge. However, faculty review was needed to determine if the current alignment is appropriate or if a more uniform distribution of item assessments per objective was needed. Alternatively, objectives receiving a low number of hits should be reviewed for appropriate placement within the curriculum.
The distribution of assessment items was uniform among the objectives of goals 1 and 5. Balance of representation analysis showed that the remaining 3 goals have uneven distributions. For example, goal 4, with 25% of all assessment hits, has a single objective with 64% of all goal hits. Faculty review is needed to determine the acceptability of the distribution of assessment items or to propose revisions in assessment item alignment.
Each alignment criterion represents a different association between the course goals and objectives and the course assessment. According to Webb's model, alignment exists when all 4 criteria have been met and a sufficient number of assessment items match to the objectives with an appropriate level of complexity and coverage with an overall balance.18 The implementation and utilization of a test blueprint with carefully weighted goals, weighted objectives under the goals, and specified cognitive levels for items for each objective provides documented guidance to the instructional faculty developing classroom assessments. A repeat alignment study after implementation of the changes described should provide documentation of improvement in course assessment structure.
Limitations
Assessment alignment methods were developed to demonstrate or evaluate the relationship between standardized achievement tests and state educational content standards used in developing elementary and secondary school curriculums, specifically language arts, math, and science. An educational standard, developed and reviewed by educational experts, is composed of a number of goals which contain a number of objectives. These are analyzed and reviewed on an ongoing basis.18 In contrast, the goals and objectives used in college courses are not as rigorously developed, which may alter the outcomes of an assessment alignment.
A review of the literature found no published documentation of an assessment alignment method designed for instructor-prepared assessments used in higher education courses. However, the results of an alignment study should provide a measure of how well assessments cover the course objectives regardless of the educational level. More research in this area would be of benefit to higher education.
SUMMARY
This study demonstrates the usefulness of alignment assessment as a tool for continuous quality improvement. The alignment assessment reveals that while assessments met Webb's criteria for acceptable alignment, there are areas for improvement. All goals and objectives were assessed, but unevenly, with 1 goal aligning with 45% of all assessment items. The assessments covered all content categories and the range of knowledge established by the objectives, but objectives under specific goals were not evenly assessed.
Alignment assessment is a method by which the integration of instructional components (objectives, instructional content, and assessments) can be evaluated. The alignment of course objectives with assessment items indirectly assesses the alignment of the instructional content. More importantly, alignment studies provide quantitative data to be used for the revision of instructional content, course objectives, and assessments.
- Received October 2, 2009.
- Accepted January 10, 2010.
- © 2010 American Journal of Pharmaceutical Education