|Posted on November 22, 2019 at 7:00 AM|
It’s a question I get a lot! And—fair warning!—you probably won’t like my answers.
First, the learning goals we assess are promises we make to our students, their families, employers, and society: Students who successfully complete a course, program, gen ed curriculum, or other learning experience can do the things we promise in our learning goals. Those learning goals also are (or should be) the most important things we want students to learn. As a matter of integrity, we should therefore make sure, through assessment, that every student who completes a learning experience has indeed achieved its learning goals. So my first answer is that you should assess everyone’s work, not a sample.
Second, if you are looking at a sample rather than everyone’s work, you must look at a large enough sample (and a representative enough sample) to be able to generalize from that sample to all students. Political polls take this approach. A poll may say, for example, that 23% of registered voters prefer Candidate X with (in fine print at the bottom of the table) an error margin of plus or minus 5%. That means the pollster is reasonably confident that, if every registered voter could be surveyed, between 18% and 28% would prefer Candidate X.
Here’s the depressing part of this approach: An error margin of 5%--and I wouldn’t want an error margin bigger than that—requires looking at about 400 examples of student work. (This is why those political polls typically sample about 400 people.) Unless your institution or program is very large, once again you need to look at everyone’s work, not a sample. Even if your institution or program is very large, your accreditor may expect you to look separately at students at each location or otherwise break your students down into smaller groups for analysis, and those groups may well be under 400 students.
I can think of only three situations in which samples may make sense.
Expensive, supplemental assessments. Published or local surveys, interviews, and focus groups can be expensive in terms of time and/or dollars. These are supplemental assessments—indirect evidence of student learning—and it’s usually not essential to have all students participate in them.
Learning goals you don’t expect everyone to achieve. Some institutions and programs have some statements that aren’t really learning goals but aspirations: things they hope some students will achieve but not that they can realistically promise that every student will achieve. Having a passion for lifelong learning or a commitment to civic engagement are two examples of such aspirations. It may be fine to assess aspirations by looking at samples to estimate how many students are indeed on the path to achieving them.
Making evidence of student learning part of a broader analysis. For many faculty, the burden of assessment is not assessing students in classes—they do that through the grading process. The burden is in the extra work of folding their assessment into a broader analysis of student learning across a program or gen ed requirement. Sometimes faculty submit rubric or test scores to an office or committee; sometimes faculty submit actual student work; sometimes a committee assesses student work. These additional steps can be laborious and time consuming, especially if your institution doesn’t have a suitable assessment information management system. In these situations, samples of student work may save considerable time—if the samples are sufficiently large and representative to yield useful, generalizable evidence of student learning, as discussed above.
For more information on sampling, see Chapter 12 of the third edition of Assessing Student Learning: A Common Sense Guide.
|Posted on August 7, 2019 at 6:30 AM|
In my July 9, 2019, blog post I encouraged using summertime to reflect on your assessment practices, starting with the question, “Why are we assessing?”
Here are the next questions on which I suggest you reflect:
- Who are our audiences for the products we’re generating through our assessment processes?
- What decisions are they making?
- How can the products of our assessment work help them make better decisions?
In other words, before planning any assessment, figure out the decisions the assessment results should inform, then design the assessment to help inform those decisions.
Your answers to the questions I’ve listed will affect the length, format, and even the vocabulary you use in each product. Consider these products of assessment processes:
Assessment results. The key audience for assessment results should be obvious: the faculty and administrators who need them to make decisions, especially what and how to teach, how to help students learn and succeed, and how best to deploy scarce resources.
Faculty and administrators are always making these decisions. The problem is that many people make those decisions in a “data-free zone.” They make a decision simply because someone thinks they have a good idea or perhaps because a couple of students complained about something.
Today time and resources at virtually every institution are limited. We can no longer afford to make decisions simply because people think they have a good idea. Before plunging ahead with a decision, we first need some evidence that we’ve identified the problem correctly and that our solution has a good chance of solving the problem.
In his book How to Measure Anything, Douglas Hubbard points out we’re not aiming to make infallible decisions, just better decisions than we would without assessment results.
Many reports of assessment results say the results have been used only to make tweaks to assignments and course curricula (“We’ll emphasize this more in class.”) But what if, say, six programs all find that their seniors can’t analyze data well? That calls for another audience: your institution’s academic leadership team. Your institution needs a process to examine assessment results across programs holistically—and probably qualitatively—to identify any pervasive issues and bring them to the attention of academic leaders so they can provide professional development and other support to address those issues across your institution.
All this suggests that you need to involve your audiences in designing your assessments and your reports of assessment results, both to make sure you’re providing the information they need and that it’s in a format they can easily understand and use.)
Learning goals have several audiences. The most important audience is students, because research has shown that many students learn more effectively when they understand what they’re supposed to be learning. Prospective students—those who are considering enrolling in your institution, program, course, or co-curricular learning experience—are another important audience. Key learning goals might help convince them to enroll (“I’ve always wanted to learn that” or “I can see why it would be important to learn that.”). A third audience is potential employers (“These are the skills I’m looking for when I hire people, so I’m going to take a close look at graduate of this program.”). And a fourth audience is potential funding sources such as foundations, donors, and government policymakers (“This institution or program teaches important things, the kinds of skills people need today, so it’s good place to invest our funds.”).
All these audiences need learning goals stated in clear, simple terms that they will easily understand. Academic jargon and complex statements have no place in learning goals.
Strategic and unit goals. These often have two key audiences: the employees who will help accomplish the goals and potential funding sources such as donors. Both need goals stated in clear, simple terms that they will easily understand so they can figure out where the institution or unit is headed, what it will look like in a few years, and how they can help achieve the goals.
Curriculum maps. Curriculum maps are a tool to help faculty (1) analyze the effectiveness of their curricula and (2) identify the best places to assess student achievement of key learning goals. So they need to be designed in ways that help faculty accomplish these quickly and easily.
Student assignments. In many assignments we give students, the implicit audience for their work—be it a paper, presentation, or performance—is us: the faculty or staff member giving the assignment. That doesn’t prepare students well for creating work for other audiences. When I taught first-year writing, one assignment was to write solicitations for gifts to a charity to two different audiences (and a third statement comparing the two). When I taught statistics, I had students write a one-paragraph summary their statistical test addressed to the hypothetical individual who requested the analysis. When I taught a graduate course in educational research methods, I had students not only draft the first three chapters of their theses but deliver mock presentations to a foundation explaining, justifying, and seeking funding for their research.
Documentation of assessment processes (how each learning goal has been assessed). The key audience here is the faculty and staff responsible for the program, course, or other learning experience being assessed. They can use this documentation to avoid reinventing the wheel (“How did we assess this last time?”).
Another audience for documentation of assessment processes is whatever group is overseeing and supporting assessment efforts at your institution, such as an assessment committee. This group can use this documentation to (1) recognize and honor good practices, (2) share those good assessment practices with others at your college, (3) give each program or unit feedback on how well its assessment work meets the characteristics of good assessment practices, and (4) plan professional development to address any pervasive issues they see in how assessment is being done.
Documentation of uses of assessment results. Here again the key audience is the faculty and staff responsible for the program, course, or other learning experience being assessed. They can use this information to track the impact of improvements they’ve attempted (“We tried adding more homework problems but that didn’t help much. Maybe this time we could try incorporating these skills into two other required courses.”).
Why haven’t I mentioned your accreditor as an audience of your assessment products? Accreditors are a potential audience for everything I’ve mentioned here, but they’re a secondary audience. They are most interested in the impact of your assessment products on students, colleagues, and the other audiences I’ve mentioned here. They want to see what you’ve shared with your key audiences—and how those audiences have used what you’ve shared with them. Most of all, they want your summary and candid, forthright analysis of the overall effectiveness of your institution’s or program’s assessment products in helping those audiences make decisions.
|Posted on July 9, 2019 at 3:50 PM|
Summer is a great time to reflect on and possibly rethink your assessment practices. I’m a big believer in form following function, so I think the first question to reflect on should be, “Why are we doing this?” You can then reflect on how well your assessment practices achieve those purposes.
In Chapter 6 of my book Assessing Student Learning I present three purposes of assessment. Its fundamental purpose is, of course, giving students the best possible education. Assessment accomplishes this by giving faculty and staff feedback on what is and isn’t working and insight into changes that might help students learn and succeed even more effectively.
The second purpose of assessment is what I call stewardship. All colleges run on other people’s money, including tuition and fees paid by students and their families, government funds paid by taxpayers, and scholarships paid by donors. All these people deserve assurance that your college will be a wise steward of their resources, spending those resources prudently, effectively, and judiciously. Stewardship includes using good-quality evidence of student learning to help inform decisions on how those resources are spent, including how everyone spends their time. Does service learning really help develop students’ commitment to a life of service? Does the gen ed curriculum really help improve students’ critical thinking skills? Does the math requirement really help students analyze data? And are the improvements big enough to warrant the time and effort faculty and staff put into developing and delivering these learning experiences?
The third purpose of assessment is accountability: assuring your stakeholders of the effectiveness of your college, program, service, or initiative. Stakeholders include current and prospective students and their families, employers, government policy makers, alumni, taxpayers, governing board members…and, yes, accreditors. Accountability includes sharing both successes and steps being taken to make appropriate, evidence-based improvements.
So your answers to “Why are we doing this?” will probably be variations on the following themes, all of which require good-quality assessment evidence:
- We want to understand what is and isn’t working and what changes might help students learn and succeed even more effectively.
- We want to understand if what we’re doing has the desired impact on student learning and success and whether the impact is enough to justify the time and resources we’re investing.
- Our stakeholders deserve to see our successes in helping students learn and succeed and what we’re doing to improve student learning and success.
|Posted on June 8, 2019 at 6:25 AM|
I have the honor of serving as one of the faculty of this year's Mission Fulfillment Fellowship of the Northwest Commission on Colleges and Universities (NWCCU). One of the readings that’s resonated most with the Fellows is Equity and Assessment: Moving Towards Culturally Responsive Assessment by Erick Montenegro and Natasha Jankowski.
A number of the themes of this paper resonate with me. One is that I’ve always viewed assessment as simply a part of teaching, and the paper confirms that there’s a lot of overlap between culturally responsive pedagogy and culturally responsive assessment.
Second, a lot of culturally responsive assessment concepts are simply about being fair to all students. Fairness is a passion of mine and, in fact, the subject of the very first paper I wrote on assessment in higher education twenty years ago. Fairness includes:
- Writing learning goals, rubrics, prompts (assignments), and feedback using simple, clear vocabulary that entry-level students can understand, including defining any terms that may be unfamiliar to some students.
- Matching your assessments to what you teach and vice versa. Create rubrics, for example, that focus on the skills you have been helping students demonstrate, not the task you’re asking students to complete.
- Helping students learn how to do the assessment task. Grade students on their writing skill only if you have been explicitly teaching them how to write in your discipline and giving them writing assignments and feedback.
- Giving students a variety of ways to demonstrate their learning. Students might demonstrate information literacy skills, for example, through a deck of PowerPoint slides, poster, infographic, mini-class, graphic novel, portfolio, or capstone project, to name a few.
- Engaging and encouraging your students, giving them a can-do attitude.
Third, a lot of culturally responsive pedagogy and assessment concepts flow from research over the last 25 years on how to help students learn and succeed, which I’ve summarized in List 26.1 in my book Assessing Student Learning: A Common Sense Guide. We know, for example, that some students learn better when:
- They see clear relevance and value in their learning activities.
- They understand course and program learning goals and the characteristics of excellent work, often through a rubric.
- Learning activities and grades focus on important learning goals. Faculty organize curricula, teaching practices, and assessments to help students achieve important learning goals. Students spend their time and energy learning what they will be graded on.
- New learning is related to their prior experiences and what they already know, through both concrete, relevant examples and challenges to their existing paradigms.
- They learn by doing, through hands-on practice engaging in multidimensional real world tasks, rather than by listening to lectures.
- They interact meaningfully with faculty—face-to-face and/or online.
- They collaborate with other students—face-to-face and/or online—including those unlike themselves.
- Their college and its faculty and staff truly focus on helping students learn and succeed and on improving student learning and success.
These are all culturally responsive pedagogies.
So, in my opinion, the concept of culturally responsive assessment doesn’t break new ground as much as it reinforces the importance of applying what we already know: ensuring that our assessments are fair to all students, using research-informed strategies to help students learn and succeed, and viewing assessment as part of teaching rather than as a separate add-on activity.
How do we apply what we know to students whose cultural backgrounds and experiences are different from our own? In addition to the ideas I’ve already listed, here are some practical suggestions for culturally responsive assessment, gleaned from Montenegro and Jankowski’s paper and my own experiences working with people from a variety of cultures and backgrounds:
- Recognize that, like any human being, you’re not impartial. Grammatical errors littering a paper may make it hard, for example, for you to see the good ideas in it.
- Rather than looking on culturally responsive assessment as a challenge, look on it as a learning experience: a way to model the common institutional learning outcome of understanding and respecting perspectives of people different from yourself.
- Learn about your students’ cultures. Ask your institution to develop a library of short, practical resources on the cultures of its students. For cultures originating in countries outside the United States, I do an online search for business etiquette in that country or region. It’s a great way to quickly learn about a country’s culture and how to interact with people there sensitively and effectively. Just keep in mind that readings won’t address every situation you’ll encounter.
- Ask your students for help in understanding their cultural background.
- Involve students and colleagues from a variety of backgrounds in articulating learning goals, designing rubrics, and developing prompts (assignments).
- Recognize that students for whom English is a second language find it particularly hard to demonstrate their learning through written assignments and oral presentations. They may demonstrate their learning more effectively through non-verbal means such as a chart or infographic.
- Commit to using the results of your assessments to improve learning for all students, not just the majority or plurality.
|Posted on May 10, 2019 at 8:50 AM|
A recent question posted to the ASSESS listserv led to a lively discussion of direct vs. indirect evidence of student learning, including what they are and the merits of each.
I really hate jargon, and “direct” and “indirect” is right at the top of my list of jargon I hate. A few years ago I did a little poking around to try to figure out who came up with these terms. The earliest reference I could find was in a government regulation. That makes sense—governments are great at coming up with obtuse jargon!
I suspect the terms came from the legal world, which uses the concepts of direct and circumstantial evidence. Direct evidence in the legal world is evidence that supports an assertion without the need for additional evidence. Witness knowledge or direct recollection are examples of direct evidence. Circumstantial evidence is evidence from which reasonable inferences may be drawn.
In the legal world, both direct and circumstantial evidence are acceptable and each alone may be sufficient to make a legal decision. Here’s an often-cited example: If you got up in the middle of the night and saw that it’s snowing, that’s direct evidence that it snowed overnight. If you got up in the morning and saw snow on the ground, that’s circumstantial evidence that it snowed overnight. Obviously both are sufficient evidence that it snowed overnight.
But let’s say you got up in the morning and saw that the roads were wet. That’s circumstantial evidence that it rained overnight. But the evidence is not as compelling, because there might be other reasons the roads were wet. It might have snowed and the snow melted by dawn. It might have been foggy. Or street cleaners may have come through overnight. In this example, this circumstantial evidence would be more compelling if it were accompanied by corroborating evidence, such as a report from a local weather station or someone living a mile away who did get up in the middle of the night and saw rain.
So, in the legal world, direct evidence is observed and circumstantial evidence is inferred. Maybe “observed” and “inferred” might be better terms for direct and indirect evidence of student learning. Direct evidence can be observed through student products and performances. Indirect evidence must be inferred through what students tell us, through things like surveys and interviews, or what faculty tell us through things like grades, or some student behaviors such as graduation or job placement.
But the problem with using “observable” and “inferred” is that all student learning is inferred to some extent. If a crime is recorded on video, that’s clearly direct, observable evidence. But if a student writes a research paper or makes a presentation or takes a test, we’re only observing a sample of what they’ve learned, and maybe it’s not a good sample. Maybe the test happened to focus heavily on the concepts the student didn’t learn. Maybe the student was ill the day of the presentation. When we assess student learning, we’re trying to see into a student’s mind. It’s like looking into a black box fitted with lenses that are all somewhat blurry or distorted. We may need to look through several lenses, from several angles, to infer reasonably accurately what’s inside.
In the ASSESS listserv discussion, John Hathcoat and Jeremy Penn both suggested that direct and indirect evidence fall on a continuum. This is why. Some lenses are clearer than others. Some direct evidence is more compelling or convincing than others. If we see a nursing student intubate a patient successfully, we can be pretty confident that the student can perform this procedure correctly. But if we assess a student essay, we can’t be as confident about the student’s writing skill, because the skill level displayed can depend on factors such as the essay’s topic, the time and circumstances under which the student completes the assignment, and the clarity of the prompt (instructions).
So I define direct evidence as not only observable but sufficiently convincing that a critic would be persuaded. Imagine someone prominent in your community who thinks your college, your program, or your courses are a joke—students learn nothing worthwhile in them. Direct evidence is the kind that the critic wouldn’t challenge. Grades, student self-ratings, and surveys wouldn’t convince that critic. But rubric results, accompanied by a few samples of student work, would be harder for the critic to refute.
So should faculty be asked or required to provide direct and indirect evidence of student learning? If your accreditor requires direct and indirect evidence, obviously yes. Otherwise, the need for direct evidence depends on how it will be used. Direct evidence should be used, for example, when deciding whether students will progress or graduate or whether to fund or terminate a program. The need for direct evidence also depends on the likelihood that the evidence will be challenged. For relatively minor uses, such as evaluating a brief co-curricular experience, indirect evidence may be just as useful as direct evidence, if not even more insightful.
One last note on direct/observable evidence: learning goals for attitudes, values, and dispositions can be difficult if not impossible to observe. That’s because, as hard as it is to see into the mind (with that black box analogy), it’s even harder to see into the soul. One of the questions on the ASSESS listserv was what constitutes direct evidence that a dancer dances with confidence. Suppose you’re observing two dancers performing. One has enormous confidence and the other has none. Would you be able to tell them apart from their performances? If so, how? What would you see in one performance that you wouldn’t see in the other? If you can observe a difference, you can collect direct evidence. But if the difference is only in their soul—not observable—you’ll need to rely on indirect evidence to assess this learning goal.
|Posted on April 17, 2019 at 9:00 AM|
Another week, another critique of assessment, this one at the Academic Resource Conference of the WASC Senior College and University Commission.
The fundamental issue is that, more than a quarter century into the higher ed assessment movement, we still aren’t doing assessment very well. So this may be a good time to reconsider, “What is good assessment?”
A lot of people continue to point to the nine Principles of Good Practice for Assessing Student Learning developed by the old American Association for Higher Education back in 1992. In fact, NILOA once published a statement that averred that they are “aging nicely.” I’ve never liked them, however. One reason is that they combine principles of good assessment practice with principles of good assessment results without distinguishing the two. Another is that nine principles are, I think, too many—I’d rather everyone focus on just a few fundamental principles.
But most important, I think they don’t focus on the right things. They overemphasize some minor traits of good assessment (I’ve seen plenty of good assessments conducted without much student involvement, for example) and are silent on some important ones. They say nothing, for example, about the need for assessment to be cost-effective, and I think that omission is a big reason why assessment is under fire today. A year ago, for example, I did a content analysis of comments posted in response to two critiques of assessment published in the Chronicle of Higher Education and the New York Times. Almost 40% of the comments talked about what a waste of time and resources assessment work is.
When I was Director of AAHE’s Assessment Forum in 1999-2000, I argued that it was time to update them, to no avail. In the mid-2000s, I did a lit review of principles of good assessment practice. (You’d be amazed how many there are! Here’s an intriguing one from 2014) I created a new model of just five principles, which I presented at several conferences. Good assessment practices:
- Lead to results that are useful and used.
- Flow from and focus on clear and important goals.
- Are cost-effective, yielding results that are useful enough to be worth the time and resources invested.
- Yield reasonably accurate and truthful results.
- Are valued.
These are not discrete, of course, so since I’ve developed this model, I’ve played around with it. About five years ago I took it down to two principles. Under this model, good assessment practices:
- Yield results that are used in meaningful ways to improve teaching and learning. This can only happen if assessment practices focus on clear and important goals and yield reasonably accurate and truthful results. And using assessment results to inform meaningful decisions is the best way to show that assessment work is valued.
- Are sustained and pervasive. This can only happen if assessment practices are cost-effective and are valued.
While I like the simplicity of this model, it buries the idea that assessments should be cost-effective, which we really need to highlight. Today when I do presentations on good assessment, I present the following four traits, because these are the traits that we most need to focus on most today. Good assessment practices:
- Lead to results that are useful and used. This is what psychometricians call consequential validity. I continue to think that this is **THE** most important characteristic of effective assessment practices—all other traits of good assessment practice flow from this one. One corollary, for example, is that assessment results must be conveyed clearly, succinctly, and meaningfully, in ways that facilitate decision-making.
- Flow from and focus on clear and important goals. While this is a corollary of the useful-and-used principle, this is so important, and so frequently a shortcoming of current assessment practices, that I highlight it separately. Learning goals need to be not only clear but relevant to students, employers, and society. They represent not what we want to teach but what students most need to learn. And those goals are treated as promises to students, employers, and society; if you pass this course or graduate, you will be able to do these things, and we will use assessments to make sure.
- Are cost-effective, yielding results that are useful enough to be worth the time and resources invested. This is a major shortcoming of many current assessment practices. They suck up enormous amounts of time and dollars, and whatever is learned from them just isn’t worth the time and money invested.
- Are part of everyday life of the college community. In other words, the culture is one of collaboration and evidence-informed planning and decision making.
|Posted on January 16, 2019 at 7:45 AM|
A recent discussion on the ACCSHE listserv reminded me that setting meaningful benchmarks or standards for student learning assessments remains a real challenge. About three years ago, I wrote a blog post on setting benchmarks or standards for rubrics. Let’s revisit that and expand the concepts to assessments beyond rubrics.
The first challenge is vocabulary. I’ve seen references to goals, targets, benchmarks, standards, thresholds. Unfortunately, the assessment community doesn’t yet have a standard glossary defining these terms (although some accreditors do). I now use standard to describe what constitutes minimally acceptable student performance (such as the passing score on a test) and target to describe the proportion of students we want to meet that standard. But my vocabulary may not match yours or your accreditor's!
The second challenge is embedded in that next-to-last sentence. We’re talking about two different numbers here: the standard describing minimally acceptable performance and the target describing the proportion of students achieving that performance level. That makes things even more confusing.
So how do we establish meaningful standards? There are four basic ways. Three are:
1. External standards: Sometimes the standard is set for us by an external body, such as the passing score on a licensure exam.
2. Peers: Sometimes we want our students to do as well as or better than their peers.
3. Historical trends: Sometimes we want our students to do as well as or better than past students.
Much of the time none of these options is available to us, leaving us to set our own standard, what I call a local standard and what others call a competency-based or criterion-referenced standard. Here are the steps to setting a local standard:
Focus on what would not embarrass you. Would you be embarrassed if people found out that a student performing at this level passed your course or graduated from your program or institution? Then your standard is too low. What level do students need to reach to succeed at whatever comes next—more advanced study or a job?
Consider the relative harm in setting the standard too high or too low. A too-low standard means you’re risking passing or graduating students who aren’t ready for what comes next and that you’re not identifying problems with student learning that need attention. A too-high standard may mean you’re identifying shortcomings in student learning that may not be significant and possibly using scarce time and resources to address those relatively minor shortcomings.
When in doubt, set the standard relatively high rather than relatively low. Because every assessment is imperfect, you’re not going to get an accurate measure of student learning from any one assessment. Setting a relatively high bar increases the chance that every student is truly competent on the learning goals being assessed.
If you can, use external sources to help set standards. A business advisory board, faculty from other colleges, or a disciplinary association can all help get you out of the ivory tower and set defensible standards.
Consider the assignment being assessed. Essays completed in a 50-minute class are not going to be as polished as papers created through scaffolded steps throughout the semester.
Use samples of student work to inform your thinking. Discuss with your colleagues which seem unacceptably poor, which seem adequate though not stellar, and which seem outstanding, then discuss why.
If you are using a rubric to assess student learning, the standard you’re setting is the rubric column (performance level) that defines minimally acceptable work. This is the most important column in the rubric and, not coincidentally, the hardest one to complete. After all, you’re defining the borderline between passing and failing work. Ideally, you should complete this column first, then complete the remaining columns.
Now let’s turn from setting standards to setting targets for the proportions of students who achieve those standards. Here the challenge is that we have two kinds of learning goals. Some are essential. We want every college graduate to write a coherent, grammatically correct paragraph, for example. I don’t want my tax returns prepared by an accountant who can complete them correctly only 70% of the time, and I don’t want my prescriptions filled by a pharmacist who can fill them correctly only 70% of the time! For these essential goals, we want close to 100% of students meeting our standard.
Then there are aspirational goals, which not everyone need achieve. We may want college graduates to be good public speakers, for example, but in many cases graduates can lead successful lives even if they’re not. For these kinds of goals, a lower target may be appropriate.
Tests and rubrics often assess a combination of essential and aspirational goals, which suggests that overall test or rubric scores often aren’t very helpful in understanding student learning. Scores for each rubric trait or for each learning objective in the test blueprint are often much more useful.
Bottom line here: I have a real problem with people who say their standard or target is 70%. It’s inevitably an arbitrary number with no real rationale. Setting meaningful standards and targets is time-consuming, but I can think of few tasks that are more important, because it’s what help ensure that students truly learn what we want them to…and that’s what we’re all about.
By the way, my thinking here comes primarily from two sources: Setting Performance Standards by Cizek and a review of the literature that I did a couple of years ago for a chapter on rubric development that I contributed to the https://www.amazon.com/Handbook-Measurement-Assessment-Evaluation-Education/dp/1138892157" target="_blank">Handbook on Measurement, Assessment, and Evaluation in Higher Education. For a more thorough discussion of the ideas here, see Chapter 22 (Setting Meaningful Standards and Targets) in the new 3rd edition of my book Assessing Student Learning: A Common Sense Guide.
|Posted on September 2, 2018 at 8:25 AM|
In a recent guest post in Inside Higher Ed, “What Students See in Rubrics,” Denise Krane explained her dissatisfaction with rubrics, which can be boiled down to this statement toward the end of her post, “Ideally, rubrics are assignment specific.”
I don’t know where Denise got this idea, but it’s flat-out wrong. As I’ve mentioned in previous blog posts on rubrics, a couple of years ago I conducted a literature review for a chapter on rubric development that I wrote for the second edition of the Handbook of Measurement, Assessment, and Evaluation in Higher Education. The rubric experts I found (for example, Brookhart; Lane; Linn, Baker & Dunbar; and Messick) are unanimous in advocating what they call general rubrics over what they call task-specific rubrics: rubrics that assess achievement of the assignment’s learning outcomes rather than achievement of the task at hand.
Their reason is exactly what Denise advocates: we want students to focus on long-term, deep learning—in the case of writing, to develop the tools to, as Denise says, grapple with writing in general. Indeed, some experts such as Lane posit that one of the criteria of a valid rubric is its generalizability: it should tell you how well students can write (or think, or solve problems) across a range of tasks, not just the one being assessed. If you use a task-specific rubric, students will learn how to do that one task but not much more. If you use a general rubric, students will learn skills they can use in whole families of tasks.
To be fair, the experts also caution against general rubrics that are too general, such as one writing rubric used to assess student work in courses and programs across an entire college. Many experts (for example, Cooper, Freedman, Lane, and Lloyd-Jones) suggest developing rubrics for families of related assignments—perhaps one for academic writing in the humanities and another for business writing. This lets the rubric include discipline-specific nuances. For example, academic writing in the humanities is often expansive, while business writing must be succinct.
How do you move from a task-specific rubric to a general rubric? It’s all about the traits being assessed—those things listed on the left side of the rubric. Those things should be traits of the learning outcomes being assessed, not the assignment. So instead of listing each element of the assignment (I’ve seen rubrics that literally list “opening paragraph,” “second paragraph,” and so on ), list each key trait of the learning goals. When I taught writing, for example, my rubric included traits like focus, organization, and sentence structure.
Over the last few months I’ve worked with a lot of faculty on creating rubrics, and I’ve seen that moving from a task-specific to a general rubric can be remarkably difficult. One reason is that faculty want students to complete the assignment correctly: Did they provide three examples? Did they cite five sources? If this is important, I suggest making “Following directions” one of the learning outcomes of the assignment and including it as a trait assessed by the rubric. Then create a separate checklist of all the components of the assignment. Ask students to complete the checklist themselves before submitting the assignment. Also consider asking students to pair up and complete checklists for each other’s assignments.
To identify the other traits assessed by the rubric, ask yourself, “What does good writing/problem solving/critical thinking/presenting look like? Focus not on this assignment but on why you’re giving students the assignment. What you want them to learn from this assignment that they can use in subsequent courses or after they graduate?
Denise mentioned two other things about rubrics that I’d also like to address. She surveyed her students about their perceptions of rubrics, and one complaint was that faculty expectations vary from one professor to another. The problem here is lack of collaboration. Faculty teaching sections of the same course--or related courses--should collaborate on a common rubric that they all use to grade student work. This lets students work on the same important skill over and over again in varying course contexts and see connections in their learning. If one professor wants to emphasize something above and beyond the common rubric, fine. The common elements can be the top half of the rubric, and the professor-specific elements can be the bottom half.
Denise also mentioned that her rubric ran three pages, and she hated. I would too! Long rubrics focus on the trees rather than the forest of what we’re trying to help students learn. A shorter rubric (I recommend that rubrics fit on one page) focuses students on the most important things they’re supposed to be learning. If it frustrates you that your rubric doesn’t include everything you want to assess, keep in mind that no assessment can assess everything. Even a comprehensive final exam can’t ask every conceivable question. Just make sure that your rubric, like your exam, focuses on the most important things you want students to learn.
If you’re interested in a deeper dive into what I learned about rubrics, here are some of my past blog posts. My book chapter in the Handbook has the full citations of the authors I've mentioned here.
|Posted on August 14, 2018 at 8:50 AM|
A while back, a faculty member teaching in a community college career program told me, “I don’t need to assess. I know what my students are having problems with—math.”
Well, maybe so, but I’ve found that my perceptions often don’t match reality, and systematic evidence gives me better insight. Let me give you a couple of examples.
Example #1: you may have noticed that my website blog page now has an index of sorts on the right side. I created it a few months ago, and what I found really surprised me. I aim for practical advice on the kinds of assessment issues that people commonly face. Beforehand I’d been feeling pretty good about the range and relevance of assessment topics that I’d covered. The index showed that, yes, I’d done lots of posts on how to assess and specifically on rubrics, a pet interest of mine. I was pleasantly surprised by the number of posts I’d done on sharing and using results.
But what shocked me was how little I’d written on assessment culture: only four posts in five years! Compare that with seventeen posts on curriculum design and teaching. Assessment culture is an enormous issue for assessment practitioners. Now knowing the short shrift I’d been giving it, I’ve written several more blog posts related to assessment culture, bring the total to ten (including this post).
(By the way, if there’s anything you’d like to see a blog post on, let me know!)
Example #2: Earlier this summer I noticed that some of the flowering plants in my backyard weren’t blooming much. I did a shade study: one sunny day when I was home all day, every hour I made notes on which plants were in sun and which were in shade. I’d done this about five years ago but, as with the blog index, the results shocked me; some trees and shrubs had grown a lot bigger in five years and consequently some spots in my yard were now almost entirely in shade. No wonder those flowers didn’t bloom! I’ll be moving around a lot of perennials this fall to get them into sunnier spots.
So, yes, I’m a big fan of using systematic evidence to inform decisions. I’ve seen too often that our perceptions may not match reality.
But let’s go back to that professor whose students were having problems with math and give him the benefit of the doubt—maybe he’s right. My question to him was, “What are you doing about it?” The response was a shoulder shrug. His was one of many institutions with an assessment office but no faculty teaching-learning center. In other words, they’re investing more in assessment than in teaching. He had nowhere to turn for help.
My point here is that assessment is worthwhile only if the results are used to make meaningful improvements to curricula and teaching methods. Furthermore, assessment work is worthwhile only if the impact is in proportion to the time and effort spent on the assessment. I recently worked with an institution that undertook an elaborate assessment of three general education learning outcomes, in which student artifacts were sampled from a variety of courses and scored by a committee of trained reviewers. The results were pretty dismal—on average only about two thirds of students were deemed “proficient” on the competencies’ traits. But the institutional community is apparently unwilling to engage with this evidence, so nothing will be done beyond repeating the assessment in a couple of years. Such an assessment is far from worthwhile; it’s a waste of everyone’s time.
This institution is hardly alone. When I was working on the new 3rd edition of my book Assessing Student Learning: A Common Sense Guide, I searched far and wide for examples of assessments whose results led to broad-based change and found only a handful. Overwhelmingly, the changes I see are what I call minor tweaks, such as rewriting an assignment or adding more homework. These changes can be good—collectively they can add up to a sizable impact. But the assessments leading to these kinds of changes are worthwhile only if they’re very simple, quick assessments in proportion to the minor tweaks they bring about.
So is assessment worth it? It’s a mixed bag. On one hand, the time and effort devoted to some assessments aren’t worth it—the findings don’t have much impact. On the other hand, however, I remain convinced of the value of using systematic evidence to inform decisions affecting student learning. Assessment has enormous potential to move us from providing a good education to providing a truly great education. The keys to achieving this are commitments to (1) making that good-to-great transformation, (2) using systematic evidence to inform decisions large and small, and (3) doing only assessments whose impact is likely to be in proportion to the time, effort, and resources spent on them.
|Posted on July 30, 2018 at 8:20 AM|
I often hear questions about how long an “assessment cycle” should be. Fair warning: I don’t think you’re going to like my answer.
The underlying premise of the concept of an assessment cycle is that assessment of key program, general education, or institutional learning goals is too burdensome to be completed in its entirety every year, so it’s okay for assessments to be staggered across two or more years. Let’s unpack that premise a bit.
First, know that if an accreditor finds an institution or program out of compliance with even one of its standards—including assessment—Federal regulations mandate that the accreditor can give the institution no more than two years to come into compliance. (Yes, the accreditor can extend those two years for “good cause,” but let’s not count on that.) So an institution that has done nothing with assessment has a maximum of two years to come into compliance, which often means not just planning assessments but conducting them, analyzing the results, and using the results to inform decisions. I’ve worked with institutions in this situation and, yes, it can be done. So an assessment cycle, if there is one, should generally run no longer than two years.
Now consider the possibility that you’ve assessed an important learning goal, and the results are terrible. Perhaps you learn that many students can’t write coherently, or they can’t analyze information or make a coherent argument. Do you really want to wait two, three, or five years to see if subsequent students are doing better? I’d hope not! I’d like to see learning goals with poor results put on red alert, with prompt actions so students quickly start doing better and prompt re-assessments to confirm that.
Now let’s consider the premise that assessments are too burdensome for them all to be conducted annually. If your learning goals are truly important, faculty should be teaching them in every course that addresses them. They should be giving students learning activities and assignments on those goals; they should be grading students on those goals; they should be reviewing the results of their tests and rubrics; and they should be using the results of their review to understand and improve student learning in their courses. So, once things are up and running, there really shouldn’t be much extra burden in assessing important learning goals. The burdens are cranking out those dreaded assessment reports and finding time to get together with colleagues to review and discuss the results collaboratively. Those burdens are best addressed by minimizing the work of preparing those reports and by helping faculty carve out time to talk.
Now let’s consider the idea that an assessment cycle should stagger the goals being assessed. That implies that every learning goal is discrete and that it needs its own, separate assessment. In reality, learning goals are interrelated; how can one learn to write without also learning to think critically? And we know that capstone assignments—in which students work on several learning goals at once—are not only great opportunities for students to integrate and synthesize their learning but also great assessment opportunities, because we can look at student achievement of several learning goals all at once.
Then there’s the message we send when we tell faculty they need to conduct a particular assessment only once every three, four, or five years: assessment is a burdensome add-on, not part of our normal everyday work. In reality, assessment is (or should be) part of the normal teaching-learning process.
And then there are the practicalities of conducting an assessment only once every few years. Chances are that the work done a few years ago will have vanished or at least collective memory will have evaporated (why on earth did we do that assessment?). Assessment wheels must be reinvented, which can be more work than tweaking last year’s process.
So should assessments be conducted on a fixed cycle? In my opinion, no. Instead:
- Use capstone assignments to look at multiple goals simultaneously.
- If you’re getting started with assessment, assess everything, now. You’ve been dragging your feet too long already, and you’re risking an accreditation action. Remember you must not only have results but be using them within two years.
- If you’ve got disappointing results, move additional assessments of those learning goals to a front burner, assessing them frequently until you get results where you want them.
- If you’ve got terrific results, consider moving assessments of those learning goals to a back burner, perhaps every two years or so, just to make sure results aren’t slipping. This frees up time to focus on the learning goals that need time and attention.
- If assessment work is widely viewed as burdensome, it’s because its cost-benefit is out of whack. Perhaps assessment processes are too complicated, or people view the learning goals being assessed as relatively unimportant, or the results aren’t adding useful insight. Do all you can to simplify assessment work, especially reporting. If people don't find a particular assessment useful, stop doing it and do something else instead.
- If assessment work must be staggered, stagger some of your indirect assessment tools, not the learning goals or major direct assessments. An alumni survey or student survey might be conducted every three years, for example.
- For programs that “get” assessment and are conducting it routinely, ask for less frequent reports, perhaps every two or three years instead of annually. It’s a win-win reward: less work for them and less work for those charged with reviewing and offering feedback on assessment reports.
|Posted on June 10, 2018 at 8:45 AM|
Architecture critic Kate Wagner recently said, “All buildings are interesting. There is not a single building that isn’t interesting in some way.” I think we can say the same thing about assessment: All assessment is interesting. There is not a single assessment that isn’t interesting in some way.
Kate points out that what makes seemingly humdrum buildings interesting are the questions we can ask about them—in other words, how we analyze them. She suggests a number of questions that can be easily adapted to assessment:
- How do these results compare to other assessment results? We can compare results against results for other students (at our institution or elsewhere), against results for other learning goals, against how students did when they entered (value-added), against past cohorts of students, or against an established standard. Each of these comparisons can be interesting. (See Chapter 22 of my book Assessing Student Learning for more information on perspectives for comparing results.)
- Are we satisfied with the results? Why or why not?
- What do these results say about our students at this time? Students, curricula, and teaching methods are rapidly changing, which makes them--and assessment--interesting. Assessment results are a piece of history: what students learned (and didn’t learn) at this time, in this setting.
- What does this assessment say about what we and our institution value? What does it say about the world in which we live?
Why do so many faculty and staff fail to find assessment interesting? I’ve alluded to a number of possible reasons in past blog posts (such as here and here), but let me throw out a few that I think are particularly relevant.
1. Sometimes assessment simply isn’t presented as something that’s supposed to be interesting. It’s a chore to get through accreditation, nothing more. Just as Kate felt obliged to point out that even humdrum buildings are interesting, sometimes faculty and staff need to be reminded that assessment should be designed to yield interesting results.
2. Sometimes faculty and staff aren’t particularly interested in the learning goal being assessed. If a faculty member focuses on basic conceptual understanding in her course, she’s not going to be particularly interested in the assessment of critical thinking that she's obliged to do. Rethinking key learning goals and helping faculty and staff rethink their curricula can go a long way toward generating assessment results that faculty and staff find interesting.
3. Some faculty and staff find results mildly interesting, but not interesting enough to be worth all the time and effort that’s gone into generating them. A complex, time-consuming assessment whose results show that students are generally doing fine and are not all that different from past years is interesting but not terribly interesting. The cost-benefit isn’t there. Here the key is to scale back less-interesting assessments—maybe repeat the assessment every two or three years just to make sure results aren’t slipping—and focus on assessments that faculty and staff will find more interesting and useful.
4. Some faculty and staff aren’t really that interested in teaching—they’re far more engaged with their research agenda. And some faculty and staff aren’t really that interested in improving their teaching. Institutional leaders can help here by rethinking incentives and rewards to encourage faculty and staff to try to improve their teaching.
Kate says, “All of us have the potential to be nimble interpreters of the world around us. All we need to do is look around.” Similarly, all of us have the potential to be nimble interpreters of evidence of student learning. All we need to do is use the analytical skills we learned in college and teach to our students to find what's interesting.
|Posted on May 27, 2018 at 7:40 AM|
When I help faculty and co-curricular staff move ahead with their assessment efforts, I probably spend half our time on helping them articulate their learning goals. As the years have gone by, I’ve become ever more convinced that learning goals are the foundation of an assessment structure…and without a solid foundation, a structure can’t be well-constructed.
So what are well-stated learning goals? They have the following characteristics:
They are outcomes: what students will be able to do after they successfully complete the learning experience, not what they will do or learn during the learning experience. Example: Prepare effective, compelling visual summaries of research.
They are clear, written in simple, jargon-free terms that everyone understands, including students, employers, and colleagues in other disciplines. Example: Work collaboratively with others.
They are observable, written using action verbs, because if you can see it, you can assess it. Example: Identify and analyze ethical issues in the discipline.
They focus on skills more than knowledge, conceptual understanding, or attitudes and values, because thinking and performance skills are what employers seek in new hires. I usually suggest that at least half the learning goals of any learning experience focus on skills. Example: Integrate and properly cite scientific literature.
They are significant and aspirational: things that will take some time and effort for students to learn and that will make a real difference in their lives. Example: Identify, articulate, and solve problems in [the discipline or career field].
They are relevant, meeting the needs of students, employers, and society. They focus more on what students need to learn than what faculty want to teach. Example: Interpret numbers, data, statistics, and visual representations of them appropriately.
They are short and therefore powerful. Long, qualified or compound statements get everyone lost in the weeds. Example: Treat others with respect.
They fit the scope of the learning activity. Short co-curricular learning experiences have narrower learning goals than an entire academic program, for example.
They are limited in number. I usually suggest no more than six learning goals per learning experience. If you have 10, 15, or 20 learning goals—or more—everyone focuses on trees rather than the forest of the most important things you want students to learn.
They help students achieve bigger, broader learning goals. Course learning goals help students achieve program and/or general education learning goals; co-curricular learning goals help students achieve institutional learning goals; program learning goals help students achieve institutional learning goals.
For more information on articulating well-stated learning goals, see Chapter 4 of the new 3rd edition of my book Assessing Student Learning: A Common Sense Guide.
|Posted on March 28, 2018 at 6:25 AM|
In my February 28 blog post, I noted that many faculty express frustration with assessment along the following lines:
- What I most want students to learn is not what’s being assessed.
- I’m being told what and how to assess, without any input from me.
- I’m being told what to teach, without any input from me.
- I’m being told to assess skills that employers want, but I teach other things that I think are more important.
- A committee is doing a second review of my students’ work. I’m not trusted to assess student work fairly and accurately through my grading processes.
- I’m being asked to quantify student learning, but I don’t think that’s appropriate for what I’m teaching.
- I’m being asked to do this on top of everything else I’m already doing.
- Assessment treats learning as a scientific process, when it’s a human endeavor; every student and teacher is different.
The underlying theme here is that these faculty don’t feel that they and their views are valued and respected. When we value and respect people:
- We design assessment processes so the results are clearly useful in helping to make important decisions, not paper-pushing exercises designed solely to get through accreditation.
- We make assessment work worthwhile by using results to make important decisions, such as on resource allocations, as discussed in my March 13 blog post.
- We truly value great teaching and actively encourage the scholarship of teaching as a form of scholarship.
- We truly value innovation, especially in improving one’s teaching because, if no one wants to change anything, there’s no point in assessing.
- We take the time to give faculty and staff clear guidance and coordination, so they understand what they are to do and why.
- We invest in helping them learn what to do: how to use research-informed teaching strategies as well as how to assess.
- We support their work with appropriate resources.
- We help them find time to work on assessment and to keep assessment work cost-effective, because we respect how busy they are.
- We take a flexible approach to assessment, recognizing that one size does not fit all. We do not mandate a single institution-wide assessment approach but instead encourage a variety of assessment strategies, both quantitative and qualitative. The more choices we give faculty, the more they feel empowered.
- We design assessment processes so faculty are leaders rather than providers of assessment. We help them work collaboratively rather than in silos, inviting them to contribute to decisions on what, why, and how we assess. We try to assess those learning outcomes that the institutional community most values. More than anything else, we spend more time listening than telling.
- We recognize and honor assessment work in tangible ways, perhaps through a celebratory event, public commendations, or consideration in promotion, tenure, and merit pay applications.
For more information on these and other strategies to value and respect people who work on assessment, see Chapter 14, “Valuing Assessment and the People Who Contribute,” in the new third edition of my book Assessing Student Learning: A Common Sense Guide.
|Posted on March 13, 2018 at 9:50 AM|
In my February 28 blog post, I noted that many faculty have been expressing frustration that assessment is a waste of an enormous amount of time and resources that could be better spent on teaching. Here are some strategies to help make sure your assessment activities are meaningful and cost-effective, all drawn from the new third edition of Assessing Student Learning: A Common Sense Guide.
Don’t approach assessment as an accreditation requirement. Sure, you’re doing assessment because your accreditor requires it, but cranking out something only to keep an accreditor happy is sure to be viewed as a waste of time. Instead approach assessment as an opportunity to collect information on things you and your colleagues care about and that you want to make better decisions about. Then what you’re doing for the accreditor is summarizing and analyzing what you’ve been doing for yourselves. While a few accreditors have picky requirements that you must comply with whether you like them or not, most want you to use their standards as an opportunity to do something genuinely useful.
Keep it useful. If an assessment hasn’t yielded useful information, stop doing it and do something else. If no one’s interested in assessment results for a particular learning goal, you’ve got a clue that you’ve been assessing the wrong goal.
Make sure it’s used in helpful ways. Design processes to make sure that assessment results inform things like professional development programming, resource allocations for instructional equipment and technologies, and curriculum revisions. Make sure faculty are informed about how assessment results are used so they see its value.
Monitor your investment in assessment. Keep tabs on how much time and money each assessment is consuming…and whether what’s learned is useful enough to make that investment worthwhile. If it isn’t, change your assessment to something more cost-effective.
Be flexible. A mandate to use an assessment tool or strategy that’s inappropriate for a particular learning goal or discipline is sure to be viewed as a waste of everyone’s time. In assessment, one size definitely does not fit all.
Question anything that doesn’t make sense. If no one can give a good explanation for doing something that doesn’t make sense, stop doing it and do something more appropriate.
Start with what you have. Your college has plenty of direct and indirect evidence of student learning already on hand, from grading processes, surveys, and other sources. Squeeze information out of those sources before adding new assessments.
Think twice about blind-scoring and double-scoring student work. The costs in terms of both time and morale can be pretty steep (“I’m a professional! Why can’t they trust me to assess my own students’ work?” ). Start by asking faculty to submit their own rubric ratings of their own students’ work. Only move to blind- and double-scoring if you see a big problem in their scores of a major assessment.
Start at the end and work backwards. If your program has a capstone requirement, students should be demonstrating achievement in many key program learning goals in it. Start assessment there. If students show satisfactory achievement of the learning goals, you’re done! If you’re not satisfied with their achievement of a particular learning goal, you can drill down to other places in the curriculum that address that goal.
Help everyone learn what to do. Nothing galls me more than finding out what I did wasn’t what was wanted and has to be redone. While we all learn from experience and do things better the second time, help everyone learn what to do so, their first assessment is a useful one.
Minimize paperwork and bureaucratic layers. Faculty are already routinely assessing student learning through the grading process. What some resent is not the work of grading but the added workload of compiling, analyzing, and reporting assessment evidence from the grading process. Make this process as simple, intuitive, and useful as possible. Cull from your assessment report template anything that’s “nice to know” versus absolutely essential.
Make assessment technologies an optional tool, not a mandate. Only a tiny number of accreditors require using a particular assessment information management system. For everyone else, assessment information systems should be chosen and implemented to make everyone’s lives easier, not for the convenience of a few people like an assessment committee or a visiting accreditation team. If a system is hard to learn, creates more work, or is expensive, it will create resentment and make things worse rather than better. I recently encountered one system for which faculty had to tally and analyze their results, then enter the tallied results into the system. Um, shouldn’t an assessment system do the work of tallying and analysis for the faculty?
Be sensible about staggering assessments. If students are not achieving a key learning goal well, you’ll want to assess it frequently to see if they’re improving. But if students are achieving another learning goal really well, put it on a back burner, asking for assessment reports on it only every few years, to make sure things aren’t slipping.
Help everyone find time to talk. Lots of faculty have told me that they “get” assessment but simply can’t find time to discuss with their colleagues what and how to assess and how best to use the results. Help them carve out time on their calendars for these important conversations.
Link your assessment coordinator with your faculty teaching/learning center, not an accreditation or institutional effectiveness office. This makes clear that assessment is about understanding and improving student learning, not just a hoop to jump through to address some administrative or accreditation mandate.
|Posted on January 28, 2018 at 7:25 AM|
A couple of years ago I did a literature review on rubrics and learned that there’s no consensus on what a rubric is. Some experts define rubrics very narrowly, as only analytic rubrics—the kind formatted as a grid, listing traits down the left side and performance levels across the top, with the boxes filled in. But others define rubrics more broadly, as written guides for evaluating student work that, at a minimum, lists the traits you’re looking for.
But what about something like the following, which I’ve seen on plenty of assignments?
70% Responds fully to the assignment (length of paper, double-spaced, typed, covers all appropriate developmental stages)
15% Grammar (including spelling, verb conjugation, structure, agreement, voice consistency, etc.)
Under the broad definition of a rubric, yes, this is a rubric. It is a written guide for evaluating student work, and it lists the three traits the faculty member is looking for.
The problem is that it isn’t a good rubric. Effective assessments including rubrics have the following traits:
Effective assessments yield information that is useful and used. Students who earn less than 70 points for responding to the assignment have no idea where they fell short. Those who earn less than 15 points on organization have no idea why. If the professor wants to help the next class do better on organization, there’s no insight here on where this class’s organization fell short and what most needs to be improved.
Effective assessments focus on important learning goals. You wouldn’t know it from the grading criteria, but this was supposed to be an assignment on critical thinking. Students focus their time and mental energies on what they’ll be graded on, so these students will focus on following directions for the assignment, not developing their critical thinking skills. Yes, following directions is an important skill, but critical thinking is even more important.
Effective assessments are clear. Students have no idea what this professor considers an excellently organized paper, what’s considered an adequately organized paper, and what’s considered a poorly organized paper.
Effective assessments are fair. Here, because there are only three broad, ill-defined traits, the faculty member can be (unintentionally) inconsistent in grading the papers. How many points are taken off for an otherwise fine paper that’s littered with typos? For one that isn’t double-spaced?
So the debate about an assessment should be not whether it is a rubric but rather how well it meets these four traits of effective assessment practices.
If you’d like to read more about rubrics and effective assessment practices, the third edition of my book Assessing Student Learning: A Common Sense Guide will be released on February 13 and can be pre-ordered now. The Kindle version is already available through Amazon.
|Posted on May 21, 2017 at 6:10 AM|
I was impressed with—and found myself in agreement with—Douglas Roscoe’s analysis of the state of assessment in higher education in “Toward an Improvement Paradigm for Academic Quality” in the Winter 2017 issue of Liberal Education. Like Douglas, I think the assessment movement has lost its way, and it’s time for a new paradigm. And Douglas’s improvement paradigm—which focuses on creating spaces for conversations on improving teaching and curricula, making assessment more purposeful and useful, and bringing other important information and ideas into the conversation—makes sense. Much of what he proposes is in fact echoed in Using Evidence of Student Learning to Improve Higher Education by George Kuh, Stanley Ikenberry, Natasha Jankowski, Timothy Cain, Peter Ewell, Pat Hutchings, and Jillian Kinzie.
But I don’t think his improvement paradigm goes far enough, so I propose a second, concurrent paradigm shift.
I’ve always felt that the assessment movement tried to do too much, too quickly. The assessment movement emerged from three concurrent forces. One was the U.S. federal government, which through a series of Higher Education Acts required Title IV gatekeeper accreditors to require the institutions they accredit to demonstrate that they were achieving their missions. Because the fundamental mission of an institution of higher education is, well, education, this was essentially a requirement that institutions demonstrate that its intended student learning outcomes were being achieved by its students.
The Higher Education Acts also required Title IV gatekeeper accreditors to require the institutions they accredit to demonstrate “success with respect to student achievement in relation to the institution’s mission, including, as appropriate, consideration of course completion, state licensing examinations, and job placement rates” (1998 Amendments to the Higher Education Act of 1965, Title IV, Part H, Sect. 492(b)(4)(E)). The examples in this statement imply that the federal government defines student achievement as a combination of student learning, course and degree completion, and job placement.
A second concurrent force was the movement from a teaching-centered to learning-centered approach to higher education, encapsulated in Robert Barr and John Tagg’s 1995 landmark article in Change, “From Teaching to Learning: A New Paradigm for Undergraduate Education.” The learning-centered paradigm advocates, among other things, making undergraduate education an integrated learning experience—more than a collection of courses—that focuses on the development of lasting, transferrable thinking skills rather than just basic conceptual understanding.
The third concurrent force was the growing body of research on practices that help students learn, persist, and succeed in higher education. Among these practices: students learn more effectively when they integrate and see coherence in their learning, when they participate in out-of-class activities that build on what they’re learning in the classroom, and when new learning is connected to prior experiences.
These three forces led to calls for a lot of concurrent, dramatic changes in U.S. higher education:
- Defining quality by impact rather than effort—outcomes rather than processes and intent
- Looking on undergraduate majors and general education curricula as integrated learning experiences rather than collections of courses
- Adopting new research-informed teaching methods that are a 180-degree shift from lectures
- Developing curricula, learning activities, and assessments that focus explicitly on important learning outcomes
- Identifying learning outcomes not just for courses but for for entire programs, general education curricula, and even across entire institutions
- Framing what we used to call extracurricular activities as co-curricular activities, connected purposefully to academic programs
- Using rubrics rather than multiple choice tests to evaluate student learning
- Working collaboratively, including across disciplinary and organizational lines, rather than independently
These are well-founded and important aims, but they are all things that many in higher education had never considered before. Now everyone was being asked to accept the need for all these changes, learn how to make these changes, and implement all these changes—and all at the same time. No wonder there’s been so much foot-dragging on assessment! And no wonder that, a generation into the assessment movement and unrelenting accreditation pressure, there are still great swaths of the higher education community who have not yet done much of this and who indeed remain oblivious to much of this.
What particularly troubles me is that we’ve spent too much time and effort on trying to create—and assess—integrated, coherent student learning experiences and, in doing so, left the grading process in the dust. Requiring everything to be part of an integrated, coherent learning experience can lead to pushing square pegs into round holes. Consider:
- The transfer associate degrees offered by many community colleges, for example, aren’t really programs—they’re a collection of general education and cognate requirements that students complete so they’re prepared to start a major after they transfer. So identifying—or assessing—program learning outcomes for them frankly doesn’t make much sense.
- The courses available to fulfill some general education requirements don’t really have much in common, so their shared general education outcomes become so broad as to be almost meaningless.
- Some large universities are divided into separate colleges and schools, each with their own distinct missions and learning outcomes. Forcing these universities to identify institutional learning outcomes applicable to every program makes no sense—again, the outcomes must be so broad as to be almost meaningless.
- The growing numbers of students who swirl through multiple colleges before earning a degree aren’t going to have a really integrated, coherent learning experience no matter how hard any of us tries.
At the same time, we have given short shrift to helping faculty learn how to develop and use good assessments in their own classes and how to use grading information to understand and improve their own teaching. In the hundreds of workshops and presentations I’ve done across the country, I often ask for a show of hands from faculty who routinely count how many students earned each score on each rubric criterion of a class assignment, so they can understand what students learned well and what they didn’t learn well. Invariably a tiny proportion raises their hands. When I work with faculty who use multiple choice tests, I ask how many use a test blueprint to plan their tests so they align with key course objectives, and it’s consistently a foreign concept to them.
In short, we’ve left a vital part of the higher education experience—the grading process—in the dust. We invest more time in calibrating rubrics for assessing institutional learning outcomes, for example, than we do in calibrating grades. And grades have far more serious consequences to our students, employers, and society than assessments of program, general education, co-curricular, or institutional learning outcomes. Grades decide whether students progress to the next course in a sequence, whether they can transfer to another college, whether they graduate, whether they can pursue a more advanced degree, and in some cases whether they can find employment in their discipline.
So where we should go? My paradigm springs from visits to two Canadian institutions a few years ago. At that time Canadian quality assurance agencies did not have any requirements for assessing student learning, so my workshops focused solely on assessing learning more effectively in the classroom. The workshops were well received because they offered very practical help that faculty wanted and needed. And at the end of the workshops, faculty began suggesting that perhaps they should collaborate to talk about shared learning outcomes and how to teach and assess them. In other words, discussion of classroom learning outcomes began to flow into discussion of program learning outcomes. It’s a naturalistic approach that I wish we in the United States had adopted decades ago.
What I now propose is moving to a focus on applying everything we’ve learned about curriculum design and assessment to the grading process in the classroom. In other words, my paradigm agrees with Roscoe’s that “assessment should be about changing what happens in the classroom—what students actually experience as they progress through their courses—so that learning is deeper and more consequential.” My paradigm emphasizes the following.
- Assessing program, general education, and institutional learning outcomes remain an assessment best practice. Those who have found value in these assessments would be encouraged to continue to engage in them and honored through mechanisms such as NILOA’s Excellence in Assessment designation.
- Teaching excellence is defined in significant part by four criteria: (1) the use of research-informed teaching and curricular strategies, (2) the alignment of learning activities and grading criteria to stated course objectives, (3) the use of good quality evidence, including but not limited to assessment results from the grading process, to inform changes to one’s teaching, and (4) active participation in and application of professional development opportunities on teaching including assessment.
- Investments in professional development on research-informed teaching practices exceed investments in assessment.
- Assessment work is coordinated and supported by faculty professional development centers (teaching-learning centers) rather than offices of institutional effectiveness or accreditation, sending a powerful message that assessment is about improving teaching and learning, not fulfilling an external mandate.
- We aim to move from a paradigm of assessment, not just to one of improvement as Roscoe proposes, but to one of evidence-informed improvement—a culture in which the use of good quality evidence to inform discussions and decisions is expected and valued.
- If assessment is done well, it’s a natural part of the teaching-learning process, not a burdensome add-on responsibility. The extra work is in reporting it to accreditors. This extra work can’t be eliminated, but it can be minimized and made more meaningful by establishing the expectation that reports address only key learning outcomes in key courses (including program capstones), on a rotating schedule, and that course assessments are aggregated and analyzed within the program review process.
Under this paradigm, I think we have a much better shot at achieving what’s most important: giving every student the best possible education.
|Posted on November 14, 2015 at 8:15 AM|
It’s actually impossible to determine whether any rubric, in isolation, is valid. Its validity depends on how it is used. What may look like a perfectly good rubric to assess critical thinking is invalid, for example, if used to assess assignments that ask only for descriptions. A rubric assessing writing mechanics is invalid for drawing conclusions about students’ critical thinking skills. A rubric assessing research skills is invalid if used to assess essays that students are given only 20 minutes to write.
A rubric is thus valid only if the entire assessment process—including the assignment given to students, the circumstances under which students complete the assignment, the rubric, the scoring procedure, and the use of the findings—is valid. Valid rubric assessment processes have seven characteristics. How well do your rubric assessment processes stack up?
Usability of the results. They yield results that can be and are used to make meaningful, substantive decisions to improve teaching and learning.
Match with intended learning outcomes. They use assignments and rubrics that systematically address meaningful intended learning outcomes.
Clarity. They use assignments and rubrics written in clear and observable terms, so they can be applied and interpreted consistently and equitably.
Fairness. They enable inferences that are meaningful, appropriate, and fair to all relevant subgroups of students.
Consistency. They yield consistent or reliable results, a characteristic that is affected by the clarity of the rubric’s traits and descriptions, the training of those who use it, and the degree of detail provided to students in the assignment.
Appropriate range of outcome levels. The rubrics’ “floors” and “ceilings” are appropriate to the students being assessed).
Generalizability. They enable you to draw overall conclusions about student achievement. The problem here is that any single assignment may not be a representative, generalizable sample of what students have learned. Any one essay question, for example, may elicit an unusually good or poor sample of a student’s writing skill. Increasing the quantity and variety of student work that is assessed, perhaps through portfolios, increases the generalizability of the findings.
Sources for these ideas are cited in my chapter, “Rubric Development,” in the forthcoming second edition of the Handbook on Measurement, Assessment, and Evaluation in Higher Education to be published by Taylor & Francis.
|Posted on March 17, 2014 at 8:10 AM|
Back in December, I suggested that there are just two traits of “good” assessment:
1. Good assessment practices yield results that are used in meaningful ways to improve teaching and learning.
2. Good assessment practices are sustained and pervasive.
With support from the Teagle Foundation, Larry Braskamp and Mark Engberg at Loyola University Chicago have developed “Guidelines for Judging the Effectiveness of Assessing Student Learning” that have five domains:
1. Having a clear purpose and readiness for assessment
2. Involving stakeholders throughout the assessment process
3. What and how to assess is critical
4. Assessment is telling a story
5. Improvement and follow-up are an integral part of the assessment process
I like these domains for a couple of reasons. First, as I suggested in my February 7, 2014, blog, I’d rather focus on effectiveness than quality, so I like Larry and Mark’s framework of effective assessment rather than good assessment. Second, their five domains are a good explication of my two traits, and you may find them more helpful in communicating with your colleagues and stakeholders.
|Posted on December 20, 2013 at 6:40 AM|
There are many statements of principles of good assessment practice. Several years ago I integrated them into one short list of just five traits that I hoped would be easier to understand and share. Since then I've tweaked my list, taking it down to four traits, then most recently up to six. But as I look at them now, I see just two fundamental principles of good assessment practice, with all the other traits falling within these two principles.
1. Good assessment practices yield results that are used in meaningful ways to improve teaching and learning as well as to inform plans and resource allocation decisions. This is the fundamental characteristic of good assessment. If your results are good enough quality that you can use them, they are good enough. Results are useful if they meet the following traits.
--Results flow from clear, important and relevant learning outcomes. If no one is using assessment results for your stated learning outcomes, perhaps your learning outcomes aren't all that important.
--Results are reasonably accurate and truthful. If you're not looking at enough student work, or your rubric criteria aren't consistently interpreted, for example, your results won't be useful.
--You have justifiable targets or standards for acceptable results.In other words, you've defined what successful results look like. Results can be used for improvement only if you have a good, clear sense of whether or not improvements are warranted and where improvements are most needed.
--Results are easy to find and easy to understand. If people don't have ready access to the results or they can't understand them, they can't use them for improvement.
--Results come from outside your college or program as well as within. External evidence informs your learning outcomes, your standards and your use of results.
2. Good assessment practices are sustained and pervasive. They are not bursts of effort just before an accreditation review, and they are not in just a few pockets here and there. They are part of everyday life. Assessment is sustained and pervasive if it meets the following traits:
-- Assessment practices are cost-effective, yielding benefits that are worth the time, effort, and resources put into them. They are kept as simple and practical as possible.
--Assessment practices are adequately supported by the college with professional development, resources, expertise, incentives, and recognition.
--Assessment practices are flexible, evolving over time and varying by discipline and program so the results are maximum value.
|Posted on November 17, 2013 at 6:55 AM|
Why aren't grades sufficient evidence of student learning?
1. Grades alone do not usually provide meaningful information on exactly what students have and have not learned. So it's hard to use grades alone to decide how to improve teaching and learning.
2. Grading and assessment criteria sometimes differ. Some components of grades reflect classroom management strategies (attendance, timely submission of assignments) rather than achievement of key learning outcomes.
3. Grading standards are sometimes vague or inconsistent. They may weight relatively unimportant (but easier to assess) outcomes more heavily than some major (but harder to assess) outcomes.
4. Grades do not reflect all learning experiences. They provide information on student performance in individual courses and assignments but not student progress in achieving program-wide or institution-wide outcomes.
That said, the grading process can provide excellent evidence of achievement of key learning outcomes, and using information from the grading process in this way can make assessment faster, easier, and more meaningful. NILOA (the National Institute for Learning Outcomes Assessment) has recently published a paper on how Prince George's Community College in Maryland is doing exactly this: http://learningoutcomesassessment.org/OccasionalPapernineteen.html.
You'll see from the NILOA paper that using the grading process to collect assessment evidence works only when faculty are willing to collaborate and agree on at least base grading criteria. I often suggest a two-part rubric: the top half provides the common criteria everyone agrees to, and the bottom half is class-specific criteria that individual faculty want to factor into grades.