|Posted on January 16, 2019 at 7:45 AM||comments (1)|
A recent discussion on the ACCSHE listserv reminded me that setting meaningful benchmarks or standards for student learning assessments remains a real challenge. About three years ago, I wrote a blog post on setting benchmarks or standards for rubrics. Let’s revisit that and expand the concepts to assessments beyond rubrics.
The first challenge is vocabulary. I’ve seen references to goals, targets, benchmarks, standards, thresholds. Unfortunately, the assessment community doesn’t yet have a standard glossary defining these terms (although some accreditors do). I now use standard to describe what constitutes minimally acceptable student performance (such as the passing score on a test) and target to describe the proportion of students we want to meet that standard. But my vocabulary may not match yours or your accreditor's!
The second challenge is embedded in that next-to-last sentence. We’re talking about two different numbers here: the standard describing minimally acceptable performance and the target describing the proportion of students achieving that performance level. That makes things even more confusing.
So how do we establish meaningful standards? There are four basic ways. Three are:
1. External standards: Sometimes the standard is set for us by an external body, such as the passing score on a licensure exam.
2. Peers: Sometimes we want our students to do as well as or better than their peers.
3. Historical trends: Sometimes we want our students to do as well as or better than past students.
Much of the time none of these options is available to us, leaving us to set our own standard, what I call a local standard and what others call a competency-based or criterion-referenced standard. Here are the steps to setting a local standard:
Focus on what would not embarrass you. Would you be embarrassed if people found out that a student performing at this level passed your course or graduated from your program or institution? Then your standard is too low. What level do students need to reach to succeed at whatever comes next—more advanced study or a job?
Consider the relative harm in setting the standard too high or too low. A too-low standard means you’re risking passing or graduating students who aren’t ready for what comes next and that you’re not identifying problems with student learning that need attention. A too-high standard may mean you’re identifying shortcomings in student learning that may not be significant and possibly using scarce time and resources to address those relatively minor shortcomings.
When in doubt, set the standard relatively high rather than relatively low. Because every assessment is imperfect, you’re not going to get an accurate measure of student learning from any one assessment. Setting a relatively high bar increases the chance that every student is truly competent on the learning goals being assessed.
If you can, use external sources to help set standards. A business advisory board, faculty from other colleges, or a disciplinary association can all help get you out of the ivory tower and set defensible standards.
Consider the assignment being assessed. Essays completed in a 50-minute class are not going to be as polished as papers created through scaffolded steps throughout the semester.
Use samples of student work to inform your thinking. Discuss with your colleagues which seem unacceptably poor, which seem adequate though not stellar, and which seem outstanding, then discuss why.
If you are using a rubric to assess student learning, the standard you’re setting is the rubric column (performance level) that defines minimally acceptable work. This is the most important column in the rubric and, not coincidentally, the hardest one to complete. After all, you’re defining the borderline between passing and failing work. Ideally, you should complete this column first, then complete the remaining columns.
Now let’s turn from setting standards to setting targets for the proportions of students who achieve those standards. Here the challenge is that we have two kinds of learning goals. Some are essential. We want every college graduate to write a coherent, grammatically correct paragraph, for example. I don’t want my tax returns prepared by an accountant who can complete them correctly only 70% of the time, and I don’t want my prescriptions filled by a pharmacist who can fill them correctly only 70% of the time! For these essential goals, we want close to 100% of students meeting our standard.
Then there are aspirational goals, which not everyone need achieve. We may want college graduates to be good public speakers, for example, but in many cases graduates can lead successful lives even if they’re not. For these kinds of goals, a lower target may be appropriate.
Tests and rubrics often assess a combination of essential and aspirational goals, which suggests that overall test or rubric scores often aren’t very helpful in understanding student learning. Scores for each rubric trait or for each learning objective in the test blueprint are often much more useful.
Bottom line here: I have a real problem with people who say their standard or target is 70%. It’s inevitably an arbitrary number with no real rationale. Setting meaningful standards and targets is time-consuming, but I can think of few tasks that are more important, because it’s what help ensure that students truly learn what we want them to…and that’s what we’re all about.
By the way, my thinking here comes primarily from two sources: Setting Performance Standards by Cizek and a review of the literature that I did a couple of years ago for a chapter on rubric development that I contributed to the https://www.amazon.com/Handbook-Measurement-Assessment-Evaluation-Education/dp/1138892157" target="_blank">Handbook on Measurement, Assessment, and Evaluation in Higher Education. For a more thorough discussion of the ideas here, see Chapter 22 (Setting Meaningful Standards and Targets) in the new 3rd edition of my book Assessing Student Learning: A Common Sense Guide.
|Posted on September 2, 2018 at 8:25 AM||comments (2)|
In a recent guest post in Inside Higher Ed, “What Students See in Rubrics,” Denise Krane explained her dissatisfaction with rubrics, which can be boiled down to this statement toward the end of her post, “Ideally, rubrics are assignment specific.”
I don’t know where Denise got this idea, but it’s flat-out wrong. As I’ve mentioned in previous blog posts on rubrics, a couple of years ago I conducted a literature review for a chapter on rubric development that I wrote for the second edition of the Handbook of Measurement, Assessment, and Evaluation in Higher Education. The rubric experts I found (for example, Brookhart; Lane; Linn, Baker & Dunbar; and Messick) are unanimous in advocating what they call general rubrics over what they call task-specific rubrics: rubrics that assess achievement of the assignment’s learning outcomes rather than achievement of the task at hand.
Their reason is exactly what Denise advocates: we want students to focus on long-term, deep learning—in the case of writing, to develop the tools to, as Denise says, grapple with writing in general. Indeed, some experts such as Lane posit that one of the criteria of a valid rubric is its generalizability: it should tell you how well students can write (or think, or solve problems) across a range of tasks, not just the one being assessed. If you use a task-specific rubric, students will learn how to do that one task but not much more. If you use a general rubric, students will learn skills they can use in whole families of tasks.
To be fair, the experts also caution against general rubrics that are too general, such as one writing rubric used to assess student work in courses and programs across an entire college. Many experts (for example, Cooper, Freedman, Lane, and Lloyd-Jones) suggest developing rubrics for families of related assignments—perhaps one for academic writing in the humanities and another for business writing. This lets the rubric include discipline-specific nuances. For example, academic writing in the humanities is often expansive, while business writing must be succinct.
How do you move from a task-specific rubric to a general rubric? It’s all about the traits being assessed—those things listed on the left side of the rubric. Those things should be traits of the learning outcomes being assessed, not the assignment. So instead of listing each element of the assignment (I’ve seen rubrics that literally list “opening paragraph,” “second paragraph,” and so on ), list each key trait of the learning goals. When I taught writing, for example, my rubric included traits like focus, organization, and sentence structure.
Over the last few months I’ve worked with a lot of faculty on creating rubrics, and I’ve seen that moving from a task-specific to a general rubric can be remarkably difficult. One reason is that faculty want students to complete the assignment correctly: Did they provide three examples? Did they cite five sources? If this is important, I suggest making “Following directions” one of the learning outcomes of the assignment and including it as a trait assessed by the rubric. Then create a separate checklist of all the components of the assignment. Ask students to complete the checklist themselves before submitting the assignment. Also consider asking students to pair up and complete checklists for each other’s assignments.
To identify the other traits assessed by the rubric, ask yourself, “What does good writing/problem solving/critical thinking/presenting look like? Focus not on this assignment but on why you’re giving students the assignment. What you want them to learn from this assignment that they can use in subsequent courses or after they graduate?
Denise mentioned two other things about rubrics that I’d also like to address. She surveyed her students about their perceptions of rubrics, and one complaint was that faculty expectations vary from one professor to another. The problem here is lack of collaboration. Faculty teaching sections of the same course--or related courses--should collaborate on a common rubric that they all use to grade student work. This lets students work on the same important skill over and over again in varying course contexts and see connections in their learning. If one professor wants to emphasize something above and beyond the common rubric, fine. The common elements can be the top half of the rubric, and the professor-specific elements can be the bottom half.
Denise also mentioned that her rubric ran three pages, and she hated. I would too! Long rubrics focus on the trees rather than the forest of what we’re trying to help students learn. A shorter rubric (I recommend that rubrics fit on one page) focuses students on the most important things they’re supposed to be learning. If it frustrates you that your rubric doesn’t include everything you want to assess, keep in mind that no assessment can assess everything. Even a comprehensive final exam can’t ask every conceivable question. Just make sure that your rubric, like your exam, focuses on the most important things you want students to learn.
If you’re interested in a deeper dive into what I learned about rubrics, here are some of my past blog posts. My book chapter in the Handbook has the full citations of the authors I've mentioned here.
|Posted on August 14, 2018 at 8:50 AM||comments (1)|
A while back, a faculty member teaching in a community college career program told me, “I don’t need to assess. I know what my students are having problems with—math.”
Well, maybe so, but I’ve found that my perceptions often don’t match reality, and systematic evidence gives me better insight. Let me give you a couple of examples.
Example #1: you may have noticed that my website blog page now has an index of sorts on the right side. I created it a few months ago, and what I found really surprised me. I aim for practical advice on the kinds of assessment issues that people commonly face. Beforehand I’d been feeling pretty good about the range and relevance of assessment topics that I’d covered. The index showed that, yes, I’d done lots of posts on how to assess and specifically on rubrics, a pet interest of mine. I was pleasantly surprised by the number of posts I’d done on sharing and using results.
But what shocked me was how little I’d written on assessment culture: only four posts in five years! Compare that with seventeen posts on curriculum design and teaching. Assessment culture is an enormous issue for assessment practitioners. Now knowing the short shrift I’d been giving it, I’ve written several more blog posts related to assessment culture, bring the total to ten (including this post).
(By the way, if there’s anything you’d like to see a blog post on, let me know!)
Example #2: Earlier this summer I noticed that some of the flowering plants in my backyard weren’t blooming much. I did a shade study: one sunny day when I was home all day, every hour I made notes on which plants were in sun and which were in shade. I’d done this about five years ago but, as with the blog index, the results shocked me; some trees and shrubs had grown a lot bigger in five years and consequently some spots in my yard were now almost entirely in shade. No wonder those flowers didn’t bloom! I’ll be moving around a lot of perennials this fall to get them into sunnier spots.
So, yes, I’m a big fan of using systematic evidence to inform decisions. I’ve seen too often that our perceptions may not match reality.
But let’s go back to that professor whose students were having problems with math and give him the benefit of the doubt—maybe he’s right. My question to him was, “What are you doing about it?” The response was a shoulder shrug. His was one of many institutions with an assessment office but no faculty teaching-learning center. In other words, they’re investing more in assessment than in teaching. He had nowhere to turn for help.
My point here is that assessment is worthwhile only if the results are used to make meaningful improvements to curricula and teaching methods. Furthermore, assessment work is worthwhile only if the impact is in proportion to the time and effort spent on the assessment. I recently worked with an institution that undertook an elaborate assessment of three general education learning outcomes, in which student artifacts were sampled from a variety of courses and scored by a committee of trained reviewers. The results were pretty dismal—on average only about two thirds of students were deemed “proficient” on the competencies’ traits. But the institutional community is apparently unwilling to engage with this evidence, so nothing will be done beyond repeating the assessment in a couple of years. Such an assessment is far from worthwhile; it’s a waste of everyone’s time.
This institution is hardly alone. When I was working on the new 3rd edition of my book Assessing Student Learning: A Common Sense Guide, I searched far and wide for examples of assessments whose results led to broad-based change and found only a handful. Overwhelmingly, the changes I see are what I call minor tweaks, such as rewriting an assignment or adding more homework. These changes can be good—collectively they can add up to a sizable impact. But the assessments leading to these kinds of changes are worthwhile only if they’re very simple, quick assessments in proportion to the minor tweaks they bring about.
So is assessment worth it? It’s a mixed bag. On one hand, the time and effort devoted to some assessments aren’t worth it—the findings don’t have much impact. On the other hand, however, I remain convinced of the value of using systematic evidence to inform decisions affecting student learning. Assessment has enormous potential to move us from providing a good education to providing a truly great education. The keys to achieving this are commitments to (1) making that good-to-great transformation, (2) using systematic evidence to inform decisions large and small, and (3) doing only assessments whose impact is likely to be in proportion to the time, effort, and resources spent on them.
|Posted on July 30, 2018 at 8:20 AM||comments (2)|
I often hear questions about how long an “assessment cycle” should be. Fair warning: I don’t think you’re going to like my answer.
The underlying premise of the concept of an assessment cycle is that assessment of key program, general education, or institutional learning goals is too burdensome to be completed in its entirety every year, so it’s okay for assessments to be staggered across two or more years. Let’s unpack that premise a bit.
First, know that if an accreditor finds an institution or program out of compliance with even one of its standards—including assessment—Federal regulations mandate that the accreditor can give the institution no more than two years to come into compliance. (Yes, the accreditor can extend those two years for “good cause,” but let’s not count on that.) So an institution that has done nothing with assessment has a maximum of two years to come into compliance, which often means not just planning assessments but conducting them, analyzing the results, and using the results to inform decisions. I’ve worked with institutions in this situation and, yes, it can be done. So an assessment cycle, if there is one, should generally run no longer than two years.
Now consider the possibility that you’ve assessed an important learning goal, and the results are terrible. Perhaps you learn that many students can’t write coherently, or they can’t analyze information or make a coherent argument. Do you really want to wait two, three, or five years to see if subsequent students are doing better? I’d hope not! I’d like to see learning goals with poor results put on red alert, with prompt actions so students quickly start doing better and prompt re-assessments to confirm that.
Now let’s consider the premise that assessments are too burdensome for them all to be conducted annually. If your learning goals are truly important, faculty should be teaching them in every course that addresses them. They should be giving students learning activities and assignments on those goals; they should be grading students on those goals; they should be reviewing the results of their tests and rubrics; and they should be using the results of their review to understand and improve student learning in their courses. So, once things are up and running, there really shouldn’t be much extra burden in assessing important learning goals. The burdens are cranking out those dreaded assessment reports and finding time to get together with colleagues to review and discuss the results collaboratively. Those burdens are best addressed by minimizing the work of preparing those reports and by helping faculty carve out time to talk.
Now let’s consider the idea that an assessment cycle should stagger the goals being assessed. That implies that every learning goal is discrete and that it needs its own, separate assessment. In reality, learning goals are interrelated; how can one learn to write without also learning to think critically? And we know that capstone assignments—in which students work on several learning goals at once—are not only great opportunities for students to integrate and synthesize their learning but also great assessment opportunities, because we can look at student achievement of several learning goals all at once.
Then there’s the message we send when we tell faculty they need to conduct a particular assessment only once every three, four, or five years: assessment is a burdensome add-on, not part of our normal everyday work. In reality, assessment is (or should be) part of the normal teaching-learning process.
And then there are the practicalities of conducting an assessment only once every few years. Chances are that the work done a few years ago will have vanished or at least collective memory will have evaporated (why on earth did we do that assessment?). Assessment wheels must be reinvented, which can be more work than tweaking last year’s process.
So should assessments be conducted on a fixed cycle? In my opinion, no. Instead:
- Use capstone assignments to look at multiple goals simultaneously.
- If you’re getting started with assessment, assess everything, now. You’ve been dragging your feet too long already, and you’re risking an accreditation action. Remember you must not only have results but be using them within two years.
- If you’ve got disappointing results, move additional assessments of those learning goals to a front burner, assessing them frequently until you get results where you want them.
- If you’ve got terrific results, consider moving assessments of those learning goals to a back burner, perhaps every two years or so, just to make sure results aren’t slipping. This frees up time to focus on the learning goals that need time and attention.
- If assessment work is widely viewed as burdensome, it’s because its cost-benefit is out of whack. Perhaps assessment processes are too complicated, or people view the learning goals being assessed as relatively unimportant, or the results aren’t adding useful insight. Do all you can to simplify assessment work, especially reporting. If people don't find a particular assessment useful, stop doing it and do something else instead.
- If assessment work must be staggered, stagger some of your indirect assessment tools, not the learning goals or major direct assessments. An alumni survey or student survey might be conducted every three years, for example.
- For programs that “get” assessment and are conducting it routinely, ask for less frequent reports, perhaps every two or three years instead of annually. It’s a win-win reward: less work for them and less work for those charged with reviewing and offering feedback on assessment reports.
|Posted on June 10, 2018 at 8:45 AM||comments (2)|
Architecture critic Kate Wagner recently said, “All buildings are interesting. There is not a single building that isn’t interesting in some way.” I think we can say the same thing about assessment: All assessment is interesting. There is not a single assessment that isn’t interesting in some way.
Kate points out that what makes seemingly humdrum buildings interesting are the questions we can ask about them—in other words, how we analyze them. She suggests a number of questions that can be easily adapted to assessment:
- How do these results compare to other assessment results? We can compare results against results for other students (at our institution or elsewhere), against results for other learning goals, against how students did when they entered (value-added), against past cohorts of students, or against an established standard. Each of these comparisons can be interesting. (See Chapter 22 of my book Assessing Student Learning for more information on perspectives for comparing results.)
- Are we satisfied with the results? Why or why not?
- What do these results say about our students at this time? Students, curricula, and teaching methods are rapidly changing, which makes them--and assessment--interesting. Assessment results are a piece of history: what students learned (and didn’t learn) at this time, in this setting.
- What does this assessment say about what we and our institution value? What does it say about the world in which we live?
Why do so many faculty and staff fail to find assessment interesting? I’ve alluded to a number of possible reasons in past blog posts (such as here and here), but let me throw out a few that I think are particularly relevant.
1. Sometimes assessment simply isn’t presented as something that’s supposed to be interesting. It’s a chore to get through accreditation, nothing more. Just as Kate felt obliged to point out that even humdrum buildings are interesting, sometimes faculty and staff need to be reminded that assessment should be designed to yield interesting results.
2. Sometimes faculty and staff aren’t particularly interested in the learning goal being assessed. If a faculty member focuses on basic conceptual understanding in her course, she’s not going to be particularly interested in the assessment of critical thinking that she's obliged to do. Rethinking key learning goals and helping faculty and staff rethink their curricula can go a long way toward generating assessment results that faculty and staff find interesting.
3. Some faculty and staff find results mildly interesting, but not interesting enough to be worth all the time and effort that’s gone into generating them. A complex, time-consuming assessment whose results show that students are generally doing fine and are not all that different from past years is interesting but not terribly interesting. The cost-benefit isn’t there. Here the key is to scale back less-interesting assessments—maybe repeat the assessment every two or three years just to make sure results aren’t slipping—and focus on assessments that faculty and staff will find more interesting and useful.
4. Some faculty and staff aren’t really that interested in teaching—they’re far more engaged with their research agenda. And some faculty and staff aren’t really that interested in improving their teaching. Institutional leaders can help here by rethinking incentives and rewards to encourage faculty and staff to try to improve their teaching.
Kate says, “All of us have the potential to be nimble interpreters of the world around us. All we need to do is look around.” Similarly, all of us have the potential to be nimble interpreters of evidence of student learning. All we need to do is use the analytical skills we learned in college and teach to our students to find what's interesting.
|Posted on May 27, 2018 at 7:40 AM||comments (0)|
When I help faculty and co-curricular staff move ahead with their assessment efforts, I probably spend half our time on helping them articulate their learning goals. As the years have gone by, I’ve become ever more convinced that learning goals are the foundation of an assessment structure…and without a solid foundation, a structure can’t be well-constructed.
So what are well-stated learning goals? They have the following characteristics:
They are outcomes: what students will be able to do after they successfully complete the learning experience, not what they will do or learn during the learning experience. Example: Prepare effective, compelling visual summaries of research.
They are clear, written in simple, jargon-free terms that everyone understands, including students, employers, and colleagues in other disciplines. Example: Work collaboratively with others.
They are observable, written using action verbs, because if you can see it, you can assess it. Example: Identify and analyze ethical issues in the discipline.
They focus on skills more than knowledge, conceptual understanding, or attitudes and values, because thinking and performance skills are what employers seek in new hires. I usually suggest that at least half the learning goals of any learning experience focus on skills. Example: Integrate and properly cite scientific literature.
They are significant and aspirational: things that will take some time and effort for students to learn and that will make a real difference in their lives. Example: Identify, articulate, and solve problems in [the discipline or career field].
They are relevant, meeting the needs of students, employers, and society. They focus more on what students need to learn than what faculty want to teach. Example: Interpret numbers, data, statistics, and visual representations of them appropriately.
They are short and therefore powerful. Long, qualified or compound statements get everyone lost in the weeds. Example: Treat others with respect.
They fit the scope of the learning activity. Short co-curricular learning experiences have narrower learning goals than an entire academic program, for example.
They are limited in number. I usually suggest no more than six learning goals per learning experience. If you have 10, 15, or 20 learning goals—or more—everyone focuses on trees rather than the forest of the most important things you want students to learn.
They help students achieve bigger, broader learning goals. Course learning goals help students achieve program and/or general education learning goals; co-curricular learning goals help students achieve institutional learning goals; program learning goals help students achieve institutional learning goals.
For more information on articulating well-stated learning goals, see Chapter 4 of the new 3rd edition of my book Assessing Student Learning: A Common Sense Guide.
|Posted on March 28, 2018 at 6:25 AM||comments (1)|
In my February 28 blog post, I noted that many faculty express frustration with assessment along the following lines:
- What I most want students to learn is not what’s being assessed.
- I’m being told what and how to assess, without any input from me.
- I’m being told what to teach, without any input from me.
- I’m being told to assess skills that employers want, but I teach other things that I think are more important.
- A committee is doing a second review of my students’ work. I’m not trusted to assess student work fairly and accurately through my grading processes.
- I’m being asked to quantify student learning, but I don’t think that’s appropriate for what I’m teaching.
- I’m being asked to do this on top of everything else I’m already doing.
- Assessment treats learning as a scientific process, when it’s a human endeavor; every student and teacher is different.
The underlying theme here is that these faculty don’t feel that they and their views are valued and respected. When we value and respect people:
- We design assessment processes so the results are clearly useful in helping to make important decisions, not paper-pushing exercises designed solely to get through accreditation.
- We make assessment work worthwhile by using results to make important decisions, such as on resource allocations, as discussed in my March 13 blog post.
- We truly value great teaching and actively encourage the scholarship of teaching as a form of scholarship.
- We truly value innovation, especially in improving one’s teaching because, if no one wants to change anything, there’s no point in assessing.
- We take the time to give faculty and staff clear guidance and coordination, so they understand what they are to do and why.
- We invest in helping them learn what to do: how to use research-informed teaching strategies as well as how to assess.
- We support their work with appropriate resources.
- We help them find time to work on assessment and to keep assessment work cost-effective, because we respect how busy they are.
- We take a flexible approach to assessment, recognizing that one size does not fit all. We do not mandate a single institution-wide assessment approach but instead encourage a variety of assessment strategies, both quantitative and qualitative. The more choices we give faculty, the more they feel empowered.
- We design assessment processes so faculty are leaders rather than providers of assessment. We help them work collaboratively rather than in silos, inviting them to contribute to decisions on what, why, and how we assess. We try to assess those learning outcomes that the institutional community most values. More than anything else, we spend more time listening than telling.
- We recognize and honor assessment work in tangible ways, perhaps through a celebratory event, public commendations, or consideration in promotion, tenure, and merit pay applications.
For more information on these and other strategies to value and respect people who work on assessment, see Chapter 14, “Valuing Assessment and the People Who Contribute,” in the new third edition of my book Assessing Student Learning: A Common Sense Guide.
|Posted on March 13, 2018 at 9:50 AM||comments (9)|
In my February 28 blog post, I noted that many faculty have been expressing frustration that assessment is a waste of an enormous amount of time and resources that could be better spent on teaching. Here are some strategies to help make sure your assessment activities are meaningful and cost-effective, all drawn from the new third edition of Assessing Student Learning: A Common Sense Guide.
Don’t approach assessment as an accreditation requirement. Sure, you’re doing assessment because your accreditor requires it, but cranking out something only to keep an accreditor happy is sure to be viewed as a waste of time. Instead approach assessment as an opportunity to collect information on things you and your colleagues care about and that you want to make better decisions about. Then what you’re doing for the accreditor is summarizing and analyzing what you’ve been doing for yourselves. While a few accreditors have picky requirements that you must comply with whether you like them or not, most want you to use their standards as an opportunity to do something genuinely useful.
Keep it useful. If an assessment hasn’t yielded useful information, stop doing it and do something else. If no one’s interested in assessment results for a particular learning goal, you’ve got a clue that you’ve been assessing the wrong goal.
Make sure it’s used in helpful ways. Design processes to make sure that assessment results inform things like professional development programming, resource allocations for instructional equipment and technologies, and curriculum revisions. Make sure faculty are informed about how assessment results are used so they see its value.
Monitor your investment in assessment. Keep tabs on how much time and money each assessment is consuming…and whether what’s learned is useful enough to make that investment worthwhile. If it isn’t, change your assessment to something more cost-effective.
Be flexible. A mandate to use an assessment tool or strategy that’s inappropriate for a particular learning goal or discipline is sure to be viewed as a waste of everyone’s time. In assessment, one size definitely does not fit all.
Question anything that doesn’t make sense. If no one can give a good explanation for doing something that doesn’t make sense, stop doing it and do something more appropriate.
Start with what you have. Your college has plenty of direct and indirect evidence of student learning already on hand, from grading processes, surveys, and other sources. Squeeze information out of those sources before adding new assessments.
Think twice about blind-scoring and double-scoring student work. The costs in terms of both time and morale can be pretty steep (“I’m a professional! Why can’t they trust me to assess my own students’ work?” ). Start by asking faculty to submit their own rubric ratings of their own students’ work. Only move to blind- and double-scoring if you see a big problem in their scores of a major assessment.
Start at the end and work backwards. If your program has a capstone requirement, students should be demonstrating achievement in many key program learning goals in it. Start assessment there. If students show satisfactory achievement of the learning goals, you’re done! If you’re not satisfied with their achievement of a particular learning goal, you can drill down to other places in the curriculum that address that goal.
Help everyone learn what to do. Nothing galls me more than finding out what I did wasn’t what was wanted and has to be redone. While we all learn from experience and do things better the second time, help everyone learn what to do so, their first assessment is a useful one.
Minimize paperwork and bureaucratic layers. Faculty are already routinely assessing student learning through the grading process. What some resent is not the work of grading but the added workload of compiling, analyzing, and reporting assessment evidence from the grading process. Make this process as simple, intuitive, and useful as possible. Cull from your assessment report template anything that’s “nice to know” versus absolutely essential.
Make assessment technologies an optional tool, not a mandate. Only a tiny number of accreditors require using a particular assessment information management system. For everyone else, assessment information systems should be chosen and implemented to make everyone’s lives easier, not for the convenience of a few people like an assessment committee or a visiting accreditation team. If a system is hard to learn, creates more work, or is expensive, it will create resentment and make things worse rather than better. I recently encountered one system for which faculty had to tally and analyze their results, then enter the tallied results into the system. Um, shouldn’t an assessment system do the work of tallying and analysis for the faculty?
Be sensible about staggering assessments. If students are not achieving a key learning goal well, you’ll want to assess it frequently to see if they’re improving. But if students are achieving another learning goal really well, put it on a back burner, asking for assessment reports on it only every few years, to make sure things aren’t slipping.
Help everyone find time to talk. Lots of faculty have told me that they “get” assessment but simply can’t find time to discuss with their colleagues what and how to assess and how best to use the results. Help them carve out time on their calendars for these important conversations.
Link your assessment coordinator with your faculty teaching/learning center, not an accreditation or institutional effectiveness office. This makes clear that assessment is about understanding and improving student learning, not just a hoop to jump through to address some administrative or accreditation mandate.
|Posted on January 28, 2018 at 7:25 AM||comments (0)|
A couple of years ago I did a literature review on rubrics and learned that there’s no consensus on what a rubric is. Some experts define rubrics very narrowly, as only analytic rubrics—the kind formatted as a grid, listing traits down the left side and performance levels across the top, with the boxes filled in. But others define rubrics more broadly, as written guides for evaluating student work that, at a minimum, lists the traits you’re looking for.
But what about something like the following, which I’ve seen on plenty of assignments?
70% Responds fully to the assignment (length of paper, double-spaced, typed, covers all appropriate developmental stages)
15% Grammar (including spelling, verb conjugation, structure, agreement, voice consistency, etc.)
Under the broad definition of a rubric, yes, this is a rubric. It is a written guide for evaluating student work, and it lists the three traits the faculty member is looking for.
The problem is that it isn’t a good rubric. Effective assessments including rubrics have the following traits:
Effective assessments yield information that is useful and used. Students who earn less than 70 points for responding to the assignment have no idea where they fell short. Those who earn less than 15 points on organization have no idea why. If the professor wants to help the next class do better on organization, there’s no insight here on where this class’s organization fell short and what most needs to be improved.
Effective assessments focus on important learning goals. You wouldn’t know it from the grading criteria, but this was supposed to be an assignment on critical thinking. Students focus their time and mental energies on what they’ll be graded on, so these students will focus on following directions for the assignment, not developing their critical thinking skills. Yes, following directions is an important skill, but critical thinking is even more important.
Effective assessments are clear. Students have no idea what this professor considers an excellently organized paper, what’s considered an adequately organized paper, and what’s considered a poorly organized paper.
Effective assessments are fair. Here, because there are only three broad, ill-defined traits, the faculty member can be (unintentionally) inconsistent in grading the papers. How many points are taken off for an otherwise fine paper that’s littered with typos? For one that isn’t double-spaced?
So the debate about an assessment should be not whether it is a rubric but rather how well it meets these four traits of effective assessment practices.
If you’d like to read more about rubrics and effective assessment practices, the third edition of my book Assessing Student Learning: A Common Sense Guide will be released on February 13 and can be pre-ordered now. The Kindle version is already available through Amazon.
|Posted on May 21, 2017 at 6:10 AM||comments (6)|
I was impressed with—and found myself in agreement with—Douglas Roscoe’s analysis of the state of assessment in higher education in “Toward an Improvement Paradigm for Academic Quality” in the Winter 2017 issue of Liberal Education. Like Douglas, I think the assessment movement has lost its way, and it’s time for a new paradigm. And Douglas’s improvement paradigm—which focuses on creating spaces for conversations on improving teaching and curricula, making assessment more purposeful and useful, and bringing other important information and ideas into the conversation—makes sense. Much of what he proposes is in fact echoed in Using Evidence of Student Learning to Improve Higher Education by George Kuh, Stanley Ikenberry, Natasha Jankowski, Timothy Cain, Peter Ewell, Pat Hutchings, and Jillian Kinzie.
But I don’t think his improvement paradigm goes far enough, so I propose a second, concurrent paradigm shift.
I’ve always felt that the assessment movement tried to do too much, too quickly. The assessment movement emerged from three concurrent forces. One was the U.S. federal government, which through a series of Higher Education Acts required Title IV gatekeeper accreditors to require the institutions they accredit to demonstrate that they were achieving their missions. Because the fundamental mission of an institution of higher education is, well, education, this was essentially a requirement that institutions demonstrate that its intended student learning outcomes were being achieved by its students.
The Higher Education Acts also required Title IV gatekeeper accreditors to require the institutions they accredit to demonstrate “success with respect to student achievement in relation to the institution’s mission, including, as appropriate, consideration of course completion, state licensing examinations, and job placement rates” (1998 Amendments to the Higher Education Act of 1965, Title IV, Part H, Sect. 492(b)(4)(E)). The examples in this statement imply that the federal government defines student achievement as a combination of student learning, course and degree completion, and job placement.
A second concurrent force was the movement from a teaching-centered to learning-centered approach to higher education, encapsulated in Robert Barr and John Tagg’s 1995 landmark article in Change, “From Teaching to Learning: A New Paradigm for Undergraduate Education.” The learning-centered paradigm advocates, among other things, making undergraduate education an integrated learning experience—more than a collection of courses—that focuses on the development of lasting, transferrable thinking skills rather than just basic conceptual understanding.
The third concurrent force was the growing body of research on practices that help students learn, persist, and succeed in higher education. Among these practices: students learn more effectively when they integrate and see coherence in their learning, when they participate in out-of-class activities that build on what they’re learning in the classroom, and when new learning is connected to prior experiences.
These three forces led to calls for a lot of concurrent, dramatic changes in U.S. higher education:
- Defining quality by impact rather than effort—outcomes rather than processes and intent
- Looking on undergraduate majors and general education curricula as integrated learning experiences rather than collections of courses
- Adopting new research-informed teaching methods that are a 180-degree shift from lectures
- Developing curricula, learning activities, and assessments that focus explicitly on important learning outcomes
- Identifying learning outcomes not just for courses but for for entire programs, general education curricula, and even across entire institutions
- Framing what we used to call extracurricular activities as co-curricular activities, connected purposefully to academic programs
- Using rubrics rather than multiple choice tests to evaluate student learning
- Working collaboratively, including across disciplinary and organizational lines, rather than independently
These are well-founded and important aims, but they are all things that many in higher education had never considered before. Now everyone was being asked to accept the need for all these changes, learn how to make these changes, and implement all these changes—and all at the same time. No wonder there’s been so much foot-dragging on assessment! And no wonder that, a generation into the assessment movement and unrelenting accreditation pressure, there are still great swaths of the higher education community who have not yet done much of this and who indeed remain oblivious to much of this.
What particularly troubles me is that we’ve spent too much time and effort on trying to create—and assess—integrated, coherent student learning experiences and, in doing so, left the grading process in the dust. Requiring everything to be part of an integrated, coherent learning experience can lead to pushing square pegs into round holes. Consider:
- The transfer associate degrees offered by many community colleges, for example, aren’t really programs—they’re a collection of general education and cognate requirements that students complete so they’re prepared to start a major after they transfer. So identifying—or assessing—program learning outcomes for them frankly doesn’t make much sense.
- The courses available to fulfill some general education requirements don’t really have much in common, so their shared general education outcomes become so broad as to be almost meaningless.
- Some large universities are divided into separate colleges and schools, each with their own distinct missions and learning outcomes. Forcing these universities to identify institutional learning outcomes applicable to every program makes no sense—again, the outcomes must be so broad as to be almost meaningless.
- The growing numbers of students who swirl through multiple colleges before earning a degree aren’t going to have a really integrated, coherent learning experience no matter how hard any of us tries.
At the same time, we have given short shrift to helping faculty learn how to develop and use good assessments in their own classes and how to use grading information to understand and improve their own teaching. In the hundreds of workshops and presentations I’ve done across the country, I often ask for a show of hands from faculty who routinely count how many students earned each score on each rubric criterion of a class assignment, so they can understand what students learned well and what they didn’t learn well. Invariably a tiny proportion raises their hands. When I work with faculty who use multiple choice tests, I ask how many use a test blueprint to plan their tests so they align with key course objectives, and it’s consistently a foreign concept to them.
In short, we’ve left a vital part of the higher education experience—the grading process—in the dust. We invest more time in calibrating rubrics for assessing institutional learning outcomes, for example, than we do in calibrating grades. And grades have far more serious consequences to our students, employers, and society than assessments of program, general education, co-curricular, or institutional learning outcomes. Grades decide whether students progress to the next course in a sequence, whether they can transfer to another college, whether they graduate, whether they can pursue a more advanced degree, and in some cases whether they can find employment in their discipline.
So where we should go? My paradigm springs from visits to two Canadian institutions a few years ago. At that time Canadian quality assurance agencies did not have any requirements for assessing student learning, so my workshops focused solely on assessing learning more effectively in the classroom. The workshops were well received because they offered very practical help that faculty wanted and needed. And at the end of the workshops, faculty began suggesting that perhaps they should collaborate to talk about shared learning outcomes and how to teach and assess them. In other words, discussion of classroom learning outcomes began to flow into discussion of program learning outcomes. It’s a naturalistic approach that I wish we in the United States had adopted decades ago.
What I now propose is moving to a focus on applying everything we’ve learned about curriculum design and assessment to the grading process in the classroom. In other words, my paradigm agrees with Roscoe’s that “assessment should be about changing what happens in the classroom—what students actually experience as they progress through their courses—so that learning is deeper and more consequential.” My paradigm emphasizes the following.
- Assessing program, general education, and institutional learning outcomes remain an assessment best practice. Those who have found value in these assessments would be encouraged to continue to engage in them and honored through mechanisms such as NILOA’s Excellence in Assessment designation.
- Teaching excellence is defined in significant part by four criteria: (1) the use of research-informed teaching and curricular strategies, (2) the alignment of learning activities and grading criteria to stated course objectives, (3) the use of good quality evidence, including but not limited to assessment results from the grading process, to inform changes to one’s teaching, and (4) active participation in and application of professional development opportunities on teaching including assessment.
- Investments in professional development on research-informed teaching practices exceed investments in assessment.
- Assessment work is coordinated and supported by faculty professional development centers (teaching-learning centers) rather than offices of institutional effectiveness or accreditation, sending a powerful message that assessment is about improving teaching and learning, not fulfilling an external mandate.
- We aim to move from a paradigm of assessment, not just to one of improvement as Roscoe proposes, but to one of evidence-informed improvement—a culture in which the use of good quality evidence to inform discussions and decisions is expected and valued.
- If assessment is done well, it’s a natural part of the teaching-learning process, not a burdensome add-on responsibility. The extra work is in reporting it to accreditors. This extra work can’t be eliminated, but it can be minimized and made more meaningful by establishing the expectation that reports address only key learning outcomes in key courses (including program capstones), on a rotating schedule, and that course assessments are aggregated and analyzed within the program review process.
Under this paradigm, I think we have a much better shot at achieving what’s most important: giving every student the best possible education.
|Posted on November 14, 2015 at 8:15 AM||comments (0)|
It’s actually impossible to determine whether any rubric, in isolation, is valid. Its validity depends on how it is used. What may look like a perfectly good rubric to assess critical thinking is invalid, for example, if used to assess assignments that ask only for descriptions. A rubric assessing writing mechanics is invalid for drawing conclusions about students’ critical thinking skills. A rubric assessing research skills is invalid if used to assess essays that students are given only 20 minutes to write.
A rubric is thus valid only if the entire assessment process—including the assignment given to students, the circumstances under which students complete the assignment, the rubric, the scoring procedure, and the use of the findings—is valid. Valid rubric assessment processes have seven characteristics. How well do your rubric assessment processes stack up?
Usability of the results. They yield results that can be and are used to make meaningful, substantive decisions to improve teaching and learning.
Match with intended learning outcomes. They use assignments and rubrics that systematically address meaningful intended learning outcomes.
Clarity. They use assignments and rubrics written in clear and observable terms, so they can be applied and interpreted consistently and equitably.
Fairness. They enable inferences that are meaningful, appropriate, and fair to all relevant subgroups of students.
Consistency. They yield consistent or reliable results, a characteristic that is affected by the clarity of the rubric’s traits and descriptions, the training of those who use it, and the degree of detail provided to students in the assignment.
Appropriate range of outcome levels. The rubrics’ “floors” and “ceilings” are appropriate to the students being assessed).
Generalizability. They enable you to draw overall conclusions about student achievement. The problem here is that any single assignment may not be a representative, generalizable sample of what students have learned. Any one essay question, for example, may elicit an unusually good or poor sample of a student’s writing skill. Increasing the quantity and variety of student work that is assessed, perhaps through portfolios, increases the generalizability of the findings.
Sources for these ideas are cited in my chapter, “Rubric Development,” in the forthcoming second edition of the Handbook on Measurement, Assessment, and Evaluation in Higher Education to be published by Taylor & Francis.
|Posted on March 17, 2014 at 8:10 AM||comments (0)|
Back in December, I suggested that there are just two traits of “good” assessment:
1. Good assessment practices yield results that are used in meaningful ways to improve teaching and learning.
2. Good assessment practices are sustained and pervasive.
With support from the Teagle Foundation, Larry Braskamp and Mark Engberg at Loyola University Chicago have developed “Guidelines for Judging the Effectiveness of Assessing Student Learning” that have five domains:
1. Having a clear purpose and readiness for assessment
2. Involving stakeholders throughout the assessment process
3. What and how to assess is critical
4. Assessment is telling a story
5. Improvement and follow-up are an integral part of the assessment process
I like these domains for a couple of reasons. First, as I suggested in my February 7, 2014, blog, I’d rather focus on effectiveness than quality, so I like Larry and Mark’s framework of effective assessment rather than good assessment. Second, their five domains are a good explication of my two traits, and you may find them more helpful in communicating with your colleagues and stakeholders.
|Posted on December 20, 2013 at 6:40 AM||comments (0)|
There are many statements of principles of good assessment practice. Several years ago I integrated them into one short list of just five traits that I hoped would be easier to understand and share. Since then I've tweaked my list, taking it down to four traits, then most recently up to six. But as I look at them now, I see just two fundamental principles of good assessment practice, with all the other traits falling within these two principles.
1. Good assessment practices yield results that are used in meaningful ways to improve teaching and learning as well as to inform plans and resource allocation decisions. This is the fundamental characteristic of good assessment. If your results are good enough quality that you can use them, they are good enough. Results are useful if they meet the following traits.
--Results flow from clear, important and relevant learning outcomes. If no one is using assessment results for your stated learning outcomes, perhaps your learning outcomes aren't all that important.
--Results are reasonably accurate and truthful. If you're not looking at enough student work, or your rubric criteria aren't consistently interpreted, for example, your results won't be useful.
--You have justifiable targets or standards for acceptable results.In other words, you've defined what successful results look like. Results can be used for improvement only if you have a good, clear sense of whether or not improvements are warranted and where improvements are most needed.
--Results are easy to find and easy to understand. If people don't have ready access to the results or they can't understand them, they can't use them for improvement.
--Results come from outside your college or program as well as within. External evidence informs your learning outcomes, your standards and your use of results.
2. Good assessment practices are sustained and pervasive. They are not bursts of effort just before an accreditation review, and they are not in just a few pockets here and there. They are part of everyday life. Assessment is sustained and pervasive if it meets the following traits:
-- Assessment practices are cost-effective, yielding benefits that are worth the time, effort, and resources put into them. They are kept as simple and practical as possible.
--Assessment practices are adequately supported by the college with professional development, resources, expertise, incentives, and recognition.
--Assessment practices are flexible, evolving over time and varying by discipline and program so the results are maximum value.
|Posted on November 17, 2013 at 6:55 AM||comments (0)|
Why aren't grades sufficient evidence of student learning?
1. Grades alone do not usually provide meaningful information on exactly what students have and have not learned. So it's hard to use grades alone to decide how to improve teaching and learning.
2. Grading and assessment criteria sometimes differ. Some components of grades reflect classroom management strategies (attendance, timely submission of assignments) rather than achievement of key learning outcomes.
3. Grading standards are sometimes vague or inconsistent. They may weight relatively unimportant (but easier to assess) outcomes more heavily than some major (but harder to assess) outcomes.
4. Grades do not reflect all learning experiences. They provide information on student performance in individual courses and assignments but not student progress in achieving program-wide or institution-wide outcomes.
That said, the grading process can provide excellent evidence of achievement of key learning outcomes, and using information from the grading process in this way can make assessment faster, easier, and more meaningful. NILOA (the National Institute for Learning Outcomes Assessment) has recently published a paper on how Prince George's Community College in Maryland is doing exactly this: http://learningoutcomesassessment.org/OccasionalPapernineteen.html.
You'll see from the NILOA paper that using the grading process to collect assessment evidence works only when faculty are willing to collaborate and agree on at least base grading criteria. I often suggest a two-part rubric: the top half provides the common criteria everyone agrees to, and the bottom half is class-specific criteria that individual faculty want to factor into grades.