|Posted on February 28, 2018 at 10:25 AM||comments (16)|
Two recent op-ed pieces in the Chronicle of Higher Education and the New York Times –and the hundreds of online comments regarding them—make clear that, 25 years into the assessment movement, a lot of faculty really hate assessment.
It’s tempting for assessment people to spring into a defensive posture and dismiss what these people are saying. (They’re misinformed! The world has changed!) But if that’s our response, aren’t we modeling the fractures deeply dividing the US today, with people existing in their own echo chambers and talking past each other rather than really listening and trying to find common ground on which to build? And shouldn’t we be practicing what we preach, using systematic evidence to inform what we say and do?
So I took a deeper dive into those comments. I did a content analysis of the articles and many of the comments that followed. (The New York Times article had over 500 comments—too many for me to handle—so I looked only at NYT comments with at least 12 recommendations.)
If you’re not familiar with content analysis, it’s looking through text to identify the frequency of ideas or themes. For example, I counted how many comments mentioned that assessment is expensive. I do content analysis by listing all the comments as bullets in a Word document, then cutting and pasting the bulleted comments to group similar comments together under headings. I then cut and paste the groups so the most frequently mentioned themes are at the top of the document. There is qualitative analysis software that can help if you don’t want to do this manually.
A caveat: Comments don’t always fall into neat, discrete categories; judgement is needed to decide where to place some. I did this analysis quickly, and it’s entirely possible that, if you’d done this instead of me, you might have come up with somewhat different results. But assessment is not rigorous research; we just need information good enough to help inform our thinking, and I think my analysis is fine for the purpose of figuring out how we might deal with this.
Why take the time to do a content analysis instead of just reading through the comments? Because, when we process a list of comments, there’s a good chance we won’t identify the most frequently mentioned ideas accurately. As I was doing my content analysis, I was struck by how many faculty complained that assessment is (I’m being snarky here) either a vast right-wing conspiracy or a vast left-wing conspiracy, simply because I’d never heard that before. It turned out, however, that there were other themes that emerged far more frequently. This is a good lesson for faculty who think they don’t need to formally assess because they “know” what their students are struggling with. Maybe they do…but maybe not.
So what did I find? As I’d expected, there are many reasons why faculty may hate assessment. I found that most of their complaints fall into just four broad categories:
It’s a waste of an enormous amount of time and resources that could be better spent on teaching. Almost 40% of the comments fell into this category. Some examples:
- We faculty are angry over the time and dollars wasted.
- The assessment craze is not only of little value, but it saps the meager resources of time and money available for classroom instruction.
- Faced with outrage over the high cost of higher education, universities responded by encouraging expensive administrative bloat.
- It is not that the faculty are not trying, but the data and methods in general use are very poor at measuring learning.
- Our “assessment expert” told us to just put down as a goal the % of students we wanted to rate us as very good or good on a self-report survey. Which we all know is junk.
I and what I think is important is not valued or respected. Over 30% of the comments fell into this category. Some examples:
- Assessment of student learning outcomes is an add-on activity that says your standard examination and grading scheme isn’t enough so you need to do a second layer of grading in a particular numerical format.
- The fundamental, flawed premise of most of modern education is that teaching is a science.
- Bureaucratic jargon subtly shapes the expectations of students and teachers alike.
- When the effort to reduce learning to a list of job-ready skills goes too far, it misses the point of a university education.
- Learning outcomes have disempowered faculty.
- The only learning outcomes I value: students complete their formal education with a desire to learn more
- Assessment reflects a misguided belief that learning is quantifiable.
External and economic forces are behind this. About 15% of comments fell into this category, including those right-wing/left-wing conspiracy comments. Some examples:
- There’s a whole industry out there that’s invested in outcomes assessment.
- The assessment boom coincided with the decision of state legislatures to reduce spending on public universities.
- Educational institutions have been forced to operate out of a business model.
- It is the rise of adjuncts and online classes that has led to the assessment push.
I’m unfairly held responsible for student learning. About 10% of comments fell into this category. Some examples:
- Students, not faculty, are responsible for student learning.
- It is much more profitable to skim money from institutions of higher learning than fixing the underlying causes of the poverty and lack of focus that harm students.
- The root cause is lack of a solid foundation built in K-12.
Two things struck me about these four broad categories. The first one was that they don’t quite align with what I’ve heard as I’ve worked with literally thousands of faculty at hundreds of colleges over the last two decades. Yes, I’ve heard plenty about assessment being useless, and I’ve written about faculty feeling devalued and disrespected by assessment, but I’d never heard the external-forces or blame-game reasons before. And I’ve heard plenty about other reasons that weren’t mentioned in these comments, especially finding time to work on assessment, not understanding how to assess (or how to teach), and moving from a culture of silos to one of collaboration. I think the reason for the disconnect between what I’ve heard and what was expressed here is that these comments reflect the angriest faculty, not all faculty. But their anger is legitimate and something we should all work to address.
[UPDATED 2/28/2018 4:36 PM EST] So what should we do? First, we clearly need better information on faculty experiences and views regarding assessment so we can understand which issues are most pervasive and address them. The Surveys of Assessment Culture developed by Matt Fuller at Sam Houston State University is an important start.
In the meanwhile, the good news is the comments in and accompanying these two pieces all represent solvable problems. (No, we can’t solve all of society’s ills, but we can help faculty deal with them.) I’ll share some ideas in upcoming blog posts. If you don’t want to wait, you’ll find plenty of practical suggestions in the new 3rd edition of my book Assessing Student Learning: A Common Sense Guide.
|Posted on February 22, 2018 at 7:00 PM||comments (0)|
I was intrigued by an article in the September 23, 2016, issue of Inside Higher Ed titled “When a C Isn’t Good Enough.” The University of Arizona found that students who earned an A or B in their first-year writing classes had a 67% chance of graduating, but those earning a C had only a 48% chance. The university is now exploring a variety of ways to improve the success of students earning a C, including requiring C students to take a writing competency test, providing resources to C students, and/or requiring C students to repeat the course.
I know nothing about the University of Arizona beyond what’s in the article. But if I were working with the folks there, I’d offer the following ideas to them, if they haven’t considered them already.
1. I’d like to see more information on why the C students earned a C. Which writing skills did they struggle most with: basic grammar, sentence structure, organization, supporting arguments with evidence, etc.? Or was there another problem? For example, maybe C students were more likely to hand in assignments late (or not at all).
2. I’d also like to see more research on why those C students were less likely to graduate. How did their GPAs compare to A and B students? If their grades were worse, what kinds of courses seemed to be the biggest challenge for them? Within those courses, what kinds of assignments were hardest for them? Why did they earn a poor grade on them? What writing skills did they struggle most with: basic grammar, organization, supporting arguments with evidence, etc.? Or, again, maybe there was another problem, such as poor self-discipline in getting work handed in on time.
And if their GPAs were not that different from those of A and B students (or even if they were), what else was going on that might have led them to leave? The problem might not be their writing skills per se. Perhaps, for example, that students with work or family obligations found it harder to devote the study time necessary to get good grades. Providing support for that issue might help more than helping them with their writing skills.
3. I’d also like to see the faculty responsible for first-year writing articulate a clear, appropriate, and appropriately rigorous standard for earning a C. In other words, they could use the above information on the kinds and levels of writing skills that students need to succeed in subsequent courses to articulate the minimum performance levels required to earn a C. When I taught first-year writing at a public university in Maryland, the state system had just such a statement, the “Maryland C Standard.”
4. I’d like to see the faculty adopt a policy that, in order to pass first-year writing, students must meet the minimum standard of every writing criterion. Thus, if student work is graded using a rubric, the grade isn’t determined by averaging the scores on various rubric criteria—that lets a student with A arguments but F grammar earn a C with failing grammar. Instead, students must earn at least a C on every rubric criterion in order to pass the assignment. Then the As, Bs, and Cs can be averaged into an overall grade for the assignment.
(If this sounds vaguely familiar to you, what I’m suggesting is the essence of competency-based education: students need to demonstrate competence on all learning goals and objectives in order to pass a course or graduate. Failure to achieve one goal or objective can’t be offset by strong performance on another.)
5. If they haven’t done so already, I’d also like to see the faculty responsible for first-year writing adopt a common rubric, articulating the criteria they’ve identified, that would be used to assess and grade the final assignment in every section, no matter who teaches it. This would make it easy to study student performance across all sections of the course and identify pervasive strengths and weaknesses in their writing. If some faculty members or TAs have additional grading criteria, they could simply add those to the common rubric. For example, I graded my students on their use of citation conventions, even though that was not part of the Maryland C Standard. I added that to the bottom of my rubric.
6. Because work habits are essential to success in college, I’d also suggest making this a separate learning outcome for first-year writing courses. This means grading students separately on whether they turn in work on time, put in sufficient effort, etc. This would help everyone understand why some students fail to graduate—is it because of poor writing skills, poor work habits, or both?
These ideas all move responsibility for addressing the problem from administrators to the faculty. That responsibility can’t be fulfilled unless the faculty commit to collaborating on identifying and implementing a shared strategy so that every student, no matter which section of writing they enroll in, passes the course with the skills needed for subsequent success.
|Posted on February 13, 2018 at 9:10 AM||comments (0)|
Today marks the release of the third edition of my book Assessing Student Learning: A Common Sense Guide. I approached Jossey-Bass about doing a third edition in response to requests from some faculty who used it as a textbook but were required to use more recent editions. The second edition had been very successful, so I figured I’d update the references and a few chapters and be done. But as I started work on this edition, I was immediately struck by how outdated the second edition had become in just a few short years. The third edition is a complete reorganization and rewrite of the previous edition.
How has the world of higher ed assessment changed?
We are moving from Assessment 1.0 to Assessment 2.0: from getting assessment done—and in many cases not doing it very well—to getting assessment used. Many faculty and administrators still struggle to grasp that assessment is all about improving how we help students learn, not an end in itself, and that assessments should be planned with likely uses in mind. The last edition talked about using results, of course, but new edition adds a chapter on using assessment results to the beginning of the book. And throughout the book I talk not about “assessment results” but “evidence of student learning,” which is what this is really all about.
We have a lot of new resources. Many new assessment resources have emerged since the second edition was published, including the VALUE rubrics published by AAC&U, the many white papers published by NILOA, and the Degree Qualifications Profile sponsored by Lumina. Learning management systems and assessment information management systems are far more prevalent and sophisticated. This edition talks about these and other valuable new resources.
We are recognizing that different settings require different approaches to assessment. The more assessment we’ve done, the more we’ve come to realize that assessment practices vary depending on whether we’re assessing learning in courses, programs, general education curricula, or co-curricular experiences. The last edition didn’t draw many distinctions among assessment in these settings. This edition features a new chapter on the many settings of assessment, and several chapters discuss applying concepts to specific settings.
We’re realizing that curriculum design is a big piece of the assessment puzzle. We’ve found that, when faculty and staff struggle with assessment, it’s often because the learning outcomes they’ve identified aren’t addressed sufficiently—or at all—in the curriculum. So this book has a brand new chapter on curriculum design, and the old chapter on prompts has been expanded into one on creating meaningful assignments.
We have a much better understanding of rubrics. Rubrics are now so widespread that we have a much better idea of how to design and use them. A couple of years ago I did a literature review of rubric development that turned on a lot of lightbulbs for me, and this edition reflects my fresh thinking.
We’re recognizing that in some situations student learning is especially hard to assess. This edition has a new chapter on assessing the hard-to-assess, such as performances and learning that can’t be graded.
We’re increasingly appreciating the importance of setting appropriate standards and targets in order to interpret and use results appropriately. The chapter on this is completely rewritten, with a new section on setting standards for multiple choice tests.
We’re fighting the constant pull to make assessment too complicated. The pull of some accreditors’ overly complex requirements, some highly structured assessment information management systems, and some assessment practitioners with psychometric training to make things much more complicated than they need to be is strong. That this new edition is well over 400 pages says a lot! This book has a whole chapter on keeping assessment cost-effective, especially in terms of time.
We’re starting to recognize that, if assessment is to have real impact, results need to be synthesized into an overall picture of student learning. This edition stresses the need to sit back after looking through reams of assessment reports and ask, from a qualitative rather than quantitative perspective, what are we doing well? In what ways is student learning most disappointing?
Pushback to assessment is moving from resistance to foot-dragging. The voices saying assessment can’t be done are growing quieter because we now have decades of experience doing assessment. But while more people are doing assessment, in too many cases they’re doing it only to comply with an accreditation mandate. Helping people move from getting assessment done to using it in meaningful ways remains a challenge. So the two chapters on culture in the second edition are now six.
Data visualization and learning analytics are changing how we share assessment results. These things are so new that this edition only touches on them. I think that they will be the biggest drivers in changes to assessment over the coming decade.
|Posted on January 28, 2018 at 7:25 AM||comments (0)|
A couple of years ago I did a literature review on rubrics and learned that there’s no consensus on what a rubric is. Some experts define rubrics very narrowly, as only analytic rubrics—the kind formatted as a grid, listing traits down the left side and performance levels across the top, with the boxes filled in. But others define rubrics more broadly, as written guides for evaluating student work that, at a minimum, lists the traits you’re looking for.
But what about something like the following, which I’ve seen on plenty of assignments?
70% Responds fully to the assignment (length of paper, double-spaced, typed, covers all appropriate developmental stages)
15% Grammar (including spelling, verb conjugation, structure, agreement, voice consistency, etc.)
Under the broad definition of a rubric, yes, this is a rubric. It is a written guide for evaluating student work, and it lists the three traits the faculty member is looking for.
The problem is that it isn’t a good rubric. Effective assessments including rubrics have the following traits:
Effective assessments yield information that is useful and used. Students who earn less than 70 points for responding to the assignment have no idea where they fell short. Those who earn less than 15 points on organization have no idea why. If the professor wants to help the next class do better on organization, there’s no insight here on where this class’s organization fell short and what most needs to be improved.
Effective assessments focus on important learning goals. You wouldn’t know it from the grading criteria, but this was supposed to be an assignment on critical thinking. Students focus their time and mental energies on what they’ll be graded on, so these students will focus on following directions for the assignment, not developing their critical thinking skills. Yes, following directions is an important skill, but critical thinking is even more important.
Effective assessments are clear. Students have no idea what this professor considers an excellently organized paper, what’s considered an adequately organized paper, and what’s considered a poorly organized paper.
Effective assessments are fair. Here, because there are only three broad, ill-defined traits, the faculty member can be (unintentionally) inconsistent in grading the papers. How many points are taken off for an otherwise fine paper that’s littered with typos? For one that isn’t double-spaced?
So the debate about an assessment should be not whether it is a rubric but rather how well it meets these four traits of effective assessment practices.
If you’d like to read more about rubrics and effective assessment practices, the third edition of my book Assessing Student Learning: A Common Sense Guide will be released on February 13 and can be pre-ordered now. The Kindle version is already available through Amazon.
|Posted on January 9, 2018 at 7:25 AM||comments (3)|
Just before the holidays, the Council of Graduate Schools released Articulating Learning Outcomes in Higher Education. The title is a bit of misnomer; the paper focuses not on how to articulate learning outcomes but on why it’s a good idea to articulate learning outcomes and why it might be a good idea to have a learning outcome framework such as the Degree Qualifications Profile to articulate shared learning outcomes across doctoral programs.
What I found most useful about the paper was the strong case it makes for the value of articulating learning outcomes. It offers some reasons I hadn’t thought of before, and they apply to student learning at all higher education levels, not just doctoral education. If you work with someone who doesn't see the value of articulating learning outcomes, maybe this list will help.
Clearly defined learning outcomes can:
• Help students navigate important milestones by making implicit program expectations explicit, especially to first-generation students who may not know the “rules of the game.”
• Help prospective students weigh the costs and benefits of their educational investments.
• Help faculty prepare students more purposefully for a variety of career paths (at the doctoral level, for teaching as well as research careers).
• Help faculty ensure that students graduate with the knowledge and skills they need for an increasingly broad range of career options, which at the doctoral level may include government, non-profits, and startups as well as higher education and industry.
• Help faculty make program requirements and milestones more student-centered and intentional.
• Help faculty, programs, and institutions define the value of a degree or other credential and improve public understanding of that value.
• Put faculty, programs, and institutions in the driver’s seat, defining the characteristics of a successful graduate rather than having a definition imposed by another entity such as an accreditor or state agency.
|Posted on December 22, 2017 at 7:15 AM||comments (0)|
Virtually all U.S. accreditors (and some state agencies) require the assessment of student learning, but the specifics--what, when, how--can vary significantly. How can programs with multiple accreditations (say regional and specialized) serve two or more accreditation masters without killing themselves in the process?
I recently posted my thoughts on this on the ASSESS listserv, and a colleague asked me to make my contribution into a blog post as well.
Bottom line: I advocate a flexible approach.
Start by thinking about why your institution's assessment coordinator or committee asks these programs for reports on student learning assessment. This leads to the question of why they're asking everyone to assess student learning outcomes.
The answer is that we all want to make sure our students are learning what we think is most important, and if we're not, we want to take steps to try to improve that learning. Any reporting structure should be designed to help faculty and staff achieve those two purposes--without being unnecessarily burdensome to anyone involved. In other words, reports should be designed primarily to help decision-makers at your college.
At this writing, I'm not aware of any regional accreditor that mandates that every program's assessment efforts and results must be reported on a common institution-wide template. When I was an assessment coordinator, I encouraged flexibility in report formats (and deadlines, for that matter). Yes, it was more work for me and the assessment committee to review apples-and-oranges reports but less work and more meaningful for faculty--and I've always felt they're more important than me.
So with this as a framework, I would suggest sitting down with each program with specialized accreditation and working out what's most useful for them.
- Some programs are doing for their specialized accreditor exactly what your institution and your regional accreditor want. If so, I'm fine with asking for a cut-and-paste of whatever they prepare for their accreditor.
- Some programs are doing for their specialized accreditor exactly what your institution and your regional accreditor want, but only every few years, when the specialized review takes place. In these cases, if the last review was a few years ago, I think it's appropriate to ask for an interim update.
- Some programs assess certain learning goals for their specialized accreditor but not others that either the program or your institution views as important. For example, some health/medical accreditors want assessments of technical skills but not "soft" skills such as teamwork and patient interactions. In these cases, you can ask for a cut-and-paste of the assessments done for the specialized accreditor but then an addendum of the additional learning goals.
- At least a few specialized accreditors expect student learning outcomes to be assessed but not that the results be used to improve learning. In these cases, you can ask for a cut-and-paste of the assessments done but then an addendum on how the results are being used.
- Some specialized accreditors, frankly, aren't particularly rigorous in their expectations for student learning assessment. I've seen some, for example, that seem happy with surveys of student satisfaction or student self-ratings of their skills. Programs with these specialized accreditations need to do more if their assessment is to be meaningful and useful.
Again, this flexible approach meant more work for me, but I always felt faculty time was more precious than mine, so I always worked to make their jobs as easy as possible and their work as useful and meaningful as possible.
|Posted on December 8, 2017 at 7:00 AM||comments (1)|
Someone on the ASSESS listserv recently asked for recommendations for a good basic book for those getting started with assessment. Here are eight books I recommend for every assessment practitioner's bookshelf (in addition, of course to my own Assessing Student Learning: A Common Sense Guide, whose third edition is coming out on February 4, 2018.)
Assessment Essentials: Planning, Implementing, and Improving Assessment in Higher Education by Trudy Banta and Catherine Palomba (2014): This is a soup-to-nuts primer on student learning assessment in higher education. The authors especially emphasize organizing and implementing assessment.
Learning Assessment Techniques: A Handbook for College Faculty by Elizabeth Barkley and Claire Major (2016): This successor to the classic Classroom Assessment Techniques (Angelo & Cross, 1993) expands and reconceptualizes CATs into a fresh set of Learning Assessment Techniques (LATs)—simple tools for learning and assessment—that faculty will find invaluable.
How to Create and Use Rubrics for Formative Assessment and Grading by Susan Brookhart (2013): This book completely changed my thinking about rubrics. Susan Brookhart has a fairly narrow vision of how rubrics should be developed and used, but she offers persuasive arguments for doing things her way. I’m convinced that her approach will lead to sounder, more useful rubrics.
Creating Significant Learning Experiences: An Integrated Approach to Designing College Courses by L. Dee Fink (2013): Dee Fink is an advocate of backwards curriculum design: identifying course learning goals, identifying how students will demonstrate achievement of those goals by the end of the course, then designing learning activities that prepare students to demonstrate achievement successfully. His book presents an important context for assessment: its role in the teaching process.
Using Evidence of Student Learning to Improve Higher Education by George Kuh, Stan Ikenberry, Natasha Jankowski, Timothy Cain, Peter Ewell, Pat Hutchings, and Jillian Kinzie (2015): The major theme of this book is that, if assessment is going to work, it has to be for you, your colleagues, and your students, not your accreditor. This book is a powerful argument for moving from a compliance approach to one that makes assessment meaningful and consequential. If you feel your college is simply going through assessment motions, this book will give you plenty of practical ideas to make it more useful.
Five Dimensions of Quality: A Common Sense Guide to Accreditation and Accountability by Linda Suskie (2014): I wrote this book after working for one of the U.S. regional accreditors for seven years and consulting for colleges in all the other U.S. accreditation regions. In that work, I found myself repeatedly espousing the same basic principles, including principles for obtaining and using meaningful, useful assessment evidence. Those principles are the foundation of this book.
Assessment Clear and Simple: A Practical Guide for Institutions, Departments, and General Education by Barbara Walvoord (2010): The strength of this book is its size: this slim volume is a great introduction for anyone feeling overwhelmed by all he or she needs to learn about assessment.
Effective Grading by Barbara Walvoord and Virginia Anderson (2010): This is my second favorite assessment book after my own! With its simple language and its focus on the grading process, it’s a great way to help faculty develop or improve assessments in their courses. It introduces them to many important assessment ideas that apply to program and general education assessments as well.
|Posted on November 21, 2017 at 8:25 AM||comments (1)|
From time to time people contact me for advice, not on assessment or accreditation but for tips on how to build a consulting business. In case you’re thinking the same thing, I’m sorry to tell you that I really can’t offer much advice.
My consulting work is the culmination of 40 years of work in higher education. So if you want to spend the next 40 years preparing to get into consulting work, I can tell you my story, but if you want to build a business more quickly, I can’t help.
I began my career in institutional research, then transitioned into strategic planning and quality improvement. These can be lonely jobs, so I joined relevant professional organizations. Some of the institutions where I worked would pay for travel to conferences only if I was presenting, so I presented as often as I could. And I became actively involved in the professional organizations I joined—I was treasurer of one and organized a regional conference for another, for example. All these things helped me network and make connections with people in higher education all over the United States.
All institutional researchers deal with surveys, and early in my career I found people asking me for advice on surveys they were developing. Writing a good survey isn’t all that different from writing a good test, which I’d learned how to do in grad school. (My master’s is in educational measurement and statistics from the University of Iowa.) After finding myself giving the same advice over and over, I wrote a little booklet, which gradually evolved into a monograph on questionnaire surveys published by the Association for Institutional Research. I started doing workshops around the country on questionnaire design.
I love to teach, so concurrently throughout my career I’ve taught as an adjunct at least once a year—all kinds of courses, from developmental mathematics to graduate courses. That’s made a huge difference in my consulting work, because it’s given me credibility with both the teaching and administrative sides of the house.
Then I had a life-changing experience: a one-year appointment in 1999-2000 as director of the Assessment Forum at the old American Association for Higher Education. People often asked me for recommendations for a good soup-to-nuts primer on assessment. At that time, there wasn’t one (there were good books on assessment, but with narrower focuses). So I wrote one, applying what I learned in my graduate studies to the higher education environment, and was lucky enough to get it published. The book, along with conference sessions, continued networking, and simply having that one-year position at AAHE, built my reputation as an assessment expert.
When I went into full-time consulting about six years ago, I did read up a little on how to build a consulting business. I built a website so people could find me, and I built a social media presence and a blog on my website to drive people to the website. But I don’t really do any other marketing. My clients tell me that they contact me because of my longstanding reputation, my book, and my conference sessions.
So if you want to be a consultant, here's my advice. Take 40 years to build your reputation. Start with a graduate degree from a really good, relevant program. Be professionally active. Teach. Get published. Present at conferences. And get lucky enough to land a job that puts you on the national stage. Yes, there are plenty of people who build a successful consulting business more quickly, but I’m not one of them, and I can’t offer you advice on how to do it.
|Posted on November 8, 2017 at 10:05 AM||comments (6)|
I was struck by Nicholas Kristof’s November 6 New York Times article, How to Reduce Shootings. No, I’m not talking here about the politics of the issue, and I’m not writing this blog post to advocate any stance on the issue. What struck me—and what’s relevant to assessment—is how effectively Kristof and his colleagues brought together and compellingly presented a variety of data.
Here are some of the lessons from Kristof’s article that we can apply to assessment reports.
Focus on using the results rather than sharing the results, starting with the report title. Kristof could have titled his piece something like, “What We Know About Gun Violence,” just as many assessment reports are titled something like, “What We’ve Learned About Student Achievement of Learning Outcomes.” But Kristof wants this information used, not just shared, and so do (or should) we. Focus both the title and content of your assessment report on moving from talk to practical, concrete responses to your assessment results.
Focus on what you’ve learned from your assessments rather than the assessments themselves. Every subheading in Kristof’s article states a conclusion drawn from his evidence. There’s no “Summary of Results’ heading like what we see in so many assessment reports. Include in your report subheadings that will entice everyone to keep reading.
Go heavy on visuals, light on text. My estimate is that about half the article is visuals, half text. This makes the report a fast read, with points literally jumping out at us.
Go for graphs and other visuals rather than tables of data. Every single set of data in Kristof’s report is accompanied by graphs or other visuals that let immediately let us see his point.
Order results from highest to lowest. There’s no law that says you must present the results for rubric criteria or a survey rating scale in their original order. Ordering results from highest to lowest—especially when accompanied by a bar graph—lets the big point literally pop out at the reader.
Use color to help drive home key points. Look at the section titled “Fewer Guns = Fewer Deaths” and see how adding just one color drives home the point of the graphics. I encourage what I call traffic light color-coding, with green for good news and red for results that, um, need attention.
Pull together disparate data on student learning. Kristof and his colleagues pulled together data from a wide variety of sources. The visual of public opinions on guns, toward the end of the article, brings together results from a variety of polls into one visual. Yes, the polls may not be strictly comparable, but Kristof acknowledges their sources. And the idea (that should be) behind assessment is not to make perfect decisions based on perfect data but to make somewhat better decisions based on somewhat better information than we would make without assessment evidence. So if, say, you’re assessing information literacy skills, pull together not only rubric results but relevant questions from surveys like NSSE, students’ written reflections, and maybe even relevant questions from student evaluations of teaching (anonymous and aggregated across faculty, obviously).
Breakouts can add insight, if used judiciously. I’m firmly opposed to inappropriate comparisons across student cohorts (of course humanities students will have weaker math skills than STEM students). But the state-by-state comparisons that Kristof provides help make the case for concrete steps that might be taken. Appropriate, relevant, meaningful comparisons can similarly help us understand assessment results and figure out what to do.
Get students involved. I don’t have the expertise to easily generate many of the visuals in Kristof’s article, but many of today’s students do, or they’re learning how in a graphic design course. Creating these kinds of visuals would make a great class project. But why stop student involvement there? Just as Kristof intends his article to be discussed and used by just about anyone, write your assessment report so it can be used to engage students as well as faculty and staff in the conversation about what’s going on with student learning and what action steps might be appropriate and feasible.
Distinguish between annual updates and periodic mega-reviews. Few of us have the resources to generate a report of Kristof’s scale annually—and in many cases our assessment results don’t call for this, especially when the results indicate that students are generally learning what we want them to. But this kind of report would be very helpful when results are, um, disappointing, or when a program is undergoing periodic program review, or when an accreditation review is coming up. Flexibility is the key here. Rather than mandate a particular report format from everyone, match the scope of the report to the scope of issues uncovered by assessment evidence.
|Posted on October 29, 2017 at 9:50 AM||comments (2)|
Assessment results are often used to make tweaks to individual courses and sometimes individual programs. It can be harder to figure out how to use assessment results to make broad, meaningful change across a college or university. But here’s one way to do so: Use assessment results to drive faculty professional development programming.
Here’s how it might work.
An assessment committee or some other appropriate group reviews annual assessment reports from academic programs and gen ed requirements. As they do, they notice some repeated concerns about shortcomings in student learning. Perhaps several programs note that their students struggle to analyze data. Perhaps several others note that quite a few students aren’t citing sources properly. Perhaps several others are dissatisfied with their students’ writing skills.
Note that the committee doesn’t need reports to be in a common format or share a common assessment tool in order to make these observations. This is a qualitative, not quantitative, analysis of the assessment reports. The committee can make a simple list of the single biggest concern with student learning mentioned in each report, then review the list and see what kinds of concerns are mentioned most often.
The assessment committee then shares what they’ve noticed with whoever plans faculty professional development programming—what’s often called a teaching-learning center. The center can then plan workshops, brown-bag lunch discussions, learning communities, or other professional development opportunities to help faculty improve student achievement of these learning goals.
There needn’t be much if any expense in offering such opportunities. Assessment results are used to decide how professional development resources are used, not necessarily increase professional development resources.
|Posted on October 7, 2017 at 8:20 AM||comments (4)|
One of the many things I’ve learned by watching Ken Burns’ series on Vietnam is that Defense Secretary Robert MacNamara was a data geek. A former Ford Motor Company executive, he routinely asked for all kinds of data. Sounds great, but there were two (literally) fatal flaws with his approach to assessment.
First, MacNamara asked for data on virtually anything measurable, compelling staff to spend countless hours filling binders with all kinds of metrics—too much data for anyone to absorb. And I wonder what his staff could have accomplished had they not been forced to spend so much time on data collection.
And MacNamara asked for the wrong data. He wanted to track progress in winning the war, but he focused on the wrong measures: body counts, weapons captured. He apparently didn’t have a clear sense of exactly what it would mean to win this war and measure progress toward that end. I’m not a military scientist, but I’d bet that more important measures would have included the attitudes of Vietnam’s citizens and the capacity of the South Vietnamese government to deal with insurgents on its own.
There are three important lessons here for us. First, worthwhile assessment requires a clear goal. I often compare teaching to taking our students on a journey. Our learning goal is where we want them to be at the end of the learning experience (be it a course, program, degree, or co-curricular experience).
Second, worthwhile assessment measures track progress toward that destination. Are our students making adequate progress along their journey? Are they reaching the destination on time?
Third, assessment should be limited—just enough information to help us decide if students are reaching the destination on time and, if not, what we might to do help them on their journey. Assessment should never take so much time that it detracts from the far more important work of helping students learn.
|Posted on August 26, 2017 at 8:20 AM||comments (11)|
Chris Coleman recently asked the Accreditation in Southern Higher Education listserv ([email protected]) about schedules for assessing program learning outcomes. Should programs assess one or two learning outcomes each year, for example? Or should they assess everything once every three or four years? Here are my thoughts from my forthcoming third edition of Assessing Student Learning: A Common Sense Guide.
If a program isn’t already assessing its key program learning outcomes, it needs to assess them all, right away, in this academic year. All the regional accreditors have been expecting assessment for close to 20 years. By now they expect implemented processes with results, and with those results discussed and used. A schedule to start collecting data over the next few years—in essence, a plan to come into compliance—doesn’t demonstrate compliance.
Use assessments that yield information on several program learning outcomes. Capstone requirements (senior papers or projects, internships, etc.) are not only a great place to collect evidence of learning, but they’re also great learning experiences, letting students integrate and synthesize their learning.
Do some assessment every year. Assessment is part of the teaching-learning process, not an add-on chore to be done once every few years. Use course-embedded assessments rather than special add-on assessments; this way, faculty are already collecting assessment evidence every time the course is taught.
Keep in mind that the burden of assessment is not assessment per se but aggregating, analyzing, and reporting it. Again, if faculty are using course-embedded assessments, they’re already collecting evidence. Be sensitive to the extra work of aggregating, analyzing, and reporting. Do all you can to keep the burden of this extra work to a bare-bones minimum and make everyone’s jobs as easy possible.
Plan to assess all key learning outcomes within two years—three at most. You wouldn’t use a bank statement from four years ago to decide if you have enough money to buy a car today! Faculty similarly shouldn’t be using evidence of student learning from four years ago to decide if student learning today is adequate. Assessments conducted just once every several years also take more time in the long run, as chances are good that faculty won’t find or remember what they did several years earlier, and they’ll need to start from scratch. This means far more time is spent planning and designing a new assessment—in essence, reinventing the wheel. Imagine trying to balance your checking account once a year rather than every month—or your students cramming for a final rather than studying over an entire term—and you can see how difficult and frustrating infrequent assessments can be, compared to those conducted routinely.
Keep timelines and schedules flexible rather than rigid, adapted to meet evolving needs. Suppose you assess students’ writing skills and they are poor. Do you really want to wait two or three years to assess them again? Disappointing outcomes call for frequent reassessment to see if planned changes are having their desired effects. Assessments that have yielded satisfactory evidence of student learning are fine to move to a back-burner, however. Put those reassessments on a staggered schedule, conducting them only once every two or three years just to make sure student learning isn’t slipping. This frees up time to focus on more pressing matters.
|Posted on August 20, 2017 at 6:35 AM||comments (1)|
Scott Jaschick at Inside Higher Ed just wrote an article tying together two studies showing that many higher ed stakeholders don’t understand—and therefore misinterpret—the term liberal arts.
And who can blame them? It’s an obtuse term that I’d bet many in higher ed don’t understand either. When I researched my 2014 book Five Dimensions of Quality: A Common Sense Guide to Accreditation and Accountability, I learned that the term liberal comes from liber, the Latin word for free. In the Middle Ages in Europe, a liberal arts education was for the free individual, as opposed to an individual obliged to enter a particular trade or profession. That paradigm simply isn’t relevant today.
Today the liberal arts are those studies that address knowledge, skills, and competencies that cross disciplines, yielding a broadly-educated, well-rounded individual. Many people use the term liberal arts and sciences or simply arts and sciences to try to make clear that the liberal arts comprise study of the sciences as well as the arts and humanities. The Association of American Colleges & Universities (AAC&U), a leading advocate of liberal arts education, refers to liberal arts as liberal education. Given today’s political climate, that may not have been a good decision!
So what might be a good synonym for the liberal arts? I confess I don’t have a proposal. Arts and sciences is one option, but I’d bet many stakeholders don’t understand that this includes humanities and social sciences, and this term doesn’t convey the value studying these things. Some of the terms I think would resonate with the public are broad, well-rounded, transferrable, and thinking skills. But I’m not sure how to combine these terms meaningfully and succinctly.
What we need here is evidence-informed decision-making, including surveys and focus groups of various higher education stakeholders to see what resonates with them. I hope AAC&U, as a leading advocate of liberal arts education, might consider taking on a rebranding effort including stakeholder research. But if you have any ideas, let me know!
|Posted on August 8, 2017 at 10:35 AM||comments (2)|
Assessing student learning in co-curricular experiences can be challenging! Here are some suggestions from the (drum roll, please!) forthcoming third edition of my book Assessing Student Learning: A Common Sense Guide, to be published by Jossey-Bass on February 4, 2018. (Pre-order your copy at www.wiley.com/WileyCDA/WileyTitle/productCd-1119426936.html)
Recognize that some programs under a student affairs, student development, or student services umbrella are not co-curricular learning experiences. Giving commuting students information on available college services, for example, is not really providing a learning experience. Neither are student intervention programs that contact students at risk for poor academic performance to connect them with available services.
Focus assessment efforts on those co-curricular experiences where significant, meaningful learning is expected. Student learning may be a very minor part of what some student affairs, student development, and student services units seek to accomplish. The registrar’s office, for example, may answer students’ questions about registration but not really offer a significant program to educate students on registration procedures. And while some college security operations view educational programs on campus safety as a major component of their mission, others do not. Focus assessment time and energy on those co-curricular experiences that are large or significant enough to make a real impact on student learning.
Make sure every co-curricular experience has a clear purpose and clear goals. An excellent co-curricular experience is designed just like any other learning experience: it has a clear purpose, with one or more clear learning goals; it is designed to help students achieve those goals; and it assesses how well students have achieved those goals.
Recognize that many co-curricular experiences focus on student success as well as student learning—and assess both. Many co-curricular experiences, including orientation programs and first-year experiences, are explicitly intended to help students succeed in college: to earn passing grades, to progress on schedule, and to graduate. So it’s important to assess both student learning and student success in order to show that the value of these programs is worth the college’s investment in them.
Recognize that it’s often hard to determine definitively the impact of one co-curricular experience on student success because there may be other mitigating factors. Students may successfully complete a first-year experience designed to prepare them to persist, for example, then leave because they’ve decided to pursue a career that doesn’t require a college degree.
Focus a co-curricular experience on an institutional learning goal such as interpersonal skills, analysis, professionalism, or problem solving.
Limit the number of learning goals of a co-curricular experience to perhaps just one or two.
State learning goals so they describe what students will be able to do after and as a result of the experience, not what they’ll do during the experience.
For voluntary co-curricular experiences, start but don’t end by tracking participation. Obviously if few students participate, impact is minimal no matter how much student learning takes place. So participation is an important measure. Set a rigorous but realistic target for participation, count the number of students who participate, and compare your count against your target.
Consider assessing student satisfaction, especially for voluntary experiences. Student dissatisfaction is an obvious sign that there’s a problem! But student satisfaction levels alone are insufficient assessments because they don’t tell us how well students have learned what we value.
Voluntary co-curricular experiences call for fun, engaging assessments. No one wants to take a test or write a paper to assess how well they’ve achieved a co-curricular experience’s learning goals. Group projects and presentations, role plays, team competitions, and Learning Assessment Techniques (Barkley & Major, 2016) can be more fun and engaging.
Assessments in co-curricular experiences need students to give them reasonably serious thought and effort. This can be a challenge when there's no grade to provide an incentive. Explain how the assessment will impact something students will find interesting and important.
Short co-curricular experiences call for short assessments. Brief, simple assessments such as minute papers, rating scales, and Learning Assessment Techniques can all yield a great deal of insight.
Attitudes and values can often only be assessed with indirect evidence such as rating scales, surveys, interviews, and focus groups. Reflective writing may be a useful, direct assessment strategy for some attitudes and values.
Co-curricular experiences often have learning goals such as teamwork that are assessed through processes rather than products. And processes are harder to assess than products. Direct observation (of a group discussion, for example), student self-reflection, peer assessments, and short quizzes are possible assessment strategies.
|Posted on June 19, 2017 at 9:30 AM||comments (1)|
Someone on the ASSESS listserv recently asked how to advise a faculty member who wanted to collect more assessment evidence before using it to try to make improvements in what he was doing in his classes. Here's my response, based on what I learned in a book I discussed in my last blog post called How to Measure Anything.
First, we think of doing assessment to help us make decisions (generally about improving teaching and learning). But think instead of doing assessment to help us make better decisions than we would make without them. Yes, faculty are always making informal decisions about changes to their teaching. Assessment should simply help them make somewhat better informed decisions.
Second, think about the risks of making the wrong decision. I'm going to assume, rightly or wrongly, that the professor is assessing student achievement of quantitative skills in a gen ed statistics course, and the results aren't great. There are five possible decision outcomes:
1. He decides to do nothing, and students in subsequent courses do just fine without any changes. (He was right; this was an off sample.)
2. He decides to do nothing, and students in subsequent courses continue to have, um, disappointing outcomes.
3. He changes things, and subsequent students do better because of his changes.
4. He changes things, but the changes don't help; despite his best effort, changes in his teaching didn't help improve the disappointing outcomes.
5. He changes things, and subsequent students do better, but not because of his changes--they're simply better prepared than this year's students.
So the risk of doing nothing is getting Outcome 2 instead of Outcome 1: Yet another class of students doesn't learn what they need to learn. The consequence is that even more students consequently run into trouble in later classes, on the job, wherever, until the eventual decision is made to make some changes.
The risk of changing things, meanwhile, is getting Outcome 4 or 5 instead of Outcome 3: He makes changes but they don't help. The consequence here is his wasted time and, possibly, wasted money, if his college invested in something like an online statistics tutoring module or gave him some released time to work on this.
The question then becomes, "Which is the worst consequence?" Normally I'd say the first consequence is the worst: continuing to pass or graduate students with inadequate learning. If so, it makes sense to go ahead with changes even without a lot of evidence. But if the second consequence involves a major investment of sizable time or resources, then it may make sense to wait for more corroborating evidence before making that major investment.
One final thought: Charles Blaich and Kathleen Wise wrote a paper for NILOA a few years ago on their research, in which they noted that our tradition of scholarly research does not include a culture of using research. Think of the research papers you've read--they generally conclude either by suggesting how some other people might use the research and/or by suggesting areas for further research. So sometimes the argument to wait and collect more data is simply a stalling tactic by people who don't want to change.
|Posted on May 30, 2017 at 12:10 AM||comments (16)|
I stumbled across a book by Douglas Hubbard titled How to Measure Anything: Finding the Value of “Intangibles in Business.” Yes, I was intrigued, so I splurged on it and devoured it.
The book should really be titled How to Measure Anything Without Killing Yourself because it focuses as much on limiting assessment as measuring it. Here are some of the great ideas I came away with:
1. We are (or should be) assessing because we want to make better decisions than what we would make without assessment results. If assessment results don’t help us make better decisions, they’re a waste of time and money.
2. Decisions are made with some level of uncertainty. Assessment results should reduce uncertainty but won’t eliminate it.
3. One way to judge the quality of assessment results is to think about how confident you are in them by pretending to make a money bet. Are you confident enough in the decision you’re making, based on assessment results, that you’d be willing to make a money bet that the decision is the right one? How much money would you be willing to bet?
4. Don’t try to assess everything. Focus on goals that you really need to assess and on assessments that may lead you to change what you’re doing. In other words, assessments that only confirm the status quo should go on a back burner. (I suggest assessing them every three years or so, just to make sure results aren’t slipping.)
5. Before starting a new assessment, ask how much you already know, how confident you are in what you know, and why you’re confident or not confident. Information you already have on hand, however imperfect, may be good enough. How much do you really need this new assessment?
6. Don’t reinvent the wheel. Almost anything you want to assess has already been assessed by others. Learn from them.
7. You have access to more assessment information than you might think. For fuzzy goals like attitudes and values, ask how you observe the presence or absence of the attitude or value in students and whether it leaves a trail of any kind.
8. If you know almost nothing, almost anything will tell you something. Don’t let anxiety about what could go wrong with assessment keep you from just starting to do some organized assessment.
9. Assessment results have both cost (in time as well as dollars) and value. Compare the two and make sure they’re in appropriate balance.
10. Aim for just enough results. You probably need less data than you think, and an adequate amount of new data is probably more accessible than you first thought. Compare the expected value of perfect assessment results (which are unattainable anyway), imperfect assessment results, and sample assessment results. Is the value of sample results good enough to give you confidence in making decisions?
11. Intangible does not mean immeasurable.
12. Attitudes and values are about human preferences and human choices. Preferences revealed through behaviors are more illuminating than preferences stated through rating scales, interviews, and the like.
13. Dashboards should be at-a-glance summaries. Just like your car’s dashboard, they should be mostly visual indicators such as graphs, not big tables that require study. Every item on the dashboard should be there with specific decisions in mind.
14. Assessment value is perishable. How quickly it perishes depends on how quickly our students, our curricula, and the needs of our students, employers, and region are changing.
15. Something we don’t ask often enough is whether a learning experience was worth the time students, faculty, and staff invested in it. Do students learn enough from a particular assignment or co-curricular experience to make it worth the time they spent on it? Do students learn enough from writing papers that take us 20 hours to grade to make our grading time worthwhile?
|Posted on May 21, 2017 at 6:10 AM||comments (6)|
I was impressed with—and found myself in agreement with—Douglas Roscoe’s analysis of the state of assessment in higher education in “Toward an Improvement Paradigm for Academic Quality” in the Winter 2017 issue of Liberal Education. Like Douglas, I think the assessment movement has lost its way, and it’s time for a new paradigm. And Douglas’s improvement paradigm—which focuses on creating spaces for conversations on improving teaching and curricula, making assessment more purposeful and useful, and bringing other important information and ideas into the conversation—makes sense. Much of what he proposes is in fact echoed in Using Evidence of Student Learning to Improve Higher Education by George Kuh, Stanley Ikenberry, Natasha Jankowski, Timothy Cain, Peter Ewell, Pat Hutchings, and Jillian Kinzie.
But I don’t think his improvement paradigm goes far enough, so I propose a second, concurrent paradigm shift.
I’ve always felt that the assessment movement tried to do too much, too quickly. The assessment movement emerged from three concurrent forces. One was the U.S. federal government, which through a series of Higher Education Acts required Title IV gatekeeper accreditors to require the institutions they accredit to demonstrate that they were achieving their missions. Because the fundamental mission of an institution of higher education is, well, education, this was essentially a requirement that institutions demonstrate that its intended student learning outcomes were being achieved by its students.
The Higher Education Acts also required Title IV gatekeeper accreditors to require the institutions they accredit to demonstrate “success with respect to student achievement in relation to the institution’s mission, including, as appropriate, consideration of course completion, state licensing examinations, and job placement rates” (1998 Amendments to the Higher Education Act of 1965, Title IV, Part H, Sect. 492(b)(4)(E)). The examples in this statement imply that the federal government defines student achievement as a combination of student learning, course and degree completion, and job placement.
A second concurrent force was the movement from a teaching-centered to learning-centered approach to higher education, encapsulated in Robert Barr and John Tagg’s 1995 landmark article in Change, “From Teaching to Learning: A New Paradigm for Undergraduate Education.” The learning-centered paradigm advocates, among other things, making undergraduate education an integrated learning experience—more than a collection of courses—that focuses on the development of lasting, transferrable thinking skills rather than just basic conceptual understanding.
The third concurrent force was the growing body of research on practices that help students learn, persist, and succeed in higher education. Among these practices: students learn more effectively when they integrate and see coherence in their learning, when they participate in out-of-class activities that build on what they’re learning in the classroom, and when new learning is connected to prior experiences.
These three forces led to calls for a lot of concurrent, dramatic changes in U.S. higher education:
- Defining quality by impact rather than effort—outcomes rather than processes and intent
- Looking on undergraduate majors and general education curricula as integrated learning experiences rather than collections of courses
- Adopting new research-informed teaching methods that are a 180-degree shift from lectures
- Developing curricula, learning activities, and assessments that focus explicitly on important learning outcomes
- Identifying learning outcomes not just for courses but for for entire programs, general education curricula, and even across entire institutions
- Framing what we used to call extracurricular activities as co-curricular activities, connected purposefully to academic programs
- Using rubrics rather than multiple choice tests to evaluate student learning
- Working collaboratively, including across disciplinary and organizational lines, rather than independently
These are well-founded and important aims, but they are all things that many in higher education had never considered before. Now everyone was being asked to accept the need for all these changes, learn how to make these changes, and implement all these changes—and all at the same time. No wonder there’s been so much foot-dragging on assessment! And no wonder that, a generation into the assessment movement and unrelenting accreditation pressure, there are still great swaths of the higher education community who have not yet done much of this and who indeed remain oblivious to much of this.
What particularly troubles me is that we’ve spent too much time and effort on trying to create—and assess—integrated, coherent student learning experiences and, in doing so, left the grading process in the dust. Requiring everything to be part of an integrated, coherent learning experience can lead to pushing square pegs into round holes. Consider:
- The transfer associate degrees offered by many community colleges, for example, aren’t really programs—they’re a collection of general education and cognate requirements that students complete so they’re prepared to start a major after they transfer. So identifying—or assessing—program learning outcomes for them frankly doesn’t make much sense.
- The courses available to fulfill some general education requirements don’t really have much in common, so their shared general education outcomes become so broad as to be almost meaningless.
- Some large universities are divided into separate colleges and schools, each with their own distinct missions and learning outcomes. Forcing these universities to identify institutional learning outcomes applicable to every program makes no sense—again, the outcomes must be so broad as to be almost meaningless.
- The growing numbers of students who swirl through multiple colleges before earning a degree aren’t going to have a really integrated, coherent learning experience no matter how hard any of us tries.
At the same time, we have given short shrift to helping faculty learn how to develop and use good assessments in their own classes and how to use grading information to understand and improve their own teaching. In the hundreds of workshops and presentations I’ve done across the country, I often ask for a show of hands from faculty who routinely count how many students earned each score on each rubric criterion of a class assignment, so they can understand what students learned well and what they didn’t learn well. Invariably a tiny proportion raises their hands. When I work with faculty who use multiple choice tests, I ask how many use a test blueprint to plan their tests so they align with key course objectives, and it’s consistently a foreign concept to them.
In short, we’ve left a vital part of the higher education experience—the grading process—in the dust. We invest more time in calibrating rubrics for assessing institutional learning outcomes, for example, than we do in calibrating grades. And grades have far more serious consequences to our students, employers, and society than assessments of program, general education, co-curricular, or institutional learning outcomes. Grades decide whether students progress to the next course in a sequence, whether they can transfer to another college, whether they graduate, whether they can pursue a more advanced degree, and in some cases whether they can find employment in their discipline.
So where we should go? My paradigm springs from visits to two Canadian institutions a few years ago. At that time Canadian quality assurance agencies did not have any requirements for assessing student learning, so my workshops focused solely on assessing learning more effectively in the classroom. The workshops were well received because they offered very practical help that faculty wanted and needed. And at the end of the workshops, faculty began suggesting that perhaps they should collaborate to talk about shared learning outcomes and how to teach and assess them. In other words, discussion of classroom learning outcomes began to flow into discussion of program learning outcomes. It’s a naturalistic approach that I wish we in the United States had adopted decades ago.
What I now propose is moving to a focus on applying everything we’ve learned about curriculum design and assessment to the grading process in the classroom. In other words, my paradigm agrees with Roscoe’s that “assessment should be about changing what happens in the classroom—what students actually experience as they progress through their courses—so that learning is deeper and more consequential.” My paradigm emphasizes the following.
- Assessing program, general education, and institutional learning outcomes remain an assessment best practice. Those who have found value in these assessments would be encouraged to continue to engage in them and honored through mechanisms such as NILOA’s Excellence in Assessment designation.
- Teaching excellence is defined in significant part by four criteria: (1) the use of research-informed teaching and curricular strategies, (2) the alignment of learning activities and grading criteria to stated course objectives, (3) the use of good quality evidence, including but not limited to assessment results from the grading process, to inform changes to one’s teaching, and (4) active participation in and application of professional development opportunities on teaching including assessment.
- Investments in professional development on research-informed teaching practices exceed investments in assessment.
- Assessment work is coordinated and supported by faculty professional development centers (teaching-learning centers) rather than offices of institutional effectiveness or accreditation, sending a powerful message that assessment is about improving teaching and learning, not fulfilling an external mandate.
- We aim to move from a paradigm of assessment, not just to one of improvement as Roscoe proposes, but to one of evidence-informed improvement—a culture in which the use of good quality evidence to inform discussions and decisions is expected and valued.
- If assessment is done well, it’s a natural part of the teaching-learning process, not a burdensome add-on responsibility. The extra work is in reporting it to accreditors. This extra work can’t be eliminated, but it can be minimized and made more meaningful by establishing the expectation that reports address only key learning outcomes in key courses (including program capstones), on a rotating schedule, and that course assessments are aggregated and analyzed within the program review process.
Under this paradigm, I think we have a much better shot at achieving what’s most important: giving every student the best possible education.
|Posted on March 18, 2017 at 8:25 AM||comments (13)|
My last blog post on analyzing multiple choice test results generated a good bit of feedback, mostly on the ASSESS listserv. Joan Hawthorne and a couple of other colleagues thoughtfully challenged my “50% rule”—that any questions that more than 50% of your students get wrong may suggest something wrong and should be reviewed carefully.
Joan pointed out that my 50% rule shouldn’t be used with tests that are so important that students should earn close to 100%. She’s absolutely right. Some things we teach—healthcare, safety—are so important that if students don’t learn them well, people could die. If you’re teaching and assessing must-know skills and concepts, you might want to look twice at any test items that more than 10% or 15% of students got wrong.
With other tests, how hard the test should be depends on its purpose. I was taught in grad school that the purpose of some tests is to separate the top students from the bottom—distinguish which students should earn an A, B, C, D, or F. If you want to maximize the spread of test scores, an average item difficulty of 50% is your best bet—in theory, you should get test scores ranging all the way from 0 to 100%. If you want each test item to do the best possible job discriminating between top and bottom students, again you’d want to aim for a 50% difficulty.
But in the real world I’ve never seen a good test with an overall 50% difficulty for several good reasons.
1. Difficult test questions are incredibly hard to write. Most college students want to get a good grade and will at least try to study for your test. It’s very hard to come up with a test question that assesses an important objective but that half of them will get wrong. Most difficult items I’ve seen are either on minutiae, “trick” questions on some nuanced point, or questions that are more tests of logical reasoning skill than course learning objectives. In my whole life I’ve written maybe two or three difficult multiple choice questions that I’ve been proud of: that truly focused on important learning outcomes and didn’t require a careful nuanced reading or logical reasoning skills. In my consulting work, I’ve seen no more than half a dozen difficult but effective items written by others. This experience has led me to suggest that “50% rule.”
2. Difficult tests are demoralizing to students, even if you “curve” the scores and even if they know in advance that the test will be difficult.
3. Difficult tests are rarely appropriate, because it’s rare for the sole or major purpose of a test to be to maximize the spread of scores. Many tests have dual purposes. There are certain fundamental learning objectives we want to make sure (almost) every student has learned, or they’re going to run into problems later on. Then there are some learning objectives that are more challenging—that only the A or maybe B students will achieve—and those test items will separate the A from B students and so on.
So, while I have great respect for those who disagree with me, I stand by my suggestion in my last blog post. Compare each item’s actual difficulty (the percent of students who answered incorrectly) against how difficult you wanted that item to be, and carefully evaluate any items that more than 50% of your students got wrong.
|Posted on February 28, 2017 at 8:15 AM||comments (2)|
Next month I’m doing a faculty professional development workshop on interpreting the reports generated for multiple choice tests. Whenever I do one of these workshops, I ask the sponsoring institution to send me some sample reports. I’m always struck by how user-unfriendly they are!
The most important thing to look at in a test report is the difficulty of each item—the percent of students who answered each item correctly. Fortunately these numbers are usually easy to find. The main thing to think about is whether each item was as hard as you intended it to be. Most tests have some items on essential course objectives that every student who passes the course should know or be able to do. We want virtually every student to answer those items correctly, so check those items and see if most students did indeed get them right.
Then take a hard look at any test items that a lot of students got wrong. Many tests purposefully include a few very challenging items, requiring students to, say, synthesize their learning and apply it to a new problem they haven’t seen in class. These are the items that separate the A students from the B and C students. If these are the items that a lot of students got wrong, great! But take a hard look at any other questions that a lot of students got wrong. My personal benchmark is what I call the 50 percent rule: if more than half my students get a question wrong, I give the question a hard look.
Now comes the hard part: figuring out why more students got a question wrong than we expected. There are several possible reasons including the following:
- The question or one or more of its options is worded poorly, and students misinterpret them.
- We might have taught the question’s learning outcome poorly, so students didn’t learn it well. Perhaps students didn’t get enough opportunities, through classwork or homework, to practice the outcome.
- The question might be on a trivial point that few students took the time to learn, rather than a key course learning outcome. (I recently saw a question on an economics test that asked how many U.S. jobs were added in the last quarter. Good heavens, why do students need to memorize that? Is that the kind of lasting learning we want our students to take with them?)
If you’re not sure why students did poorly on a particular test question, ask them! Trust me, they’ll be happy to tell you what you did wrong!
Test reports provide two other kinds of information: the discrimination of each item and how many students chose each option. These are the parts that are usually user-unfriendly and, frankly, can take more time to decipher than they’re worth.
The only thing I’d look for here is any items with negative discrimination. The underlying theory of item discrimination is that students who get an A on your test should be more likely to get any one question right than students who fail it. In other words, each test item should discriminate between top and bottom students. Imagine a test question that all your A students get wrong but all your failing students answer correctly. That’s an item with negative discrimination. Obviously there’s something wrong with the question’s wording—your A students interpreted it incorrectly—and it should be thrown out. Fortunately, items with negative discrimination are relatively rare and usually easy to identify in the report.
|Posted on January 26, 2017 at 8:40 AM||comments (6)|
A new survey of chief academic officers (CAOs) conducted by Gallup and Inside Higher Education led me to the sobering conclusion that, after a generation of work on assessment, we in U.S. higher education remain very, very far from pervasively conducting truly meaningful and worthwhile assessment.
Because we've been working on this so long, as I reviewed the results of this survey, I was deliberately tough. The survey asked CAOs to rate the effectiveness of their institutions on a variety of criteria using a scale of very effective, somewhat effective, not too effective, and not effective at all. The survey also asked CAOs to indicate their agreement with a variety of statements on a five-point scale, where 5 = strongly agree, 1 = strongly disagree, and the other points are undefined. At this point I would have liked to see most CAOs rate their institutions at the top of the scale: either “very effective” or “strongly agree.” So these are the results I focused on and, boy, are they depressing.
Quality of Assessment Work
Less than a third (30%) of CAOs say their institution is very effective in identifying and assessing student outcomes. ‘Nuff said on that!
Value of Assessment Work
Here the numbers are really dismal. Less than 10% (yes, ten percent, folks!) of CAOs strongly agree that:
- Faculty members value assessment efforts at their college (4%).
- The growth of assessment systems has improved the quality of teaching and learning at their college (7%).
- Assessment has led to better use of technology in teaching and learning (6%). (Parenthetically, that struck me as an odd survey question; I had no idea that one of the purposes of assessment was to improve the use of technology in T&L!)
And just 12% strongly disagree that their college’s use of assessment is more about keeping accreditors and politicians happy than it is about teaching and learning.
And only 6% of CAOs strongly disagree that faculty at their college view assessment as requiring a lot of work on their parts. Here I’m reading something into the question that might not be there. If the survey asked if faculty view teaching as requiring a lot of work on their parts, I suspect that a much higher proportion of CAOs would disagree because, while teaching does require a lot of work, it’s what faculty generally find to be valuable work--it's what they are expected to do, after all. So I suspect that, if faculty saw value in their assessment work commensurate with the time they put into it, this number would be a lot higher.
Using Evidence to Inform Decisions
Here’s a conundrum:
- Over two thirds (71%) of CAOs say their college makes effective use of data used to measure student outcomes,
- But only about a quarter (26%) said their college is very effective in using data to aid and inform decision making.
- And only 13% strongly agree that their college regularly makes changes in the curriculum, teaching practices, or student services based on what it finds through assessment.
So I’m wondering what CAOs consider effective uses of assessment data!
- About two thirds (67%) of CAOs say their college is very effective in providing a quality undergraduate education.
- But less than half (48%) say it’s very effective in preparing students for the world of work,
- And only about a quarter (27%) say it’s very effective in preparing students for engaged citizens.
- And (as I've already noted) only 30% say it’s very effective in identifying and assessing student outcomes.
How can CAOs who admit their colleges are not very effective in preparing students for work or citizenship engagement or assessing student learning nonetheless think their college is very effective in providing a quality undergraduate education? What evidence are they using to draw that conclusion?
- While less than half of CAOs saying their colleges are very effective in preparing students for work,
- Only about a third (32%) strongly agree that their institution is increasing attention to the ability of its degree programs to help students get a good job.
After a quarter century of work to get everyone to do assessment well:
- Assessment remains spotty; it is the very rare institution that is doing assessment pervasively and consistently well.
- A lot of assessment work either isn’t very useful or takes more time than it’s worth.
- We have not yet transformed American higher education into an enterprise that habitually uses evidence to inform decisions.