TLT-SWG: Uniform Impact

Tuesday, October 27, 2009

Evaluation Methods: Each user is unique. Assess each one first, then look for patterns

On Monday, I talked about my belief, as a novice evaluator and educator, that evaluation (and teaching) should be organized around programmatic goals: describe every student should learn, study each student's progress toward those goals, and study the program activities that are most crucial (and perhaps most risky) for producing those outcomes.

After some years of experience, however, first at Evergreen and then as a program officer with the Fund for the Improvement of Postsecondary Education (FIPSE), I realized that this Uniform Impact was valuable but limited.

In fact, I now think there are two legitimate, valuable ways to think about, and evaluate, any educational program or service:

Uniform Impact: Pay attention to the same learning goal(s) for each and every student (or, if you're evaluating a faculty support program, pay attention whether all the faculty are making progress in a direction chosen by the program leaders).
Unique Uses: Pay attention to the most important positive and negative outcomes for each user of the program, no matter what those outcomes are.

You can see both perspectives in action in many courses. For example, if an instructor gives three papers an “A,” and remarks, “These three papers had almost nothing in common except that, in different ways, they were each excellent,” the instructor is using a Unique Uses perspective to do assessment.

Each of these two perspectives focuses on things that the other perspective would miss. A Unique Uses perspective is especially important in liberal and professional education: they both want to educate students to exercise judgment and make choices. If every student had the same experiences and outcomes, the experience would be training, not liberal or professional education.

Similarly, Unique Uses is important for transformative uses of technology in education, because many of those uses are intended to empower learners and their instructors. For example, when a faculty member assigns students to set their own topics and then use the library and the Internet to do their own research, some of the outcomes can only be assessed through a Unique Uses approach.

What are the basic steps for doing a Unique Uses evaluation?

Pick a selection, probably a random selection, of users of the program (e.g., students).
Use an outsider to ask them what the most important consequences have been from participating in the program, how they were achieved, and why the interviewee thinks their participation in the program helped cause those consequences (evidence).
Use experts with experience in this type of program ( Eliot Eisner has called these kinds of people 'connoisseurs' because they have educated judgment honed by long experience) to analyze the interviews. For each user, the connoisseur would summarize a) the value of the outcome in the connoisseur's eyes, using a single or multiple rating scales
The connoisseur would also comment on whether and how the program seems to have influenced the outcome for this individual, perhaps with suggestions for how the program could do better next time with this type of user.
The connoisseur(s) then look for patterns in these evaluative narratives about individuals. For example, the connoisseur(s) might notice that many of the participants encountered problems when, in one way or another, their work carried them beyond the expertise of their instructors, and that instructors seemed to have no easy strategy for coping with that.
Finally, the connoisseur(s) write a report to the program with a summary judgment, recommendations for improvement, or both, illustrated with data from relevant cases.

To repeat, a comprehensive evaluation of almost any academic program or service ought to have both Uniform Impact and Unique Uses components, because each type of study will pick up findings that the other will miss. Some programs (e.g. a faculty development program that works in an ad hoc manner with each faculty member requesting help) are best served if the evaluation is mostly about Unique Uses. A training program (e.g., most versions of Spanish 101) is probably best evaluated using mainly Uniform Impact methods. But most programs and services need some of each method.

There are subtle, important differences between these two perspectives. For example,

Defining “excellence”: In a Uniform Impact perspective, program excellence consists of producing great value-added (as measured along program goals) regardless of the characteristics or motivations of the incoming students. In contrast, program excellence in Unique Uses terms is measured in part by generativity: Shakespeare's plays are timeless classics in part because there are so many great, even surprising ways to enact them, even after 400 years. The producer, director and actors are unique users of the text.
Defining the 'technology”: From a Uniform Impact perspective, the technology will be the same for all users. From a Unique Uses perspective, one notices that different users make different choices of which technologies to use, how to use them, and how to use their products.

For more on our recommendations about how to design evaluations, especially studies of educational uses of technology, see the Flashlight Evaluation Handbook. The Flashlight Approach, a PDF in Section I, gives a summary of the key ideas.

Have any evaluations or assessments at your institution used Unique Uses methods? Should they in the future? Please click the comments button below and share your observations and reactions.

PS We're over 3,300 visits to http://bit.ly/ten_things_table. So far, however, most people seem to look at the summary and perhaps one essay. Come back, read more of these mini-essays, and share more of your own observations!

Monday, October 26, 2009

12. To evaluate ed tech, set learning goals & assess student progress toward them (OK but what does this approach miss?)

It's Monday so let's talk about another one of those things I no longer (quite) believe about evaluation of educational uses of technology. Definition: “Evaluation” for me is intentional, formal gathering of information about a program in order to make better decisions about that program.

In 1975, I was the institutional evaluator at The Evergreen State College in Olympia, Washington. I'd offer faculty help in answering their own questions about their own academic programs (a “program” is Evergreen's version of a course). Sometimes faculty would ask for help in framing a good evaluative question about their programs. I'd respond, “First, describe the skills, knowledge or other attributes that you want your students to gain from their experience in your program.”

“Define one or more Learning Objectives for your students” remains step 1 for most evaluations today, including (but not limited to) evaluating the good news and bad news about technology use in academic programs. In sections A-E of this series, I've described five families of outcomes (goals) of technology use, and suggested briefly how to assess each one.

However, outcomes assessment by itself provides little guidance for how to improve outcomes. So the next step is to identify the teaching/learning activities that should produce those desired outcomes. Then the evaluator gathers evidence about whether those activities have really happened, and, if not, why not. Evidence about activities can be extremely helpful in a) explaining outcomes, b) improving outcomes, c) investigating the strengths, weaknesses and value of technology (or any sort of resource or facility) for supporting those activities.

Let's illustrate this with an example.

Suppose, for example, that your institution has been experimenting with the use of online chats and emails to help students learn conversational Spanish. As the evaluator, you'd need to have a procedure for assessing each student's competence in understanding and speaking Spanish. Then you'd use that method to assess all students at the end of the program and perhaps also earlier (so you could see what they need at the beginning, how they're doing in the middle, and what they've each gained by the end).

You would also study how the students are using those online communications channels, what the strengths and weaknesses of each channel are for writing in Spanish, whether there is a relationship between each student's use of those channels and their progress in speaking Spanish, and so on.

Your findings from these studies will signal whether online communications are helping students learn to speak Spanish, and how to make the program work better in the future.

Notice that what I've said so far about designing evaluation is entirely defined by program goals.: the definition of goals sets the assessment agenda and also tells which activities are most important to study. I've labeled this the Uniform Impact perspective, because it assumes that the program's goals are what matter, and that those goals are the same for all students.

Does the Uniform Impact perspective describe the way assessment and evaluation are done? Do any assessments and evaluations that you know go beyond the suggestions above? (Please add your observations below by using the “Comments” button.)

PS. “Ten Things” is gaining readers! The alias for the table of contents – http://bit.ly/ten_things_table – has been clicked over 3,200 times already. Thanks! If you agree these are important questions for faculty and administrators to consider, please add your own observations to any of these posts, old or new, and spread the word about this series.