ACM Logo  An ACM Publication  |  CONTRIBUTE  |  FOLLOW    

Formative evaluation
a practical guide

By Lisa Neal / November 2006

Print Email
Comments Instapaper
When designing an online course, countless decisions need to be made if the course is to meet its objectives for the target audience. Formative evaluation provides an easy-to-learn approach for verifying design decisions in order to increase effectiveness.

The purpose of formative evaluation is to obtain user feedback during the design and development stages of a project, such as an online course or a Web site. The feedback is generally in three areas: appeal, usability, and effectiveness at accomplishing a task. Formative evaluation can be conducted on concepts, paper prototypes, screen mock-ups, a working prototype, or an alpha release, with the goal of using the feedback to improve the design at the earliest possible stages when the cost of making changes is lower.

Formative evaluation has four stages: planning the sessions, conducting the sessions, compiling the results, and prioritizing the results from the sessions. The stages are not necessarily entirely linear, especially when the detection of significant problems leads to some immediate redevelopment before more sessions are conducted.

Planning the Sessions
Planning a session starts with a review of the goals for a product and an identification of the user populations who will be tested. The goals and user populations together set an agenda for the formative evaluation. The ideal number of users depends on how much time can be allocated to formative evaluation and the stage of development: Generally in the earlier stages fewer testers are used.

The next step is developing a scenario and scripting a session, which involves determining what a user will see and be asked. In scripting a session, the duration should be considered, which depends on the product and on the users since a lengthy session can become tedious for users. Every session should start and end the same: Thank the user for his or her time and let him or her know that their feedback is greatly appreciated for helping to improve the product. (In some cases, users are compensated or at least supplied with snacks as a way of showing appreciation.) The evaluator should provide the user with some information about what the product is in the beginning. The user should never feel like he or she is being tested on his or her knowledge or skills.

The next step is to determine whether any data needs to be collected at either the beginning or end of a session and whether it is better to use paper or to ask questions. At the beginning of a session, relevant biographical information is collected about a person's role, expertise, and tool use. Afterwards, evaluators may ask questions about overall reactions—sometimes asking them on paper using a scale can be more effective than (and a change from) asking questions.

For use during the session, series of questions are prepared that are associated with an idea or a screen, be it paper, a mock-up, or interactive. The questions may be asked before, during, or after a screen is viewed. For example, before seeing a screen, a user may be asked about expectations. During the viewing of a screen, the user may be asked if he or she likes it (appeal), to locate something on the screen (usability), or how to accomplish a specific task (effectiveness). After viewing the screen, the user may be asked how it compared to expectations. Additionally, during the actual session, a think-aloud protocol is generally used and questions are asked as needed, such as "Why did you click there?" or "What are you trying to find?" if the user isn't volunteering the information.

A plan for a formative evaluation includes filling roles. Generally two people are required, an evaluator and a scribe, and there may be observers as well. The evaluator may be a designer or developer, and, if so, he or she should be careful to not seem attached to his or her work. The evaluator should be friendly and try to elicit reactions and responses. Sometimes the evaluator is not involved in the project at all and is just briefed on the formative evaluation objectives.

It is hard for the evaluator to take notes, which is why it is better when there are two people. Even when a session is audio- or video-recorded, the most useful results are the higher-level comments and observations, not a record of each utterance. Useful notes include: "Immediately saw where the pull-down menu was to open a file" or "Was excited to know that there would be a chart-template library and discussed extensively how it could be used on the job." The scribe should keep in mind that the notes will be used to plan revisions. Also, a scribe should note anything that will help in conducting subsequent sessions.

Any observers at a session should try to remain unobtrusive. While it is helpful for developers to observe users, a user may be intimidated by too many observers.

Finally, it is very helpful to test a formative evaluation on a colleague before using outsiders or customers to work out any kinks, make sure everyone is clear on their roles, and come up with the approximate time needed per session. This is especially important for less-experienced evaluators.

Conducting the Sessions
When formative evaluation sessions are well planned and scripted, the only surprises that should occur are in the feedback. The detection of significant issues may lead to a redesign rather than using other testers.

The evaluator needs to remember during the sessions to elicit useful information. For example, if a user says "I don't like these colors," an evaluation should ask "Why don't you like them?" or "Are there colors you would prefer?" and then, as a follow up question, ask the user why he or she believes the other colors would work better.

Compiling Results
The evaluator and scribe should review notes as soon as possible after a session, ideally before the next session, and enhance the notes with any additional thoughts or observations. It is important that they discuss how the formative evaluations themselves are going so that improvements can be made if necessary. People who act as evaluators or scribes improve and become more comfortable over repeated sessions, and feedback helps them to be more effective in their roles.

The documentation from each session should be shared among the evaluator, scribe, and observers immediately after the sessions for enhancements or clarifications. Producing an executive summary which highlights session outcomes can be useful for anyone who was not at the sessions but wants to know how they went.

Review and Prioritize
The final step of a formative evaluation is to review the results and determine what actions to take. Designers and developers should provide input since they are most likely to know how easy or hard it will be to implement changes.

The first step is that the notes should be mined for positive and negative feedback, especially anything that can be acted upon. This is also where the more-detailed feedback from users is helpful, since "I don't like it" doesn't help much without an explanation of why.

The next step is for feedback to be prioritized by criticality and difficulty of making the change(s). Actions are typically revisions to the concept, design, or product. However, it may become clear that more testing is needed or testing with different users when reactions are mixed. A prioritization scheme is used such as 1 (easy to change) to 3 (difficult to change) and A (critical to fix) to C (would be nice to fix). The ideal is to have few actions to take, of course, but ideally most are "1-A."

Using the above example of color, if a page was designed using lots of pale blue, with a logo in blue, red, yellow, and green, users might say that it looked too Google-like. This might be considered a fairly minor problem ("C") yet one that is easy to fix ("1") by changing the color palette and making sure that the new colors aren't borrowed from another frequently-used Web site. Thus a report would state the number of users who commented, the prioritization ("1-C"), and a brief explanation of the problem and the proposed solution.

The more important examples, of course, are less cosmetic. For a learning application, the ideal situation is to discover in the formative evaluation that people are actually learning what they are intended to learn.

An Example of a Formative Evaluation
The most extensive formative evaluation I have been involved with was done for Plimoth Plantation's Online Learning Center, You Are the Historian: Investigating the First Thanksgiving. The first version of the site was evaluated by 33 graduate students at the Harvard Graduate School of Education, using almost 100 teachers and children, as one of two class projects in a semester course on formative evaluation. The evaluation results led to significant redesign.

There were many positive findings, such as that the visual richness and the use of audio enhanced the appeal of the site. We also found out that children and teachers wanted more activities, more audio, and a stronger role for the children who were incorporated into the site as guides. Children who took part in the formative evaluation identified with the guides and were curious to find out more about them. They liked the juxtaposition of modern children with those from 1621. We also found out through pre- and post-testing that learning was taking place: After using the site, children knew more about what an historian is, what a myth is, and what happened in 1621. Interestingly, many also went from talking about Pilgrims and Indians to talking about the early colonists and the Wampanoag.

The subsequent redesign enhanced the existing features that were especially appealing to children and made significant changes to address the perceived inadequacies. Some screen shots and further detail can be found in Making Learning Fun: Plimoth Plantation's Online Learning Center. The site has won a number of awards and has been adopted by countless classrooms around the world. Some of its success can be attributed to what was learned in the formative evaluation.



Comments

  • There are no comments at this time.

ADDITIONAL READING

    Lisa Neal
  1. Is it live or is it Memorex?
  2. The Value of Voice
  3. Predictions for 2006
  4. Five Questions...for Christopher Dede
  5. Five Questions... for John Seely Brown
  6. Five questions...for Shigeru Miyagawi
  7. "Deep" thoughts
  8. 5 questions... for Richard E. Mayer
  9. Designing usable, self-paced e-learning courses
  10. Want better courses?
  11. Just "DO IT"
  12. Five questions...
  13. Senior service
  14. Blogging to learn and learning to blog
  15. My life as a Wikipedian
  16. Five questions...for Elliott Masie
  17. The stripper and the bogus online degree
  18. Five questions...for Lynn Johnston
  19. Five questions...for Tom Carey
  20. Not all the world's a stage
  21. Five questions...for Karl M. Kapp
  22. Five questions...for Larry Prusack
  23. Five questions...for Seb Schmoller
  24. Do distance and location matter in e-learning?
  25. Why do our K-12 schools remain technology-free?
  26. Music lessons
  27. Learn to apologize for fun and profit
  28. Of web hits and Britney Spears
  29. Advertising or education?
  30. Five questions…for Matt DuPlessie
  31. Back to the future
  32. Serious games for serious topics
  33. Five (or six) questions...for Irene McAra-McWilliam
  34. Learner on the Orient Express
  35. Talk to me
  36. Q&A with Diana Laurillard
  37. Do it yourself
  38. Degrees by mail
  39. Predictions for 2004
  40. "Spot Learning"
  41. Q&A with Saul Carliner
  42. When will e-learning reach a tipping point?
  43. Online learning and fun
  44. In search of simplicity
  45. eLearning and fun
  46. Everything in moderation
  47. The basics of e-learning
  48. Predictions For 2003
  49. How to get students to show up and learn
  50. Q&A
  51. Blended conferences
  52. Predictions for 2002
  53. Learning from e-learning
  54. Storytelling at a distance
  55. Q&A with Don Norman