ACM Logo  An ACM Publication  |  CONTRIBUTE  |  FOLLOW    

Do Serious Games Work? Results from Three Studies

By Richard Blunt / December 2009

Print Email
Comments (15) Instapaper

"Military recruits and entry-level civilians of today not only understand technology in everyday use; they expect it," says Mark Oehlert, the Department of Defense's director of Game and Simulation department at the Defense Aquisition University.

These young workers are digital natives, raised in an environment where they were surrounded by inexpensive, yet highly interactive systems. Today's college generation grew up with video games from infancy. With games and technology at their fingertips, they process more information faster, and in a much different way, than most older people do.

The following study may help answer some of the questions now surrounding serious games—or games whose primary purpose is something other than entertainment, such as military training, education, physical therapy—and determine the relationship between the use of video games and learning as measured on standardized tests. More research is needed, but these findings provide some answers to both skeptics and supporters.

Purpose of the Studies
As recently as three years ago, Dr. Jan Cannon-Bowers (2006), eminent researcher in the field of the science of learning, challenged the efficacy of game-based learning and serious games at the Training 2006 Conference and Expo:

"We have plenty of empirical studies about simulations over the last 25 years. We know simulations work. We know simulations improve performance. We know simulations improve learning. Yet, I challenge anyone to show me a literature review of empirical studies about game-based learning. There are none. We are charging head-long into game-based learning without knowing if it works or not. We need studies."

These studies began in 2005 and took two years to complete. At the time, there were no established quantitative studies that showed anything like return on investment or "return on learning."

Return on learning (ROL) are metrics that show improvements in grades, increased student throughput, decreased costs of education or training, or faster learning.

Previous studies were vague and not scientifically conducted, citing findings that "supported" learning, or "reinforced" learning, or noted that students "enjoyed" the game aspects, hence Dr. Cannon-Bowers frustrations.

These three studies were the first hard data studies into game-based learning that we now call serious games. Consequently, the purpose of the three studies was to determine the relationship between the use of video games and learning.

Research Design and Methodology
Three studies were conducted at an east coast university to examine the difference in academic achievement between students who did and did not use video games in learning.

A video game was added to half the classes teaching first-year business students, third-year economics students, and third-year management students. Identical testing situations were used in each respective course, while data collected included game use, test scores, gender, ethnicity, and age.

A battery of statistical tests was used to test game use effectiveness. The various data groups were compared using a bank of standardized test questions provided with the course text. All students used the same text for their respective business, economics, or management course. Therefore, using questions from the texts ensured that students had the same access to text and class content apart from game use and reinforces the credibility of results as being attributable to participation in the game.

Results
Study 1: Business Students and Industry Giant II
The first research study was undertaken at an east coast university to examine the effectiveness of adding a simulation game as a supplement to an "Introduction to Business and Technology" (BUSN 115) course.

Approximately one-fifth of students participated in game playing, drawn randomly across courses and instructors.

The overall purpose of this study was to examine the effectiveness (by comparing exam score means) of the addition of the video game, Industry Giant II, as a supplement to BUSN 115.

What is the difference in academic achievement between students who use video games in learning and those who do not? 801 students did not play the game, 227 students did play the game.

As one student said about her experience in the course:

"It was really good having the simulation there because the games really brought it home. While the theory is really great, playing around with the variables we actually saw what would happen in the simulation. It made us really interested in what a business is because when you read about it, it's just ink and you stick the book in the back of your closet when you're done. But in the game, it's an experience that really stays with you, you actually learn about running a business. 'I've done this, I know how this works and why it's good for my business.'"

Figure 1 shows the means of test scores with and without game play, and Figure 2 shows the distribution of grades with and without game play.

Figure 1

Figure 1. Mean test scores with and without game use for BUSN 115 students.

Figure 2

Figure 2. Distribution of letter grades for BUSN 115 students.

Study Two: Economics Students and Zapitalism
The second research study tested whether adding a simulation game to a college level economics course improved student understanding and application of concepts, as measured by standardized tests. Significant elements included game participation, with a substantial improvement in test scores for students playing the video game Zapitalism.

The overall purpose of this study was to examine the effectiveness (by comparing exam score means, and gender means) of the addition of Zapitalism as a supplement to the ECON 312 "Principles of Economics" class.

234 students did not play the game, while 322 students did. Figure 3 shows the means of test scores with and without game play, Figure 4 shows test scores by gender, and Figure 5 shows the distribution of grades with and without game play.

Figure 3

Figure 3. Average test scores with and without video game use for ECON 312 students.

Figure 4

Figure 4. Test scores with and without game use for ECON 312 students by gender, with and without game play.

Figure 5

Figure 5. Distribution of letter grades for ECON 312 students, with and without video game use.

Study Three: Management Students and Virtual-U
The third causal-comparative study was conducted at an east coast university to examine the difference in academic achievement between students who did and did not use video games in learning.

The game Virtual-U was added to half the classes teaching third-year management students. Identical testing situations were used while data collected included game use, test scores, gender, ethnicity, and age. ANOVA, chi-squared, and t tests were used to test game use effectiveness.

The overall purpose of this study was to examine the effectiveness (by comparing exam score means, gender means, ethnicity means, and age means) of the addition of the video game as a supplement to the MGMT 303 "Principles of Management" class.

One student's perception: "The greatest impact I got from that course was the hands-on experience using Virtual-U. It helped us improve and learn more from the course."

Figure 6 shows the means of test scores with and without game play, Figure 7 shows test scores by gender, Figure 8 shows test scores by ethnicity (note: there were no Asians enrolled who played without the game), Figure 9 shows test scores by age, and Figure 10 shows the distribution of grades with and without game play.

Figure 6

Figure 6. Average test scores with and without video game use for MGMT 303 students.

Figure 7

Figure 7. Test scores with and without game use for MGMT 303 students by gender, with and without game play.

Figure 8

Figure 8. Test scores with and without game use for MGMT 303 students by ethnicity, with and without game play.

Figure 9

Figure 9. Test scores with and without game use for MGMT 303 students by age, with and without game play.

Figure 10

Figure 10. Distribution of letter grades for MGMT 303 students by ethnicity, with and without game play.

Conclusion
Between 2006 and 2008, more than $200,000,000 was spent on serious games without knowing if it works. The problem addressed by this research, was to determine the relationship between the use of video games and learning.

The findings show that classes using the game had significantly higher means than those classes that did not use the game. There were no significant differences between male or female scores, regardless of game play, while both genders scored significantly higher with game play than without. There were no significant differences between ethnic groups, while all ethnic groups scored significantly higher with game play. Lastly, students ages 40 and under scored significantly higher with game play, whereas students age 41 and up did not.

In short, the studies found that, at least in some circumstances, the application of serious games significantly increases learning.

The challenge now is to expand this type of research to further understand the efficacy of using serious games. The potential of using serious games to create new expectations of learning and performance achievements remains to be proven. The more studies there are like this, that show the return on learning of using serious games, will help decision makers overcome the cultural change resistance needed to embrace serious games and game-based learning.



Comments

  • Wed, 29 Dec 2010
    Post by Cobh

    Its so refreshing to find articles like the ones you post on your site. Very informative reading

  • Fri, 24 Dec 2010
    Post by chris

    Authorsmania provide papers all over the academic spectrum  everything from simple essays to term papers, theses, and dissertations in every area of academic inquiry. It utilize the standard research formats that are used in universities.Its a best option.

  • Thu, 08 Apr 2010
    Post by Robert S. Becker, PhD

    It seems to me that much evaluation of academic and corporate learning - including the studies reported by Rick Blunt - measures only cognitive gains. This roughly translates into the ability to think better.

    The problem with this is that serious games run where traditional instruction fears to tread, in the affective and conative domains. But the evaluative strategies mentioned in in this discussion don't go there, except when briefly quoting participants who loved their learning experience and felt like they accomplished something important in their lives, beyond getting a good grade from the teacher. Should look for serious games to do what traditional instruction is doing already? That limits and undermines the promise of the technology. Instead we should measure what games are uniquely empowered to do. On a superficial level, games attract, excite, engage and satisfy players. On a deeper and more critical level, games endow players with the ability to learn, to apply what they know and feel, and attain a self-determined outcome. If we are capable of measuring these powers along with the cognitive gains, then I think the research will produce more useful findings.

  • Thu, 25 Mar 2010
    Post by Ben Sawyer

    Crap... my comments didn't post! Thanks cold fusion... Oh well I'll have to come back another day. Glad to see VU was useful. If my extensive comments are found lying in the DB somewhere hopefully they'll get posted otherwise maybe this weekend...

    - Ben

  • Tue, 16 Mar 2010
    Post by Simon Egenfeldt-Nielsen

    I think that the phantasy that we do not have evidence and need more needs to be killed once and for all. There is actually quite a few studies, but they will never solve the underlying issue. Namely, that we need to think of assessment, learning and games differently than other educational forms. See my short article from 2006 that sums up what there was back then (http://egenfeldt.eu/papers/game-overview.pdf)

    I do not see the above studies as more rigorous that a number of other studies (although still interesting), and they do not respond to the concerns that was raised at the Serious Games Summits over the years in relation to assessment (among others by Dr. Jan Cannon-Bowers), but have actually been pretty well know for a number of years in the research tradition most evident in the Simulation & Gaming journal (see my my book from 2007 for more info - http://egenfeldt.eu/blog/?page_id=16 ).

  • Thu, 07 Jan 2010
    Post by Daniel Hickey

    Sorry--I just saw I dropped out from a sentence, which was supposed to say that I agreed with Mark Notess about the point on controlling for time on task. I did not mean to sound so disagreeable! But that also reminds me of a related point that gets at the difficulty of controlled comparisons of games versus "conventional" curriculum (whatever that is or happens to be). In the key comparison study of QA (Anna Arici Barab's lovely dissertation, reference in the aformentioned JSET article) some of the students in the QA classes were so enthralled with Taiga that they were logging on at home to play. Post hoc analysis revealed that it was not enough to really matter, but the fact that they were is actually a really important finding. Of the course the kids in the text-based comparison classroom were not doing voluntary homework. But the pressure for "scientific" empirical evidence might have led us to block access. Screw that. Instead of doing that in the next study, we insted invested in improvements to formative feedback across all four classes, and dramatically increased gains in understanding and achievement, resulting in key design principles that are useful to others, along with effect sizes so large they can possibly be explained away by some free time play. In retrospect, we probably should have also ran with that initial finding and tested different design strategies for encouraging free choice engagement in the QA conditions. Meanwhile in the Anna's initial comparison study, the teacher did a far better job teaching the conventional text curruclum (because he was familar with that format) while he struggled a bit with QA. This biased the results against QA but it still prevailed on both the problem-solving performance assessment and standards-oriented achievement test. But we could not have reasonably controlled for that, other than arguing the advantate of the QA condition would have been even larger still. Good design-based research trumps good controlled studies in most circumstances!

  • Thu, 07 Jan 2010
    Post by Daniel Hickey

    This is an interesting article and particularly interesting discussion thread. I gree First I disagree with the premise that we must have empirical comparisons (what they really mean to say is controlled studies, as empricial is a bit imprecise). I actually think that too many people rush into comparison studies before the technology gets refine. But technology is inevitable and fosters forms of engagement and learning that are not possible without it. In the words of Nora Saballi in the PCAST report way back in the 1990s, "to think that education is the one information based industry that will not eventurally be transformed by technology is absurd--so lets stop trying to some how "prove" that technology "works".

    But that is not my main issue. The real problem I have with these studies is that they equate increase scores on an undescribed assessment as "academic achievement". In that absence of any effort to warrant the validity of these assessments, either in the summary or the longer paper (which it was not), we really can't draw any meaningful conclusions. The outcomes were more than likely primarily the result of the degree to which successful participation in the game transferred to more successful participation in the exam. In general, if the test was a selected response items focus on basic factual knowledge, a simulation may actually undermine performance if you control for time on task (because games are not very effective at teaching such knowlege). But if the test instead featured a complex performance task or essay where individuals had to explain complex relationships and perhaps provide rationale for their answers, then the game may help a lot. If the test focused entirely on stuff that was unique to the game, then the game would make an even bigger difference.

    This is one of the reasons I am a bit jaundiced towards shrill calls for more "empirical" studies. Random assignment in particular is very difficult to do well in most educational settings, and controlling for confounds in quasi-experimental designs is surprizingly difficult. When this gets coupled with the challenges of validity, you end up with a bunch of effect sizes that wear the mantle of science but are mostly deliberate or naive bias. You get statistical signicants but don't end up with any guidance for building more effective games.

    Don't get me wrong. I have done lots of quasi experimental studies of all different kinds of innovations (most recently with Quest Atlantis), and some randomized trials too. I jus think those studies should come much later, after earlier deisgn studies to maximize learning. Then you have to pay a lot attention to the relative validity of the measures. In my experience if you don't have pre-post measures OR randomized assigment OR a massive effect size (at least .7 SD) then it just does not matter if the differences between groups were unlikely to occur by chance.

    Thanks for the thread folks! I am just starting a sabattical and hope to spend more time hanging out here.

  • Thu, 10 Dec 2009
    Post by Richard Oppenheimer

    In light of both Clarks and Jills comments, clearly what we used in the pre-computer, pre-internet days that I described was a simulation rather than a game, which Jill classifies as being an activity that includes virtual conflict and that has a win/lose outcome, per her conversation with Zimmerman. This is, of course, the traditional definition of a game, thus Clark's excellent insight that we need to clarify our definitions of games and how they are implemented in the learning environment. That completing practical tasks tends to increase student retention and participation has been proven and reported often in the literature. I think Clark is correct that the first goal is to recognize the opportunity for employing whatever we define as a "game" in the learning experience, which begs the question of who designs, develops, and performs the rigorous testing required to prove an activity's effectiveness in a specific course or situation, as reported in the Blunt article.

    Because games may require, by default, the win-lose paradigm as Jill/Zimmerman state leads me to believe that Clark's postulate is spot on. If the goal is to ensure that more students take practical and theoretical skills from the learning event, one would have to carefully design the "sim/game" (if you all do not mind my using this term for now) to ensure not only relevance but also timeliness in the learning outcomes. Obviously it would be of little use today to provide extensive skills in FORTRAN through a sim/game experience when teaching future programmers, however CLOUD Computing concepts would be of definite value for todays studentsfor a while, at least.

    In the business curriculum, a designer would have to ensure that the concepts in the sim/game were currently viable or basic enough to be of value for some period of time. Though we might change texts every year or so, even in todays rich tools environment, extensively modifying or changing the sim/game every term would not permit adequate time for the required testing to occureven if budget money were available for both activities! Minor tweaks based on slight changes or ongoing testing would certainly be possible, assuming that the original designer was still around or had created and left behind copious notes for the next programmer/designer.

    So I believe we face several challenges to ensure that sim/games, in whatever terms we define them, not only measurably increasing student performance and retention, but are also available in a practical, cost-efficient version for the typical college classroom. Taking this one step further, if the curriculum design were such that the sim/game was part of a two-or-more course sequential offering, students would be able to build upon their skills while progressing in their major. (Students who place out of Course 101 would offer department chairpersons additional challenges in such an environment, but I digress& .)

  • Tue, 08 Dec 2009
    Post by Jill Duffy

    Games have a scoring system and a "winner" or win outcome. Sims don't necessarily have one (though they can). In a sim, the object is to complete the simulation. These thoughts come from a conversation I had with Eric Zimmerman, who wrote the book Rules of Play with Katie Salen, in which they state: "A game is a system in which the players engage in an artificial conflict, defined rules, that results in a quantifiable outcome. ... At the conclusion of a game, a player has either won or lost or received some kind of numerical score."

  • Tue, 08 Dec 2009
    Post by Clark Quinn

    I find it indicative of the problem to see Richard's comments about using a simulation game versus a video game in juxtaposition with Jan's quote which opened the article. We have a definitional problem here: a game like Virtual U *is* a simulation, just tuned into a game. So what is the difference between simulations and "game-based learning"?

    I'm frankly surprised at the concern for 'game-based learning'; as noted, we know that application of concepts leads to better retention. Ala Mark's comment, we should be investigating specifically when and how we should tune simulations into games, not whether we should.

  • Tue, 08 Dec 2009
    Post by Richard Oppenheimer

    While I was not able to perform scientifically-sound studies 30 years ago when teaching as a graduate assistant at the University of Florida, I was able anecdotally to observe increased participation, increased comprehension, and slightly higher grades when I added the game "Relocation: A Corporate Decision" (sorry, I forget the publisher's name) to one of the Business and Technical Writing courses I was teaching. Clearly this is not a video game, but a traditional simulation game overlaid on the syllabus and course objectives and supported with the normal textbook and additional readings used in this class. I worked with a recently-relocated local company that made bows and arrows--Bear Archery--as a real-world example for the students and, following the game instructions, we visited and interviewed corporate executives during the term. Simultaneously, my other section proceeded normally and followed the same syllabus without the simulation. It became clear to me after the first set of papers were submitted that those students who were participating in the "gaming" section were significantly more engaged in the course than those who were completing identical assignments in the "traditional" class.

    At the end of the term, final grades were slightly better for the "gaming" section, but intrinsic and extrinsic motivation was clearly evident and anecdotal comments on the "happiness" sheet (what Dr. Sterling Hennis at UNC called the course evaluation form when I studied with him) showed that those engaged in the simulation felt that they had gained some measure of real-world experience during the class. Of course, going off-campus as a class for one interview session, and requiring the students make an appointment and go to the corporate offices on their own at least once more during the term, provided some change from the normal classroom-only environment, and some students were pleased to do that while others found it to be an added chore, which was not the term they used, as I remember.

    Afterward, however, I strongly suggested to the University that we might want to add gaming to this class or, possibly, work with the business school to devise a way to combine a simulation and writing class in some waywhat we call Writing Across the Curriculum (WAC) today, of course. Unfortunately, I moved from academia to industry soon after this term ended, but I have never lost site of the value of the properly-designed and delivered simulation game in the traditional college classroom. Given today's electronic environment, I would strongly support such an approach given the proper design and planning for the course. It is interesting to see that this study seems to provide the proper support for such an approach.

  • Sat, 05 Dec 2009
    Post by Teicko

    Are these truly serious games or do they fall in category of of educational games. I've been under the impression serious games is a much smaller class of games.

    You may find this research from Forrester about serious games intriguing.

    http://www.enthiosys.com/news-events/forrester-serious-gaming/

  • Fri, 04 Dec 2009
    Post by Guy Boulet

    I understand Mark's concerns and nothing in the reports clearly indicates if those games were used in addition to regular learning activities or if they were replacing some. If they were replacing other activities then the results are quite significant since it shows that games are way more efficient than those activities they replaced. But if they were used in addition to regular activities it only shows then that extra activities increase learning, which was already known.

    It would have been important to make that clarification in the report.

  • Wed, 02 Dec 2009
    Post by Jill Duffy

    I see what Mark is getting at with his comment, but it doesn't explain why people in the 41-50 age range did NOT see an increase in their grades after adding the supplemental game. I think there's more to it, and I agree that more studies are needed.

  • Tue, 01 Dec 2009
    Post by Mark Notess

    Interesting article--I also took a look at the more detailed writeup of the studies. The main problem I have with these studies is that they don't demonstrate a game-based learning supplement as superior to any other kind of supplemental learning activity. A better study would be to have some students participate in another kind of supplemental activity such as an online study group and the other students in the game supplement. That way, we don't have one group having an extra activity. I'd argue that these studies merely show that extra learning activities tend to increase learning outcomes. They don't illuminate the benefits, much less the cost-benefit ratio, of serious games. Unless I'm missing something here about your methodology.