Rating the Unratable

I think I’ve mentioned before that if you haven’t read it, Atul Gawande’s Better is a great book about how things can get systematically improved. It focuses on medicine, but it’s one of those all-purpose insightful books.

Anyway, one section that really stuck with me was the Apgar score (article version here, for the curious) . For the unfamiliar, this is a rating given to babies when they’re born and a minute afterwards. It rates five things (Complexion, Pulse, Movement, Breathing and Irritability) to quickly generate a score from 0-10. In and of itself it’s not a very detailed piece of information, but it’s simple, easy to asess, easy to communicate, and generally makes an excellent shorthand for the child’s health.

Now, the specifics of the Apgar score are pretty interesting in their own right, but what’s much more interesting is positive impact it had on successful births. In an illustration of the trusim that you must measure something in order to improve it, Apgar scores gave hospitals a yardstick to measure their performance by, so they had something concrete to improve and to judge results by.

What gets me, and what brings this across to gaming for me, is that part of the success lay in the somewhat arbitrary nature of the scoring system. There are a bazillion variables at play when a baby is born, and picking those 5 and saying their the ones to score is, from a certain perspective, almost capricious. But, as with a lot of things, it seems to be one of those cases where making a good decision is a much better path than indefinite delay in trying to find the perfect solution.

So, with that in mind, I’m busting out a list of things GMs do. It’s probably a bad list, but I want to start somewhere. Honestly, I doubt we can come to something nearly as useful as the Apgar score, and even if we come up with a list, there’s a whole question of how to use it, but dammit, you have to start somewhere.

When I started on thel ist I realized that the biggest muddle I encountered was between the GM’s “Solo fun” (that is, design work) and actual play at the table. Both are important, but since I’m trying to take a practical tack on this, I chose to think in terms of how play went at the table. That is, I’m looking to rate things that the GM does in play, perhaps as a list to run through at the end of a session and see how each of these things went.

Removing those non-table elements shortened the list dramatically, but I still don’t feel it’s as solid as it could be, but here it is:

  • Playing interesting NPCs (Strong character voice)
  • Setting Presentation (how well does the world hang together?)
  • Scene setting
  • Engaging challenges (Puzzles, fights)
  • Rules mastery
  • Humor
  • Creating Emotional engagement
  • EDIT BASED ON MANY COMMENTS – Maintaining Focus/Pacing.

So what on that sucks, and what is it missing? Or is the entire methodology flawed, and the list should be entirely different?

30 thoughts on “Rating the Unratable

  1. David 'Doc Blue' Wendt

    I am not familiar with the development of the Apgar score, but often such scores result from studying MANY variables and then mathematically identifying which combination are most predictive.

    One could take a similar approach to GMing. Brainstorm a large (exhaustive) list of factors, score them, and score the relative success of the game independently. Then collect all of the data and use mathematics to identify the most predictive combination of factors.

  2. Sean

    I realize you’re musing aloud, as it were, but a compact, comprehensive list is probably the direction to go. There is great strength in brevity. My ideas generally begin complex and I have to analyze, parse, and pare down. To that end, here are my suggestions:

    Humor can be dropped as not all games need/should have it, take horror for example. I’d replace it with Atmosphere. When that’s done, emotional engagement and scene setting could be collapsed into it as well. This reduces the list down from seven to a more palatable five.


  3. Ethan Duty

    The really shorthand way to address the issue is “engagement”. How well do you engage your players and get them to surrender time and brain power to beat imaginary kobolds. At this point, even the rules knowledge rating goes out the window.

  4. Jason

    Okay, comments!

    1. How do you distinguish between “setting presentation” and “scene setting” at the table?
    2. Can “emotional engagement” be merged with one of the other categories?
    3. Keep humor, or expand it into the category of “player management” — the ability of a GM to keep a group connected with each other and with the game. Humor’s a great tool here, even in non-comedic games.

  5. Mountzionryan

    I wonder if you’re not being fundamental enough.

    What about interpersonal communication?

    I know of some decent GMs that have poor habits in this regard.

    Also decision making. Gms need to be able to make quick rulings with confidence.

  6. Zack Walters

    I’m pretty comfortable with the list you have, but I think it’s lacking something along the lines of “player involvement in design.” It’s not that this can’t be covered by your current bullets, but I feel like they’re currently leaving the players fairly passive.

  7. Evan

    I think Sean has some valuable insights, however, not to undo his work, but I think that fairness or sense of fairness in adjudication could be very important. Also, from my recent experience at the “Taste of Savage Worlds” event locally, I obsevered that our GM Paul Charran did a great job in reinforcing positive play and diffusing distractions/digressions (often led by my enthusiastic, but inexperienced 13 year old). Some of this might be captured in the emotional engagement, but I think there needs to be something to capture the GM’s appropriate control at the table. In some groups, a GM really needs to be an active director of the action to move things along and keep things on track. That same GM also sometimes needs to sit back and let the palyers riff off each other and create the game for themselves, to essentially get out of the way (and remove other obstacles even) so that the “art” can happen. Both are hard skills to do well, and for some, including me, knowing where to go on the sliding scale of control is a challenge.

    I think a useful score has to capture something of this.

  8. Benoit

    How about “Player Involvement” or “Involves all Players”?

    I like “Scene Setting.”

    Some things on your list depend upon whether the GM is using a pre-published module or not; interesting puzzles are not up to the GM if he didn’t write the adventure.

    Also, maybe list some things that each category includes. For example: “Scene Setting: Includes all relevant details, describes for all 5 senses, includes non-relevant details to add color to the scene.”

  9. Rob Donoghue

    Several comments about player management, keeping things focused and on track definitely point to a gap. I’d probably call this “Focus” (or maybe “pacing”). This has come up enough that I’m editing it into the list.

  10. Arcane Springboard

    Interesting puzzles doesn’t just mean in the design, but the implementation. In fact, given this is an ‘at the table’ list, it should pretty much exclusively be implementation.

    Did you give the players enough clues to solve the puzzle? If they had difficulty did you deal with it?

    In combats, did you do your best tactically? Did you remember all of the monsters abilities/traits?

    I agree though that you should limit it to 5, maybe 6. If it’s much more than the digits in a phone number it’s not consise enough IMO.

  11. captainindigo

    Here’s a list that reflects my understanding of what you’re trying to measure.

    The World
    The Rules
    The Players
    The Characters
    The Story

    Obviously these are crap terms w/ too little focus, but I wanted to make sure I was on the same page in terms of what you were trying to examine and measure.

  12. Arcane Springboard

    Also, I’m not sure how the setting presentation really deals with the ‘at the table’ aspect. Isn’t that more of a preparation aspect (which deserves it’s own 5 part score I think).

  13. Mountzionryan

    If the Apgar is your model, I don’t think your list is fundamental enough.

    Here’s mine:
    1 Interpersonal Communication
    2 Decision Making
    3 Presentation (Playing interesting NPCs, Setting Presentation, Scene setting, maybe Engaging challenges, and Creating Emotional engagement all fall into this category)
    4 Rules Mastery

  14. T.W.Wombat

    First thing that hit me: Table Presence. I see it as a broader category that contains pacing/focus and other things like dealing with rules lawyers and hecklers.

    I agree that humor isn’t appropriate for every game (though it’s solidly in most of mine), so I’d go with something setting-based.

    And short list is very much better. Longer than 5 or 6 and it’ll be too unwieldy to use.

    Another option is to get in touch with Ed Healy and see if he’ll send you a copy of the Iron GM rating sheets to get another perspective. I remember there being a “Pure Gut Reaction” question: “Was this an Iron GM performance?” There are more questions and a broader range of responses, but you can get a feel for the point of the questions. Rules mastery, how “tight” the game ran, player involvement and engagement all came up on that rating sheet.

    Good idea – using a system like this for self-improvement sounds really helpful.

  15. senatorhatty

    I think the difficulty with the comparison to APGAR is that doctors know what is actually healthy, whereas it’s much harder (for me, anyway) to tease out what elements make a GM good, and then whether a given GM would be viewed as, in general, good, as opposed to just suiting my fancy. What the doctors chose to rate may be arbitrary, but the arbitration has already been done on what is “good” in each of those 5 factors.

    Rules Mastery, for example, is less important to me than what I’d call “rules flow.” Which is, I guess, the GM being able to run the game in the system without the rules disrupting the game enjoyment. Sometimes this can be done with rules mastery, but sometimes it can be done by a rules journeyman who is willing to improvise in ways that are enjoyable.

    As I type this, I realize some of my friends would object to this (“why pretend you are playing MnM if you aren’t going to use the rules?”), while others would embrace it, but only if the GM is fair in the application of handwaving.

  16. Noumenon

    I don’t want to criticize if your plan is to identify the critical areas and get into the scoring tomorrow, but this isn’t usable yet. Compared to the Apgar, how do you score these?

    An infant got two points if it was pink all over, two for crying, two for taking good, vigorous breaths, two for moving all four limbs, and two if its heart rate was over a hundred. Ten points meant a child born in perfect condition. Four points or less meant a blue, limp baby.

    Humor gets one point if your players joke during the session, two if you do? Emotional engagement gets one point if your players argue, two points if they cry?

    In the rest of medicine, we measure dozens of specific things: blood counts, electrolyte levels, heart rates, viral titers. But we have no measure that puts them together to grade how the patient as a whole is faring. It’s like knowing, during a basketball game, how many blocked shots and assists and free throws you have had, but not whether you are actually winning.

    We need to be able to rate individual sessions and differentiate them from each other. Something like “strong character voice” is generally going to be the same every session. It’s like rating a doctor on “forceps skill.” That stays the same or improves over time. In contrast, rating on heartbeats per minute tells if the doctor squeezed the forceps too hard on this baby, as well as whether this baby has a well-developed heart and other factors.

    tl;dr: Base the scoring on yes or no questions. Score the session, not the GM.

  17. Goken

    I don’t think “Focus” is quite right, though that isn’t bad. The worst GMs I’ve seen are usually not taking charge and giving guidance where needed. Strong leadership isn’t always needed, depending on the table. I’d say the most important thing for a GM to do is to do whatever is needed based on that table’s needs. Best described by Chatty’s stages: http://critical-hits.com/tag/the-4-stages/

  18. Fred Hicks (Evil Hat Productions)

    If you’re building a list for assessment like the Apgar score, the length of the list needs to be short, and the elements on the list should be specific, measurable, and (if the rating is not a 2 on a 0 to 2 scale) actionable.

    As I’ve been saying on twitter, I’d eliminate anything from the list that’s particularly broad, as I find it’s usually a category that contains other specific, measurable, actionable items already on the list…

  19. Kit

    Somewhere on SG, Lula mentioned how we used to divide authority, on the movie-crew model: director, producer, set designer, fight choreographer, etc. I’ll see if I can reconstruct that list, or find the thread.

  20. Pôl Jackson

    In addition to a simple, Apgar-like rating system, I would also want to track some other variables. For example, I’d be interested to see what effect ‘hours of prep time’ has on the score, over time.

    I’d love to try out a GM scoring system! I imagine I’d rate myself privately after each session, but also get anonymous ratings from the players. I wouldn’t worry too much about the players’ ratings, unless I start seeing a huge gap between ‘how I think I’m doing’ and ‘how they think I’m doing’.

    For non-traditional RPGs that share GM duties between the players, I think a rating system is still useful. The person who owns the game is often in charge of evoking setting and explaining rules, even if they’re not ‘running’ it.

  21. Scott Dunphy

    11 years ago, Tim White (of Return to Northmoor) came up with a scoring system that – almost exactly – fits what is being discussed here. The criteria are Preperation, Positive Energy, Focus, Fairness, and Fun. Just like APGAR, the max score is 10 (2+2+1+2+3). While it was (and still is) used to give out “Best GM” prizes at cons in Denver, it is primarily intended as a GM improvement tool. I monitored the stats from several conventions and it definitely worked for lots of people who took it seriously. Not too mention that I used it to improve my own score.

    Here’s a link to the scoring system: http://www.rp-artisans.org/downloads/GM_Rating_Sheet_v1.2.pdf?attredirects=0&d=1

  22. Reverance Pavane

    I have a much simpler list.

    Did they have fun?
    Did you have fun?
    Was play streamlined or turbulent?
    Were players engaged by the events of the game?
    Were you satisfied with what the players achieved?

    The first two are obvious, except that many gamemasters also tend to forget that they are there to have fun too. The next rates whether the game flowed smoothly or stuttered, stalled or took too many divergent or alternate paths. The last rates progress of the development of the campaign and it’s stories.

  23. Neil Smith

    I’m uncomfortable with singling out the GM as the one being scored. It reinforces the erroneous idea that the GM is the one who is responsible for the fun at the table. Most of the criteria apply equally well to everyone at the table, so everyone should be rated.

  24. sirvalence

    I second the suggestion of “Involves All Players”. While it is everyone’s responsibility (not just the GM) to ensure that everyone is engaged, it’s very easy for the GM to get too caught up in the most outgoing players and not check in with those that are a bit quieter or take longer to decide what to do.

  25. Ifryt

    For building longer list I’ll throw in one more very important for me element – consequences. To build story organically from what players give. It helps create better story and keeps consistency of the setting.

  26. Jacinta98

    I have two thoughts on this. Given it’s a subjective score with personalities involved, a friend will score you well, a stranger might misunderstand what you were reaching for in the game, a fan will love you and so on. The Apgar score doesn’t offend anyone if the baby scores low. This might. So personalities and social niceties need to be thought about.
    The other thought is that the questions and levels need to be meaningful. Going back to the Apgar score again – those people use it all day, every day. Which means they have an intimate understanding of the questions and answers. You can’t hope to meet the same frequency or wide spread use, so the levels and questions need to be universally interpreted as possible to make up for the newness and infrequency the users will have with the tool.

  27. Emmett

    To those that commented on rating the players.

    Don’t the players get scored when the GM hands out XP? My primary method of awarding XP is based on how well they played, and almost never on how many enemies they defeated.

    If only there was a way of the players assigning these scores to the GM and then having them act like XP so the GM could do cool stuff with it. They’d have to be things that don’t harm players because then they’d never award any points to the GM. I’m not entirely sure what that could be. The group has to buy the GM dinner? The group has to buy the GM a new supplement? What about if the GM hands over the game to a new GM, he can use those points to start with an advanced character? Just some thoughts.


Leave a Reply

Your email address will not be published. Required fields are marked *