Effective Playtesting Rituals

Every community of creators should have a regular playtest series. This is true whether it’s an ad-hoc community of independent creators, an academic research group, or a large corporation. If there isn’t a playtest series, you should start one. This post is for you.

I have been involved in several recurring playtest series. I first discovered this format in SF, a monthly series called Sandbox at The Go Game. I instituted a weekly playtest series as part of CTRL Labs, after we were acquired by Meta, and as far as I’m aware it’s still running. The Lower Case, an immersive theatre art space I was co-running in NYC, also ran a monthly playtest series. The creative medium (games, apps, theatre, UX research) doesn’t matter, if you create new stuff, you must have a playtest series.

Running an effective playtest series requires ongoing management of subtle sociocultural factors. You can’t just declare a playtest timeslot and hope things work out; this may be worse than running no playtest at all! I’ve found these issues to be surprisingly consistent across creative mediums, social groups, and whether playtests are in-person or remote.

Rules

Schedule playtests at a regular, recurring time.
Eliminate any red tape to sign up as a creator or tester. The only requirement to participate should be to show up, like an unconference.
Participation must be optional for testers.
Testers should not be allowed to watch an experience before playtesting it.
Documentation (photo/video) of an experience during a playtest is not allowed unless done by the creator themselves.
Playtests should be distinct from morale-boosting demos or showcases. Mixing these is harmful.

Why We Playtest

The generally accepted belief is that the purpose of a playtest is for a creator to get feedback on a work-in-progress. This isn’t exactly true, and it is merely the centre around which other useful outcomes emerge.

The real primary purpose of a playtest is to assist a creator in determining immediate, actionable, next steps for their work on an experience. A more useful playtest is one where observations are surprising to a creator. A playtest where the goal is to determine if an experience is “good enough” to launch, as measured by lack of negative feedback, is quite un-useful.

A secondary purpose of playtests is to observe, as a group, UX flows surrounding experiences. This is especially true for playtests within an institution where most creators share a similar platform. Issues that individuals thought were one-off anecdotes, personal to only them, are able to be elevated to consistently observed patterns. Personal Anecdote: In CTRL Labs, one research team was playtesting app A. However, a few days earlier, there were reports of sporadic crashes with the launcher that all apps used. The engineer debugging this showed up to the playtest session for app A, as this was a chance to watch ~30 people attempt to use the launcher, and do live debugging. I recall the issue happened ~3 times (10%), and the engineer was able to pull these people into a side Zoom to determine commonalities.

An playtest session acts as an artificial deadline that forces creators to prepare their experience as a consumable whole outside of the specific sub-experience they are working on. This requires creators to modest documentation, QA, and explain their work to people earlier than they normally would if they were able to defer these seemingly irrelevant tasks until the sub-experience was “ready to ship”.

The sessions themselves are useful for work awareness and socialization, in particular for newer people to be aware of what other groups are working on or concerned with. This is much faster onboarding than reading some org chart, documentation, source code or review.

Morale is a dangerous, slippery anti-pattern I will call out multiple times. It’s very easy for people of various experience levels to feel like the purpose of playtests is to feel like a celebration of “we’re doing good work!” This leads to creators holding back experiences until they’re so polished a playtest doesn’t create actionable outcomes, or testers providing only celebratory feedback instead of problem-solving with the creator on improving an experience. Showing off work regularly is a good practice, but you should nurture this desire in separate short “demo sessions”, where creators get 2-5 minutes each.

Psychology of Creators and Testers

Creators are a mess, because genuine creation is difficult and risky, no matter how experienced you are. Testers don’t know how to be useful, and you must steer them towards what you need from them, instead of hoping that open-ended listening to undirected feedback will be miraculously useful.

In absence of an outside force, creators will iterate on experiences indefinitely and never launch them, even if they don’t identify as perfectionists. In general, creators should be encouraged to launch early, then un-abandon¹ if they want to improve the already-launched experience. Creators should never be forced to show an experience at a specific playtest, but any red tape or requirements will feel to creators as over-pronounced speed bumps in the sales funnel of getting them to playtest their experience. I’ve had to reject proposals that creators should submit titles or abstracts of their playtests early, with specific parameters around playtest duration or tester counts. I’ve had creators ask weeks in advance whether a certain extension cable is available or whether they’re allowed to move chairs; obvious unconscious ploys of anxious procrastination. You should reassure creators that all they need to do is show up, and show up a bit early if they have particular needs, and it will be sorted out. This is why playtests must be on a regular schedule, rather than on creator request, as a creator will always request a playtest when it’s too late for them to learn anything useful about the experience they want feedback on. You must pay particular attention to creators who are afraid of looking incompetent; this can happen with both newer creators, or in experienced creators who feel like they have a reputation to uphold. You must find ways to soothe them. If they say something like “I think I’m ready to playtest but there’s a 20% chance it will crash each time” you should reassure them that that does not prevent a playtest from happening, as the program could simply be relaunched.

Testers show up because they are excited to learn more by trying out novel stuff. Newer people to a community or institution are more likely to show up; this fresh meat is great news for creators, as experienced testers may have too much prior knowledge to provide unbiased feedback. Testers often show up with the specific motivation of previewing upcoming features they may rely on (or be victimized by) in the future. This will give them an early opportunity to steer development with feedback to be more useful to them. Creators will show up as testers for one session, as they are considering bringing an experience to a future session; they likely want psychological reassurance that the playtest sessions are a safe and useful space. Pairs of creator groups will often show up, testing each others’ experiences as a mutually beneficial transaction of favours, sometimes over separate sessions.

It is up to you to set up a relationship between creators and testers of mutual problem-solving, where testers genuinely want the experience to work for some purpose. You must let only properly engaged and motivated testers into the playtest session, as only they will give useful feedback. In absence of proper engagement and motivation, testers will give feedback that is:

blandly positive, to get out of their obligation as quickly as possible, or out of a need to reassure or celebrate a creator
arbitrarily negative, because they felt they didn’t want to be bland, and were grasping at straws
over-abstract and intellectual, since they are bored and daydreaming, or want to perform intelligence in front of others. Often it’s hard to reject this feedback as it is technically correct, but will only be relevant in 9 months or for <1% of real users in the wild. This is one of the major problems of focus groups, which I believe are over-used in UX research
humorous. Some laughter is okay, and in fact a lot of creativity is unlocked if people feel allowed to make suggestions that feel dumb. However, unmotivated jokey testers can become a useless peanut gallery.

Chilling Effects

Watch out for these patterns, as if you let them creep in, playtests will cease to be useful, then creators will stop submitting experiences and testers will stop showing up.

Creators and/or testers feel that every piece of feedback must be responded to. This creates stilted, instead of free-flowing, feedback, and you should intervene to stop this. Testers should give spontaneous feedback without expecting response. Step in to moderate if you see this happening. Creators should know that they learn more from watching what testers do rather than what they say.
Apologies. Creators will pre- or post-apologize for the insufficiency of their work, and testers will pre-apologize in advance of criticism. This sets a tone that other people will pick up. Stop this.
If a playtest feels like labour to the tester, they will feel disrespected, like their time is wasted, and they will cease being useful as a tester. Examples: I once saw a playtest where the end result was that each tester made a papercraft object that a creator obviously needed for another project. I saw another playtest where a creator just needed lots of people to use an interface to calibrate a model. Neither of these examples engaged with the testers as intellectual human beings capable of providing feedback, and even if a creator goes through the motion of absent-mindedly listening to feedback, testers can tell. If a creator needs an experience to go through a rote QA process as part of a regular release to ensure that nothing breaks, they should frame it as such, and not call it a playtest.
If a creator feels like the playtest is being documented, or non-private to the degree that it is a more public “launch” than they are ready for. If creators feel this way about the recurring sessions you run, it means they’ll be stuck playtesting themselves or only with overly-trusted people, in artificial contexts, forever. Sometimes people with too much social power (executives, moneyed producers, celebrities, sponsors) showing up to a playtest can ruin it as creators and testers can no longer act normal. If a creator feels like the quality of their work is being judged in a playtest, then they will bias to polish, over risk, and the playtest ceases to be about learning anything. If you encourage people to not talk about what happens at a playtest, it has the positive effect of making playtests the place to be to see cool, secret stuff earlier before broader awareness, and more testers will show up.
Testers will feel a desire to watch another tester to do an experience before they try themselves. In playtest sessions with multiple experiences, testers will want to do this before choosing one to commit to. Don’t let them do this! Creators should be able to give enough information verbally to testers so testers can choose, for example “this will be 5 minutes of typing” or “this is a 20-minute audio-only horror experience in the dark”.
You, the person running the playtest series, will feel a desire to have clearer up-front requirements from creators. Such as, length of playtesters, # of testers required, etcetera. This is never information you need to know in advance and you should resist the urge to require this, as this level of formality will make creators nervous about their often genuine need to make last minute pivots.

Tips for Creators

If you think your experience isn’t ready to be playtested, you are almost always wrong. However, there are a few exceptions:

When it instantly crashes. Anecdote: I’ve seen creators develop entirely in a game engine like Unity, and then when playtest time comes, they build a standalone executable, and send it off without attempting to open the executable themselves first.
When the creator can’t describe why they are doing the playtest in a single sentence. This happens if the creator is in a rush or stressed, and only ever takes a couple minutes of calm thinking to resolve. You can check this by asking a creator “what are you hoping to get out of today’s playtest?” and if the answer is run-on word vomit or silence, you must suggest they hold off their playtest (possibly until later in the session) until they have an answer.
When at least one person other than the creator hasn’t tried the playtest instructions themselves. This pre-playtest doesn’t need to be full-length, but it could just be someone else reading or walking through the instructions quickly. This may become your responsibility as the playtest session runner. Possibly necessary for experienced creators.

A lot has been written about asking for feedback that you can read elsewhere. However, I strongly suggest creators be specific about the bounds of feedback. Testers will be able to tell if their feedback is uninteresting to a creator, so help them be in the right neighbourhood!

BAD: “Just looking for general feedback, or if anything seems wrong”

GOOD: “I want to see if people can get through the end-to-end flow of the experience. The visual design of the experience is all placeholders, so we aren’t looking for feedback on that right now.”

Advanced

The amount of experiences submitted for testing during each session will fluctuate wildly. Some creators won’t need to have their experience playtested immediately, so it is good to maintain a backlog of experiences that may be pulled from for sessions with more scarcity.

Async playtests for works-in-progress are possible. Setting this successfully at CTRL Labs was one of the influences for my interest in the Async theme. Going async at first was in response to the impossibility of finding a weekly time that didn’t have a conflict for the ~200 stakeholders. To run async playtests, we had creators write up a Google Doc of instructions, and link that doc in a shared playtest spreadsheet. Testers were told to simply give feedback text/screenshots in the bottom of a playtest’s Google Doc. There was an understanding that any playtest instructions were ephemeral, so testers should either run through the playtest the day it was shared, or not at all. This was important to reduce the percieved support/maintenance burden for creators. About 40% of the testers that participated were async. A surprise benefit of async playtests is that it leads to creators writing higher-quality instructions, and being more prepared, in advance of the synchronous playtests, since they have to imagine a situation where they aren’t able to look over testers’ shoulders.

Quote attributed to Leonardo Da Vinci: “Art is never finished, only abandoned” ↩︎

Dustin Freeman