A guide to using the SUS survey in user research

What is the System Usability Scale (SUS) survey?

The System Usability Scale survey (or SUS survey in short) is a questionnaire created by John Brooke in 1986 and designed to assess the perceived usability of a service/product (software, app, website etc.) or system (phone, tablet, computer etc.). Over the years, it has evolved to become a widely used standardised and recognised tool adopted by UX practitioners across various industries and sectors (like healthcare or financial services for instance).

It is composed of ten survey questions, all related to usability aspects (confidence, ease of use, efficiency, learnability etc.), and using a five-point Likert scale response. Respondents are asked to rate their agreement with each statement on a scale from ‘strongly disagree’ to ‘strongly agree’. The ten survey questions are:

I think that I would like to use this [service/system/product] frequently.
I found the [service/system/product] unnecessarily complex.
I thought the [service/system/product] was easy to use.
I think that I would need the support of a technical person to be able to use this [service/system/product].
I found the various functions in this [service/system/product] were well integrated.
I thought there was too much inconsistency in this [service/system/product].
I imagine that most people would learn to use this [service/system/product] very quickly.
I found the [service/system/product] very cumbersome to use.
I felt very confident using the [service/system/product].
I needed to learn a lot of things before I could get going with this [service/system/product].

Once respondents have completed the survey, you can do a swift calculation (see section ‘How do I actually calculate the SUS score?’) and obtain a single score between 0 and 100, which determines your product/service or system’s overall perceived usability. The higher the score, the better the usability.

How do I calculate the SUS score?

Follow these four steps to calculate the final SUS score:

Step 1

All ODD-numbered questions (Q1, Q3, Q5, Q7, Q9) are positive questions. You’ll need to assign the following scores to each odd-numbered questions based on the participant’s response:

Response	Strongly disagree	Disagree	Neither agree nor disagree	Agree	Strongly agree
Score	0	1	2	3	4

Step 2

All EVEN-numbered questions (Q2, Q4, Q6, Q8, Q10) are negative questions. You’ll need to assign the following scores to each even-numbered questions based on the participant’s response:

Response	Strongly disagree	Disagree	Neither agree nor disagree	Agree	Strongly agree
Score	4	3	2	1	0

Step 3

Add up all the scores for each respondent (you should obtain a final score for each respondent between 0 and 40). Then add up all participant scores and divide the result by the number of respondents to the survey.

Step 4

Multiply the final score by 2.5 to obtain a score that ranges between 0 and 100. This step, even though not mandatory, is strongly recommended because it will be easier for you and your stakeholders to interpret a score ranging between 0 and 100 rather than 0 and 40.

An example

We conducted usability testing of a specific software with 8 participants. At the end of the research session, we asked the participants to complete the SUS survey. The table below shows an example of the response and final score for one of the participants.

	Participant 1	Score
Q1	Agree	3
Q2	Disagree	3
Q3	Neither agree nor disagree	2
Q4	Neither agree nor disagree	2
Q5	Agree	3
Q6	Strongly disagree	4
Q7	Strongly agree	4
Q8	Agree	1
Q9	Agree	3
Q10	Strongly disagree	4
Total		29

Then I add up all the individual scores. The table below shows the final score for each of the 8 participants.

	P1	P2	P3	P4	P5	P6	P7	P8	Total
SUS score	29	25	30	24	30	30	32	33	233

Then, I divide the total by the number of respondents to get a score that will range on a scale from 0 to 40 (see instructions 3). Total score: 233/8 = 29.125

Finally, I multiply the result by 2.5 to change the scale to 100 (see instructions 4). So my final SUS score for the software is 29.125*2.5 = 72.81

How do I interpret the SUS score?

What does a score of 72.81 actually mean? On its own, this score doesn’t indicate whether the usability of our software is good or bad. To make any metric meaningful, including the SUS, it must be compared against a benchmark. Jeff Sauro provides five different ways to interpret the SUS score, which you can learn more about here. These are based on over 30 years of using this method and 10,000 responses from hundreds of products. Four of these five interpretations are shown below:

This graphic demonstrates the interpretation of SUS scores across four categories: Net Promoter Score, Acceptability, Adjective ratings, and Grades. At the bottom of the graphic, a horizontal scale from 0 to 100 represents the SUS scores. Each of the four categories has a scale that shows its alignment with SUS scores. For the NPS category, a score above approximately 80 is needed to be considered a promoter. In the Acceptability category, with a scale from not acceptable to acceptable, a SUS score above 70 is required to be in the acceptable range. For Adjective ratings, ranging from worst imaginable to best imaginable, a score above 70 qualifies as good. Finally, in the Grades category (ranging from F to A), a score of around 72 or above is needed to achieve a grade of B or higher. — Four interpretations of the SUS score.

I find that NPS often speaks well to stakeholders. It can be a great starting point to discuss how their SUS scores for a specific website I tested correlates (or not) with their NPS.

No matter how you choose to interpret and communicate the SUS score with your stakeholders, keep these two points in mind to avoid misinterpretation:

The SUS score should NOT be interpreted or communicated as a percentage. In our example, it would mislead stakeholders to interpret our software score of 72.8 as well above average. Indeed, according to Jeff Sauro, based on the analysis of “more than 5,000 user scores [encompassing] almost 500 studies across a variety of application types”, the average SUS score is 68, not 50.
Scores from individual questions in themselves are meaningless. It’s the overall score that matters.

Which channels and methods can I use to distribute the SUS survey?

At Bunnyfoot, all the client projects we conducted using the SUS so far involved administering the survey at the end of usability testing research sessions. Below are a few things to keep in mind when doing so.

Administer the SUS survey after the user has completed all the tasks, but before asking the usual wrap up questions

At Bunnyfoot, we usually simply share the moderator’s screen that contains the survey questions. We then either let the participant fill in the answers (if the sessions are in-lab) or ask them to tell us which answers they want and complete the survey for them (if the sessions are online).

Ensure that participants understand each of the survey questions and don’t answer them mechanically

After spending 45 min to one hour testing a site with you, participants’ fatigue tends to creep in. And these survey questions are not the easiest to understand (some questions are worded positively, others negatively). We kindly ask our participants to speak out loud when reading the questions.

Be aware of sample size and statistical significance

Moderated usability testing research usually only comprises between 4-10 sessions. This means that the sample size for the survey will be low and the results may very well not be statistically significant. It is important to keep that in mind as you might get some push back from stakeholders on that front. Fortunately for us, there’s been some research carried out by Jeff Sauro suggesting that the “SUS can be used on very small sample sizes (as few as two users) and still generate reliable results”. However, Sauro recommends to “compute a confidence interval around your sample SUS score to understand the variability in your estimate.”

Other methods to administer the SUS survey

Are there any other ways to administer the SUS? Certainly. You could probably insert it in an email or newsletter to your clients or subscribers. You’ll need to choose the customer panel and timing right though. Users need to have used the system or service extensively before completing the survey. A good use case for sending an email like that could be at the end of a product/software trial or several days after purchase of a software as part of the customer satisfaction survey.

An example: GitLab has been utilising the SUS since 2020. They send email surveys to specific users to monitor and enhance product usability. Survey data is analysed quarterly, with respondents’ feedback coded and clustered into themes to identify trends over time.

When should I consider integrating the SUS into my UX research?

You can consider different scenarios (or research objectives) where integrating the SUS within your research can be beneficial (please note this is not an exhaustive list).

You want to take the ‘temperature’ of your product/service’s usability

It can be useful to administer the SUS survey to do a basic ‘evaluation’ of your product/service’s usability. A very low score can serve as a starting point to convince stakeholders of the need for further research or prioritisation of efforts for that specific product/service.

You want to measure the impact of design changes over time

Administering the SUS survey periodically or during iterative rounds of testing for example enables you to track changes in usability and assess the impact of design enhancements or modifications over time.

An example: We conducted two rounds of iterative testing for the customer platform of an international insurance company. We also provided design recommendations between each round. The customer platform saw the SUS score rise from 66 to 82 between Round 1 and 2. It provided the team with a tangible measure into the effectiveness of the design changes made between Round 1 and Round 2.

You want to compare different versions of a design

You can incorporate the SUS into usability testing or A/B testing sessions to compare the perceived usability of various design concepts. Note that small variations in scores (particularly if you’re testing the designs with a small sample of participants) shouldn’t be considered a significant indicator of one design being better than the other. Instead, focus on larger, more consistent differences in scores to make more informed decisions about the usability of the designs.

An example: In a study for a car insurance company, testing and comparing two different concepts of their ‘car selection and configuration’ online journey yielded similar SUS scores (78 and 75) for both concepts. These scores corroborated our qualitative findings which highlighted that the both concepts provided a positive user experience. We then leveraged qualitative findings to emphasise which aspects of both prototypes appealed or frustrated participants.

You want to benchmark your product/service’s usability against competitors

You can administer the SUS survey to users of your product/service and users of your competitors’ product/service, or conduct usability testing on identical journeys across your brand and competitors’ brand, adding a SUS survey for each brand at the end. This approach provides a quantifiable score which makes comparison between brands straightforward.

What are the benefits and strengths of the SUS?

The SUS holds several advantages that make it a valuable tool for usability assessment.

It’s quick and cheap to administer

This survey is composed of only 10 questions. It works well for projects that run on tight budgets.

It’s reliable and backed up by research

The SUS is a reliable instrument backed up by over 500 studies. Because of its widespread use among UX practitioners, the SUS provides a more accurate indicator of your product’s usability than custom-made surveys set up internally.

It’s agnostic of technologies, software, systems

You can use the SUS survey to evaluate the perceived usability of your new web application as well as a thermometer or a smartwatch. The questions are the same, and they are applicable to all products or services. In an attempt to build a more granular approach, A. Bangor, P. Kortum and J. Miller established a mean score based on the interface type (e.g., TV, web, hardware etc.) and P. Kortum and C. Perez compared usability ratings for home health care devices against other common products.

What are the limitations of the SUS?

The SUS comes with its shares of weaknesses and limitations. I’ve tried to compile a list (although non-exhaustive) of the SUS’s main limitations below.

The SUS is NOT a tool to diagnose problems

The SUS score is just a number between 0 and 100. It allows you to assess whether or not something is going wrong with your website or system. However, it doesn’t allow you to discover WHAT is going wrong with said product, service, website or system or pinpoint specific issues. Think about the SUS as a thermometer: it indicates high or low body temperature. If your temperature is particularly high or low, you know that something is going wrong. In SUS terms, any score below 60 or above 80 is particularly interesting – it indicates that usability is particularly bad or particularly good. However, the thermometer, just as the SUS, doesn’t tell you what is going on. Therefore, just as a doctor examines you for a comprehensive understanding, combining SUS with qualitative research is critical to uncover what is happening with your website or system, and why this is happening.

Limited usage of the system or site evaluated will provide less reliable SUS scores

The SUS includes statements that require respondents to reflect on their experience in a ‘holistic’ manner. For example, questions like “I found the system unnecessarily complex” or “I thought the system was easy to use” demand a broader perspective that comes from prolonged interaction with the system and experience of different functionalities and tasks. I would personally refrain from using the SUS during usability testing sessions where users only test one single task for instance. Using the Single Ease Question may be more appropriate in that case.

Participants’ perception of their experience might not reflect reality

Participants’ interpretation of their interaction with a system may not accurately reflect the observed effectiveness of this interaction. In short, participants may struggle to conduct a task but then report this task as easy. That’s why it’s essential to integrate additional metrics during testing, such as tasks success rate, time spent on different tasks, and error rates, to connect subjective perception of usability with objective evaluation of usability.

Response bias impacts participants’ responses

Participants might interpret SUS questions differently, leading to variability in responses even among users with similar experiences. Factors such as mood, fatigue, or external distractions during the evaluation session can further exacerbate this variability. Therefore, it’s extremely valuable to complement the SUS survey with final questions that ask participants to summarise their overall experience of the system or site. When you have fewer participants in your sessions, these qualitative insights can help temper the perception of good usability indicated by a high SUS score.

The scores may not reflect the users’ learning curve

The SUS may not effectively capture the learning curve associated with a system. Users who initially struggle with a new interface but then become more accustomed to the system may provide a high SUS score. Incorporating additional metrics (e.g. tasks success rate, time on the different tasks, error rate), or longitudinal studies can help account for this dynamic aspect of user experience.

The SUS score is difficult to interpret without ‘training’

The scoring system of SUS can be prone to ambiguity (e.g., mean at 68 rather than 50). When sharing results with teammates and stakeholders, differing interpretations of what constitutes a “good” or “bad” score may emerge. Stakeholders may express a desire to delve into individual participant or question scores, which is not the intended use of the SUS. To prevent misinterpretations, involve stakeholders in the research process from the outset, outlining both the main advantages and limitations of the SUS as a research tool and what a ‘good’ or ‘bad’ score looks like.

Need support with your UX research and design?

Bunnyfoot has always prioritised evidence capture and valid, bespoke study designs to truly understand the nature of user engagement, interactions and behaviours, in a holistic way.

If you would like to talk about how we can support you with gaining a deeper understanding of your audience so you can make evidence-driven design decisions, we’d love to talk! Contact us.