User Perception Vs. Reality in Voice Controlled User Interfaces

Although they’ve been on the market for a while now, there was a huge push from Amazon and Google to get their ‘smart’ assistants – the Echo and Home respectively – in to our homes and lives this past Christmas, and Voice Controlled User Interfaces (VCUIs) / Conversational Agents (CAs) have been mentioned on countless blogs and sites as key trends to keep an eye on this year – including our own. The purpose of both the Amazon Echo and Google Home is to make our lives easier… but do they really?

The Feature List

On the surface, you’d be hard-pressed to argue with the benefits of these devices. For example, just some of their listed features include controlling the following:

Smart lights
Smart blinds
Smart thermostats
Smart plugs
Music / radio / podcasts
TV channels
Shopping lists (and in some cases, making purchases)
Answering the user’s questions (similar to a search engine)

But while this scope of capabilities is very impressive in theory, the reality so far seems to be that the expectations of consumers (based on, at least in part, the marketing by both Amazon and Google) may have been higher than what is actually being delivered…

Google invites us to “be more at home” in an advert for their VCUI

User Expectations vs. Device Usability

Despite the potential abilities of VCUIs, the current devices seem to be let down by the experience they provide to users. For example, recent research by Martin Porcheron (PHD student), Stuart Reeves (Assistant Professor) and Joel Fischer (Assistant Professor) of Nottingham University examined how five families interacted with an Amazon Echo that was given to them for use at home. Although users were initially very positive about the VCUI and interacted with it often, this enthusiasm tailed off as the device frequently didn’t understand what the user was trying to achieve and occasionally lacked the ability to undertake the action.

“What we see is that people typically will try and try again to get a voice-controlled device to work – whether it’s by pronouncing words differently, by trying a different word ordering, or by using different words altogether,” Reeves stated. “Future testing could look at the ways people do this so that devices can respond better to the ways people naturally interact with them.”

Based on their research, Porcheron, Reeves and Fischer estimate that the devices currently have a 30% – 50% failure rate.

A user cooks dinner with help from Alexa in one of Amazon’s adverts

Another study, by Ewa Luger and Abigail Sellen for Microsoft Research, suggested that users’ mental models differed to how the devices actually worked, and this gap in expectation was compounded by the lack of meaningful feedback – or corrective measures – to get users back on track while using the device. Helping users recognise, diagnose, and recover from errors is a core usability heuristic, and if feedback isn’t taken on board, it risks damaging the way users view your product / service.

Security Concerns

There is also the issue of security. Over the Christmas break I had several discussions with friends and relatives about how comfortable they were with a device in their homes that’s constantly listening in the background. The fact that the device microphones are always on may be a concern for some, they don’t start recording until their wake words (‘Hey, Google’ or ‘Alexa’) are recognised – similarly to Apple’s Siri – so users do have some control. There have also been some more significant scares regarding the devices, such as hackers being able to tap into the VUI’s operating systems. These are issues to be aware of when designing VCUI systems; trust plays a huge part in a user’s attitude to a product and can affect not only how they use it but whether they do so at all!

Machine Learning (the Good and the Bad)

As these devices are connected to the cloud and employ machine learning, VCUIs will adapt and improve the more people interact with them. However, there have been some well-publicised machine learning failures, such as Microsoft’s Twitter chatbot ‘Tay’, which had to be pulled from the network within 24 hours after tweeting some unsavoury posts that it ‘learned’ from the behaviour of other Twitter users. This again is a learning curve for VCUI developers: as much as these devices should be programmed to absorb as much information as possible, not all interactions should be treated as neutral.

How to Improve Usability in the Future

There are a few key things to keep in mind when designing a VCUI, to bring the usability of the devices more in-line with what consumers expect.

Monitoring how users interact with VCUIs in a testing environment will be crucial for making sure your device can adapt to a variety of responses given by users. For example, it’s unlikely that your participants will have a single way of responding to the device, instead, they’re likely to give one of a range of answers: “Yes”, “Yeah”, “Okay”, “Fine”, “Sure” – and so on. Similarly, it may be worth considering more flexible (or customisable) ‘wake-up’ phrases – however, this shouldn’t have such a wide scope that users feel that anything they say could set the device off.

Giving your VCUI a distinct personality will help users to develop a rapport with it – and testing will be key to finding out which personality resonates most. Consider testing many different ‘types’ of voices, so you can tell what style of voice your participants respond to most positively (male or female, older or younger, different accents), and look at which phrases users like to hear. If you opt for a VCUI that is too formal in tone it risks making the interaction less enjoyable, but on the flip-side, if it’s too informal, it may come across as over-the-top and false.

A still from Apple’s website showcasing how the HomePod adapts its sound quality to the environment

If the challenges currently facing VCUIs are ever going to be mitigated, changes to the devices should not only be driven by machine learning, but also through extensive user research and user centred design to ensure future updates meet the needs of the users, and enable both better corrective measures and increased functionality. It will be interesting to see whether Apple’s HomePod – launching later this year – serves as a wake-up call to competitors, or whether it’s just more of the same… Watch this space.

Want to learn more?