The Brand Manager’s Voice Experience Checklist

For brand managers or agencies considering a voice experience, I offer this marketing development checklist to expedite your journey. It’s based on our experiences building an AI powered chatbot and an Alexa interactive skill.


Identify the Business Case
Start by asking yourself where a voice experience can truly improve the customer experience. NetFlix, the iPhone, and ecommerce didn’t solve a problem; instead, they reduced friction with a better CX.

Visual. Will there be a visual dimension to your voice experience? An avatar can give users a mental image for voice-only situations such as in the car or when using a smart speaker.

Sound. Will you have an audio logo or a score? An audio logon is a few musical notes or small jingle associated with your brand. Scoring is more elaborate sound/music that accompanies your voice experience.

Voice. The voice of your brand can go 3 ways.

  1. Robot: Use the stock synthetic voice(s) provided by the platform.
  2. Human: done by a voice actor or person you know. The person reads the dialog which then replaces the native voice assistant voice.
  3. Synthetic human: a person reads a training script that enables a computer to understand the base elements of that person’s voice. The computer then replicates the human voice to turn written dialog into your voice assistant’s voice.

Voice Search Optimization
Like SEO, VSO is positioning your voice assistant to be suggested when a user inquires about the product or service your organization provides. Google uses its search rankings while Alexa draws on the keywords/phrases that skill makers have attached to their skill. Cortana uses Bing, and Siri draws from a variety of sources. Be sure to understand and leverage how your voice platforms surface voice search results.


Conversation Design
Conversation design has two components: the interaction model and the conversation.

The interaction model is how your voice assistant works and what it does. For a pizza ordering voicebot the essence of an interaction model might be “our model will gather information about the pizza and the transaction including the size, toppings, payment info, and customer name and delivery address.”

The conversation embodies the user context and dialog used to gather the pizza information. The user context is vitally important; it’s who, what, when, where, why and how.

  • Who is going to use the voicebot? A person who wants to…order, listen, research, ask…
  • What activity is he or she doing when using it? Driving? Sitting on a train? Walking in the park?
  • When is it being used? Is there a particular time of day they’re likely to use it?
  • Where are they? Home, work, traveling, car, movie theater(!)?
  • Why will they use it? It’s 24/7; Need hands-free; Easier than using the computer…
  • How are they going to use it? Smartphone? Smart speaker? Laptop? Kiosk? All?
  • And: Why would they not use it?

Understanding the user context is the foundation of a great customer experience.

A voice experience that doesn’t sync with user context will disappoint.


Google Assistant and Alexa are the strongest B2C platforms at the moment.

Google Assistant has the largest installed voice assistant base, being on a billion Android phones. Assistant uses Google’s search engine data to surface the brands it recommends; something to consider if you have strong search rankings. It’s also widely recognized as the most capable voice assistant.

Alexa is the smart speaker champ with about 100 million Echos in homes (Q1, 2019). Alexa is a voice-only experience while the other platforms facilitate chat too, i.e., users can type when privacy is desired. Alexa is also the current media darling by virtue of its smart speaker kingship.

Siri is currently a B2C outlier due to the need to link Siri to a corresponding App Store app. Adding voice to your iOS app gives users a native Siri voice experience; worth considering since Siri is on 500 million iPhones and iPads (recent-model Macs too).

Cortana is on 145 million Windows PCs. While not native to a smartphone or smart speaker, Cortana is integrated with Alexa. That makes it possible for a Windows PC user to ask Cortana to ask Alexa to do something (turn on the lights at home) and for Alexa users to ask Alexa to ask Cortana to create an Outlook calendar event, etc..

IBM Watson is a B2B suite of cognitive services. Watson is a popular white label choice and it is used by enterprises that want total control of the user experience and functionality. Watson also has data privacy options that are unavailable on the B2C platforms.

Define business objects first, then examine the platforms in that context.


Expectations and KPIs
A key performance indicator (KPI) for a voice experience depends on the use case. If your voicebot is designed to handle an sales inquiries then a good KPI might be voice assistant conversion rate versus conversion rates of a web form or email.

Today, voice is neither a mature technology nor a mature user experience. At this point in the voice experience your KPIs serve as a baseline to which future performance can be compared.

This is the era of exploring how and where voice AI can improve the CX/UX and business workflows. Maybe you’ll find a clear path to ROI today but uncertainty should not stop you from exploring. Voice AI is a pivotal technology of the magnitude that ecommerce was 20 years ago.

Today’s voice experience innovators are tomorrow’s market share leaders.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.