Building voice augmented experiences in apps: Kumar Rangarajan of Slang Labs at Future of Work 2019

Friday February 15, 2019 , 5 min Read

Today, every surface has the potential to be a digital surface. Gartner predicts that by 2020, 30 percent of web browsing will be done without a screen, and by the end of this year, 10 million homes will have a room-based voice device like Amazon Echo or Google Home.

Kumar Rangarajan, Co-Founder of Slang Labs (also, former Co-Founder of Little Eye Labs, a startup acquired by Facebook)

The trend is headed in a clear direction — digital experiences are moving beyond just the screen. In the very near future, we’ll see business, creativity, work, and leisure move toward a world where your phone, laptop, desktop, and TV screens won’t be the sole conduits for digital interactions. New technologies are emerging to create deeper, yet less attached, connections between our physical and digital worlds.

A word is worth a thousand clicks: augmenting an experience using voice

These formed the premise for a workshop led by Kumar Rangarajan, Co-Founder of Slang Labs (also, former Co-Founder of Little Eye Labs, a startup acquired by Facebook). Held as part of the recent Future of Work 2019 conference, the workshop explained the fundamentals of designing and implementing Voice Augmented eXperiences (VAX) in mobile apps, how VAX is different from building actions or skills for popular voice assistants like Alexa or Google Assistant and how this next generational user experience can reduce drop-offs, help in gaining new users, and enable easy information retrieval.

“We wanted to understand how well self-serve works for the Indian market. There are several self-service platforms in the west, but one that is purely built in India is rare. Also, the product itself has its own steps. Building a tech on a voicebase is tough, and it needs to be built on top of an app,” said Kumar.

Is Voice-only UI is actually a step backward?

‘Speaking out what you want, is significantly faster than navigating buttons and menus and filters,’ says Kumar as he set the context for the workshop. It is being claimed that “Voice UI” is the future! But, we think that a Voice-only UI is actually a step backward. While voice is great as a natural input interface -- it’s easy, it’s intuitive and very fast, Guided User Interfaces (GUI) are great as a rich, comprehensive and high-bandwidth output interface. Marrying them together to build the apps of the future or converting the apps of today into this paradigm, would open up your app to audiences that expect convenience, speed and are growing up with the notion that voice interfaces will be present everywhere.

Voice or GUI? Why not both!

It’s easy to see why voice-based interactions are growing rapidly. Firstly, here’s the notion that it decreases thought-to-action latency - it opens up wider user demographic for your apps because users don’t need to be digitally savvy or possibly literate too. Secondly, it understands true intent, because sometimes your intent is not limited to available listed options. Finally, with voice, there’s less cognitive overload.

It’s also important to understand what’s great about GUI: It wins on functionality and discoverability. It’s a matured solution, and a lot of beginner problems are already solved. Finally, its UI-only functionality and implicitly private working makes it visually compelling and discrete.

And can one get the best of both? Yes, VAX can augment your existing visual experience.

VAX – One solution to rule them all

VAX is the concept behind Slang. It’s the notion of adding voice on top of an existing experience (usually touch-based) to make the end user usage of the app significantly better. Slang’s VAX solution provides all the mechanisms needed to add a multi-lingual voice experience inside your app. This includes voice processing, speech recognition, multiple Language understanding, voice responses (greetings, prompts, confirmations, and so on), UI elements needed to interact with the end user, permission handling, context continuity for hybrid interactions (voice and touch) and much more..

It provides all of this in a mixture of configurable actions (via the Slang Console) and code level integrations (SDK integration).

What does it do?

Slang is an integrated platform that provides multi-lingual voice-to-action with touch in one package to enable you to break barriers, expand your user base and even target non-English users. Being multimodal and context-aware, it can enable contextual and natural conversations spanning voice and touch. Its simple APIs help you add voice interface to your app with just a few lines of code, in no time. Its powerful recognition capabilities help you infer specific actions from conversational sentences with minimal training. Its custom intents let you easily configure app-specific intents in the Slang Console, while the interface also allows you to customise Slang to suit your brand experience, and frees you from dependency on any voice assistant ecosystems.

Get in touch with Slang

The workshop ended with a quick Q&A session with Slang experts, and the team asking that the audience send further questions, if any, to [email protected].

Advertise with us