Adding Voice to your mobile app
Every knowledgeable mobile user these days knows (at least partially) to utilize the generic capabilities offered by Siri, Google Assistant and even Cortana (Windows anyone?). But we see that the number of custom apps utilizing these capabilities offered by the parent OS is very minimal comparatively.
Here is our primer into getting you familiar with how you can extend your app’s reach beyond the app using Voice assistants in your user’s mobile. Why, you ask? Because it’s cool. Well, yeah, it really is. Also, it gets the user to use your services more easily and thus directly increases the user engagement overall. We are of course speaking from having built apps with these capabilities for different businesses.
The Voice “Assistants”
The voice platforms have been evolving for over a decade now and with tremendous strides happening in machine learning, natural language processing, data mining and computing power, we are now in the golden age of actually reaping benefits using Voice as a user input. Exciting times indeed!
Apple, Google and Windows call their services as Voice Assistants. In their crux, they aim to get you your very own personal secretary to remind you of things, to answer your searchable queries from the net or your personal info, to control your phones and in general – to make your life a little bit easier in this tech age.
Apple and Google have been trying as much as they can to get you the right info you need at the right time. They introduced push notifications, widgets, integration with your calendars for reminders and emails for travel related cues etc. And their greatest trump card till now has been to extend their Voice services to developers to utilize in their custom apps.
Developers and Possibilities
For a minute, let us forget about finding what these platforms are and how these platforms are exposed to developers and let us wander on to the magical land of possibilities. Just sit back and imagine what you will do when you have your own personal assistant to help you with anything (legal and moral matters, of course 😉).
- Your assistant will know exactly where you are at what time and can help you with location based reminders (“John, since you are near Park Street, why don’t you pick up a box of donuts from Krazzi Donuts on your way home for the party?”),
- Your assistant knows your schedule back to front and you might just need to tell him to postpone a meeting to a later date and he will immediately understand without further requests about additional meeting info.
- Your assistant can help with your child’s math homework that “you” were “supposed” to be doing. Or, complex budget calculations you only offload on him because you just can’t remember all the crazy formulae.
- You never have to tell your assistant to connect you to “Peter” who is actually your dad. You just speak the relation and your trusty assistant has them on the line the next second.
- Your assistant (driving with you to a client meeting) is always ready with info about the meeting, the history with the client and the presentations you have prepared for the meet. Just tell him to give a brief or modify a slide, he will do it instantly.
- When you need to book a party hall for your team’s yearly bash, you just inform your assistant. He finds a date convenient for you, a location and even puts in a reservation since he has all info about the number of people, your preferences and your history.
- Your assistant can book your flight instantly for those last minute mad-rush meetings you need to attend over the weekend. You just speak the destination and presto, your itinerary, hotel reservations and return flight are booked and you might even get a suggestion to attend your favorite band’s concert on that extra evening there.
- Or even laugh at your stupid jokes that no one else gets to hear.
Ah, the dreams.
Siri and Google Voice can assist you, at least, partially with a lot of these requests. Surprised? They have a lot of info and usage statistics about you to get on with, remember? And it is definitely not as scary as some make it sound. And for the parts that they can’t meet yet, you have apps that have the necessary info and controls to finish the tasks for you.
While it sure is exciting to think of all this, let us take another moment and understand that supporting “any” kind of custom action your app might provide is more difficult than you might assume, from a technical implementation standpoint, for anyone. Siri/Google Voice run their voice recognition through advanced deep neural networks in their servers and they are designed to have a conversation with you. But, sorting through sheer number of combinations a user would use to issue a simple request to these services is staggering and for your app to actually process the parsed info into meaningful actions to respond back will be an immense effort. So, both services have come up with supported “domains” for third party apps and thus restricting potentially the type of requests you can make to these services and thereby ensuring that any generic action related to these domains can be handled by Siri and your app effectively.
As of iOS 11, Apple supports these domains of apps – VoIP Calling, Messaging, Payments, Lists and Notes, Visual Codes, Photos, Workouts, Ride Booking, Car Commands, CarPlay and Restaurant Reservations – in the sense that apps that enable these services for users can now integrate Siri into their apps with minimal setup. For detailed implementation guide, look for our follow-up blog with a special focus on SiriKit integration for iOS developers.
And Google supports apps that request Voice support for these actions – Alarm, Communication (calls and VoIP), Fitness (basic actions), Local (booking a cab), Media (music, photos and videos), Productivity (taking a note), Search (using a specific app) and Open (URL or app) actions.
A casual look at their documentation shows us that Siri’s support for third-party apps sound much more exhaustive and powerful than Google’s at this point. But Google Assistant is much more capable than Siri in doing your native (system) tasks. Years of search experience has clearly poised Google to take advantage of this situation, what with Google Home entering our lives too now. But that’s a story for another time 😉 , our focus here is for support for third-party apps and there the odds are ever so slightly stacked in favor of Apple now.
Let us look at some practical use cases that are realistic today and our dreams for the future with this, shall we?
As of today, apps with the following use cases can be realistic implementations with these Voice Assistants.
- Apps providing VoIP Calling, telephony and messaging – “Hey Siri, call mom on Skype” (Siri actually understands relationships if defined by the user in prior 😮 )
- Apps maintaining custom lists and notes – “Add milk and bread to Shopping List in Listo”, “Note to self: Mail Jane about the party plans tonight”
- Apps managing your media – “Ok Google, shoot a video using Tuber”
- Apps supporting your fitness journey – “Hey Siri, start tracking a new run with Couch2Roads”, “What is my heart rate now?”
- Apps enabling you to book cabs – “Hey Siri, book a cab from my location to Apple Headquarters now using TaxiTaxi”
- Apps helping you make restaurant reservations – “Reserve a table for 2 tonight 8 pm at Macello’s using RestaQuickr”
- Apps controlling your car’s electronic systems – “Hey Siri, tune in to 101 fm” , “Turn the temperature down”, “Hey Siri, open my car doors for me”
- Apps helping you with different payments – “AirSend Peter $40 for concert tickets”
- Apps supporting visual QR codes for contact sharing and payment processing – “Show me my QR code from AirSend”, “Scan this code”
- Apps letting you search for custom info – “Ok Google, find the cheapest tickets for tomorrow’s Zeda concert in BookThatShow”
What we like best is that Siri actually asks the user relevant, contextual follow-up questions when necessary. For instance, imagine you want to book a cab using TaxiTaxi. You, being the busy normal human that you are, will not likely remember the exact magic words to speak to Siri all the time. You might just say “Hey Siri, book a cab for me with TaxiTaxi” forgetting to add vital info like from where you want the cab or to where or how big a cab you want or when you want it. If the developers of TaxiTaxi are as shrewd as our developers, they would have expected user requests to be incomplete like this and would have enabled their integration with Siri to ask follow up questions to fill in the missing data. Siri will process the initial request and ask a follow up – “Where do you want to book this from?”, then “What is your destination?”, then “How many passengers are traveling with you?” and then “Do you want to book it now or later?”. It is an actual meaningful conversation happening between you and Siri with details from the app. How cool is that? (We admit, we get childishly excited with technology and practical uses.)
But, being the dreamers that we are, we do wish a lot more domains are supported by both platforms. We think that the following use cases will be immensely useful for numerous apps and their users if Voice assistant support were extended to include their domains.
- Financial Apps
There are hordes of apps in the app stores helping you with different dimensions of your financial needs. But most of these are just calculators. You punch in a bunch of values and calculate what you are looking for. Why do you need to open the app and key in these when you (can in the future) get Siri to input these data for you and get you the answers using your app’s logic. Consider these examples
- Mortgage/Auto Loan Calculator: You just need to know if that loan at the current interest rate offered to you makes sense to be taken for 5 years or for 10 years based on the asset’s price and your rough financial position. Just ask – “Calculate the total amount I will pay with 6.8% interest paid yearly on a capital loan of $100k over 5 years vs 10 years using FinanceMeter”. And there, you will have your answer in a few seconds as Siri/Google Voice communicates with your app FinanceMeter. You can even ask follow up questions relevant to your first request. “How does this compare to the market trend?”. And Siri intelligently maps the context and connects with your app again for more info.
- Investment Growth Calculator: Of course no one can accurately predict how the stock market is going to sway over a few years but you can have apps that run likely scenarios and list you safe bets. Just get Siri to do all the mapping for you instantly without even opening that great InvestmentGuru app you have on your phone. Say – “Siri check the return rate since inception of the StarTrack Equipments using InvestmentGuru “. And again, Siri does all the work for you and gets you the numbers. Ask more follow-up relevant questions or start a new calculation. Anything man, anything!
- Educational Apps
Apps catering to the learning needs of the masses are one of the most popular categories in the app stores. They range from guided tutorials to basic topics in any field, helping you improve your memory, guiding you into a better mental state or just providing loads of informative news about everything under and above the sun.
- Training Apps: Supposing your app teaches you concepts and uses flashcards to test your understanding, you can (in the future) use Siri in your daily commute to just let it test you using the app. Don’t have to open the app, use swipes or clicks to progress, just home screen and your voice. You are all prepped up easily.
- Collaboration Apps: If your app lets you collaborate with your user’s schoolmates or teammates on a project and you want to share your latest findings or annotations from your late night’s research with everyone, could potentially just ask Siri. “Highlight the latest edits I made to the doc in Togetherly and notify everyone.”
- Medical Apps
There are numerous use cases in medical apps that can benefit from using voice as a medium outside the app. We really wish Apple and Google were working on supporting this vertical exhaustively next. Consider the following
- Dosage Calculator: Doctors have started using apps to calculate the dosages of medications to be prescribed accurately these days. Instead of having to search an ever-growing list or having to use multiple clicks to find what they are looking for, they could simply ask “Check DosageMeter for the prescribed dosage of Paracetamol for an 8-year old” or “Check DosageMeter for interactions between Drug A and Drug B”
- Personal Health Records: As a user all you need to record your daily glucose ranges or blood pressure ranges will be to just ask Siri or Google Voice. Or, as a doctor or nurse if you want to schedule a treatment for a particular patient, you would need to just dictate that to Siri or Google Voice.
Man, we could list at least a dozen more exciting use cases that can potentially be supported if the platforms extended their support ever so slightly, which we are sure they would sooner than later. We are indeed heading towards a silent but exciting future with personal assistants, we are sure of it!
If you want to know more about the apps we have created for our clients with Siri or Google Voice support, please reach out to us. We will be more than happy to discuss your needs and provide solutions that actually would work for you!