top of page

AI in Personal Assistants

Published by Sid Chadha on August 25, 2020

  • Instagram

As we continue to use personal assistants such as the Google Home, Alexa, or Siri in our lives, it's increasingly important to describe the process behind these assistance, with artificial intelligence being the cornerstone of how these assistants work. There are two vital components of where Artificial Intelligence is applied to personal assistants:

Natural Language Processing

This is perhaps the biggest usage of AI in a personal assistant. Natural Language Processing is essential to recognize what the user is saying. With Natural Language Processing, a personal assistant can understand what you are saying and convert it into a text.


How does Natural Language processing work you may ask? Natural Language Processing takes language and breaks it down into shorter and elemental pieces. It can then understand the relationship between pieces and explore these pieces can work together in order to create a sentence that makes sense. Personal assistants typically use convolutional neural networks in order to recognize what the user is saying. Below is the specific natural language processing functions a personal assistant uses. In the below cases, the overall objective is to take raw language input and use linguistics and algorithms to transform text in order to deliver a greater value to the personal assistant:

  • Content categorization - linguistic based documentation

  • Topic discovery/modeling - predicts the meaning/theme of the text collection

  • Contextual extraction - pull structured information text based sources

  • Sentiment analysis - use Machine Learning to identify the mood or subjective opinions within large amounts of text

  • Speech to text and text to speech conversion - transforming voice commands into written commands - the core of natural language processing. This typically works through using a convolutional neural network to fully analyze given text 

  • Document summarization - automatically generating synopses of large bodies of text



















Visualization of how NLP works;

Personalized results

AI assistants personalize your experience in numerous ways. First is key phrases that activate them such as "Hey Siri", "Ok Google", or "Alexa". Speaker recognition is used to recognizing who's speaking. When a user calls "Hey Siri", key parts of the phrase are pooled (like in convolutional neural networks), compared with past calls from that users, computes a probability via the sigmoid activation function, and outputs a decision based on that probability.  The below images describe exactly how Apple is able to recognize Hey Siri :


How "Hey Siri" is recognized and personalized; Apple

Apple's Hey Siri's network architecture consists of a 100-neuron hidden layer with a sigmoid activation (i.e., 1x100S) followed by a 100-neuron hidden layer with a linear activation, and a softmax layer with 16000 output nodes. The deep neural network or DNN is then trained using a speech vector as an input. Below is a graphic describing how this exactly works


How the "Hey Siri" model is trained ; Apple

The second way personal assistants personalize the experience towards the user is to learn from users' behaviors and routines and meld them together in order to efficiently predict what they may want at a certain time or where they may have to go. Furthermore, if an assistant makes a mistake in interpretation or output it accumulates this as data to make sure it doesn't make a similar mistake again. AI also synthesizes audio and sounds it out using AI in order to sound more human and help the user understand what its saying. 

Overall, through using Natural Language Processing and personalizing the personal assistant expereince towards the user, over 45 percent of Americans now own their own personal assistant and it is spreading like a wildfire in popularity. 


bottom of page