Chatbots might finally be done saying: “Sorry, I didn’t quite get that.”

Picture of a robot with a speech bubble drawn on a chalkboard
fb twitter linkedin

I took a swig of my favourite dark roast coffee, opened Slack, and began to type.

/jarvis, schedule a meeting with @gokulraj and @lrode for Friday morning. Title this “Front-end Weekly Check-in.” Please add the note, “Bring me a mug of black coffee, I’ll be sleepy.”

My stomach rumbled. Looks like I’m going to need some good food later.       

/jarvis, find some time for lunch today with @plee. Let her decide on the place.

Leaning back in my chair, I interlock my fingers and extend my arms out in front of me. Ever since onboarding Jarvis as my personal assistant, Monday mornings at the office have been slightly less miserable. But the best part about Jarvis is that he doesn’t just work for me. He is a scheduling AI, a mere needle in the haystack of chatbots that have scheduled meetings for Fortune 100 companies, small businesses, and start-ups.

As a software developer myself, what intrigues me the most about chatbots like Jarvis is their ability to interact with us using natural language, a subset of artificial intelligence known as natural language processing (NLP). Recent innovation in NLP has fostered drastic improvements in chatbots, placing C-suite executives in a frenzy as they tussle to get onboard with the new technology.      

Quick disclaimer: For now, Jarvis is the brainchild of my imagination. He is a fictional AI I contrived for the purpose of this blog, but there are existing, commercially available chatbots that I will touch on later!

Why should you care about chatbots?

With an increasing number of companies opting to integrate support for chatbots within their existing communications ecosystems to adopt competitive advantages, it’s time that you hopped aboard the hype train too.


The chatbot market is on pace to reach $1.25 billion by 2025. A survey of c-suite executives show that 80% of respondents have already used chatbots or planned to use them by 2020. With nearly 4 billion monthly active users between WhatsApp, Facebook Messenger, and WeChat, instant messaging is now the preferred mode of communication amongst young people. And with millennials staged to constitute 75% of the global workforce by 2025, virtual communication is our future. That is precisely why chatbots are such a disruptive innovation. As NLP continues to improve, they will only become increasingly integrated into business operations.

Save money

Most firms are garbage at texting back, and it is harming their ability to convert sales leads. The bar chart below depicts the average response time to online customer queries based on an audit of over 2000 American companies:

A bar chart showing companies' average response times to online customer queries. It has a bimodal distribution with two peaks at the two ends of the response time axis; firms either respond within 5 minutes, or don't reply at all.
Graph: Study from the HBR based on average response time to customer queries

In a separate study, the Harvard Business Review discovered that firms who responded to customer queries within an hour were seven times more likely to qualify a lead than firms who responded more than an hour after the original message was sent. Firms who waited longer than a day to respond were 60 times less likely to qualify leads. Chatbots can bridge this gap in the lead generation funnel and potentially save companies a lot of money.

Additionally, further research by IBM points out that firms blow an annual $1.3 trillion on customer support calls. Chatbots can reduce these costs by 30% through expediting response times and liberating live chat support agents for more technical work.

Even my non-programmer friends can (learn to) build (a simple) chatbot

Thanks to advancements in NLP, chatbots are becoming easier and easier to build. A quick google search yields a plethora of frameworks that allow you to build your own chatbot, with the simplest taking just minutes to build. Chatfuel and Botsify are two examples that allow you to build fully-functioning, commercial-grade chatbots without any coding knowledge at all!

In a nutshell, chatbots are inevitable.  

Use cases of chatbots

Now that we’ve laid out the reasons why you should be interested in building a chatbot, let’s dive into some of their most popular functions.


For starters, scheduling chatbots, as I alluded to in the preface of this blog, are specialized, task-oriented AI. They can understand natural language, but are specifically programmed to do one thing really well—schedule meetings. Scheduling chatbots are really good at carrying out specific sets of commands. However, since they don’t need to be highly contextually aware, they may falter when faced with semantically ambiguous language. Although they are not yet as accurate as humans, scheduling chatbots are much faster. They also don’t complain about having to play “Where’s Waldo?” with your co-workers’ busy calendars to find empty slots for meeting times. Scheduling bots are becoming more widely adopted because of the potential time and money they could save in the workplace, given that 25 million meetings are scheduled by Americans each day. Currently, among the most popular are’s scheduling bots, Andrew and Amy.

Screenshot of user asking's scheduling chatboot in Slack to schedule a meeting with his coworkers.
Example: Using Slack to ask’s scheduling chatbot, Andrew, to book a meeting.
Screenshot showing the calendar event for a meeting scheduled by's scheduling chatbot.
Example: scheduled meetings show up on Google Calendar as an event.


FAQ chatbots are similar to scheduling chatbots in that they are task-oriented AI. If you’ve ever found yourself using Ctrl + F (or Cmd + F) to navigate the FAQ page on a website, then you know how much of a pain it is to read through those long-winded blocks of text. FAQ chatbots are an easy fix to that problem. Data can be gathered from existing FAQ pages, email queries, call logs, and support chat scripts and then fed to chatbots that will train themselves using NLP. Overall, chatbots are a more immersive means of engaging website traffic than traditional FAQ pages. They save you from having to manually create FAQs, and your visitors from having to scour through them.

Personal assistant  

Whereas the first two use cases are highly specialized, personal assistant chatbots are found across a diverse basket of industries, performing a wide array of transactional tasks. The commercial banking sector was an early adopter of personal assistant chatbots, which are already widely integrated into online banking sites and mobile apps. Bank of America’s AI assistant, Erica, is one such example. She can understand natural language and conduct transactional tasks such as providing account balance information, giving credit report updates, paying bills, and transferring funds. Erica can also employ machine learning to conduct more personalized tasks such as offering simple financial advice.

Screenshot of conversation between user and Bank of America's in-app AI chatbot, Erica.
Example: Bank of America’s in-app personal assistant, Erica.

Limitations of chatbots

Although research in NLP has improved the functionality of chatbots, it’s important for us to be aware of what they aren’t able to do. As you may have gathered from the previous section on use cases, chatbots are really good at performing highly specialized tasks. Although it would be amazing to have chatbots that do everything for us, it isn’t realistic. The proof is in our pockets. Virtual assistants, such as Google Assistant, Siri, and Alexa, are AI that try to do everything from setting alarms and scheduling events to engaging in two-way conversations with the user. In trying to tackle so much, these virtual assistants require a very large knowledge domain. While they may be good at performing simple, structured tasks, they struggle with contextual tasks such as understanding and responding to text or speech input.

Below is a non-exhaustive list of challenges and limitations that chatbots face.

Domain-specific knowledge

A hefty challenge facing chatbots is the lack of publicly available domain-specific data that can be used to train them for their specific roles. NLP is a diverse field with many, many distinct tasks, and most task-specific datasets only have between 1000 and 10,000 annotated training examples (note: annotation is the process of labelling data such as images, text, and audio to make it recognizable for machines; annotation is manually conducted by humans). A chatbot whose purpose is to provide customer support to commercial banking clients, for example, will need to be trained using relevant data such as financial jargon. It takes a lot of time to gather this data.

Understanding informal speech

A big frustration of mine while developing a chatbot for a client was being unable to find enough training data for chatspeak. Chatspeak is any form of deviation from proper language conventions. It embodies lowercase, typos, slang, sentence fragments, incorrect punctuation, and bad grammar. Because the data used to train almost all open source models and word embeddings is taken from well-written, properly punctuated sources such as Wikipedia, building a chatbot means engineering some way to handle chatspeak. Google Assistant can respond naturally to “what’s your name,” but stumbles when your text input is littered with typos and chatspeak.

Sentiment analysis

Chatbots often struggle to grasp the emotional intent of the user. More accurate sentiment analysis would allow businesses that employ chatbots to gather insight into the satisfaction levels of customers and identify product shortcomings.


Chatbots aren’t great at parsing text or speech correctly for the relationships between subjects, objects, places, desires, and beliefs. The ability to analyze context is conducive to understanding the intended meaning of a sentence. To detect sarcasm, for example, we would have to teach AI to parse possible meanings of a sentence construction and understand which one is intended given the context. Also included in the mush of contextual information that chatbots struggle with is word-sense disambiguation. A common example would be a chatbot’s inaccuracy in determining the correct meaning of a homograph, such as the word “bass,” in a sentence.  

Multiple intents in a single question

Neuroscientists have long asserted that humans cannot truly multitask. The same is true for chatbots—they struggle to handle more than one question at a time. If, for example, you ask a chatbot on a fashion e-commerce site, “can you help me find some dress shoes to wear to my brother’s wedding and some workout clothes for the gym?” chatbots might completely ignore the second half of your question. Bummer. Another excuse for you to skip the gym.

Attention & conversation memory

If you make reference to an aforementioned statement without explicitly drawing attention to it, chatbots may struggle to grasp what it is you are referring to. In the conversation below, I asked Google Assistant, “who’s your favourite character?” Even though I didn’t explicitly ask for a favourite character from Game of Thrones, it was heavily implied as I had literally mentioned the TV series right before I asked the question. Chatbots have short-term memories.

Despite these challenges, chatbots are getting better

Pre-trained models

Like I mentioned in the previous section, a lack of data with which to train chatbots is one of the greatest challenges facing NLP because most modern deep learning NLP models are data-hungry. This is where pre-trained models come to the rescue. You can think of pre-trained models as new hires. As a manager, you expect your new hire to speak and understand the language the job requires so that during the onboarding process, you can jump straight into teaching them the technical demands of the job.    

Pre-trained models are akin to new hires. Instead of building your own NLP model from scratch, you can reuse other models as a starting point. These pre-trained models can then be finetuned on specific NLP tasks, resulting in considerable improvements in accuracy. Pre-trained models promise to handle linguistic constructs including synonyms, context, figures of speech, and entity recognition.

Some existing pre-trained models include BERT, ELMo, and ULMFit. Of the three, BERT (Bidirectional Encoder Representations from Transformers) is the most popular and if you are interested in the nitty-gritty details, check out their GitHub here.  


Word2Vec is a means of creating word embeddings. There are many open source word embeddings which are a great way to quickly enhance a model’s performance. Word embedding is the representation of a word as a vector. With a set of word embeddings, semantic relationships between words can be found by comparing the relationships between their vectors. This process looks roughly like grouping similar words together, and one of the basic benefits is having a reliable way to find word synonyms.

Building your own chatbot 101

With everything else out of our way, I present to you a quick crash course on useful knowledge to have in your toolkit once you embark on your own chatbot construction endeavours.

Open-source NLP pipelines

A data pipeline is a network of data modules whereby the output of one module is fed into the input of another. Most frameworks already handle basic NLP tasks, and come with a number of these modules. Pipelines are used to deconstruct sentences into their semantic subsets so that they can be parsed for meaning. Some of these basic tasks are:

Sentence segmentation & word tokenization:

This means breaking down paragraphs into sentences, and sentences into words.

Named entity recognition:

This involves labeling named entities (i.e. nouns) with real-world concepts. For example, in the sentence “Vancouver is an expensive city to live in,” “Vancouver” would be labelled as a geographic entity.


This is a process that simplifies the many forms (plural, verb conjugation, etc.) of a word to their base form so that they aren’t mistaken as two completely different words. For example, the words “am,” “is,” and “are” would all be reduced to their base form “be.”

What’s great is that there are many open-source NLP pipelines readily available online. Because of their modular design, you only have to pick and choose the components of the pipeline you need. You can add new modules, or customize existing ones.

Chatbot frameworks

There are many frameworks you can experiment with to build your own chatbot. Here are three of my favourites, arranged from the easiest to the most difficult to use.


  • Open-source developer tool for building chatbots
  • Intuitive, flowchart-type programming


  • Dialogue management engine that is good for building robust, maintainable chatbots
  • It is powerful enough to be an enterprise solution, with case studies that demonstrate its real-world applications
  • This is my personal favourite


  • TensorFlow is a general purpose open-source machine learning framework
  • It is extremely powerful and customizable, but also the most difficult to use since it is not specifically designed for chatbots.

Methods of obtaining chatbot training data

Like I’ve mentioned numerous times in this blog, the biggest hurdle to building a chatbot is obtaining the massive amount of data you need to train your bot. Below, I’ve identified some methods you can use to build your own chatbot! Choosing the best method will depend on how well it suits your personal goals and objectives.

Mechanical Turks (MTurks)

This is an online crowdsourcing marketplace that allows people to solicit human labour (MTurk workers) for tasks that can be remotely performed. MTurk workers are often hired to annotate data that is used to train chatbots. They also review chatbots’ conversations with humans and evaluate their performance.


This is a method whereby messages from chatbots are run through AI and only sent out if the message reaches a certain confidence threshold. If the message fails to hit the confidence threshold, it is first forwarded to human operators to be revised before being sent out (this new message is used to train the bot afterwards). The downside is that a large number of human operators are needed to keep this process running smoothly.

Synthetic data

If you have a small amount of data, you can get more mileage out of them through manipulation—programmatically flipping words around, using synonyms, or rearranging sentences. Although it is easy to create synthetic data, it is difficult to make high-quality datasets simply by recycling your existing data into different forms. As an example, you can generate training data using template sentences. The template for a coffee order might look like:

[greeting] [ask for] [item] [please]

Each bracketed section is a slot where a number of variations could fit. A greeting might take the form of a “hi,” “hello,” or “how’s it going?” and still have the same meaning.

Train as you go

This method involves launching a minimum viable product (chatbot) and then analyzing its conversation histories to improve it. The downside is that you are training your bot at the expense of your customers, but the upside is that you get to train your bot using data from relevant scenarios (i.e. real customer interactions with your chatbot). Regardless of how you train your bot initially, training as you go should be a step you take at some point in your bot development/maintenance lifecycle

Some final tips to keep in mind

Last but not least, here are some final pro tips from yours truly.

1. Set up a development workflow

Having a workflow allows you to experiment with different methods and evaluate their effectiveness. Building chatbots require a lot of testing and having a clear process will help.

2. Be prepared to handle failure

We’re far from reaching 100% accuracy with chatbots. It’s bound to fail sometimes, and when it does, you should consider how to handle that failure. For example, chatbots often frustrate clients by trying to answer queries they aren’t fully equipped to handle. Instead of doing this, Chatbots should be built so that they are able to identify and acknowledge gaps in their ability. They can then hand-off customers to live chat support agents when necessary.

3. Keep ethical concerns in mind

We’re entering uncharted territory when it comes to AI, and it’s important to be diligent when we’re working with technology this powerful. Make sure customers know they’re interacting with a bot and be transparent with what kind of data you’re collecting. Also, if it’s possible for children to come across your bot, you need to consider whether the content is appropriate.

4. Bigger isn’t always better

In fact, the smaller the domain, the better the bot. As mentioned earlier, the reason why virtual assistants like Siri and Google Assistant are so bad at their job is because they’re given too many. It’s a jack of all trades and master of none type situation. Successfully training a chatbot to do its job well requires a significant amount of data. We’re not yet at the point where a fully functional AI assistant is feasible. If you want your chatbot to be good at what it does, keep its purpose/goal small.

We’ve only grazed the surface

Like the heading implies, we’ve barely scratched the tip of the NLP iceberg. As more and more research is invested into NLP, technological innovation will only give rise to more NLP models that could disrupt how businesses of the future are run. If you’re interested in diving deeper, three current NLP projects that I highly recommend reading into during lunch break are IBM Project Debater, Google Duplex, and OpenAI GPT-2. Speaking of lunch,

/jarvis, when did you schedule lunch for again? My stomach is growling.