5 min read

How to Build a Chatbot — A Lesson in NLP

A 5 step NLP process can help you design simple chatbots
How to Build a Chatbot — A Lesson in NLP

A 5 step NLP process can help you design simple chatbots

A ‘chatbot’ as the name suggests is a machine that chats with you. The trick though is to make it as human-like as possible. From ‘American Express customer support’ to Google Pixel’s call screening software chatbots can be found in various flavours.

How does it actually work?

The earlier versions of chatbots used a machine learning technique called pattern matching. This was much simpler as compared to the advanced NLP techniques being used today.

What is Pattern Matching?

To understand this just imagine what you would ask a book seller for example — “What is the price of __ book?” or “Which books of __ author do you have?” Each of these italicised questions is an example of a pattern that can be matched when similar questions appear in the future.

Pattern matching requires a lot of pre generated patterns. Based on these pre-generated patterns the chatbot can easily pick the pattern which best matches the customer query and provide an answer for it.

How do you think the following chat could be created

Put simply, the question May I know the price for is converted into a template The price of <star/>. This template is like a key against which all future answers will be stored. So we can have the following

  • The price of iPhone X — $1500
  • The price of Kindle Paperwhite — $100

The code for this in AIML (Artificial Intelligence Modelling Language) will look like

#PATTERN MATCHING
<category>
   <pattern>MAY I KNOW THE PRICE FOR *</pattern>
   
   <template>
      <srai>THE PRICE OF <star/></srai>
   </template>
   
</category>

------------------------------------------------
#PRE_STORED PATTERNS

<category>
   <pattern>THE PRICE OF iPhone X?</pattern>
   <template>iPhone X Costs $1500.</template>
</category>

<category>
   <pattern>THE PRICE OF Kindle Paperwhite?</pattern>
   <template>The all-new kindle paperwhite costs $100. Yay!! You   
             have got an offer!! You can get it for $85 if you apply 
             the coupon MyGoodBot 
   </template>
</category>

NLP Chatbot

Pattern matching is simple and quick to implement but it can only go so far. It needs a lot of pre-generated templates and is useful only for applications which expect a limited number of questions.

xkcd

Enter NLP! NLP is a collection of slightly advanced techniques which can understand a broad range of questions. NLP process for creating a chatbot can be broken down into 5 major steps

1) Tokenize — Tokenization is the technique for chopping text up into pieces, called tokens, and at the same time throwing away certain characters, such as punctuation. These tokens are linguistically representative of the text.

Tokenizing a sentence

2) Normalisation — Normalisation processes the text to find out the common spelling mistakes that might alter the intended meaning of the user’s request. A very good research paper that performs normalisation on tweets explains this concept very well

Syntactic normalisation of tweets research

3) Recognising Entities — This step helps chatbot identify which thing is being talked about e.g. is it an object or a country or a number or the user’s address.

Observe in the below example how Google, IBM and Microsoft are all clubbed as organizations. This step is also known as named entity recognition.

Entities for various words.

4) Dependency Parsing — In this step we split the sentence into its constiuent nouns, verbs, objects, common phrases and punctuations. This technique helps the machine to identify phrases and that in turn tells it about what users want to convey.

— dependency parsing example

5) Generation — Finally, the step where a response is generated. All the above steps fall under NLU (Natural Language Understanding). These steps help the bot to understand the meaning of the sentence being written. This step however falls under NLG (Natural Language Generation). This step receives the output of the previous NLU steps and generates a number of sentences with the same meaning. The generated sentences are generally similar in terms of the following

  • Word order — “the kitchen light” is similar to “the light in the kitchen”
  • Singular/plural — “the kitchen light” is similar to “the kitchen lights”
  • Questions — “close the door” is similar to “do you mind closing the door?”
  • Negation — “turn on the tv at 19:00” is similar to “don’t turn on the tv at 19:00”
  • Politeness — “turn on the tv” is similar to “could you please be so kind as to turn on the tv?”

Based on the context of user’s question the bot can reply with one of the above options and the user would return satisfied. In a lot of cases users are unable to differentiate between a bot and human.


Chatbots are growingly steadily and have come a long way since AIML was invented in 1995. Even in 2016 an average user was spending more than 20 minutes interacting over messaging apps, with Kakao, Whatsapp and Line being the top favorites.

Similarweb

Businesses around the world are looking to cut costs on customer care and provide round the clock customer service through the use of these bots.

80% of businesses want chatbots by 2020
Businesses are beginning to see the benefits of using chatbots for their consumer-facing products

The technology behind chatbots is fairly standard. NLP has a long way to go but even in its current state it holds a lot of promise for the field of chatbots.