Friday 14 February 2020

Pattern Matching Chatbots - how they work and getting started

Introduction

In this post I am going to describe how you might create a chatbot using simple pattern matching techniques. Looking first at what chatbots are, a brief history of how and why they are made, an obligatory paragraph on the Turing Test and then how you can use the code examples to get started easily.

All the code used for creating these are available on GitHub. They are written in python and I’m unlikely to maintain them, although I would be willing to accept pull requests. 

If you want to jump straight to the code examples, search for “Very Simple pattern matching chatbot” 

Contents

1.     An Overview of chatbots
a.     Brief history
b.     Why we use them
c.     Turing Test
2.     Pattern Matching
3.     Machine Learning
4.     Getting Started
5.     Very Simple pattern matching chatbot
6.     More Advanced pattern matching chatbot
a.     Additional techniques that could be used
7.     User context aware chatbot
a.     Additional techniques that could be used
8.     User Input Learning pattern matching chatbot
a.     Cleverbot
b.     Turing Test
c.     Additional techniques that could be used

Overview of Chatbots

Brief History

There are many articles online that discuss how Chatbots began. The first implementation of a chatbot was by Joseph Weizenbaum in 1966 at the Massachusetts Institute of Technology (MIT) in the paper “ELIZA--A Computer Program For the Study of Natural Language Communication Between Man and Machine”. 

This created an understanding that machines could be made to interact with humans as if they themselves are human, something that is regularly played out in books, films and other popular culture. This is usually either through the humanization of non-verbal creatures such as the Lion King or the coming to life of machines such as AI or C-3PO. 

Since ELIZAs release, many have attempted to improve on the workings of Weizenbaum to make ever more sophisticated conversational agents from computers. Sometimes this is through clever matching of input to output, and sometimes it is through the collation of many hours of ‘training’, to give the appearance of intelligence through imitation of real things.

Why we use them

Chatbots originally started as a way of allowing humans to speak to computers in a human way through natural language. Normally, commands are written on computers using specific terms and keywords to make a computer perform actions required, but by using natural language a human would not need to be trained to speak in the way of the computer but could instead speak ‘normally’ to achieve what they need.

A great example of this in action would be Alexa or Siri. The user speaks to their phone in a natural way e.g. “put a timer on for ten minutes” to which the phones will reply “Timer set”. This works great for businesses in particular as they are able to provide support services for consumers without needing to train users in a specific language, in fact, they can talk to the computer in the same way they can speak to another person.

These virtual assistants have become better and better at interpreting user input, but this is mainly because each has a limited set of specialist topic conversations that they discuss. i.e. there are limited questions a tourist kiosk might need to answer online i.e. ticket prices, opening times etc. so the chatbot has a limited vocabulary.

From another angle, people have always been obsessed with creating intelligent machines, and language is a form of intelligence we can recognise. The intelligence of machines has long been foreseen although it rarely excels at the pace anticipated. For example, chatbots are no closer to passing as intelligent now as they were post ELIZA, but with machine learning and automatable methods to allow computers to replicate human natural language, the appearance of intelligence may soon be sufficient to pass ‘The Turing Test’.

Turing Test

The Turing test was developed in 1950 by Alan Turing, not as a way of showing ‘intelligence’ but as a way of discerning if a machine is able to ‘think’. In the test, a human evaluator talks to two ‘people’, one a human and the other a machine in a text only format. They then judge which is the machine and which is the human. If the machine fools enough of the judges the machine is deemed to have passed the Turing test and be shown to ‘think’.

There are obvious limitations to this method, mostly because ‘think’ has its own philosophical meaning. In my view, our current machines pattern match without being able to imagine, dream or relate conversation to real experience. These things are unique to the person who is thinking whereas a machine will only ever display that which it is programmed to think. 

With this in mind, chatbots can be used to create very human like conversations for the purpose of being virtual assistants to help people with day to day life tasks, as a listener to our human qualms and plans, or as a human-like conversation tool to provide information.

Pattern Matching

Pattern matching is a technique used by early chatbots like ELIZA and ALICE (https://www.pandorabots.com/pandora/talk?botid=b8d616e35e36e881). They take sentence input, match for patterns and perform response creation based on keywords found within. i.e. “I am X” gets the response “why are you X”.

From the ELIZA paper

ALICE for her part can deviate from questions it cannot understand in order to appear more human in its response, or to appear to have more ‘thought’. i.e. “Why is the sky blue” gets the response “I don’t have time to explain it to you” or “that is complicated thing to explain.” ALICE replies sentences that make sense as responses but neither reply necessarily in a way that makes a cohesive conversation.
ALICE example
Together, they both use a mixture of natural language processing (NLP) techniques which are methods for processing how people speak and finding patterns for the responses we give. Usually this is so that automated machine can sound more like those they are talking to in order to give an impression of empathy and/or to trick the user into thinking they are getting support – whether information or emotional – from a human.

They do this by parsing the input into natural language components, or looking for patterns like names or objects and replying with the same pattern put in.
“I love to play football” as an input might be processed as “I” is the subject which translates to “you” as a response, “love” is the emotion, “play” is the verb and “football” is the object. 
The response formed could be: “Why do $swicthedSubject $emotion $verb $subject?”

Input
Output
I love to play football
Why do you love to play football?
I hate modern art
Who do you hate modern art?
She likes to dance
Why do she like to dance?

This blog doesn’t go deep into NLP techniques, but you can read about some techniques here: https://www.nlp-techniques.org/what-is-nlp/nlp-techniques-list/

Machine Learning

It is worth mentioning machine learning and ‘artificial intelligence’ (AI) which is fast becoming the given method for creating ‘intelligent’ machines. This method attempts to incrementally improve the ability to reach a goal through positive and negative feedback of what is achieved.

One system that uses machine learning and pattern matching together is Cleverbot (https://www.cleverbot.com). Cleverbot takes input from users and stores it both as an input and a response. It then uses popular responses in order to provide the ‘best’ response to user input. i.e. if 100 people respond to input ‘hello’ and 95 reply ‘hello’ the bot would take this as positive reinforcement of this as an ‘good’ response. Whereas ‘kjjkkkd’ used only 1 time as a reply would rarely, if ever, be used.

Getting Started

To get started on chatbot creation there are some obvious steppingstones to slowly get there whilst building a knowledge of what they are, why they are and what has been tried before:
1.     Create a very simple pattern matching chatbot
2.     Start to look at using NLP techniques to help form more intricate responses
3.     Start to add your own ideas to make the chatbot appear to ‘think’ i.e. add Context
4.     Start to include Machine Learning into input and response generation through neural nets

In this blog I will look at techniques 1, 3 and 4 in more detail, and hopefully provide a platform for you to start on your own Chatbot journey.

Very Simple pattern matching chatbot


The very simplest form of chatbot takes input, looks in a file (txt, csv, json) and where it finds the input, it will respond with one of the responses associated to that input. For this simple example we use a JSON file called ‘responses.json’.

very simple response JSON
Each input is a key in the JSON file with an array of potential responses that can be put back to the user. The steps in this simple interaction are:
1.     Ask the user for input in a loop so they can input data and we can process it with the next steps in its own function called ‘processInput’

get input from the user in a loop
2.     First, we load the responses file, in the example we have JSON
3.     Check if the user unaltered input has a match with a JSON key i.e. ‘hi’ is not a JSON key so would result in ‘none’ where as ‘hello’ is and so would not result in ‘none’. If the input is not in the responses file – response with a simple “I don’t understand”.
4.     If there is a match in the responses file, because we have arrays of potential output we need to check.
a.     If there is only one potential response, respond with that response
b.     If there are more than one potential responses, randomly select one

Pros:
·      Less risk of there being bad responses to input (instead of I don’t understand)
·      Chatbot can stay about a specific subject (“I only really talk about triathlons”)

Cons:
·      Lots of work to increase and expand the chatbot knowledge base
·      Unable to maintain continuous contextual conversation
·      Lack of conversation flow

example simple pattern matching chatbot

This simple example shows how many early chatbots functioned. They would take input and process them for output. The methods for processing differed but this gives a good starting point for implementing processing techniques.

More Advanced pattern matching chatbot


Expanding on the simple pattern matching there are several things we can look to add:
·      Logging any input that resulted in a bad response (to aid in building the knowledge base)
·      Formatting the input for consistency
·      Allowing for partial input matching for producing responses
·      Respond with a question instead of “don’t understand” to keep conversation going
The code base is updated from “Very simple pattern matching chatbot” with bold input showing changes.

Input will still be matched against keys in a responses JSON file with an array of potential output to be selected from, from matches.

1.     Input is asked for in a loop, when received it is processed in ‘processInput’ function
2.     The responses JSON file is loaded
3.     New: format the input so that case is consistent and so that punctuation is removed. This means that pattern matching is not prohibited by case sensitivity, only the text. In this example we have a ‘formatInput’ function which can be expanded for more complex processing.
format input
4.     New: Loop through all the keys in the responses JSON and check if the input text is inside the current key text. This allows partial pattern matching within the text 
i.e. input: “How are you?” matches JSON input key: “how are you today?”
This could be expanded further by getting a percentage match of every JSON key and pattern matching with the most likely match above a certain threshold.

Input
Pattern
Match
Hello, how are you?
hello
25%

how are you
75%

hello how are you
100%

bye
0%

5.     Changed: If there is a match, set a ‘matched’ variable and 
a.     print either the only output in the response array, or 
b.     randomly select from multiple potential outputs
check for pattern in this input
6.     If no match is found:
a.     New: append the unmatched string to a dedicated text file for later processing
b.     New: If the user asked a question, send a stock response
c.     New: If the user didn’t ask a question, the chatbot should ask a question
handle scenarios where there are no matches
Pros:
·      Able to provide the closest possible answer through input matching
·      Less risk of there being bad responses to input (instead of I don’t understand)
·      Won’t respond with anything unvetted
·      Unanswered input can be sorted
·      Allows some improved conversation flow if the chatbot doesn’t know what to say

Cons:
·      Lots of work to increase and expand the chatbot knowledge base
·      Unable to maintain continuous contextual conversation
·      Lack of conversation flow

more advanced pattern matching example

User context aware chatbot


Expanding further on the more advanced pattern matching chatbot which does some additional formatting, the next example starts to look at the importance of embedding context into conversations.

There is a ‘userContext.json’ file which will be unique to each user. This could have a lot of personalised information but for this example there will be three: age, name and location. 
JSON for user context
To do this there is a need to add the following logic and files:
·      Check if there is any user context in the input from the user
·      Check the input against an input match file
·      Update the user context based on matched input:
o   Check location against known locations
o   Check age against numbers
o   Check name against position in string input
·      Check response for any context pointers and replace them with any known context values
The code base is updated from “More advanced pattern matching chatbot” with bold input showing changes.

1.     Input is asked for in a loop, when received it is processed in ‘processInput’ function
2.     The responses JSON file is loaded
3.     Format the input so that case is consistent and so that punctuation is removed.
4.     NEW: Check the users input to see if it matches an input pattern which denotes user contextual data is going to be given. This will be processed in the ‘checkForUserContext’ function
a.     First, we load input patterns that might have a match. This will have a string that only has to be partially in the input. Below is the JSON for input matching. i.e. “Good Afternoon, my name is Aiden” would match with the first key in the JSON file.
Matching input JSON
b.     If there is a match follow one of three paths:
                                               i.     NAME: split the word into parts and set name context as the last word
                                             ii.     LOCATION: check the similarity of each word against known locations, the closest proximity above a degree of know is set as the location
check location against the locations JSON
JSON with locations in
                                            iii.     AGE: Find a numerical or word match to a number in the input
c.     Update the user context file with the new user information
context matching and processing
5.     Loop through all the keys in the responses JSON and check if the input text is inside the current key text. This allows partial pattern matching within the text 
6.     If there is a match, set a ‘matched’ variable and:
a.     print either the only output in the response array, or 
b.     randomly select from multiple potential outputs
7.     If no match is found:
a.     append the unmatched string to a dedicated text file for later processing
b.     If the user asked a question, send a stock response
c.     If the user didn’t ask a question, the chatbot should ask a question
8.     New: The response JSON has been updated to add context variables to make the conversation more personal and intelligent. i.e. “My name is Aiden”, “hello Aiden”. 
responses with pointers to be replaced with context
            To check the response, there is a ‘checkResponse’ function:
a.     Check for the pointer and take the context type after it
b.     If there is no context of that type stored ask the user what that context is
c.     If there is context in the context JSON replace the pointer with the context text
check response and replace the pointer
9.     Response with the determined output

Pros:
·      Able to retain user context
·      Conversations flow better and gives a semblance of intelligence
·      Could be ramped up quickly with ‘small’ wins i.e. adding job, likes, dislikes etc.

Cons:
·      Carries context but doesn't pick up sentence context i.e. (user)I am 12 years old 32, (chatbot) sorry are you 12 or 32? (user) 32 (chatbot) what?
·      Time consuming to update each pattern type i.e. input, response and locations etc.
·      Hard to get right - the more you add the more complex it becomes and the greater the shortcomings become
Example of pattern matching with user context

User Input Learning pattern matching chatbot


The final example to be shown will take the form of chatbot used by the likes of ‘cleverbot’ (http://www.cleverbot.com). This system takes user input and then maps that input to real user responses. The idea being that the response of one user will generally make sense and give the perception of a conversation flowing – even if that conversation has a random or unusual flow.

This example will continue to grow exponentially, it needs only two files. The ‘script.py’ and the ‘responses.json’ file. The responses file take a slightly different form, with the potential responses array comprising of multiple JSONs with a ‘frequency’ and ‘word’ key.
response JSON with frequency and word
1.     Input is asked for in a loop, when received it is processed in ‘processInput’ function
2.     Format the input so that case is consistent and so that punctuation is removed.
3.     The input will now be added as its own response into the response JSON in a function called ‘addInputAsResponse:
a.     Create a new JSON key and array out of the user input
b.     Open the responses json file and load the data
c.     If the chatbot response is a JSON key in the responses file do some additional processing:
                                               i.     If the user’s response matches something that has been said before, the ‘frequency’ for that word is increased
                                             ii.     If the user’s response has not been said before, the response is added as a new response the chatbot can use later
d.     If the chatbot response is not a JSON key in the responses file add it as a JSON key with the user response as its first array input
e.     Save the responses JSON file after updating it
add user input as a potential response
4.     Now the response needs to be processed. This follows the same pattern as the other examples. If there is a match either pick the only available response, or one of multiple responses
5.     If there is no match with any data from the responses json file. The new input (from the user) is added as a key to the responses JSON and is left blank. In this event a response is still needed so further processing occurs in ‘getResponseWithNoData’. The idea is to reduce the number of empty arrays in the JSON keys file:
a.     First open the responses JSON and load the data
b.     Cycle the keys in the JSON and store any with an empty array or a ‘known’ responses array
c.     If there are any empty responses a random ‘selector’ decides whether to use an empty JSON response key or one with responses already (this improves conversation flow). A random response is given from one of the two arrays.
d.     If there are no empty JSON response keys, a random response is given from those available.
generating a response when there is no data
6.     Respond to the user with the determined output

Pros:
·      Self-growing and self-editing response file
·      Easy to manage
·      Could be greatly improved with little editing of the responses file at regular intervals

Cons:
·      Little/no control over input from users - would require additional functionality. Especially for profanity or ‘offence’ which have been maliciously ‘taught’ bad words and phrases in real-life bot scenarios
·      Cannot hold context for more than 1 input/response iteration
·      Needs lots of users/variety of speech forms to get the model working well
User Input Learning Example 1
User Input Learning Example 2

Conclusion

All the code provided in this blog has been given to aid anyone wanting to make a quick start on a pattern matching chatbot. These are generally quite basic examples with the best example being the user input learning pattern. To advance further in this vein, neural nets and greater NLP parsing could be looked at and used.

No comments:

Post a Comment