How to Make an AI Chatbot- Ideas2IT

In the previous blog post, we looked at how chatbots work and the different types of chatbots. Now, we'll learn how to build one ourselves.

‍Exploring Chatbot Platforms, Frameworks and AI Services

‍The chatbot frameworks are much like any other software framework that provide us with tools and utilities. They are usually implemented for a certain programming language. In addition, some of the bot frameworks also have hosted and interactive development environments to facilitate creating bots to even bigger extent. AI services are independent, cloud-hosted platforms, often exposing GUIs for interactive creation of chatbot logic featuring Machine Learning powered NLP capabilities, and enabling communication via RESTful API.

We looked at two major frameworks: Botkit for node.js and Rasa NLU for Python. Of these, Botkit is really a toy case that doesn’t lend itself for real use cases while Rasa AI services, as mentioned above, are cloud-hosted solutions for NLP needs and building smart bots that can predict flows of complex conversations. They provide UI for construction of prediction models and training models of the Machine Learning based understanding of language entities. We looked at Wit.ai, api.ai, LUIS.ai, IBM Watson. We’ve tabulated our results below,

‍

Comparing Chatbot AI Services: Pros & Cons

SNIPS NLU, a new entrant, decided to take this status quo on and surveyed the scene first. Then they came up with their own benchmarks as well and published the results. What do we do? We take SNIPA and others on, in turn! And, voila, we beat them, even! We compare our performance with that of others, which SNIPA had tabulated. You can see it for yourself!

‍

How to Build Chatbot: Our Approach and Methodology

How did we do this? This is relatively straightforward and our entire program can be defined in the following steps:

Step 1: We downloaded the pre-trained word vectors learned from different sources crawl-300d-2M.vec, that’s available for download on the Internet. It contains 2 Million words, each word is represented as a 300 dimensional vector.

Step 2: Chatbot training data is embedded within this file. That is, we’re using the 2 million words as our training set.

Step 3: We used the LSTM since what we need here is the ability to remember something for a long once something is learned while learn new things slowly. This Neural Network Intent model is created with the embedded training data intents from the data above. Thus, one part of the chatbot, that is intent recognition, is taken done.

Step 4: The next part of the chatbot, namely, entity recognition is created for each intent. This is a bidirectional, recurrent LSTM neural network model. That is for each intent, we build multiple bidirectional entity structures so that an intent with any entity can then flow as a seamless conversation.

Step 5: Test data is obtained again from the above file, after splitting accordingly.

Step 6: Intent and entities are predicted from the intent model and corresponding entity model.

‍Voila we have a chatbot!

‍Why is that one developer who tinkered with the code was able to beat industry standards?

Those large projects we’ve benchmarked ourselves against are open to users world over and they should have performance that is top notch for that. The hardware requirements for a large cloud based SaaS model that has many possible customers simultaneously will become unfathomably expensive for a complex mathematical model.

So the large Data Science teams of large projects spend months trying to build a simple model that generalizes well. In our case, we have a mathematically complex model that is difficult to train. However, our deployment is a simple test case, and thus it works better. And this is easier to achieve as a programming task as well!

So building a chatbot is about a trade off - do you want a generalizable model that is mathematically simple and thus difficult to build, or do you want a model that is mathematically complex and thus generalizes poorly but is it easy to build? Your answer will determine the chart above!

Ideas2IT Team