We’ve already spoken of the Chatbot Hackathon that was announced at Ideas2IT. As it turned out, we used some of the ideas from all of the teams and built one comprehensive chatbot to intelligently gather time sheets from employees and store it an a database, saving our payroll managers and HR massive amounts of time. This is a quick recap of how we built our chatbot.
Before we got down to writing code, we had to answer three basic questions:
And these were the answers we came up with:
Based on these answers, we decided that the following would be our primary objectives in building the chatbot:
With our objectives in place, we were ready to begin building. There were a few things to remember throughout the build process:
We performed a quick analysis on the project and hand, and came up with the following modules:
NLP (Natural Language Processing) is an inevitable layer when you’re building a chatbot with AI capabilities. We’ve seen a large-scale democratization of conversational AI, with the arrival of platforms such as API.ai and WIT.ai. The limitation with these frameworks, though, is that they leave very little room to understand background processing logic, and we didn’t want that.
As we scoured the internet for other NLP tools, we began to realize Python and tools that provide Python support were the way to go, when building an AI application. Given all of this, we decided to go with Spacy, which not only suited our requirements but was also accurate and super fast.
Since this was a timesheet chatbot, we built a Spellcheck class and a class for basic level validation such as noise removal (removing stop words), lemmatization (figuring out the words from its inflections, eg. considering play for playing, played) and a pattern check using fuzzy-wuzzy which matches the string pattern and which helps the bot take rational decision while sending responses to users.
Our timesheet was already maintained on Google’s Spreadsheet, for flexibility and, well, because it was free. Since we already had a database, we used it to read/update records. We used Python plugins such as gspread and pyg sheets to do so. In addition to this, day-to-day data was cached in a Redis server using a cron job for faster and easier access.
We initially thought of maintaining one channel per user, and later we found that there might be scalability issues, since Redis has a restriction in terms of the number of channels. We then decided to maintain all of the conversations in one channel, and user information (Location, Login, Email etc.) in another, using Pub/Sub. Messages from the user will be differentiated by id (in our case, we just used email ids).
We had to whiteboard various conversation scenarios; after all, not all employees speak the same way, and we wanted to be prepared for anything. If the chatbot were to misunderstand or ignore an employee and fail to collect timesheet data, that would pretty much defeat the purpose of automating the process in the first place. Given that, we were really careful about conversation flow, and tried to preempt scenarios as best as we could.
We used a lightweight Python server called FLASK which was used to deploy the application, which in turn was helpful in invoking web service calls (REST APIs) from the http client.
Our chatbot is live!
We now have a chatbot that intelligently collects employee data. Our developers are happy for having learned how to build a chatbot (and have built a few more since for clients), HR and admin are happy because they can focus on big-picture activities, and our Data Science team is happy, because they have a convenient, in-house application on which to test out cool stuff such as Knowledge Representation and Reasoning (KR), and other modern ML techniques without the pressures that come with project delivery. Win-win-win! We’re working with chatbots a lot, and are actively investing in this space, so stay tuned for more information.