We’ve already spoken of the Chatbot Hackathon that was announced at Ideas2IT. As it turned out, we used some of the ideas from all of the teams and built one comprehensive chatbot to intelligently gather time sheets from employees and store it an a database, saving our payroll managers and HR massive amounts of time. This is a quick recap of how we built our chatbot.
Build Evaluation
Before we got down to writing code, we had to answer three basic questions:

  1. Who is the target audience?
  2. Why are we looking to build a tool for timesheet automation now?
  3. Why are we building a chatbot to do so?

And these were the answers we came up with:

  1. The audience is the backend team and the collection of HR and admins who manage timesheets, payroll and other back office records.
  2. The company is scaling, and as 200 employees become 300, 400 and more it’s going to get hard to manually collate timesheet data.
  3. Chatbots are great for automatically obtaining records, rather than manually collecting data from PMs and TLs at the end of the month.

Build Objectives
Based on these answers, we decided that the following would be our primary objectives in building the chatbot:

  1. A separate reporting app for backoffice is as important as developing the chatbot to interact with individual employees. No more using Excel sheets.
  2. The chatbot should be really user-friendly. It should make users’ lives as simple as possible, with minimal chat.

With our objectives in place, we were ready to begin building. There were a few things to remember throughout the build process:

  • The most recent technological advances and hacks relevant to chatbots.
  • The best tech stack for optimal performance.
  • Scalability – A growing number of employees engaging with the application at one time.
  • Adaptability – Different environments, devices, etc.

Build Process
We performed a quick analysis on the project and hand, and came up with the following modules:

  • UI,  or  an application to integrate with (like Hangouts, Messenger, Slack, extensions/plugins etc.)
  • NLP to process conversations, understand intent and context and add ‘intelligence’ to the application.
  • A database to read, write, persist data (in our case, the Spreadsheet was the DB for the read and write bits).
  • A communication channel (we used Redis’s Pub/Sub model for socket communications and the caching mechanism was also in Redis).
  • Server (to deploy and run the application)

Tech Stack

UI

We wanted the bot to be easily installable and lightweight, so made it a Chrome extension (size < 200 KB) which in turn integrates with Gmail; this made the job really easy since we didn’t have to bother creating a fresh UI, we merely used Hangouts (thanks, Google!). We merely had to pass the data through Javascript.

NLP
NLP (Natural Language Processing) is an inevitable layer when you’re building a chatbot with AI capabilities. We’ve seen a large-scale democratization of conversational AI, with the arrival of platforms such as API.ai and WIT.ai. The limitation with these frameworks, though, is that they leave very little room to understand background processing logic, and we didn’t want that.

As we scoured the internet for other NLP tools, we began to realize Python and tools that provide Python support were the way to go, when building an AI application. Given all of this, we decided to go with Spacy, which not only suited our requirements but was also accurate and super fast.

Since this was a timesheet chatbot, we built a Spellcheck class and a class for basic level validation such as noise removal (removing stop words), lemmatization (figuring out the words from its inflections, eg. considering play for playing, played) and a pattern check using fuzzy-wuzzy which matches the string pattern and which helps the bot take rational decision while sending responses to users.

Database
Our timesheet was already maintained on Google’s Spreadsheet, for flexibility and, well, because it was free. Since we already had a database, we used it to read/update records. We used Python plugins such as gspread and pyg sheets to do so. In addition to this, day-to-day data was cached in a Redis server using a cron job for faster and easier access.

Communication Channel
We initially thought of maintaining one channel per user, and later we found that there might be scalability issues, since Redis has a restriction in terms of the number of channels. We then decided to maintain all of the conversations in one channel, and user information (Location, Login, Email etc.) in another, using Pub/Sub. Messages from the user will be differentiated by id (in our case, we just used email ids).
Conversation Flow

We had to whiteboard various conversation scenarios; after all, not all employees speak the same way, and we wanted to be prepared for anything. If the chatbot were to misunderstand or ignore an employee and fail to collect timesheet data, that would pretty much defeat the purpose of automating the process in the first place. Given that, we were really careful about conversation flow, and tried to preempt scenarios as best as we could.

Server
We used a lightweight Python server called FLASK which was used to deploy the application, which in turn was helpful in invoking web service calls (REST APIs) from the http client.
Our chatbot is live!

We now have a chatbot that intelligently collects employee data. Our developers are happy for having learned how to build a chatbot (and have built a few more since for clients), HR and admin are happy because they can focus on big-picture activities, and our Data Science team is happy, because they have a convenient, in-house application on which to test out cool stuff such as Knowledge Representation and Reasoning (KR), and other modern ML techniques without the pressures that come with project delivery. Win-win-win! We’re working with chatbots a lot, and are actively investing in this space, so stay tuned for more information.

Have something to add to the conversation? We’re all ears!

Leave a Reply