Building Conversational AI Chatbots with MinIO

Chatbots have become one of the most ubiquitous elements of AI and they are easily the type of AI that humans (unwittingly or not) interact with.  At the core is Natural Language Processing (NLP), a field of study within the broader domain of AI that deals with a machine's ability to understand language, both text and the spoken word like humans.

The goal of NLP is to have the computer be able to carry out a conversation that is complete in terms of context, tone, sentiment and intent.

Chatbots have evolved remarkably over the past few years, accelerated in part by the pandemic’s push to remote work and remote interaction. Like all AI systems, learning is part of the fabric of the application and the corpus of data available to chatbots has delivered outstanding performance which to some is unnervingly good.

One advantage of chatbots is that they are packaged as an application and therefore can be embedded into websites and/or phone numbers, integrated into commerce applications and payment systems and CRM systems.

Another advantage of chatbots is that enterprise identity services, payments services and notifications services can be safely and reliably integrated into the messaging systems. This increases overall supportability of customers needs along with the ability to re-establish connection with in-active or disconnected users to re-engage.

Different Types of Chatbots

There is an excellent scholarly article by Eleni Adamopoulou and Lefteris Moussiades that outlines the different types of Chatbots and what they are useful for. We have paraphrased it below but encourage readers to take in the whole article as it covers some of the foundational building blocks as well.

  • Knowledge Domain (Open): Open domain chatbots can talk about general topics and respond appropriately.
  • Knowledge Domain (Closed): Closed domain chatbots are focused on a particular knowledge domain and might fail to respond to other questions.
  • Service Provider: Service provider considers sentimental proximity of the chatbot user, and the amount of intimate interaction that takes place.
  • Inter/Intra personal: these chatbots can either be within the personal domain, or the domain of communication, and are generally considered to be friendly.
  • Informative: These are built from a static fixed data source that is stored ahead of time.
  • Chat based/Conversational: Chat-based/Conversational chatbots talk to the user, like another human being, and their goal is to respond correctly to the sentence they have been given.
  • Task-based: Task-based chatbots perform a specific task such as booking a flight or helping somebody. These are intelligent in the context of asking for information and understanding the user’s input, using rules and NLP.

Architecture Models for Chatbots

Again, using the work by Eleni Adamopoulou and Lefteris Moussiades as a source, we find that they break down the architecture as follows. We agree with their classification and so we recreate it below:

  • Rule-based Model
  • Fixed predefined set of rules
  • Recognizes the lexical form of input text
  • Human hand-coded knowledge
  • Retrieval-based Model
  • Queries the knowledge base collected over time
  • Analyzes available resources using APIs
  • Retrieves potential responses from an index and applies matching for the appropriate answer
  • Generative Model
  • More human-like; generates answers based on previous knowledge
  • Applies ML algorithms and deep learning techniques
  • Depends on how you build and train the ML models

We are interested in the generative models for implementing a modern conversational AI chatbot. Let us look at the chatbot architecture in general and expand further to enable NLP to improve the knowledge base.

General Chatbot Design and Architecture

Our approach will follow the generally accepted best practices of using building blocks. This is consistent with how one would scale MinIO. In the case of our chatbot design we want to create modularity that allows for a) accurate knowledge representation b) a strategy for developing answers and c) predetermined responses for when the machine does not understand.

Typically these architectures are based on a retrieval-based model

If we take the approach of identifying the constituents in the above diagram, apart from the user interface, we will need to build user message analysis for language understanding and figuring out the context of the message with confirmation. We would also need a dialog manager that can interface between the analyzed message and backend system, that can execute actions for a given message from the user. The dialog manager would also interface with response generation that is meaningful to the user. The action execution module can interface with the data sources where the knowledge base is curated and stored.

Conversational AI Chatbot Architecture with MinIO

With the advent of AI/ML, simple retrieval-based models do not suffice in supporting chatbots for businesses. The architecture needs to be evolved into a generative model to build Conversational AI Chatbots. Adding human-like conversation capabilities to your business applications by combining NLP, NLU, and NLG has become a necessity. These interfaces continue to grow and are becoming one of the preferred ways for users to communicate with businesses.

With the recent Covid-19 pandemic, adoption of conversational AI interfaces has accelerated. Enterprises were forced to develop interfaces to engage with users in new ways, gathering required user information, and integrating back-end services to complete required tasks.

Public cloud service providers have been at the forefront of innovation when it comes to conversational AI with virtual assistants. Innovations in text-to-speech and speech-to-text frameworks and integrations with NLP/NLU frameworks have enabled enterprises to build highly effective chatbot and voice experiences, increase user satisfaction, reduce operational costs, and streamline business processes — all while speeding up time-to-market.

However, with data often distributed across public cloud, private cloud, and on-site locations, multi-cloud strategy has become a priority. Kubernetes and Dockerization have leveled the playing field for software to be delivered ubiquitously across deployments irrespective of location. MinIO has taken storage to the next level by adopting these advancements. MinIO clusters with replication enabled can now bring the knowledge base to where the compute exists. And as far as the compute goes, we can apply AI principles using NLP and NLU on the knowledge base and continuously improve the content to be consumed and delivered to the customers’ queries through phones, virtual assistants, and bots. Integrating Kubeflow and Kale pipelines, data analysts and scientists in the background implement continuously learning AI models on the data captured based on user habits, and optimize responses and content to efficiently serve the frontend components.

One such example of a generative model depicted here takes advantage of the Google Text-to-Speech (TTS) and Speech-to-Text (STT) frameworks to create conversational AI chatbots. Backend systems are replaced by MinIO, ingesting the data directly into MinIO. As user habits are recorded with NLU, the user data is also made available in MinIO along with the knowledge base for background analysis and machine learning model implementation. For more information on how to configure Kubeflow and MinIO, follow this blog.

Zooming into the architecture with MinIO, we can take advantage of open source NLP projects such as Apache OpenNLP and Stanford’s CoreNLP package to perform NLU and NLP on top of the datasets captured in MinIO by user activity. You can leverage third party datasets to enrich and enhance the knowledge base to continuously learn and optimize as per business needs. Then, build your NLP pipeline that handles text processing steps like cleaning, normalization, tokenization, stop word removal, named entity recognition and Stemming and Lemmatization. Once processing is complete and the data stored in a curated bucket, meaning you have a simplified bag-of-words model representation, disregarding grammar and even word order but keeping multiplicity, we can train the model continuously with the data available from existing and new knowledge bases. Let MinIO take care of the underlying storage and manage it for you as shown below

Conclusion

Alan Turing said, “I believe that at the end of the century the use of words and general educated opinion will have altered so much that one will be able to speak of machine thinking without expecting to be contradicted.”  With AI/ML today it is possible to continuously improve the expected result for user interactions with conversational AI and not be contradicted.

Let open source software help you with simplifying enterprise conversational AI needs and let MinIO handle the storage solutions to enable continuous learning and optimize the knowledge base for improved chatbot experience.