Albert Louca , Omar Farouk, Youssef Mohamed , Michel Ashraf

Supervised by: Dr Eslam Amer, TA Reman Sewilam, TA Ahmed Hazem

Publishing Date



Chatbots are usually a software application made to copy the human’s ability in a conversation using Artificial Intelligence. And to increase its understanding Our chatbot took a more complex approach than any other one as it gather information first from the user words combined with the sentiments from his voice and supported by the facial expression gathered from the speaker using Deep Learning. The data collected will be processed to generate the most suitable response or answer depending on the situation all with a friendly way to provide a more human-like behavior.

1.1 Purpose of this document

The purpose of this document is to provide a good clarification for our system (Smart Behavioral Chatbot)’s requirements. The system shall be able to communicate with the user using text to speech and speech to text and read the user’s emotions and sentiments collected through the voice, face and words to retrieve the most suitable answer/response using BERT Model and Machine Learning. This document will offer a full details on our system’s functionalities and back-end.

1.2 Scope of this document

Our customer base is intended to be mainly medical institutions which will be able to use the chatbot to provide faster, accurate and sensitive answers to the patient’s inquiries. The system shall be finished in no more than 7 months from now, also we will be using specific python libraries to meet our requirements such as (tensorflow) to help in the machine learning.

1.3 System Overview

Our system aims to expand the chatbot technology by making it a smart behavioral system that can use BiDirectional Transformers (BERT), image and video processing to carry a conversation seamlessly and efficiently with people. It can not only process the users words with remarkable accuracy, but also their actions and sentimental features during the conversation (e.g. angry, sad, frustrated) through image and voice analysis and reply accordingly in which our system will then compile together the foremost fitting reply/answer through the use of deep reinforcement learning giving the client the feeling that he/she is talking to a cognizant human being. The client will utilize a video camera and voice amplifier to speak with our chatbot. The system then will extract the frames and audio out of the camera and the microphone respectively so that we will have our information. The processing of both the audio and the frames procured from the live video will be synchronous. The system will Utilize the speech and turn it into text so that it (the system) can understand what the user’s inquiry is about and also extract vocal sentimental highlights by analyzing numerous physical decencies of the voice and certain key words. At that point on the other side the framework will make utilize of face detection techniques / algorithms and machine learning the system will be able to distinguish the client feelings using techniques such as recognize feature point acquired from the user’s face in order to detect his/her emotional status across the duration that he/she is conversing with the system bot and comparing them to our dataset. Through the use of the data we collected and deep reinforcement learning the client will be displayed with the foremost fitting answer that will make the discussion between him/her and the chatbot stream easily as in case it was carried by an genuine human being.

1.4 System Scope

• Our system’s scope is to help clients by providing them a more professional customer service experience. As this system will be able to react similar to a human being by reading the user’s facial expressions using a webcam. As-well as extracting his/her voice sentiments using a microphone.This data will be processed using machine learning and it will help the system to gather more info about the client’s state and emotions in general,so the system will be able to handle the client’s issue in a professional way with respect to the user’s state. This system will also aid companies to represent their customer service. It will make them able to serve more than one client at a time maintaining the performance, which is cost effective and more reliable.

• The outcome of this system is to find suitable answers to the client’s issue as-well as providing him an experience that is similar to the one that would be offered by a human being.