Abstract:
The intensive advancements of dialogue systems (DMS) for numerous businesses has
resulted from the rapid expansion in conversational AI and user chat data. However,
relatively little research has been done on low-resource languages like Azerbaijani. The
primary goal of this project is to test different DMS pipeline setups to find the best natural
language understanding and dialogue management settings. In our project, we created and
tested various DMS pipelines using conversational text data received from one of
Azerbaijan's largest retail banks. Two major modules of DMS, Natural Language
Understanding (NLU) and Dialogue Manager were studied in the scope of this project. We
used a language identification (LI) component to determine language of the text in the first
step of NLU. For user intention identification, traditional ML classifiers (logistic regression,
neural networks, and SVM) were compared with the Duаl Intent аnd Entity Trаnsformer
(DIET) intent classification architecture. For both word and character n-gram based tokens,
we employed diverse combinations of feature generators such as Count Vectorizer, Term
Frequency-Inverse Document Frequency (TF-IDF) Vectorizer, and word embeddings in
these tests. The Named Entity Extraction (NER) component was introduced to the pipeline
to distinguish important information from text messages Entity tags were collected and
passed into the Dialogue Management module as features. The Dialogue Management
module, which includes a Rule-based Policy to handle FAQs and chitchats as well as a
Transformer Embedding Dialogue (TED) Policy to handle more complicated and
unexpected dialogue inputs, was used to monitor all NLU setups. As a result, we propose a
DMS pipeline for a financial assistant that is able to identify intentions, named entities, a
language of the input message, and policies that allow for the generation of appropriate
actions (for predefined dialogues) and the recommendation of the best next response.