However, for relatively confidential internal information of an agency or organization, retrieving and analyzing this information takes a lot of time and effort. Even if, you provide a customer support service by on-call staff, it can lead to overload when one staff member does not have enough time to support many customers, or they will be confused when encounter questions that are not easily retrieved from the internal database.
Solution with QuickSearch
Why QuickSearch?
In the information age, a reliable source of information as valuable as any other mineral resource on earth. We can easily find information about any event or person – something that has been shared – on the internet.
Sometimes, the staff on duty also answer frequently repeated questions. As for customers, they can’t even explain what they want to learn about your product or service. This can take time and effort on both sides and can cause customers to have an undesirable experience!
A QuickSearch application can handle the flow of information internally (without sharing data), provide real-time feedback with acceptable results, and can handle similar queries.
How it works?
Based on researching many text models by word separation, tokenizer, as well as information extraction and encoding, we provide a number of models based on local searching – not shared database methods.
One of the research directions is based on keywork, with the idea that keyworks will contain most of the meaning of the sentence, this helps focus on the most valuable components, eliminate meaningless components, and reduce time. processing time.
Based on researching many text models by word separation, tokenizer, as well as information extraction and encoding, we provide a number of models based on local searching – not shared database methods.
One of the research directions is based on keyword, with the idea that keywords will contain most of the meaning of the sentence, this helps focus on the most valuable components, eliminate meaningless components, and reduce time. processing time. We use TF-IDF, BM25 algorithms to register data and calculate the keywork level in the database to provide the most relevant information. Searching based on sentences is also one of the solutions we have provided.
This method uses sentences as a unit to encode and extract information through the S-BERT model. The processing flow for data registration and data retrieval is similar, the L2 metric is used to calculate the similarity of the query and the database. This method works well if meaningful questions are provided. At the same time, we also provide a solution that combines keyword and sentences to provide a solution that is both fast and effective in the registration and information retrieval flow.
We also provide an online search solution, which uses the llama index library to register and retrieve data. This method aims to use models that have been trained with large and similar amounts of data, leading to very efficient data retrieval. However, you must share information with third parties.
In summary, we provide QuickSearch applications with many engine model options such as keywork-based, sentence-based, combination-based, and online mode (by llama index library). We mainly research and analyze Japanese, but English and Vietnamese are also our upcoming goals. We also provide a chatbot to create a natural conversation and revolve around information registered in the database.
Process & Results
The system has been tested with many types of data, from the FAQ database of a customer care system to the information system of cosmetics, real estate, or the general rules of a company.
For FAQs, our system can provide answers that are very close to the request or provide similar answers in the database – sentence-based searches achieve good results within the data type This. For searching for information to rent or buy real estate, keywork-based search has achieved very encouraging results. For the cosmetic information dataset and the general rule set, the information is numerous and widespread, making it very challenging to retrieve the target information. The system provides reasonable results with questions that match the information in the database. We still need to work on improving our search models to provide optimal results for users.
Besides, providing additional chatbot functionality to bring a more natural and satisfying experience to users.