<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=2171572209666742&amp;ev=PageView&amp;noscript=1">
man holding phone reading a chat message

Processing of text by computers

Natural Language Processing

In a world of ever-increasing flows of information, the automated processing of natural language can help us focus on what’s relevant.

Motivation

Natural Language Processing (NLP) is the processing of text by computers. By no means a young discipline, NLP has had its fair share of advancements since its humble beginnings in 1970. But it was in late 2022 that ChatGPT by OpenAI took the world by storm with its astonishing abilities. However, there also are alternative and occasionally specialized models that can handle specific NLP tasks, of which some are open source, an important point from a security/confidentiality perspective.

Man pointing on monitor

Approach

Our focus lies on use cases with a high business demand, particularly semantic search summarization, question answering, and document analysis (classification). We evaluated different NLP models for a specific use case based on their performance and non-functional criteria, e.g., how well they can be integrated into existing solutions and products. The question of on-premise or remote API is also an important factor due to its impact on model maintenance and data privacy.

Improving the Adnovum internal search function is a prime example of a use case. After the semantic search of a BERT-like model proved superior to the Confluence search, we developed an application that only partially indexes the internal Confluence. We then integrated this NLP model into Elasticsearch to include data stores besides Confluence in the indexing and semantic search functions. By adding NLP models tailored to question answering, the application can go beyond merely listing matching documents, and in addition provide a fully formulated response, thus affording the speed and convenience in information retrieval that modern users expect. However, such a solution should be local, i.e., on-premise, due to privacy requirements.

Expected results

Publicly available NLP models, both on-premise and remote API, are a perfectly viable approach for implemented and validated use cases. Our clients can test these models with their own data, for example, in the areas of semantic search, summarization, and question answering. With further fine-tuning, clients can adapt the models to their domain-specific vocabularies and tasks. All major European languages are supported (English, German, French, and Italian). Commercial models hosted in cloud environments guarantee data security.

Status

Through workshops, we share the expertise gained from our technical analyses and PoC implementations with clients from various industries. Clients can feed their own data into our demo applications to gain an immediate impression of the added value NLP technologies offer for their internal systems and processes.

Conversational AI

ChatGPT has given rise to new questions surrounding conversational AI, which can benefit from NLP technologies in two ways: The first is improving such core features as speech-to-text transcription and intent recognition. The open-source solution Whisper by OpenAI is an impressive example of how powerful language models can boost transcription. The second are additional functionalities, such as automatic conversation summarization. Together with our conversational AI team, we are exploring improvements in both areas.

In sum, ChatGPT on its own can’t replace a professional conversational AI solution. But it can be a major component of a much-improved product. We are currently exploring just how ChatGPT can best contribute to a professional conversational AI.