Assessing the added value of ChatGPT to our services
Abderrahim Ait Ben Moh – Hilversum, May 2023
The adoption of AI applications is increasing in various professions such as journalism and media monitoring. Automation with the help of artificial intelligence has been in development for several years, but the topic has become widely discussed with the arrival of ChatGPT. Several monitoring companies are taking advantage of this from a marketing perspective. We investigated whether “Conversational AI” in the current form of ChatGPT could add value to our work processes and services.
At RTV Monitor, we implemented various artificial intelligence applications years ago. For example, automatic speech recognition in multiple languages, automatic recognition of commercials and music, and a system for image recognition to monitor logos and burned-in subtitles.
The breakthroughs in conversational AI, with more context sensitivity such as ChatGPT, challenge monitoring companies to review their work processes, products, and services. We are convinced that with the rise and widespread acceptance of ChatGPT-like technology, people are getting used to a different way of interacting with systems. That will become the new norm.
We also believe that monitoring companies can better and more efficiently present the increasingly large amounts of data when they enable this form of interaction. And thereby they will be able to better meet the information needs of their clients. It is a beautiful goal, but there are still obstacles to be overcome.
We extensively investigated whether ChatGPT could enrich RTV Monitor data, with a focus on segmentation, summarization, and sentiment analysis. These are some of our findings:
- ‘Garbage in, garbage out’
For both segmentation, summarization, and sentiment analysis, we noticed that the quality of the input is crucial for the output. In the case of speech recognition, our conclusion is that its quality has a significant effect on the generated output. A low-quality transcription results in a sentiment or summarization analysis with a high error margin. Errors in the input are weighed as heavily as the correct text in the input.
During our tests, we also encountered situations in which ChatGPT generated so-called “hallucinations” when enriching our RTV data; ChatGPT invented incorrect information that was not based on the actual input data. These hallucinations are problematic, especially in analyses and decision-making based on the generated output. The difference can be in nuances as well as factual content. It is quite a difference if the analysis states that “Rutte believes that the Groningers should be compensated for the earthquakes” compared to “Rutte believes that compensating Groningers for earthquakes is appropriate.”
Another obstacle we observed during our tests is the stability of ChatGPT. Although GPT can achieve impressive results, the stability of the output can vary, even with similar input. This can lead to inconsistencies and unpredictability in the generated results. It is important to take these stability issues into account and take the necessary measures to improve the reliability and consistency of the enriched data.
At RTV Monitor, we have been seeing the opportunities and possibilities that AI offers for some time, but at the same time, we are critical. We consider the quality of our services to be the most important aspect. The above findings show that there are still steps needed before conversational AI techniques can provide a reliable added value to our services.
These steps include improving input, identifying, and reducing hallucinations, as well as improving the stability of the output. This is done, among other things, by tweaking the many parameters.
- ‘Garbage in, garbage out’