Using Topic Modeling to Find Hidden Structures in the Language of Sesotho sa Leboa

Hlaudi Daniel Masethe*, Mosima Anna Masethe, Sunday O. Ojo, Pius A. Owolawi

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Calls from customers are a great way for service providers to get input about their application process. These calls may also provide a plethora of previously undiscovered information about the queries and concerns of customers. Unfortunately, it might be difficult to properly examine these call data because they are usually unstructured. This study aims to extract valuable customer insights from recorded customer calls to a South African government workers medical scheme (GEMS) application procedure by utilizing Topic Modeling techniques, a branch of Natural Language Processing. The objective of the research is to examine popular Topic Modeling algorithms such as LDA, BERTOPIC, LSA, LSI, and HDP for call content analysis, and categorization, to gain insights into human behaviors and experiences. The research intends to give the business a thorough grasp of client wants, interests, and concerns by utilizing the power of these algorithms, ultimately enabling more efficient decision-making processes. Natural language processing makes heavy use of topic modeling approaches to derive subjects from unstructured text input. A widely used method in topic modeling allows topics to be automatically extracted from a large corpus of textual materials. BERTopic, a transformer-based architecture, outperforms all other conventional algorithms with a greater accuracy of 91%.

Original languageEnglish
Title of host publication2024 Conference on Information Communication Technology and Society, ICTAS 2024 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages75-81
Number of pages7
ISBN (Electronic)9798350314892
DOIs
Publication statusPublished - 2024
Externally publishedYes
Event8th Conference on Information Communication Technology and Society, ICTAS 2024 - Hybrid, Durban, South Africa
Duration: 7 Mar 20248 Mar 2024

Publication series

Name2024 Conference on Information Communication Technology and Society, ICTAS 2024 - Proceedings

Conference

Conference8th Conference on Information Communication Technology and Society, ICTAS 2024
Country/TerritorySouth Africa
CityHybrid, Durban
Period07/Mar/2408/Mar/24

Keywords

  • BERTopic
  • HDP
  • LDA
  • LSI
  • Language models
  • Term Frequency-Inverse Document
  • Topic Modeling

Fingerprint

Dive into the research topics of 'Using Topic Modeling to Find Hidden Structures in the Language of Sesotho sa Leboa'. Together they form a unique fingerprint.

Cite this