TY - GEN
T1 - A Bibliometric Analysis of Techniques for Word Sense Disambiguation in Morphologically Rich Languages
AU - Masethe, Hlaudi D.
AU - Masethe, Mosima A.
AU - Ojo, Sunday O.
AU - Owolawi, Pius A.
AU - Giunchiglia, Fausto
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Word Sense Disambiguation (WSD) continues to provide considerable difficulty in Natural Language Processing, especially for morphologically rich languages (MRLs), where intricate word forms and inflections exacerbate lexical ambiguity. The absence of extensive linguistic resources and cross-lingual tools for these languages exacerbates the challenges in developing efficient WSD systems. This study does a bibliometric analysis of significant publications and co-citation networks in the domain of WSD, concentrating on MRLs, to investigate how the scientific community has tackled this problem. The research used VOSviewer to illustrate the intellectual framework of the topic by mapping prominent authors, citation trends, and thematic clusters derived from data obtained from major academic databases. The research designates Roberto Navigli as the preeminent academic, with 530 citations and a total link strength of 4914, mostly due to his contributions to BabelNet and graph-based disambiguation techniques. Additional prominent contributions are Eduardo Agirre (semantic similarity), Alessandro Raganato (WSD assessment frameworks), and Martha Palmer (VerbNet and PropBank). Other prominent individuals, like Hwee Tou Ng, Christiane Fellbaum, Rada Mihalcea, and Ted Pedersen, are acknowledged for their contributions to the development of symbolic, statistical, and hybrid word sense disambiguation approaches. The study identifies a cohort of under-cited but significant researchers, such as Pushpak Bhattacharyya, David Yarowsky, and Alexander Gelbukh, whose work underscores the disjointed character of cross-linguistic research in low-resource contexts. The co-citation analysis indicates a robust research foundation focused on common tools and frameworks, while highlighting the essential need for enhanced international cooperation to broaden WSD solutions for under-represented morphologically rich languages.
AB - Word Sense Disambiguation (WSD) continues to provide considerable difficulty in Natural Language Processing, especially for morphologically rich languages (MRLs), where intricate word forms and inflections exacerbate lexical ambiguity. The absence of extensive linguistic resources and cross-lingual tools for these languages exacerbates the challenges in developing efficient WSD systems. This study does a bibliometric analysis of significant publications and co-citation networks in the domain of WSD, concentrating on MRLs, to investigate how the scientific community has tackled this problem. The research used VOSviewer to illustrate the intellectual framework of the topic by mapping prominent authors, citation trends, and thematic clusters derived from data obtained from major academic databases. The research designates Roberto Navigli as the preeminent academic, with 530 citations and a total link strength of 4914, mostly due to his contributions to BabelNet and graph-based disambiguation techniques. Additional prominent contributions are Eduardo Agirre (semantic similarity), Alessandro Raganato (WSD assessment frameworks), and Martha Palmer (VerbNet and PropBank). Other prominent individuals, like Hwee Tou Ng, Christiane Fellbaum, Rada Mihalcea, and Ted Pedersen, are acknowledged for their contributions to the development of symbolic, statistical, and hybrid word sense disambiguation approaches. The study identifies a cohort of under-cited but significant researchers, such as Pushpak Bhattacharyya, David Yarowsky, and Alexander Gelbukh, whose work underscores the disjointed character of cross-linguistic research in low-resource contexts. The co-citation analysis indicates a robust research foundation focused on common tools and frameworks, while highlighting the essential need for enhanced international cooperation to broaden WSD solutions for under-represented morphologically rich languages.
KW - Bibliometric Analysis
KW - Morphologically Rich Languages (MRLs)
KW - Natural Language Processing (NLP)
KW - VOSviewer
KW - Word Sense Disambiguation (WSD)
UR - https://www.scopus.com/pages/publications/105017805919
U2 - 10.1109/ICTAS64866.2025.11155589
DO - 10.1109/ICTAS64866.2025.11155589
M3 - Conference contribution
AN - SCOPUS:105017805919
T3 - 2025 Annual IEEE Conference on Information Communication Technology and Society, ICTAS 2025 - Proceedings
BT - 2025 Annual IEEE Conference on Information Communication Technology and Society, ICTAS 2025 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2025 Annual IEEE Conference on Information Communication Technology and Society, ICTAS 2025
Y2 - 23 July 2025 through 25 July 2025
ER -