MULTIMODAL LARGE LANGUAGE MODELS FOR LOW-RESOURCE LANGUAGES AND GLOBAL SOUTH DEPLOYMENT: A COMPREHENSIVE SURVEY OF ARCHITECTURES, BENCHMARKS, AND SOCIOTECHNICAL CHALLENGES

Austin Olom Ogar; Abah Joshua; Aliyu Suleiman Muhammed; Oluwatobi Noah Akande; Faruk Obansa Muhammed; Ibrahim Anka Salihu

Authors

Austin Olom Ogar Department of Computer Science, Nile University of Nigeria, Plot 681 Cadastral Zone C-OO, Abuja 900001,
Abah Joshua
Aliyu Suleiman Muhammed
Oluwatobi Noah Akande
Faruk Obansa Muhammed
Ibrahim Anka Salihu

Abstract

Multimodal Large Language Models (MLLMs) such as LLaVA, InstructBLIP, and Qwen2-VL have unlocked joint reasoning over text and images at an unprecedented scale. The technical literature on MLLM efficiency, domain adaptation, and benchmarking has matured into a substantial corpus. However, the overwhelming majority of surveys are written from the vantage point of high-resource English-language datasets and well-resourced computing infrastructures. This survey takes a different angle. We consolidate the recent literature on multimodal language understanding through the lens of low-resource languages and Global South deployment contexts, where data scarcity, compute constraints, intermittent connectivity, and sociotechnical-trust considerations interact in ways that high-resource surveys rarely address. We propose a four-quadrant taxonomy that organises the field around linguistic coverage, data scarcity, compute constraints, and sociotechnical trust. We trace the evolution of multilingual multimodal architectures across an eight-year arc, map the benchmark landscape against nine languages and five modalities, and describe a three-tier edge-regional-global deployment topology suited to low-resource environments. Five high-impact application domains are surveyed: healthcare, agriculture, education, disaster response, and public service. A dedicated section examines sociotechnical considerations, including linguistic justice, algorithmic fairness, and regulatory readiness. We identify six concrete open research problems and outline a future research agenda. The survey is intended as a single-source reference for researchers, policy makers, and practitioners pursuing equitable multimodal AI deployment in low-resource contexts.

MULTIMODAL LARGE LANGUAGE MODELS FOR LOW-RESOURCE LANGUAGES AND GLOBAL SOUTH DEPLOYMENT: A COMPREHENSIVE SURVEY OF ARCHITECTURES, BENCHMARKS, AND SOCIOTECHNICAL CHALLENGES

Authors

Abstract

Downloads

Published

Issue

Section

License

Developed By

Information