Search | arXiv e-print repository

WorldView-Bench: A Benchmark for Evaluating Global Cultural Perspectives in Large Language Models

Authors: Abdullah Mushtaq, Imran Taj, Rafay Naeem, Ibrahim Ghaznavi, Junaid Qadir

Abstract: Large Language Models (LLMs) are predominantly trained and aligned in ways that reinforce Western-centric epistemologies and socio-cultural norms, leading to cultural homogenization and limiting their ability to reflect global civilizational plurality. Existing benchmarking frameworks fail to adequately capture this bias, as they rely on rigid, closed-form assessments that overlook the complexity… ▽ More Large Language Models (LLMs) are predominantly trained and aligned in ways that reinforce Western-centric epistemologies and socio-cultural norms, leading to cultural homogenization and limiting their ability to reflect global civilizational plurality. Existing benchmarking frameworks fail to adequately capture this bias, as they rely on rigid, closed-form assessments that overlook the complexity of cultural inclusivity. To address this, we introduce WorldView-Bench, a benchmark designed to evaluate Global Cultural Inclusivity (GCI) in LLMs by analyzing their ability to accommodate diverse worldviews. Our approach is grounded in the Multiplex Worldview proposed by Senturk et al., which distinguishes between Uniplex models, reinforcing cultural homogenization, and Multiplex models, which integrate diverse perspectives. WorldView-Bench measures Cultural Polarization, the exclusion of alternative perspectives, through free-form generative evaluation rather than conventional categorical benchmarks. We implement applied multiplexity through two intervention strategies: (1) Contextually-Implemented Multiplex LLMs, where system prompts embed multiplexity principles, and (2) Multi-Agent System (MAS)-Implemented Multiplex LLMs, where multiple LLM agents representing distinct cultural perspectives collaboratively generate responses. Our results demonstrate a significant increase in Perspectives Distribution Score (PDS) entropy from 13% at baseline to 94% with MAS-Implemented Multiplex LLMs, alongside a shift toward positive sentiment (67.7%) and enhanced cultural balance. These findings highlight the potential of multiplex-aware AI evaluation in mitigating cultural bias in LLMs, paving the way for more inclusive and ethically aligned AI systems. △ Less

Submitted 14 May, 2025; originally announced May 2025.

Comments: Preprint. Submitted to the Journal of Artificial Intelligence Research (JAIR) on April 29, 2025

arXiv:2501.03259 [pdf, other]

Toward Inclusive Educational AI: Auditing Frontier LLMs through a Multiplexity Lens

Authors: Abdullah Mushtaq, Muhammad Rafay Naeem, Muhammad Imran Taj, Ibrahim Ghaznavi, Junaid Qadir

Abstract: As large language models (LLMs) like GPT-4 and Llama 3 become integral to educational contexts, concerns are mounting over the cultural biases, power imbalances, and ethical limitations embedded within these technologies. Though generative AI tools aim to enhance learning experiences, they often reflect values rooted in Western, Educated, Industrialized, Rich, and Democratic (WEIRD) cultural parad… ▽ More As large language models (LLMs) like GPT-4 and Llama 3 become integral to educational contexts, concerns are mounting over the cultural biases, power imbalances, and ethical limitations embedded within these technologies. Though generative AI tools aim to enhance learning experiences, they often reflect values rooted in Western, Educated, Industrialized, Rich, and Democratic (WEIRD) cultural paradigms, potentially sidelining diverse global perspectives. This paper proposes a framework to assess and mitigate cultural bias within LLMs through the lens of applied multiplexity. Multiplexity, inspired by Senturk et al. and rooted in Islamic and other wisdom traditions, emphasizes the coexistence of diverse cultural viewpoints, supporting a multi-layered epistemology that integrates both empirical sciences and normative values. Our analysis reveals that LLMs frequently exhibit cultural polarization, with biases appearing in both overt responses and subtle contextual cues. To address inherent biases and incorporate multiplexity in LLMs, we propose two strategies: \textit{Contextually-Implemented Multiplex LLMs}, which embed multiplex principles directly into the system prompt, influencing LLM outputs at a foundational level and independent of individual prompts, and \textit{Multi-Agent System (MAS)-Implemented Multiplex LLMs}, where multiple LLM agents, each representing distinct cultural viewpoints, collaboratively generate a balanced, synthesized response. Our findings demonstrate that as mitigation strategies evolve from contextual prompting to MAS-implementation, cultural inclusivity markedly improves, evidenced by a significant rise in the Perspectives Distribution Score (PDS) and a PDS Entropy increase from 3.25\% at baseline to 98\% with the MAS-Implemented Multiplex LLMs. Sentiment analysis further shows a shift towards positive sentiment across cultures,... △ Less

Submitted 2 January, 2025; originally announced January 2025.

arXiv:2501.01205 [pdf, other]

Harnessing Multi-Agent LLMs for Complex Engineering Problem-Solving: A Framework for Senior Design Projects

Authors: Abdullah Mushtaq, Muhammad Rafay Naeem, Ibrahim Ghaznavi, Muhammad Imran Taj, Imran Hashmi, Junaid Qadir

Abstract: Multi-Agent Large Language Models (LLMs) are gaining significant attention for their ability to harness collective intelligence in complex problem-solving, decision-making, and planning tasks. This aligns with the concept of the wisdom of crowds, where diverse agents contribute collectively to generating effective solutions, making it particularly suitable for educational settings. Senior design p… ▽ More Multi-Agent Large Language Models (LLMs) are gaining significant attention for their ability to harness collective intelligence in complex problem-solving, decision-making, and planning tasks. This aligns with the concept of the wisdom of crowds, where diverse agents contribute collectively to generating effective solutions, making it particularly suitable for educational settings. Senior design projects, also known as capstone or final year projects, are pivotal in engineering education as they integrate theoretical knowledge with practical application, fostering critical thinking, teamwork, and real-world problem-solving skills. In this paper, we explore the use of Multi-Agent LLMs in supporting these senior design projects undertaken by engineering students, which often involve multidisciplinary considerations and conflicting objectives, such as optimizing technical performance while addressing ethical, social, and environmental concerns. We propose a framework where distinct LLM agents represent different expert perspectives, such as problem formulation agents, system complexity agents, societal and ethical agents, or project managers, thus facilitating a holistic problem-solving approach. This implementation leverages standard multi-agent system (MAS) concepts such as coordination, cooperation, and negotiation, incorporating prompt engineering to develop diverse personas for each agent. These agents engage in rich, collaborative dialogues to simulate human engineering teams, guided by principles from swarm AI to efficiently balance individual contributions towards a unified solution. We adapt these techniques to create a collaboration structure for LLM agents, encouraging interdisciplinary reasoning and negotiation similar to real-world senior design projects. To assess the efficacy of this framework, we collected six proposals of engineering and computer science of... △ Less

Submitted 2 January, 2025; originally announced January 2025.

arXiv:2206.14608 [pdf, other]

Traffic Management of Autonomous Vehicles using Policy Based Deep Reinforcement Learning and Intelligent Routing

Authors: Anum Mushtaq, Irfan ul Haq, Muhammad Azeem Sarwar, Asifullah Khan, Omair Shafiq

Abstract: Deep Reinforcement Learning (DRL) uses diverse, unstructured data and makes RL capable of learning complex policies in high dimensional environments. Intelligent Transportation System (ITS) based on Autonomous Vehicles (AVs) offers an excellent playground for policy-based DRL. Deep learning architectures solve computational challenges of traditional algorithms while helping in real-world adoption… ▽ More Deep Reinforcement Learning (DRL) uses diverse, unstructured data and makes RL capable of learning complex policies in high dimensional environments. Intelligent Transportation System (ITS) based on Autonomous Vehicles (AVs) offers an excellent playground for policy-based DRL. Deep learning architectures solve computational challenges of traditional algorithms while helping in real-world adoption and deployment of AVs. One of the main challenges in AVs implementation is that it can worsen traffic congestion on roads if not reliably and efficiently managed. Considering each vehicle's holistic effect and using efficient and reliable techniques could genuinely help optimise traffic flow management and congestion reduction. For this purpose, we proposed a intelligent traffic control system that deals with complex traffic congestion scenarios at intersections and behind the intersections. We proposed a DRL-based signal control system that dynamically adjusts traffic signals according to the current congestion situation on intersections. To deal with the congestion on roads behind the intersection, we used re-routing technique to load balance the vehicles on road networks. To achieve the actual benefits of the proposed approach, we break down the data silos and use all the data coming from sensors, detectors, vehicles and roads in combination to achieve sustainable results. We used SUMO micro-simulator for our simulations. The significance of our proposed approach is manifested from the results. △ Less

Submitted 27 June, 2022; originally announced June 2022.

arXiv:0906.5393 [pdf]

Measurable & Scalable NFRs using Fuzzy Logic and Likert Scale

Authors: Nasir Mahmood Malik, Arif Mushtaq, Samina Khalid, Tehmina Khalil, Faisal Munir Malik

Abstract: Most of the research related to Non Functional Requirements (NFRs) have presented NFRs frameworks by integrating non functional requirements with functional requirements while we proposed that measurement of NFRs is possible e.g. cost and performance and NFR like usability can be scaled. Our novel hybrid approach integrates three things rather than two i.e. Functional Requirements (FRs), Measura… ▽ More Most of the research related to Non Functional Requirements (NFRs) have presented NFRs frameworks by integrating non functional requirements with functional requirements while we proposed that measurement of NFRs is possible e.g. cost and performance and NFR like usability can be scaled. Our novel hybrid approach integrates three things rather than two i.e. Functional Requirements (FRs), Measurable NFRs (M-NFRs) and Scalable NFRs (S-NFRs). We have also found the use of Fuzzy Logic and Likert Scale effective for handling of discretely measurable as well as scalable NFRs as these techniques can provide a simple way to arrive at a discrete or scalable NFR in contrast to vague, ambiguous, imprecise, noisy or missing NFR. Our approach can act as baseline for new NFR and aspect oriented frameworks by using all types of UML diagrams. △ Less

Submitted 29 June, 2009; originally announced June 2009.

Comments: 5 Pages, International Journal of Computer Science and Information Security (IJCSIS)

Journal ref: IJCSIS June 2009 Issue, Vol. 2, No. 1

Showing 1–5 of 5 results for author: Mushtaq, A