MBZUAI Institutional Repository
Featured Items
Recent Submissions
Item Metadata only Higher Order Cumulants-Based Method for Direct and Efficient Causal Discovery(IEEE, 2025-10-28)Causal discovery plays a pivotal role in scientific inquiry and subsequent applications in prediction or decision-making. While many methods have been proposed, many of them rely on independence tests. However, these tests are difficult to implement and computationally intensive. In this article, we aim to propose a direct and computationally efficient method to determine the causal relationship between two observed variables in the linear non-Gaussian case. Building on the insight that cumulants provide information about the shape of a probability distribution, we show that interestingly, the (in)dependence between two observed variables can be directly inferred from the difference in the product of certain joint cumulants of these variables. This concept is named the cause difference criterion. Based on this criterion, we introduce two practical methods, high-order cumulant (HC) and HC-linear non-Gaussian acyclic model (LiNGAM), for causal discovery in the high-dimensional case. Theoretical analyses ensure the identifiability of the proposed criteria and methods. Experimental results indicate that our methods outperform most existing methods.Item Metadata only The Esethu Framework: Reimagining Sustainable Dataset Governance and Curation for Low-Resource Languages(Association for Computational Linguistics, 2025-08-01)This paper presents the Esethu Framework, a sustainable data curation framework specifically designed to empower local communities and ensure equitable benefit-sharing from their linguistic resource. This framework is supported by the Esethu license, a novel community-centric data license. As a proof of concept, we introduce the Vuk'uzenzele isiXhosa Speech Dataset (ViXSD), an open-source corpus developed under the Esethu Framework and License. The dataset, containing read speech from native isiXhosa speakers enriched with demographic and linguistic metadata, demonstrates how community-driven licensing and curation principles can bridge resource gaps in automatic speech recognition (ASR) for African languages while safeguarding the interests of data creators. We describe the framework guiding dataset development, outline the Esethu license provisions, present the methodology for ViXSD, and present ASR experiments validating ViXSD's usability in building and refining voice-driven applications for isiXhosa.Item Metadata only SED++: A Simple Encoder-Decoder for Improved Open-Vocabulary Semantic Segmentation(IEEE, 2025-10-31)Open-vocabulary semantic segmentation aims to partition an image into distinct semantic regions based on an open set of categories. Existing approaches primarily rely on image-level pre-trained vision-language models to perform this pixel-level task. In this paper, we propose SED, a simple yet effective encoder-decoder architecture for open-vocabulary semantic segmentation leveraging pre-trained vision-language models. SED consists of a hierarchical image encoder, a text encoder, and a gradual fusion decoder. The hierarchical image encoder and text encoder collaboratively generate a cost volume, which is progressively decoded by the gradual fusion decoder to produce segmentation results. In contrast to a plain encoder, the hierarchical encoder better captures image detail information while maintaining linear computational complexity with respect to input size. The gradual fusion decoder adopts a top-down structure to progressively integrate high-resolution features with the cost volume. Furthermore, a category early rejection strategy is introduced in gradual fusion decoder to filter out non-existent categories at different layers, significantly improving inference efficiency. Based on SED, we further introduce two modules, including non-label text embedding and additional category early rejection in the encoder. Moreover, we extend our method with minimal decoder modification for open-vocabulary video semantic segmentation. Extensive experiments on multiple datasets validate the effectiveness and efficiency of our proposed method. With ConvNeXt-B, our method achieves an mIoU of 34.9% on the ADE20K with 150 classes (i.e., A-150) at an inference speed of 69 ms per image on a single A6000 GPU, and has an mIoU score of 40.2% on video segmentation dataset VSPW. The implementation will be publicly available at https://github.com/xb534/SED.git.Item Metadata only SmartSync: Revolutionizing Remote Work with AIoT Virtual Office Interactivity(Springer Nature, 2025-10-26)Our homes are where we spend a significant portion of our time, engaging in activities from sleeping to dining. However, due to the COVID pandemic and various economic factors, an increasing number of people have started working from home, thereby escalating the demand for professional and smart home offices. A pivotal aspect of enabling effective work-from-home arrangements is the integration of smart technology in these offices. This objective can be intriguingly achieved by applying the concept of Artificial Intelligence of Things (AIoT). In this work, we propose a system for real-time interaction between the user and the office environment, leveraging AIoT to collect and analyze data from both physical surroundings and virtual meetings. This approach empowers users, especially those participating in virtual meetings on platforms like Zoom, to have enhanced control over both their virtual and physical workspaces. Our initial research highlights the significant utility and necessity of AIoT-enhanced meetings, offering users flexible and intelligent options to manage their work and surroundings more effectively.Item Metadata only Joint Energy-Efficient Task Scheduling and Trajectory Optimization for Multi-UAV MEC Networks in Low-Altitude Economy(IEEE, 2025-11-03)Driven by the increasing deployment of Internet of Things (IoT) devices and significant progress in Unmanned Aerial Vehicle (UAV) technologies, UAV-assisted Mobile Edge Computing (UMEC) has emerged as an effective approach to address the challenges of constrained ground network coverage and insufficient computational resources. Leveraging their mobility, flexible deployment, and low cost, UAVs can dynamically support edge networks by providing computation and communication services for latency and energy sensitive tasks. Nevertheless, task scheduling and UAV trajectory optimization in UMEC systems still face significant challenges. To address the task scheduling problem for energy-sensitive tasks in the UMEC system, this paper first formulates a mathematical model of the system’s energy consumption and establishes an optimization objective aimed to optimize the energy cost of UMEC system. Building upon this, we propose a task scheduling algorithm that integrates Multi-Agent Proximal Policy Optimization (MAPPO) with a bargaining game-based resource coordination mechanism. The proposed algorithm reduces the overall energy consumption and enhances the system performance through a bargaining-based resource allocation and matching strategy, alongside MAPPO-driven trajectory optimization. Simulation results show that our method outperforms the baseline algorithms in terms of system utility, achieving up to a 7.4% improvement.
Communities in MBZUAI iRep
Select a community to browse its collections.
