MBZUAI Institutional Repository
Welcome to MBZUAIRep, the MBZUAI institutional repository. It is the hub for collecting and preserving the university research output. The library manages this service to collect academic journal articles, conference proceedings, books, book chapters, theses, and dissertations from MBZUAI faculty members, students, staff, and researchers.
Featured Items
Recent Submissions
Item A train-time loss in a system and method for calibrating object detectionA system and method of training a deep neural network for object detection in an object detection system. The object detection system including a camera and a controller including the DNN. The method including capturing an image by the camera, receiving the image, predicting, using the DNN, a bounding box and corresponding class label, evaluating the prediction with a total loss function including an object detection loss function, a box regression loss function, and a calibration loss function that takes into account precision and confidence. The method outputs a calibrated image with the object bounding box, the corresponding label, and a respective confidence score, in which the confidence score is a probability associated with the predicted class label.Item An electromagnetic indirect-driving scanning mirror for wide-field coaxial LiDAR applications(SPIE)This paper reports an electromagnetic indirect-driving scanning mirror with an enlarged mirror plate (17𝑚𝑚 × 17𝑚𝑚) supported by high-strength polymer hinges for wide-field coaxial LiDAR (Light Detection and Ranging) applications. An indirect-driving mechanism was developed to achieve large tilting angle through mechanical amplification, while maintaining a relatively high resonance frequency of the enlarged mirror plate. A prototype mirror was designed, fabricated, and tested. A Hall scan position sensor was integrated to monitor the pose of the mirror in real time. The testing results show a coupled resonance frequency of 54.9 𝐻𝑧 with an optical tilting angle of ±60°, corresponding to a field of view (FoV) of 120°. A wide-field coaxial LiDAR setup was also built based on the indirect-driving scanning mirror, and 2D imaging was demonstrated.Item A Self-contained Introduction to Large Language ModelsLarge language models (LLMs) are a significant technique in Artificial Intelligence. There is no shortage of documents describing the basic concepts. This article, as another attempt to give an introduction of LLMs, aims to help beginners with only basic knowledge of machine learning. We try to be self-contained by giving brief explanation to every basic concept. Further, we have a modularized design to start from a high-level overview and gradually get into details of the GPT-2 model.Item Downstream Task Guided Masking Learning in Masked Autoencoders Using Multi-Level OptimizationMasked Autoencoder (MAE) is a notable method for self-supervised pretraining in visual representation learning. It operates by randomly masking image patches and reconstructing these masked patches using the unmasked ones. A key limitation of MAE lies in its disregard for the varying informativeness of different patches, as it uniformly selects patches to mask. To overcome this, some approaches propose masking based on patch informativeness. However, these methods often do not consider the specific requirements of downstream tasks, potentially leading to suboptimal representations for these tasks. In response, we introduce the Multi-level Optimized Mask Autoencoder (MLO-MAE), a novel framework that leverages end-to-end feedback from downstream tasks to learn an optimal masking strategy during pretraining. Our experimental findings highlight MLO-MAE's significant advancements in visual representation learning. Compared to existing methods, it demonstrates remarkable improvements across diverse datasets and tasks, showcasing its adaptability and efficiency.Item Multiclass Confidence and Localization Calibration for Object DetectionA safety-critical control system and method with train-time calibration of object detection. A controller calibrates prediction by a deep neural network. The train-time calibration includes a multi-class confidence calibration, and a bounding box localization calibration. The controller outputs a calibrated image with the object bounding box, the corresponding class label, and a respective confidence score. The confidence score is a probability associated with the predicted class label. The multi-class confidence calibration is determined as a difference between a fused mean confidence and a certainty with accuracy. The fused mean confidence is between a mean logits-based class-wise confidence and class wise certainty. The controller determines the localization calibration by determining a deviation between a predicted mean bounding box overlap and a predictive certainty of the bounding box.
Communities in MBZUAI iRep
Select a community to browse its collections.