Item

Assessing the Capabilities of Large Language Models for Oil and Gas Industry Applications

Castanedo, F.
Bhattacharya, W.
Ghosh, S.
Takac, M.
Lahlou, S.
Iklassov, Z.
Schaffrath, M.
Reddicharla, N.
Mohan, R.
Yaslam, M.
... show 1 more
Research Projects
Organizational Units
Journal Issue
Abstract
Large Language Models (LLMs) are commonly evaluated on general-domain datasets such as MMLU or GSM8K. However, assessing their performance in specific domains requires creating a test set with domain-specific context. This work provides a comprehensive analysis of state-of-the-art LLM's capabilities with respect to oil and gas knowledge that has been conducted in the context of the EnergyAI project within ADNOC. Our assessment is performed using two custom domain evaluation datasets. First, we measure the ability of the model to handle Multiple-Choice Questions (MCQ), which consists of questions that require a solid understanding and reasoning across upstream, midstream and downstream oil and gas operations. Second, we assess their ability to answer open-ended questions in the same domain, analyzing and quantifying the quality of the responses in relevance, accuracy and completeness. In addition to these evaluations, we examine other factors such as model size, throughput, response latency, and overall capabilities. We have observed that the Llama3.1 405B model delivers a domain-specific performance comparable to proprietary models like GPT4o. Furthermore, models at a smaller scale - such as Llama 3.1 70B and Llama 3.2 90B - deliver excellent results relative to their size and offer an attractive performance-to-cost tradeoff. Nevertheless, all current LLM, including the most advanced open-source and proprietary models, exhibit notable limitations in their knowledge and reasoning capabilities within the oil and gas domain. These shortcomings highlight the need for targeted domain adaptation, deeper integration of technical knowledge, and rigorous evaluation tailored to the complexities of the industry.
Citation
F. Castanedo et al., “Assessing the Capabilities of Large Language Models for Oil and Gas Industry Applications,” ADIPEC, Nov. 2025, doi: 10.2118/229701-MS.
Source
Proceedings of the Abu Dhabi International Petroleum Exhibition and Conference, 2025
Conference
Abu Dhabi International Petroleum Exhibition and Conference, 2025
Keywords
Deep Learning, Evaluation, Artificial Intelligence, Natural Language, Machine Learning, Large Language Model, LLM, Application, Dataset, Llama-3
Subjects
Source
Abu Dhabi International Petroleum Exhibition and Conference, 2025
Publisher
ADIPEC
Full-text link