PanGu (Huawei)
Huawei PanGu (Chinese: 盘古) is a family of ultra-large pre-trained artificial intelligence models (foundation models) developed by Huawei Cloud. The name "PanGu" refers to Pangu, the mythological first living being in Chinese culture who created the world[1]. The PanGu family covers various domains, including natural language processing (NLP), computer vision (CV), multimodal analysis, predictive modeling, and scientific computing.
History and Development
PanGu-α (2021)
The first model in the family, PanGu-α (PanGu-Alpha), was introduced in April 2021. With 200 billion parameters, it became the largest Chinese language model at the time, surpassing the size of GPT-3 (175 billion) from OpenAI[2].
The model was developed by the Huawei Cloud team in collaboration with the Noah's Ark lab and trained on a cluster of 2048 specialized Huawei Ascend 910 processors using the MindSpore framework[3]. The training corpus consisted of 1.1 TB of high-quality Chinese text data. PanGu-α demonstrated strong performance on the CLUE (Chinese Language Understanding Evaluation) benchmark, achieving first place in the overall rankings[1].
PanGu 3.0 (2023): A Platform Approach
In July 2023, Huawei unveiled the PanGu 3.0 platform, marking a shift from a single model to a multi-layered "5+N+X" architecture focused on industrial applications[4].
- L0 (Base Layer): Five fundamental models (NLP, CV, multimodal, predictive, and scientific computing).
- L1 (Industry Layer): N industry-specific models, fine-tuned from the base models for specific sectors (government, finance, manufacturing, etc.).
- L2 (Scenario Layer): X models for specific application scenarios (virtual assistant, typhoon track prediction, etc.).
This hierarchical approach allows customers to either use ready-made solutions or fine-tune industry models on their own data, significantly simplifying and reducing the cost of adaptation.
PanGu 5.5 (2025): Mixture-of-Experts Architecture
In June 2025, Huawei announced the update to PanGu 5.5, aimed at solving advanced industrial problems. Its key feature is a Mixture-of-Experts (MoE) architecture with 256 expert sub-networks, which increased the total number of parameters to 718 billion[5]. The MoE architecture allows for the dynamic activation of only a part of the model when solving a specific task, which, according to Huawei, provides an eight-fold increase in inference efficiency compared to previous generations[6].
Key Architectural and Technical Solutions
The PanGu models are built on a GPT-like transformer architecture but include several innovations for training ultra-large models. To control the generation process, a special Query Layer was introduced, which helps to induce the desired output during the pre-training stage[3].
The training and deployment of PanGu models are tightly integrated with Huawei's own hardware and software platform:
- Ascend 910 Processors: Specialized AI accelerators that form the basis of the computing clusters.
- MindSpore Framework: An open-source deep learning platform that supports auto-parallel technology, which combines five types of parallelism (data, model, pipeline, optimizer, etc.) to efficiently distribute computations across thousands of nodes[3].
Specialized Models and Their Applications
PanGu-Weather
One of the most well-known models in the family is PanGu-Weather, a global meteorological model based on deep learning. In July 2023, a paper about it was published in the prestigious scientific journal Nature[7].
The model has demonstrated the ability to surpass the accuracy of traditional numerical weather prediction methods from the European Centre for Medium-Range Weather Forecasts (ECMWF) at a significantly higher speed. Generating a 24-hour global forecast takes the model mere seconds instead of several hours of supercomputer calculations, representing a speed-up of approximately 10,000 times[7]. In August 2023, PanGu-Weather forecasts were integrated into the ECMWF service for use in practical meteorological services[8].
Industrial Applications
PanGu models have been implemented in over 500 scenarios across 30 industries. Some examples include:
- Agriculture: The Chinese Academy of Agricultural Sciences (CAAS) used PanGu to develop a breeding model, which helped cultivate an experimental rice variety with improved lodging resistance[5].
- Oil and Gas Industry: CNPC uses a PanGu model for the automatic detection of pipeline defects with sub-millimeter accuracy, increasing efficiency by ~40%[9].
- Public Administration: In Shenzhen, an intelligent assistant named "Xiaofu" was created, which provides citizens with information about public services based on a corpus of over 200,000 local documents[4].
- Pharmacology: The PanGu Drug Molecule model is used to accelerate the screening process for drug candidates. It is claimed that it helped discover a new class of antibiotics, marking the first breakthrough in this field in 40 years[4].
Open-Source Release
In June 2025, Huawei announced the open-sourcing of some models from the PanGu family. The following were released to the public[10]:
- PanGu Dense Model 7B (7 billion parameters).
- PanGu Pro MoE Model 72B (72 billion parameters).
This move is aimed at stimulating innovation and creating an open ecosystem around the Huawei Ascend hardware platform, which is a strategic response to global competition in the AI field[10].
Further Reading
- Zeng, W.; et al. (2021). PanGu‑α: Large‑Scale Autoregressive Pretrained Chinese Language Models. PDF.
- Huawei (2021). HDC.Cloud 2021: Huawei Releases Six Ground‑breaking Products to Supercharge the Cloud and Intelligent Transformation of Business. Online news.
- Huawei Cloud (2023). Reshaping Industries with AI: Huawei Cloud Launches PanGu Models 3.0 and Ascend AI Cloud Services. Online news.
- Bi, K.; et al. (2023). Accurate Medium‑Range Global Weather Forecasting with 3D Neural Networks. Nature, 620, 560–566. DOI:10.1038/s41586‑023‑06185‑3.
- Technology Magazine (2025). What Huawei PanGu 5.5 Models Mean for Industrial AI. Online article.
- MindSpore Team (2021). MindSpore: An All‑Scenario Deep Learning Computing Framework (White Paper v1.1). PDF.
- Zhang, S.; et al. (2024). Ascend 910 NPU SoC Architecture for Large‑Scale AI Training. arXiv:2407.11888. Online preprint.
- AIbase News (2025). Huawei Open Sources Dense PanGu 7B and Mixture‑of‑Experts PanGuPro 72B. Online news.
- CNPC & Huawei Cloud (2024). Kunlun: Large‑Scale AI Model for Oil and Gas Pipeline Defect Detection. Online case study.
- MindSpore Docs (2024). Automatic Parallel — Five‑Mode Hybrid Strategy in MindSpore. Online documentation.
- Press, O.; et al. (2021). Train Short, Test Long: Attention with Linear Biases Enables Input‑Length Extrapolation. arXiv:2108.12409.
- Law, M. (2025). How Huawei PanGu 5.5 AI Models Transform Industry Operations. AI Magazine. Online article.
Notes
- ↑ 1.0 1.1 “HDC.Cloud 2021: Huawei Releases Six Groundbreaking Products to Supercharge the Cloud and Intelligent Transformation of Business”. Huawei. [1]
- ↑ Wodecki, Ben (27 Apr 2021). «Huawei has created the world's largest Chinese language model». AI Business. [2]
- ↑ 3.0 3.1 3.2 Zeng, Wei, et al. (Apr 2021). «PanGu-α: Large-scale Autoregressive Pretrained Chinese Language Models». Technical Report. [3]
- ↑ 4.0 4.1 4.2 «Reshaping Industries with AI: Huawei Cloud Launches Pangu Models 3.0 and Ascend AI Cloud Services». HUAWEI CLOUD. 7 Jul 2023. [4]
- ↑ 5.0 5.1 Law, Marcus (23 Jun 2025). «What Huawei Pangu 5.5 Models Mean for Industrial AI». Technology Magazine. [5]
- ↑ «How Huawei Pangu 5.5 AI Models Transform Industry Operations». AI Magazine. [6]
- ↑ 7.0 7.1 «Prestigious science journal Nature publishes paper about Pangu Weather AI Model authored by HUAWEI CLOUD researchers». Huawei News. 6 Jul 2023. [7]
- ↑ Bi, Kaifeng, et al. (2023). «Accurate medium-range global weather forecasting with 3D neural networks». Nature. [8]
- ↑ «CNPC and Huawei Cloud Jointly Launch the "Kunlun" Model for the Oil and Gas Industry».
- ↑ 10.0 10.1 «Huawei Open Sources Dense Pangu 7B and Mixture of Experts Model with 72B Parameters». Albase News. 30 Jun 2025. [9]