PanGu (Huawei)

From Systems Analysis Wiki
Jump to navigation Jump to search

Huawei PanGu (Chinese: 盘古) is a family of ultra-large pre-trained artificial intelligence models (foundation models) developed by Huawei Cloud. The name "PanGu" refers to Pangu, the mythological first living being in Chinese culture who created the world[1]. The PanGu family covers various domains, including natural language processing (NLP), computer vision (CV), multimodal analysis, predictive modeling, and scientific computing.

History and Development

PanGu-α (2021)

The first model in the family, PanGu-α (PanGu-Alpha), was introduced in April 2021. With 200 billion parameters, it became the largest Chinese language model at the time, surpassing the size of GPT-3 (175 billion) from OpenAI[2].

The model was developed by the Huawei Cloud team in collaboration with the Noah's Ark lab and trained on a cluster of 2048 specialized Huawei Ascend 910 processors using the MindSpore framework[3]. The training corpus consisted of 1.1 TB of high-quality Chinese text data. PanGu-α demonstrated strong performance on the CLUE (Chinese Language Understanding Evaluation) benchmark, achieving first place in the overall rankings[1].

PanGu 3.0 (2023): A Platform Approach

In July 2023, Huawei unveiled the PanGu 3.0 platform, marking a shift from a single model to a multi-layered "5+N+X" architecture focused on industrial applications[4].

  • L0 (Base Layer): Five fundamental models (NLP, CV, multimodal, predictive, and scientific computing).
  • L1 (Industry Layer): N industry-specific models, fine-tuned from the base models for specific sectors (government, finance, manufacturing, etc.).
  • L2 (Scenario Layer): X models for specific application scenarios (virtual assistant, typhoon track prediction, etc.).

This hierarchical approach allows customers to either use ready-made solutions or fine-tune industry models on their own data, significantly simplifying and reducing the cost of adaptation.

PanGu 5.5 (2025): Mixture-of-Experts Architecture

In June 2025, Huawei announced the update to PanGu 5.5, aimed at solving advanced industrial problems. Its key feature is a Mixture-of-Experts (MoE) architecture with 256 expert sub-networks, which increased the total number of parameters to 718 billion[5]. The MoE architecture allows for the dynamic activation of only a part of the model when solving a specific task, which, according to Huawei, provides an eight-fold increase in inference efficiency compared to previous generations[6].

Key Architectural and Technical Solutions

The PanGu models are built on a GPT-like transformer architecture but include several innovations for training ultra-large models. To control the generation process, a special Query Layer was introduced, which helps to induce the desired output during the pre-training stage[3].

The training and deployment of PanGu models are tightly integrated with Huawei's own hardware and software platform:

  • Ascend 910 Processors: Specialized AI accelerators that form the basis of the computing clusters.
  • MindSpore Framework: An open-source deep learning platform that supports auto-parallel technology, which combines five types of parallelism (data, model, pipeline, optimizer, etc.) to efficiently distribute computations across thousands of nodes[3].

Specialized Models and Their Applications

PanGu-Weather

One of the most well-known models in the family is PanGu-Weather, a global meteorological model based on deep learning. In July 2023, a paper about it was published in the prestigious scientific journal Nature[7].

The model has demonstrated the ability to surpass the accuracy of traditional numerical weather prediction methods from the European Centre for Medium-Range Weather Forecasts (ECMWF) at a significantly higher speed. Generating a 24-hour global forecast takes the model mere seconds instead of several hours of supercomputer calculations, representing a speed-up of approximately 10,000 times[7]. In August 2023, PanGu-Weather forecasts were integrated into the ECMWF service for use in practical meteorological services[8].

Industrial Applications

PanGu models have been implemented in over 500 scenarios across 30 industries. Some examples include:

  • Agriculture: The Chinese Academy of Agricultural Sciences (CAAS) used PanGu to develop a breeding model, which helped cultivate an experimental rice variety with improved lodging resistance[5].
  • Oil and Gas Industry: CNPC uses a PanGu model for the automatic detection of pipeline defects with sub-millimeter accuracy, increasing efficiency by ~40%[9].
  • Public Administration: In Shenzhen, an intelligent assistant named "Xiaofu" was created, which provides citizens with information about public services based on a corpus of over 200,000 local documents[4].
  • Pharmacology: The PanGu Drug Molecule model is used to accelerate the screening process for drug candidates. It is claimed that it helped discover a new class of antibiotics, marking the first breakthrough in this field in 40 years[4].

Open-Source Release

In June 2025, Huawei announced the open-sourcing of some models from the PanGu family. The following were released to the public[10]:

  • PanGu Dense Model 7B (7 billion parameters).
  • PanGu Pro MoE Model 72B (72 billion parameters).

This move is aimed at stimulating innovation and creating an open ecosystem around the Huawei Ascend hardware platform, which is a strategic response to global competition in the AI field[10].

Further Reading

  • Zeng, W.; et al. (2021). PanGu‑α: Large‑Scale Autoregressive Pretrained Chinese Language Models. PDF.
  • Huawei (2021). HDC.Cloud 2021: Huawei Releases Six Ground‑breaking Products to Supercharge the Cloud and Intelligent Transformation of Business. Online news.
  • Huawei Cloud (2023). Reshaping Industries with AI: Huawei Cloud Launches PanGu Models 3.0 and Ascend AI Cloud Services. Online news.
  • Bi, K.; et al. (2023). Accurate Medium‑Range Global Weather Forecasting with 3D Neural Networks. Nature, 620, 560–566. DOI:10.1038/s41586‑023‑06185‑3.
  • Technology Magazine (2025). What Huawei PanGu 5.5 Models Mean for Industrial AI. Online article.
  • MindSpore Team (2021). MindSpore: An All‑Scenario Deep Learning Computing Framework (White Paper v1.1). PDF.
  • Zhang, S.; et al. (2024). Ascend 910 NPU SoC Architecture for Large‑Scale AI Training. arXiv:2407.11888. Online preprint.
  • AIbase News (2025). Huawei Open Sources Dense PanGu 7B and Mixture‑of‑Experts PanGuPro 72B. Online news.
  • CNPC & Huawei Cloud (2024). Kunlun: Large‑Scale AI Model for Oil and Gas Pipeline Defect Detection. Online case study.
  • MindSpore Docs (2024). Automatic Parallel — Five‑Mode Hybrid Strategy in MindSpore. Online documentation.
  • Press, O.; et al. (2021). Train Short, Test Long: Attention with Linear Biases Enables Input‑Length Extrapolation. arXiv:2108.12409.
  • Law, M. (2025). How Huawei PanGu 5.5 AI Models Transform Industry Operations. AI Magazine. Online article.

Notes

  1. 1.0 1.1 “HDC.Cloud 2021: Huawei Releases Six Groundbreaking Products to Supercharge the Cloud and Intelligent Transformation of Business”. Huawei. [1]
  2. Wodecki, Ben (27 Apr 2021). «Huawei has created the world's largest Chinese language model». AI Business. [2]
  3. 3.0 3.1 3.2 Zeng, Wei, et al. (Apr 2021). «PanGu-α: Large-scale Autoregressive Pretrained Chinese Language Models». Technical Report. [3]
  4. 4.0 4.1 4.2 «Reshaping Industries with AI: Huawei Cloud Launches Pangu Models 3.0 and Ascend AI Cloud Services». HUAWEI CLOUD. 7 Jul 2023. [4]
  5. 5.0 5.1 Law, Marcus (23 Jun 2025). «What Huawei Pangu 5.5 Models Mean for Industrial AI». Technology Magazine. [5]
  6. «How Huawei Pangu 5.5 AI Models Transform Industry Operations». AI Magazine. [6]
  7. 7.0 7.1 «Prestigious science journal Nature publishes paper about Pangu Weather AI Model authored by HUAWEI CLOUD researchers». Huawei News. 6 Jul 2023. [7]
  8. Bi, Kaifeng, et al. (2023). «Accurate medium-range global weather forecasting with 3D neural networks». Nature. [8]
  9. «CNPC and Huawei Cloud Jointly Launch the "Kunlun" Model for the Oil and Gas Industry».
  10. 10.0 10.1 «Huawei Open Sources Dense Pangu 7B and Mixture of Experts Model with 72B Parameters». Albase News. 30 Jun 2025. [9]