YandexGPT (language model)

From Systems Analysis Wiki
Jump to navigation Jump to search

YandexGPT (Yet another GPT) is a family of large language models developed by Yandex and first introduced in May 2023.[1] YandexGPT models are used in the Alisa voice assistant, Yandex Search, and other services, and are also available via the public API of the Yandex Cloud platform.[2]

YaLM-100B (2022) was a preceding open-source research model with 100 billion parameters. It served as a "proof of concept," but YandexGPT was developed separately for commercial use.[3]

Release History

Major Versions
Date Release Key Features
Jun 2022 YaLM-100B 100B parameters, 1.7 TB of data; Apache 2.0.[3]
May 17, 2023 YandexGPT 1.0 Integration into Alisa.[1]
Sep 7, 2023 YandexGPT 2 +67% quality improvement based on internal benchmarks.[4]
Mar 28, 2024 YandexGPT 3 Pro / Lite New enterprise API lineup.[5]
Oct 24, 2024 YandexGPT 4 Pro / Lite 32,000-token context; hidden reasoning (chain-of-thought).[6]
Feb 25, 2025 YandexGPT 5 Pro Parity with GPT-4o in 64% of tasks.[7]
Mar 31, 2025 YandexGPT 5 Lite Instruct 8-billion parameter model released open-source; Llama format.[8]

Architecture and Training

  • Base architecture: Transformer, optimized for the Russian language.
  • YandexGPT 5 Lite: Llama-compatible; pre-training ≈ 15 trillion tokens, subsequent fine-tuning ≈ 320 billion.[8]

Context and Limits

  • Architectural context limit: 32,000 tokens (versions 4/5).[6]
  • The public API limits a single request (prompt + completion) to 7,400 tokens.[9]
  • The maximum **response** size is 2,000 tokens, according to the "Quotas and limits" section.[10]

Current Models (June 2025)

Model Parameters Context License Notes
YandexGPT 5 Pro N/A 32,000 Proprietary Available via API and Alisa Pro.[7]
YandexGPT 5 Lite 8 billion 32,000 Yandex GPT-Lite License Open-source; Llama-compatible.[8]
YaLM-100B 100 billion 2,048 Apache 2.0 Original project.[3]

Benchmarks

  • Internal tests: 5 Pro achieved parity with GPT-4o in 64% of tasks; performance improvement over 4 Pro is 67%.[7]
  • ru-LLM Arena: YandexGPT holds the leading position in ELO rating among Russian-language models.[11]

Fine-tuning

The LoRA method is officially supported for 5 Lite; a usage example is published in the model card.[8]

API Modes

  • Synchronous — for fast responses (Lite).
  • Asynchronous — for resource-intensive tasks (Pro).[2]

Multimodality

The YandexGPT family remains text-based; multimodal services ("Neuro", "YandexArt", "Yandex Vision") are developed separately.[6]

Literature

  • Matkin, N. et al. (2024). Comparative Analysis of Encoder-Based NER and Large Language Models for Skill Extraction from Russian Job Vacancies. arXiv:2407.19816.
  • Tsanda, A.; Bruches, E. (2024). Russian-Language Multimodal Dataset for Automatic Summarization of Scientific Papers. arXiv:2405.07886.
  • Goloburda, M. et al. (2025). Qorǵau: Evaluating LLM Safety in Kazakh-Russian Bilingual Contexts. arXiv:2502.13640.
  • Togmanov, M. et al. (2025). KazMMLU: Evaluating Language Models on Kazakh, Russian, and Regional Knowledge of Kazakhstan. arXiv:2502.12829.
  • Noels, S. et al. (2025). What Large Language Models Do Not Talk About: An Empirical Study of Moderation and Censorship Practices. arXiv:2504.03803.

Notes

  1. 1.0 1.1 "Yandex adds ChatGPT analog to Alisa". RBC. [1]
  2. 2.0 2.1 "Getting started with YandexGPT (Quickstart)". Yandex Cloud Docs. [2]
  3. 3.0 3.1 3.2 "yandex/YaLM-100B: Pretrained language model with 100B". GitHub. [3]
  4. "How Yandex decided to monetize its ChatGPT analog". RBC. [4]
  5. "Yandex introduced the third generation of YandexGPT neural networks". RBC. [5]
  6. 6.0 6.1 6.2 "A more powerful family of YandexGPT 4 models". Habr. [6]
  7. 7.0 7.1 7.2 "Yandex integrates YandexGPT 5 Pro into the Alisa Pro chat". AdIndex. [7]
  8. 8.0 8.1 8.2 8.3 "yandex/YandexGPT-5-Lite-8B-pretrain". Hugging Face. [8]
  9. "ChatYandexGPT API Reference (max_tokens = 7400)". LangChain Docs. [9]
  10. "Yandex Cloud service quotas and limits → Foundation Models". Yandex Cloud Docs. [10]
  11. "llmarena/llmarena — a Russian crowdsourcing platform for LLM evaluation". GitHub. [11]