Open-weight and closed-weight models

Open-weight and closed-weight models are two fundamentally different approaches to the development and distribution of large language models (LLMs), forming a key dichotomy in the modern artificial intelligence ecosystem. The choice between these approaches affects technical capabilities, economics, security, and the future development of AI^[1].

The distinction lies in the accessibility of the model's trained parameters (weights). Open-weight models publish their weights, allowing the community to use, modify, and deploy them locally. Closed-weight models, in contrast, keep their weights secret, providing access to their capabilities exclusively through proprietary APIs^[2].

Definitions and Key Differences

Open-Weight Models

Open-weight models are systems in which the trained parameters (weights) of a neural network are publicly available for use, modification, and distribution. According to Andrej Karpathy of OpenAI, such a model is akin to "handing over the binary of an operating system"—users receive a functional product but typically without access to the training source code or training data.

Key characteristics:

Local Deployment: The ability to run the model on one's own hardware, ensuring full data control and privacy.
Fine-tuning: The ability to adapt the model for specific tasks and domains.
Transparency and Auditing: Researchers can study the internal mechanisms of the model to identify biases and vulnerabilities.

Closed-Weight Models

Closed-weight models (also known as proprietary models) are systems whose parameters are a trade secret and are accessible only through an API or restricted licenses. Development companies like OpenAI and Anthropic have complete control over the architecture, training methods, and inference mechanisms. The GPT-4 technical report explicitly states its refusal to disclose details, "given the competitive landscape and the safety implications of large-scale models"^[3].

Key characteristics:

Centralized Control: The developer manages updates, security, and usage policies.
Ease of Use: Access via an API frees users from the need to manage complex infrastructure.
Opacity: The lack of access to internal mechanisms makes independent auditing impossible and complicates understanding the reasons for erroneous or biased responses.

Distinction from Open Source

It is important to distinguish between the terms open-weight and open-source. A true open-source model involves the publication of all artifacts necessary for reproduction: weights, architecture, training code, and datasets. Most modern "open" models, such as Llama from Meta, are open-weight but not fully open-source, as their training data and precise training methods remain private.

Comparative Analysis: Performance, Cost, and Innovation

Performance and Customization

Historically, closed-weight models like GPT-4 have led on general benchmarks. However, the performance gap is rapidly narrowing. According to the Stanford AI Index 2025, it has shrunk from 8% to 1.7% over the last year^[1]. Powerful open-weight models, such as LLaMA 3.1 405B from Meta and DeepSeek-V3, demonstrate comparable, and on some tasks (especially programming), superior results^[4].

The key advantage of open-weight models lies in deep customization. The ability to fine-tune on specific data allows them to outperform larger but more general-purpose closed-weight models in narrow domains, such as medicine or law.

Economic Aspects

Training Cost: Creating frontier models is extremely expensive. The training of GPT-4 is estimated to cost over $100 million. Open-weight models like DeepSeek-V3 achieve similar performance at a cost of $5.5 million, democratizing access to the creation of powerful systems.
Usage Cost (Inference): Closed-weight models are billed on a pay-per-use model via an API, which can lead to high expenses with large volumes. Open-weight models deployed locally require an initial investment in infrastructure but have a significantly lower total cost of ownership (TCO) at scale.

Impact on Scientific Research and Innovation

Open-weight models are fundamentally transforming scientific research by ensuring reproducibility and democratizing access. Researchers worldwide can analyze, critique, and improve open models, which creates a dynamic ecosystem and accelerates progress. In turn, closed models create a "reproducibility crisis," as claimed results cannot be independently verified.

Security and Ethical Dilemmas

The issue of security is a central dilemma in the debate between openness and control.

Closed-Weight Approach (Centralized Prevention): Developers like OpenAI and Anthropic take a preventive approach. They implement complex security filters, conduct intensive red teaming, and adhere to strict policies, such as Anthropic's Responsible Scaling Policy, committing not to deploy models that exceed certain risk thresholds^[5].
Open-Weight Approach (Decentralized Resilience): This philosophy, similar to the open-source world, suggests that "many eyes make all bugs shallow." The community can find and fix vulnerabilities more quickly. However, this also creates risks: malicious actors can just as easily study models to find vulnerabilities or remove safety mechanisms through fine-tuning.

Research shows that human intent, rather than model availability, is the primary risk factor. 90% of documented cases of generative AI misuse are related to the exploitation of permitted capabilities, rather than harm generated by the systems themselves.

Regulatory Approaches: EU and US

EU AI Act: Adopts a preventive, risk-based approach. The act imposes strict obligations on models with "systemic risk" (requiring more than 10²⁵ FLOPS for training) but provides limited exceptions for open-source models that do not pose such a risk. This creates an incentive for transparency but also regulatory complexity.
US Approach: Based on promoting innovation and managing risks through industry standards. President Biden's Executive Order 14110 and the subsequent NTIA report recommend refraining from immediate restrictions on open-weight models, proposing instead to create a monitoring system for evidence-based decision-making^[6].

Key Models and Players

Comparative table of leading open-weight and closed-weight models
Model Type	Model	Developer	Key Feature
Open-weight	LLaMA 3.1	Meta	High performance, setting the standard for open models; large community.
Open-weight	Mixtral 8x7B	Mistral AI	Mixture of Experts (MoE) architecture, providing high performance with low inference costs.
Closed-weight	GPT-4 / GPT-4o	OpenAI	Historical performance leader, strong multimodal capabilities.
Closed-weight	Claude 4 Opus	Anthropic	Focus on safety and ethics (Constitutional AI), large context window.

Links

Stanford AI Index Report 2025 — Annual report on the state of AI.
NTIA report on open-weight models

Literature

OpenAI et al. (2023). GPT-4 Technical Report. arXiv:2303.08774.
Touvron, H. et al. (2023). Llama 2: Open Foundation and Fine-Tuned Chat Models. arXiv:2307.09288.
DeepSeek-AI (2025). DeepSeek-V3 Technical Report. arXiv:2412.19437.
Kapoor, S.; Bommasani, R. et al. (2024). On the Societal Impact of Open Foundation Models. arXiv:2403.07918.
U.S. NTIA (2024). Dual-Use Foundation Models with Widely Available Model Weights. NTIA Report.
Stanford HAI (2025). Artificial Intelligence Index Report 2025. Full PDF.
Anthropic (2023). Responsible Scaling Policy. Anthropiс RSP.
Klyman, K. et al. (2024). A Design Framework for Open-Source Foundation Model Safety. arXiv:2406.10415.
Kembery, E.; Reed, T. (2024). AI Safety Frameworks Should Include Procedure for Model Access Decisions. arXiv:2411.10547.
European Commission (2024). General-Purpose AI Models in the AI Act – Q&A. EU AI Act FAQ.
Zhang, X. et al. (2025). Mitigating Cyber Risk in the Age of Open-Weight LLMs. arXiv:2505.17109.
Biderman, S. et al. (2024). Risks and Opportunities of Open-Source Generative AI. arXiv:2405.08597.

Notes

↑ ^1.0 ^1.1 “Artificial Intelligence Index Report 2025”. Stanford University HAI. [1] Retrieved July 4, 2025.
↑ Karpathy, Andrej. “On Open-sourcing LLMs”. X (formerly Twitter).
↑ “GPT-4 Technical Report”. OpenAI. [2]
↑ “DeepSeek-V2 and DeepSeek-Coder-V2 Technical Report”.
↑ “Anthropic's Responsible Scaling Policy”. Anthropic.
↑ “Dual-Use Foundation Models with Widely Available Model Weights”. U.S. Department of Commerce, NTIA. (2024).

[stanford_index_2025-1] 1.0 ^1.1 “Artificial Intelligence Index Report 2025”. Stanford University HAI. [1] Retrieved July 4, 2025.

[karpathy_def-2] Karpathy, Andrej. “On Open-sourcing LLMs”. X (formerly Twitter).

[gpt4_report-3] “GPT-4 Technical Report”. OpenAI. [2]

[deepseek_v3-4] “DeepSeek-V2 and DeepSeek-Coder-V2 Technical Report”.

[anthropic_rsp-5] “Anthropic's Responsible Scaling Policy”. Anthropic.

[ntia_report-6] “Dual-Use Foundation Models with Widely Available Model Weights”. U.S. Department of Commerce, NTIA. (2024).

[1]

[2]

[3]

[4]

[5]

[6]

Open-weight and closed-weight models

Contents

Definitions and Key Differences

Open-Weight Models

Closed-Weight Models

Distinction from Open Source

Comparative Analysis: Performance, Cost, and Innovation

Performance and Customization

Economic Aspects

Impact on Scientific Research and Innovation

Security and Ethical Dilemmas

Regulatory Approaches: EU and US

Key Models and Players

Links

Literature

Notes

Navigation menu

Open-weight and closed-weight models

Definitions and Key Differences

Open-Weight Models

Closed-Weight Models

Distinction from Open Source

Comparative Analysis: Performance, Cost, and Innovation

Performance and Customization

Economic Aspects

Impact on Scientific Research and Innovation

Security and Ethical Dilemmas

Regulatory Approaches: EU and US

Key Models and Players

Links

Literature

Notes

Navigation menu

Search