Claude (Anthropic)

From Systems Analysis Wiki
Jump to navigation Jump to search

Claude is a family of multimodal large language models (LLMs) developed by the research company Anthropic.

Built on a transformer architecture, Claude models are positioned as AI assistants focused on safety, usefulness, and honesty. A key feature of their development is the Constitutional AI methodology, aimed at creating steerable and ethically aligned systems.

History and Philosophy

Founding and Mission of Anthropic

Anthropic was founded in 2021 by former senior employees of OpenAI, including siblings Dario[1] and Daniela Amodei[2]. Their departure was prompted by disagreements with OpenAI's leadership regarding its direction, particularly concerns that the partnership with Microsoft and increasing commercialization could compromise its commitment to AI safety.

Anthropic's mission is "to develop and maintain advanced AI for the long-term benefit of humanity." The company is registered as a Public Benefit Corporation (PBC) in the United States, which legally obligates it to balance financial interests with public benefit. This approach is reinforced by a unique governance structure featuring the Long-Term Benefit Trust (LTBT), an independent body with the authority to influence the composition of the board of directors to ensure adherence to the safety mission.

Philosophy: HHH and Constitutional AI

The behavior of Claude models is based on the HHH formula: Helpful, Honest, and Harmless. To implement these principles, Anthropic developed its own training methodology—Constitutional AI (CAI).

Unlike traditional RLHF (Reinforcement Learning from Human Feedback), where human annotators directly rate the model's responses, CAI uses a "constitution"—a set of explicit ethical principles—based on which the model learns to self-evaluate and correct its own answers. This makes the process more scalable, transparent, and controllable.

Architecture and Key Technologies

Transformer-based Foundation

Like other modern LLMs, Claude uses a decoder-only transformer architecture that autoregressively generates text token by token. However, Anthropic has introduced significant improvements aimed at enhancing performance, safety, and steerability.

Constitutional AI (CAI)

The CAI training process consists of two stages:

  1. Supervised Learning Phase: The model generates responses to prompts, and then another critic model, guided by the "constitution," evaluates them and suggests revisions. The original model is then fine-tuned based on these revisions.
  2. Reinforcement Learning from AI Feedback (RLAIF) Phase: The model generates pairs of responses, and the critic model selects the better one based on the "constitution." This data is used to train a preference model (reward model), which then serves as a signal to fine-tune the main model using reinforcement learning algorithms.

Long Context and Multimodality

One of Claude's main advantages is its very large context window. Starting from 100k tokens in Claude 2, it was increased to 200k in Claude 3 and up to 1-2 million tokens in versions 3.5 and 4. This allows the models to analyze entire books, codebases, or multi-hour transcripts within a single prompt.

Starting with the Claude 3 family, the models became multimodal, gaining the ability to process images alongside text.

Hybrid Reasoning and Agentic Capabilities

Starting with versions Claude 3.7 and 4, a hybrid reasoning architecture was introduced. It allows the models to switch between two modes:

  • Fast Answers: The standard mode for simple tasks.
  • Extended Thinking: For complex tasks, the model pauses to "think," performing internal reasoning steps, calling tools (web search, code execution), and formulating a more well-founded response. This makes the process more transparent and reliable.

Evolution of Claude Models

Claude 1 and 2 (2023)

  • Claude 1 (March 2023): The first public version. It introduced the Claude Instant model for fast tasks and a flagship version with a 100,000 token context window.
  • Claude 2 (July 2023): An improved version that became publicly available via a web interface. It showed significant improvements in coding (71% on Codex HumanEval) and mathematics. In November 2023, Claude 2.1 was released with a 200,000 token context window.

Claude 3 (March 2024)

A family of models that surpassed GPT-4 on several benchmarks for the first time.

  • Versions: Haiku (fastest), Sonnet (balanced), and Opus (most powerful).
  • Key Innovations: Introduction of multimodality (image analysis), significant improvements in reasoning and coding, and a reduction in unwarranted refusals. Opus achieved 86.8% on MMLU.

Claude 3.5 (June 2024)

An intermediate generation focused on increasing intelligence and speed.

  • Claude 3.5 Sonnet: Surpassed Claude 3 Opus in performance while being twice as fast. It introduced the Artifacts feature[3]—an interactive panel for working with generated code or documents.

Claude 3.7 and Claude 4 (2025)

A generation focused on agentic capabilities and complex reasoning.

  • Claude 3.7 Sonnet (February 2025): Introduced hybrid reasoning, allowing the model to combine fast answers with deep, step-by-step reasoning.
  • Claude 4 (May 2025): A flagship family (Opus 4 and Sonnet 4) with a focus on autonomous AI agents. The models are capable of performing multi-step tasks, working with the file system (via Computer Use), calling tools, and maintaining long-running work sessions (up to several hours) without performance degradation. Opus 4 achieved 72.5% on the complex coding benchmark SWE-bench.

Summary Table of Claude Generations

Evolution of Key Characteristics of Claude Models
Generation Release Year Key Versions Max. Context Window Key Innovations
Claude 1 2023 Claude, Instant 100,000 tokens First public release, large context.
Claude 2 2023 Claude 2, 2.1 200,000 tokens Improved coding and reasoning, public availability.
Claude 3 2024 Opus, Sonnet, Haiku 200,000+ tokens Multimodality (images), surpassed GPT-4.
Claude 3.5 2024 Sonnet, Haiku 200,000+ tokens Increased speed and intelligence, "Artifacts" feature.
Claude 4 / 3.7 2025 Opus, Sonnet 200,000+ tokens Hybrid reasoning, agentic capabilities, tool use.

Application and Availability

Claude models are available through several channels:

  • Web interface claude.ai: Provides free access (to the Sonnet model) and paid subscriptions (Pro, Max) with access to more powerful models (Opus) and higher limits.
  • Developer API: Anthropic offers a commercial API that allows developers to integrate Claude into third-party applications. Prices vary depending on the model (Haiku is the cheapest, Opus is the most expensive).
  • Cloud Platforms: Claude is available through Amazon Bedrock and Google Cloud Vertex AI, which simplifies its use in enterprise environments.
  • Integrations: Claude is integrated into popular services such as Slack, Notion, and Quora (in the Poe chatbot).

Literature

Literature

  • Ouyang, L. et al. (2022). Training Language Models to Follow Instructions with Human Feedback. arXiv:2203.02155.
  • Bai, Y. et al. (2022). Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback. arXiv:2204.05862.
  • Bai, Y. et al. (2022). Constitutional AI: Harmlessness from AI Feedback. arXiv:2212.08073.
  • Bulatov, A. et al. (2023). Scaling Transformer to 1M Tokens and Beyond with RMT. arXiv:2304.11062.
  • Jimenez, C. E. et al. (2023). SWE-bench: Can Language Models Resolve Real-World GitHub Issues?. arXiv:2310.06770.
  • Yuan, W. et al. (2024). Self-Rewarding Language Models. arXiv:2401.10020.
  • Yang, A. et al. (2024). Context Parallelism for Scalable Million-Token Inference. arXiv:2411.01783.
  • Miranda, L. J. V. et al. (2024). Hybrid Preferences: Learning to Route Instances for Human vs AI Feedback. arXiv:2410.19133.
  • Chittepu, Y. et al. (2025). Reinforcement Learning from Human Feedback with High-Confidence Safety Constraints. arXiv:2506.08266.
  • Yuan, W. et al. (2025). Process-based Self-Rewarding Language Models. arXiv:2503.03746.
  • Yang, B. et al. (2025). Long Context Windows in Generative AI: An AI Atlas Report. [4] (tech-report, open review).

Notes

  1. “Dario Amodei”. In Wikipedia [1]
  2. “Daniela Amodei”. In Wikipedia [2]
  3. “What Are Artifacts and How Do I Use Them? | Anthropic Help Center”.[3]