GPT-5.6 Sol, Terra, and Luna release notes

GPT-5.6 Sol, Terra, and Luna release notes

OpenAI has launched a limited preview of the GPT-5.6 model series, introducing a tiered capability structure consisting of Sol (flagship), Terra (balanced), and Luna (fast and affordable). The release focuses on advancing agentic capabilities in coding, biology, and cybersecurity while implementing a more rigorous, layered safety stack to mitigate high-risk offensive use.

New Model Tiers and Pricing

OpenAI is transitioning to a naming convention where the version number represents the generation and the name represents the capability tier. This allows different tiers to advance on their own schedules.

Model Positioning Input Price (per 1M tokens) Output Price (per 1M tokens)
GPT-5.6 Sol Flagship / Highest Intelligence $5.00 $30.00
GPT-5.6 Terra Balanced / Everyday Work $2.50 $15.00
GPT-5.6 Luna Fast / Lowest Cost $1.00 $6.00

Prompt Caching Updates

GPT-5.6 introduces more predictable prompt caching with support for explicit cache breakpoints and a minimum cache life of 30 minutes. Cache writes are billed at 1.25x the uncached input rate, while cache reads maintain a 90% discount.

Advanced Capabilities and Agentic Workflows

GPT-5.6 Sol introduces two new operational modes to handle complex, multi-step reasoning tasks:

  • max reasoning effort: Grants the model additional time to reason deeply before responding.
  • ultra mode: Leverages subagents to accelerate complex work, moving beyond the capabilities of a single agent.

Domain-Specific Performance

  • Coding: Sol sets a new state of the art on Terminal-Bench 2.1, specifically improving command-line workflows that require tool coordination and iteration.
  • Biology: On GeneBench v1, Sol outperforms GPT-5.5 in long-horizon genomics and quantitative-biology analyses while utilizing fewer tokens.
  • Cybersecurity: Sol improves the performance-efficiency frontier for vulnerability research. On ExploitBench, it is competitive with Mythos Preview while using approximately one-third of the output tokens. In ExploitGym, all three 5.6 models show strong improvements in cyber capabilities as reasoning increases.

Layered Safeguard Stack and Safety Framework

To balance the increased power of the models with the risk of misuse, OpenAI has implemented a layered safeguard stack. The goal is to enable legitimate defensive work (e.g., patch development, security education) while constraining prohibited offensive activity.

Safety Layers

  1. Model-Level Training: The models are trained to refuse prohibited cyber assistance, even when faced with jailbreak attempts or disguised intent.
  2. Real-Time Classifiers: Misuse classifiers monitor output during generation. High-risk detections may pause generation for review by a larger reasoning model.
  3. Account-Level Signals: Systems analyze patterns across multiple conversations to distinguish persistent malicious behavior from legitimate dual-use security research.
  4. Differentiated Access: Sensitive capabilities are not made broadly available by default during the preview phase.

Automated Red-Teaming

OpenAI utilized over 700,000 A100-equivalent GPU hours for automated red-teaming to identify "universal jailbreaks"—attacks that work across various contexts rather than narrow prompts. This is supplemented by third-party human expert red-teaming.

Deployment and Government Coordination

GPT-5.6 is currently in a limited preview for a small group of trusted partners. OpenAI stated that this phased approach was taken at the request of the U.S. government to coordinate capabilities ahead of a broader release.

OpenAI explicitly noted that they do not believe government-mandated access processes should become the long-term default, as it restricts access for developers and cyber defenders. The company is working with the Administration to develop a repeatable process for future releases under the cyber Executive Order framework.

Community Perspectives and Critiques

Discussion among technical users on Hacker News highlights several points of contention regarding the release:

  • Government Influence: Users expressed concern over the U.S. government acting as a bottleneck for AI innovation. One user noted, "This amount of courting the current administration is pretty scary imo."
  • Pricing Trends: Some developers observed a trend of increasing costs for "mini" or entry-level models over time, suggesting that users are being forced into more expensive tiers.
  • Competitive Landscape: There is skepticism regarding how Sol compares to competitors like Claude Fable 5. Some users pointed to the Agent Arena leaderboard, where Fable 5 currently holds a high rank in tool orchestration.
  • Version Naming: Critics questioned why a "next-generation" model is labeled as version 5.6 rather than a major version jump to GPT-6.

Sources