Anthropic has introduced its new flagship model, Claude Opus 4.8. This time, the company has focused on the “honesty” of the model and the reliability of the answers.
Anthropic says that Claude Opus 4.8 hides errors made when writing code about 4 times less than the previous version. The model admits this more often when it is not sure of the answer and is more reluctant to present unverified information as fact.
According to the test results published by the company, the model also achieved higher indicators in terms of security and behavior in line with the user’s interests.
In benchmark tests, Claude Opus 4.8 was the leader in six out of seven tests. In the SWE-Bench Pro programming test, the model showed a result of 69.2%, leaving behind GPT-5.5 and Gemini 3.1 Pro.
Along with the update, a new “effort” setting has been added to the Claude chatbot. Users will now be able to choose how much resources the model will spend on the task. The faster mode is about 2.5 times faster and costs less than previous versions.
In addition, the Claude Code platform has launched the “dynamic workflow” feature. This system allows the AI agent to schedule long and multi-stage tasks and run hundreds of parallel sub-agents in the same session.
