z.ai's open supply GLM-5 achieves file low hallucination fee and leverages new RL 'slime' approach -

Chinese language AI startup Zhupai aka z.ai is again this week with an eye-popping new frontier massive language mannequin: GLM-5.

The most recent in z.ai's ongoing and frequently spectacular GLM sequence, IT retains an open supply MIT License — good for enterprise deployment – and, in one in every of a number of notable achievements, achieves a record-low hallucination fee on the impartial Artificial Analysis Intelligence Index v4.0.

With a rating of -1 on the AA-Omniscience Index—representing an enormous 35-point enchancment over its predecessor—GLM-5 now leads your entire AI business, together with U.S. opponents like Google, OpenAI and Anthropic, in information reliability by figuring out when to abstain relatively than fabricate Information.

Past its reasoning prowess, GLM-5 is constructed for high-utility information work. IT options native "Agent Mode" capabilities that enable IT to show uncooked prompts or supply supplies instantly into skilled workplace paperwork, together with ready-to-use .docx, .pdf, and .xlsx recordsdata.

Whether or not producing detailed monetary stories, highschool sponsorship proposals, or advanced spreadsheets, GLM-5 delivers leads to real-world codecs that combine instantly into enterprise workflows.

IT can also be disruptively priced at roughly $0.80 per million enter tokens and $2.56 per million output tokens, roughly 6x cheaper than proprietary opponents like Claude Opus 4.6, making state-of-the-art agentic engineering cheaper than ever earlier than. Right here's what else enterprise choice makers ought to know in regards to the mannequin and its coaching.

Technology: scaling for agentic effectivity

On the coronary heart of GLM-5 is an enormous leap in uncooked parameters. The mannequin scales from the 355B parameters of GLM-4.5 to a staggering 744B parameters, with 40B lively per token in its Combination-of-Specialists (MoE) structure. This progress is supported by a rise in pre-training information to twenty-eight.5T tokens.

To handle coaching inefficiencies at this magnitude, Zai developed "slime," a novel asynchronous reinforcement studying (RL) infrastructure.

Conventional RL typically suffers from "long-tail" bottlenecks; Slime breaks this lockstep by permitting trajectories to be generated independently, enabling the fine-grained iterations essential for advanced agentic conduct.

By integrating system-level optimizations like Lively Partial Rollouts (APRIL), slime addresses the technology bottlenecks that sometimes eat over 90% of RL coaching time, considerably accelerating the iteration cycle for advanced agentic duties.

The framework’s design is centered on a tripartite modular system: a high-performance coaching module powered by Megatron-LM, a rollout module using SGLang and customized routers for high-throughput information technology, and a centralized Information Buffer that manages immediate initialization and rollout storage.

By enabling adaptive verifiable environments and multi-turn compilation suggestions loops, slime gives the strong, high-throughput basis required to transition AI from easy chat interactions towards rigorous, long-horizon programs engineering.

To maintain deployment manageable, GLM-5 integrates DeepSeek Sparse Consideration (DSA), preserving a 200K context capability whereas drastically decreasing prices.

Finish-to-end information work

Zai is framing GLM-5 as an "workplace" device for the AGI period. Whereas earlier fashions centered on snippets, GLM-5 is constructed to ship ready-to-use paperwork.

IT can autonomously remodel prompts into formatted .docx, .pdf, and .xlsx recordsdata—starting from monetary stories to sponsorship proposals.

In follow, this implies the mannequin can decompose high-level targets into actionable subtasks and carry out "Agentic Engineering," the place people outline high quality gates whereas the AI handles execution.

Excessive efficiency

GLM-5’s benchmarks make IT the brand new strongest open supply mannequin on this planet, in keeping with Artificial Analysis, surpassing Chinese language rival Moonshot's new Kimi K2.5 launched simply two weeks in the past, displaying that Chinese language AI firms are almost caught up with much better resourced proprietary Western rivals.

In response to z.ai's personal supplies shared right now, GLM-5 ranks close to state-of-the-art on a number of key benchmarks:

SWE-bench Verified: GLM-5 achieved a rating of 77.8, outperforming Gemini 3 Professional (76.2) and approaching Claude Opus 4.6 (80.9).

Merchandising Bench 2: In a simulation of operating a enterprise, GLM-5 ranked #1 amongst open-source fashions with a closing steadiness of $4,432.12.

Past efficiency, GLM-5 is aggressively undercutting the market. Stay on OpenRouter as of February 11, 2026, IT is priced at roughly $0.80–$1.00 per million enter tokens and $2.56–$3.20 per million output tokens. IT falls within the mid-range in comparison with different main LLMs, however based mostly on its top-tier bechmarking efficiency, IT's what one may name a "steal."

Mannequin	Enter (per 1M tokens)	Output (per 1M tokens)	Whole Price (1M in + 1M out)	Supply
Qwen 3 Turbo	$0.05	$0.20	$0.25	Alibaba Cloud
Grok 4.1 Quick (reasoning)	$0.20	$0.50	$0.70	xAI
Grok 4.1 Quick (non-reasoning)	$0.20	$0.50	$0.70	xAI
deepseek-chat (V3.2-Exp)	$0.28	$0.42	$0.70	DeepSeek
deepseek-reasoner (V3.2-Exp)	$0.28	$0.42	$0.70	DeepSeek
Gemini 3 Flash Preview	$0.50	$3.00	$3.50	Google
Kimi-k2.5	$0.60	$3.00	$3.60	Moonshot
GLM-5	$1.00	$3.20	$4.20	Z.ai
ERNIE 5.0	$0.85	$3.40	$4.25	Qianfan
Claude Haiku 4.5	$1.00	$5.00	$6.00	Anthropic
Qwen3-Max (2026-01-23)	$1.20	$6.00	$7.20	Alibaba Cloud
Gemini 3 Professional (≤200K)	$2.00	$12.00	$14.00	Google
GPT-5.2	$1.75	$14.00	$15.75	OpenAI
Claude Sonnet 4.5	$3.00	$15.00	$18.00	Anthropic
Gemini 3 Professional (>200K)	$4.00	$18.00	$22.00	Google
Claude Opus 4.6	$5.00	$25.00	$30.00	Anthropic
GPT-5.2 Professional	$21.00	$168.00	$189.00	OpenAI

That is roughly 6x cheaper on enter and almost 10x cheaper on output than Claude Opus 4.6 ($5/$25). This launch confirms rumors that Zhipu AI was behind "Pony Alpha," a stealth mannequin that beforehand crushed coding benchmarks on OpenRouter.

Nonetheless, regardless of the excessive benchmarks and low value, not all early customers are enthusiastic in regards to the mannequin, noting its excessive efficiency doesn't inform the entire story.

Lukas Petersson, co-founder of the safety-focused autonomous AI protocol startup Andon Labs, remarked on X: "After hours of studying GLM-5 traces: an extremely efficient mannequin, however far much less situationally conscious. Achieves targets by way of aggressive ways however doesn't purpose about its scenario or leverage expertise. That is scary. That is the way you get a paperclip maximizer."

The "paperclip maximizer" refers to a hypothetical scenario described by Oxford philosopher Nick Bostrom back in 2003, during which an AI or different autonomous creation unintentionally results in an apocalyptic state of affairs or human extinction by following a seemingly benign instruction — like maximizing the variety of paperclips produced — to an excessive diploma, redirecting all sources essential for human (or different life) or in any other case making life unattainable by way of its dedication to fulfilling the seemingly benign goal.

Ought to your enterprise undertake GLM-5?

Enterprises in search of to flee vendor lock-in will discover GLM-5’s MIT License and open-weights availability a major strategic benefit. In contrast to closed-source opponents that hold intelligence behind proprietary partitions, GLM-5 permits organizations to host their very own frontier-level intelligence.

Adoption is just not with out friction. The sheer scale of GLM-5—744B parameters—requires an enormous {hardware} ground that could be out of attain for smaller corporations with out vital cloud or on-premise GPU clusters.

Safety leaders should weigh the geopolitical implications of a flagship mannequin from a China-based lab, particularly in regulated industries the place information residency and provenance are strictly audited.

Moreover, the shift towards extra autonomous AI brokers introduces new governance dangers. As fashions transfer from "chat" to "work," they start to function throughout apps and recordsdata autonomously. With out the strong agent-specific permissions and human-in-the-loop high quality gates established by enterprise information leaders, the chance of autonomous error will increase exponentially.

In the end, GLM-5 is a "purchase" for organizations which have outgrown easy copilots and are able to construct a very autonomous workplace.

IT is for engineers who have to refactor a legacy backend or requires a "self-healing" pipeline that doesn't sleep.

Whereas Western labs proceed to optimize for "Pondering" and reasoning depth, Zai is optimizing for execution and scale.

Enterprises that undertake GLM-5 right now aren’t simply shopping for a less expensive mannequin; they’re betting on a future the place essentially the most precious AI is the one that may end the venture with out being requested twice.

👇Observe extra 👇
👉 bdphone.com
👉 ultractivation.com
👉 trainingreferral.com
👉 shaplafood.com
👉 bangladeshi.help
👉 www.forexdhaka.com
👉 uncommunication.com
👉 ultra-sim.com
👉 forexdhaka.com
👉 ultrafxfund.com
👉 bdphoneonline.com
👉 dailyadvice.us

https://bdphone.com/
https://www.ultraactivation.com/
https://trainingreferral.com/
https://shaplafood.com/
https://bangladeshi.help/
https://www.forexdhaka.com/
https://uncommunication.com/

Technology: scaling for agentic effectivity

Finish-to-end information work

Excessive efficiency

Ought to your enterprise undertake GLM-5?

Leave a Comment Cancel Reply