Discover the best AI tools curated for professionals.

AIUnpacker

Search everything

Find AI tools, reviews, prompts, and more

Quick links

Grok Comes to Amazon Bedrock: What It Means for Enterprise AI

Grok is now a first-class citizen on Bedrock. If you're an AWS shop, here's what's available, what it costs, and whether you should bother switching from Claude or Nova.

AIUnpacker

AIUnpacker Editorial

June 17, 2026

9 min read
AIUnpacker

AIUnpacker

Jun 17, 2026 · 9m read

Jun 17, 2026 9 min

Key Takeaways

Grok is now a first-class citizen on Bedrock. If you're an AWS shop, here's what's available, what it costs, and whether you should bother switching from Claude or Nova.

Editorial Disclosure & Affiliate Notice

This content is published for informational and educational purposes only. It is not intended as a substitute for professional, legal, financial, or medical advice. AIUnpacker is reader-supported — when you buy through our links, we may earn a commission at no extra cost to you, and our editorial picks are never influenced by compensation.

  • For educational purposes only. Nothing here should be taken as a guarantee, recommendation, or professional recommendation.
  • AI-assisted editing. Drafts are produced with AI assistance and reviewed by our human editorial team.
  • Opinions are our own. Also, we are not affiliated with most tools we cover unless explicitly stated.
  • Information may be outdated. Verify pricing, features, and policies directly with the vendor.
  • Last reviewed: June 17, 2026.

Read more on our About page, Terms and Editorial Policy.

Grok Comes to Amazon Bedrock: What It Means for Enterprise AI

If your AI roadmap runs through AWS, you’ve got a new option to weigh. xAI’s Grok 4.3 went generally available on Amazon Bedrock on June 15, 2026, making xAI the third major independent AI lab on the platform alongside Anthropic and OpenAI (AWS announcement, June 15, 2026). The headline price — $1.25 per million input tokens and $2.50 per million output tokens — is the cheapest a US-lab frontier reasoning model goes on Bedrock right now (AWS Bedrock pricing page).

That’s the elevator pitch. The full picture has more nuance than the launch tweet. I’ve spent the last few days digging through the docs, the benchmarks, and the pricing tables, and I want to walk you through what actually shipped, what it costs in real workloads, and where the gotchas live. If you’re an enterprise AI buyer on AWS, this is the breakdown I wish I’d had on Monday morning.

What Just Landed on June 15, 2026

Grok 4.3 is now live on Bedrock in three US regions — Oregon (us-west-2), N. Virginia (us-east-1), and Ohio (us-east-2) — as In-Region inference only. No Geo Cross-Region, no Global Cross-Region, no multi-region failover for the launch (AWS Grok 4.3 model card).

The model itself wasn’t new on launch day. xAI started beta testing Grok 4.3 on April 17, 2026, and flipped it to the API default on April 30 (VentureBeat, May 1, 2026). Bedrock is the distribution layer, not a new model release. What you’re getting:

  • Model ID: xai.grok-4.3
  • Context window: 1 million tokens
  • Max output: 30,000 tokens per request
  • Input modalities: Text and image; output is text only
  • Reasoning: Always-on by default, configurable via reasoning.effort (none, low, medium, high)
  • Service tiers: Standard, Priority, Flex — Reserved is not supported

A quick note on that “always-on” reasoning. Unlike Claude or Nova, where you toggle thinking on or off, Grok 4.3 thinks before every response. You can suppress reasoning tokens from the output by setting effort: none, but the internal process still runs. For high-volume pipelines that don’t need deep reasoning, this matters — you’re paying for thinking you may not want.

How Much Does Grok 4.3 Cost on Bedrock?

On-demand pricing is $1.25 per million input tokens and $2.50 per million output tokens, with cached input at $0.20 per million. Confirmed on the AWS Bedrock pricing page (us-west-2 region) and matching the direct xAI API rate.

Here’s how that stacks up against the other Bedrock frontier options an enterprise team is likely to weigh. Pricing is per million tokens, on-demand, US regions:

ModelProviderInputOutputContextEndpoint
Grok 4.3xAI$1.25$2.501MMantle
Amazon Nova ProAmazon$0.80$3.20300KRuntime
DeepSeek V3.2DeepSeek$0.62$1.85128KRuntime
Claude Sonnet 4.6Anthropic$3.00$15.00200KRuntime
Claude Opus 4.7Anthropic$5.00$25.00200KRuntime
GPT-5.4 (Bedrock)OpenAI$2.75$16.50128KRuntime

(Sources: AWS Bedrock pricing, truefoundry Bedrock pricing analysis, June 12, 2026)

“Grok 4.3 is as smart as Sonnet 4.6 and 5x cheaper and faster.” — Bindu Reddy, CEO of Abacus AI, on X, May 1, 2026.

Three things stand out from that table. First, Grok 4.3 is the only frontier-class reasoning model under $1.50 input on Bedrock. Second, it’s the only one with a million-token context window. Third — and this is the part the marketing glosses over — it’s also the only one running on a non-standard endpoint.

The Three Gotchas Nobody’s Talking About

If you only read the AWS announcement, you’ll assume Grok 4.3 drops into your existing Bedrock code with a model ID change. It does not. Here are the three traps that will bite teams who don’t read the model card carefully.

1. The Mantle Endpoint Breaks Your Standard SDK Code

Grok 4.3 doesn’t run on the bedrock-runtime endpoint that Claude, Titan, and most other Bedrock models use. It runs on Mantle, a new distributed inference engine inside Bedrock that’s OpenAI-compatible (AWS Mantle docs).

What that means in practice:

  • The endpoint URL is https://bedrock-mantle.{region}.api.aws/openai/v1, not the standard Bedrock Runtime URL
  • Converse API and InvokeModel are not supported
  • You need to use an OpenAI-compatible client (the openai Python SDK works) and point it at Mantle
  • Default parameters differ from OpenAI: temperature defaults to 0.7, top_p to 0.95, max_completion_tokens to 131,072

If your platform team has standardized on the Bedrock Converse API, adding Grok 4.3 is a separate integration project, not a config flip. Budget the engineering work before you promise it in a roadmap.

2. The 200K Context Cliff Doubles Your Bill

The 1 million-token context window is real, but pricing doubles for any request over 200,000 total tokens. That’s a context-pricing tier that kicks in silently if you’re not watching.

For long-document workloads — contract review, case-law corpora, full financial filings, multi-file RAG pipelines — the effective cost can land well above the headline rate. A pipeline that routinely assembles 600K-token prompts is not running at $1.25/M input. It’s running at roughly double. Model your costs at the higher tier before you commit a budget.

3. The Vendor Is Mid-Restructuring

xAI merged into SpaceX in February 2026, with plans to dissolve xAI as a separate entity and fold Grok and X into a SpaceX AI division. Nine of xAI’s eleven original co-founders have departed (The Register, May 29, 2026).

The compliance checklist passes — xAI maintains SOC 2 Type II, HIPAA eligibility, and GDPR (VentureBeat, May 1, 2026). But compliance describes the model and the platform, not the organization behind it. A regulated buyer betting on stable API behavior and a multi-year roadmap deserves more diligence than the certifications alone suggest.

How Grok 4.3 Actually Performs

On the Artificial Analysis Intelligence Index, Grok 4.3 scores 53 at high reasoning effort and around 38 at low effort (Artificial Analysis, April 30, 2026). At high effort, that’s a clear jump over the prior Grok 4.20 and places it in competitive territory with Claude Sonnet 4.6. At low effort, it’s middling.

The agentic story is the genuinely strong one:

  • GDPval-AA agentic ELO: 1,500 — a 321-point jump over Grok 4.20’s 1,179
  • #1 on Vals AI CaseLaw v2 (79.3% accuracy) — legal reasoning
  • #1 on Vals AI CorpFin — corporate finance
  • Tau2-Bench Telecom: 98% — customer-support tool calling, up 8 points from prior gen

The honest caveat: Grok 4.3 also lost about 8 points on the non-hallucination metric versus Grok 4.20, even as it gained roughly 8 points on factual accuracy (AA-Omniscience). More right answers, but also more confident wrong ones. For finance, healthcare, and legal workflows where a fabricated citation is a liability, that regression is the number to weigh against the marketing language.

On coding, the news is mixed. Independent reviewers at Andon Labs reported Grok 4.3 as a “big regression” on Vending-Bench 2, describing the model as having “narcolepsy problems” in long agentic simulations. It scores only 11% on ProofBench (math). So: strong on agentic tool use in narrow domains, weaker on general coding and complex math.

What This Means for Your AWS Stack

If you’re already running workloads on Bedrock, here’s the practical read.

1. For high-volume agentic and tool-calling pipelines, Grok 4.3 is the price-performance pick on the platform right now. At roughly a third of Claude Sonnet’s input cost and a sixth of its output cost, with competitive agentic scores, it’s a credible default for customer support automation, contract review, and financial document Q&A. Account for the 44% extra output tokens that always-on reasoning emits versus prior generations, and check your real spend before budgeting the headline rate.

2. For long-document RAG with prompts routinely above 200K tokens, model the doubled context tier first. If your pipeline pushes past the cliff on most calls, the headline advantage may disappear against a 200K-window model at higher per-token rates.

3. For Bedrock-standardized teams on the Converse API, scope the Mantle integration as its own workstream. It’s an OpenAI-compatible client path, not a model swap. The migration is tractable but real.

4. For regulated verticals (finance, healthcare, legal), the certifications are in place, but the non-hallucination regression and vendor-in-flux risk warrant extra diligence. Pilot before you commit, and keep a frontier accuracy model in the loop for high-liability outputs.

“The energy drink of frontier models: it’ll keep you up, but you won’t enjoy the experience and you’ll regret it in the morning.” — Corey Quinn, The Register, May 29, 2026.

That’s harsh but worth weighing against the price.

How AWS Sees This (Beyond the Headline)

There’s a structural read here that the launch coverage mostly skips. AWS now hosts all three independent frontier labs on Bedrock: Anthropic (with a $100 billion-plus compute commitment and up to 5 gigawatts of Trainium capacity per the Anthropic-Amazon announcement, April 20, 2026), OpenAI (with an expanded $38 billion agreement plus another $100 billion in commitments), and now xAI.

Bedrock now serves “more than 100,000 organizations worldwide” per AWS’s own marketing page. The pattern across these launches — model shows up on Bedrock shortly after a major compute deal — suggests the catalog is partly a sales funnel for Trainium silicon, not just an enterprise model marketplace.

What that means for you: the model lineup on Bedrock is going to keep expanding. Mantle is positioned to be the on-ramp for OpenAI-compatible third-party models (it launched OpenAI-compatible support in early 2026 per the AWS Mantle announcement, March 2026). If your team standardizes on Mantle for new integrations now, you’ll be ready for whatever ships next.

Should You Actually Switch?

The honest answer: it depends on the workload.

Here’s a quick decision framework:

  1. Cost-sensitive agentic workloads at scale — Strong candidate. Cheapest US-lab reasoning model on Bedrock, with the strongest agentic benchmark jumps of any 2026 release I’ve tracked.
  2. Long-document RAG above 200K tokens — Run the cliff math first. If your prompts routinely exceed 200K, the effective cost may erase the headline advantage.
  3. Regulated finance/healthcare/legal — Pilot with guardrails. Certifications pass, but the hallucination regression and vendor restructuring are real risk factors.
  4. Bedrock-standardized teams on Converse/InvokeModel — Scope the Mantle integration work before promising it. It’s a separate code path, not a model ID swap.

For most teams, the right sequence is the same: shortlist Grok 4.3 on price, prototype against Mantle on your own prompts, measure real spend with the 200K cliff factored in, and run your highest-liability prompts through an accuracy check before trusting it in production.

The Bottom Line

Grok on Amazon Bedrock is a real event, not just a catalog line. xAI is the third independent frontier lab on the platform, and at $1.25 / $2.50 per million tokens it’s the cheapest US-lab frontier reasoning model there. For high-volume agentic and tool-use workloads, that price-performance profile is genuinely compelling — the agentic benchmark jump over the prior generation is the strongest part of the story.

The fine print is where teams will win or lose money. The 1M window doubles in price above 200K tokens. The model lives on the Mantle endpoint, so it’s a separate integration for anyone standardized on Bedrock’s own SDK. And the accuracy story is mixed: more correct answers, but a measured regression on hallucination that matters most in exactly the regulated verticals AWS is targeting.

Frontier capability is no longer the scarce thing. Distribution, price discipline, and organizational stability are. Grok 4.3 brings the first two convincingly. The third — with a vendor mid-restructuring and a founder exodus — is the open question. Don’t decide off a headline. Run your own eval on the prompts you care about, with the cliff, the endpoint, and the accuracy trade-off all priced in.

Get our weekly AI digest

The latest AI tools, prompts, and insights — delivered every Tuesday.

No spam. Unsubscribe anytime.

AIUnpacker

AIUnpacker Editorial Team

Verified

A collective of engineers, journalists, and AI practitioners dedicated to providing clear, unbiased analysis of the AI tools shaping tomorrow.