AWS GovCloud (US) adds OpenAI GPT OSS and NVIDIA Nemotron models to Amazon Bedrock
Amazon Bedrock now supports OpenAI’s open-weight GPT OSS models (120B and 20B) and NVIDIA Nemotron models (Nano 9B v2, Nano 12B v2, Nano 30B, Super 120B) in AWS GovCloud (US) for U.S. government workloads.
1 source · cross-referenced
- Amazon Bedrock in AWS GovCloud (US) now supports OpenAI’s open-weight GPT OSS models (120B and 20B) and NVIDIA Nemotron models (Nano 9B v2, Nano 12B v2, Nano 30B, Super 120B).
- The models are served via a next-generation inference engine with zero operator access and run entirely within the AWS GovCloud (US) compliance boundary.
- Agencies can use a unified API to select models for specific use cases without changing application code.
AWS GovCloud (US) now supports OpenAI’s open-weight GPT OSS models (gpt-oss-120b and gpt-oss-20b) and NVIDIA Nemotron models (Nemotron 3 Super 120B, Nemotron 3 Nano 30B, and the Nano 9B v2 and Nano 12B v2 variants) in Amazon Bedrock.
The models are served via Amazon Bedrock’s next-generation inference engine, which operates with zero operator access and runs entirely within the AWS GovCloud (US) compliance boundary. This ensures no AWS operator, customer, or model provider can access customer data such as prompts or completions.
Agencies can invoke the models through two endpoints: the bedrock-mantle endpoint, which is OpenAI-compatible and supports the OpenAI Python and TypeScript SDKs, and the bedrock-runtime endpoint, which uses the Converse and InvokeModel APIs via the AWS SDK and includes access to native Amazon Bedrock features such as Guardrails.
The NVIDIA Nemotron family includes a 120B parameter hybrid mixture-of-experts (MoE) model (Nemotron 3 Super) that activates only 12B parameters per token, delivering up to 5x higher throughput than the previous generation, and a 30B parameter model (Nemotron 3 Nano) that activates approximately 3B parameters per token, reducing reasoning-token generation by up to 60% and delivering 4x higher throughput.
OpenAI’s GPT OSS models include a 120B parameter model designed for production and high-reasoning use cases and a 20B parameter model optimized for lower latency and specialized or local deployments. Both models provide a 128K-token context window and support up to 16K output tokens.
The models are available in AWS GovCloud (US) Regions, which are physically located in the United States and administered exclusively by U.S. citizens. These Regions support compliance frameworks including FedRAMP High, DoD Cloud Computing Security Requirements Guide Impact Levels 2, 4, and 5, ITAR, and CJIS.
Agencies can use a unified API to select the appropriate model for specific use cases without modifying application code, enabling deployment of agentic applications and mission workflows such as automated security control assessments, multi-document intelligence synthesis, contract and acquisition analysis, and policy compliance checking.
- Jul 4, 2026 · TechCrunch — AI
Alibaba bans employees from using Anthropic’s Claude Code
Trust75 - Jul 4, 2026 · AWS — Machine Learning Blog
AWS details how Amazon Bedrock can be used to detect AI-generated phishing emails
Trust78 - Jul 4, 2026 · AWS — Machine Learning Blog
AWS SageMaker AI adds multi-turn reinforcement learning training loop with serverless execution
Trust79