Tools · Apr 29, 2026

DeepInfra Now Available as Hugging Face Inference Provider with SDK Support

The serverless inference platform DeepInfra has been integrated into Hugging Face's ecosystem, allowing developers to access models through the Hub's web interface and Python/JavaScript SDKs with billing options for direct or routed requests.

Trust61

HypeSome hype

1 source · single source

ShareX LinkedIn Email

TL;DR

DeepInfra joined Hugging Face Inference Providers, enabling access to models like DeepSeek V4, Kimi-K2.6, and GLM-5.1 through the Hub's UI and client SDKs
Users can configure API keys in account settings and choose between direct billing (via DeepInfra) or routed requests billed through Hugging Face
Integration extends to agent harnesses including Pi, OpenCode, and Hermes Agents, with support for text generation and conversational tasks launching initially
Hugging Face PRO subscribers receive $2 monthly inference credits applicable across all supported providers

Hugging Face has added DeepInfra to its growing roster of integrated Inference Providers, giving developers native access to the serverless platform's model catalog directly from model pages on the Hub. DeepInfra, which maintains a catalog exceeding 100 models and emphasizes competitive per-token pricing, initially supports text generation and conversational tasks, with LLMs including DeepSeek V4, Kimi-K2.6, and GLM-5.1 available at launch. Expanded support for text-to-image, text-to-video, and embedding tasks is planned.

The integration surfaces DeepInfra as an option in Hugging Face's web UI, allowing users to invoke models directly from model cards. Developers can establish preferences in account settings, specifying which providers to prioritize and whether to supply custom API keys. Two billing modes are available: customers can route requests through Hugging Face (charged to their HF account at standard provider rates with no markup), or authenticate directly to DeepInfra and incur charges on their own DeepInfra account.

Both the Python (huggingface_hub >= 1.11.2) and JavaScript (@huggingface/inference) SDKs now include DeepInfra support, enabling programmatic access via standard OpenAI-compatible client libraries. The integration extends to popular agent frameworks such as Pi, OpenCode, and Hermes Agents, reducing the friction of plugging DeepInfra-hosted models into downstream applications.

Hugging Face PRO subscribers receive $2 in monthly inference credits usable across all integrated providers, while free-tier users access a smaller quota of free inference capacity.

Sources

01Hugging Face — DeepInfra on Hugging Face Inference Providers

Also on Tools

DeepInfra Now Available as Hugging Face Inference Provider with SDK Support

Granite 4.1 LLMs use five-stage pre-training and multi-stage reinforcement learning to achieve dense model efficiency

Claude Code v2.1.122 adds Bedrock service tier selection and improves MCP server handling

Amazon launches OpenAI models and new agent service on AWS Bedrock