DeepInfra Now Available as Hugging Face Inference Provider with SDK Support
The serverless inference platform DeepInfra has been integrated into Hugging Face's ecosystem, allowing developers to access models through the Hub's web interface and Python/JavaScript SDKs with billing options for direct or routed requests.
1 source · single source
- DeepInfra joined Hugging Face Inference Providers, enabling access to models like DeepSeek V4, Kimi-K2.6, and GLM-5.1 through the Hub's UI and client SDKs
- Users can configure API keys in account settings and choose between direct billing (via DeepInfra) or routed requests billed through Hugging Face
- Integration extends to agent harnesses including Pi, OpenCode, and Hermes Agents, with support for text generation and conversational tasks launching initially
- Hugging Face PRO subscribers receive $2 monthly inference credits applicable across all supported providers
Hugging Face has added DeepInfra to its growing roster of integrated Inference Providers, giving developers native access to the serverless platform's model catalog directly from model pages on the Hub. DeepInfra, which maintains a catalog exceeding 100 models and emphasizes competitive per-token pricing, initially supports text generation and conversational tasks, with LLMs including DeepSeek V4, Kimi-K2.6, and GLM-5.1 available at launch. Expanded support for text-to-image, text-to-video, and embedding tasks is planned.
The integration surfaces DeepInfra as an option in Hugging Face's web UI, allowing users to invoke models directly from model cards. Developers can establish preferences in account settings, specifying which providers to prioritize and whether to supply custom API keys. Two billing modes are available: customers can route requests through Hugging Face (charged to their HF account at standard provider rates with no markup), or authenticate directly to DeepInfra and incur charges on their own DeepInfra account.
Both the Python (huggingface_hub >= 1.11.2) and JavaScript (@huggingface/inference) SDKs now include DeepInfra support, enabling programmatic access via standard OpenAI-compatible client libraries. The integration extends to popular agent frameworks such as Pi, OpenCode, and Hermes Agents, reducing the friction of plugging DeepInfra-hosted models into downstream applications.
Hugging Face PRO subscribers receive $2 in monthly inference credits usable across all integrated providers, while free-tier users access a smaller quota of free inference capacity.
- May 21, 2026 · TechCrunch
Spotify launches ElevenLabs-powered audiobook creation tool for independent authors
Trust54 - May 20, 2026 · Hugging Face
Hugging Face releases six Ettin reranker models with distillation training recipe
Trust74 - May 19, 2026 · Google AI — Blog
Google announces voice features, image editor, and personal AI agent for Workspace
Trust77