Skip to content
Tools · Apr 29, 2026

DeepInfra Now Available as Hugging Face Inference Provider with SDK Support

The serverless inference platform DeepInfra has been integrated into Hugging Face's ecosystem, allowing developers to access models through the Hub's web interface and Python/JavaScript SDKs with billing options for direct or routed requests.

Trust61
HypeSome hype

1 source · single source

ShareXLinkedInEmail
TL;DR
  • DeepInfra joined Hugging Face Inference Providers, enabling access to models like DeepSeek V4, Kimi-K2.6, and GLM-5.1 through the Hub's UI and client SDKs
  • Users can configure API keys in account settings and choose between direct billing (via DeepInfra) or routed requests billed through Hugging Face
  • Integration extends to agent harnesses including Pi, OpenCode, and Hermes Agents, with support for text generation and conversational tasks launching initially
  • Hugging Face PRO subscribers receive $2 monthly inference credits applicable across all supported providers

Hugging Face has added DeepInfra to its growing roster of integrated Inference Providers, giving developers native access to the serverless platform's model catalog directly from model pages on the Hub. DeepInfra, which maintains a catalog exceeding 100 models and emphasizes competitive per-token pricing, initially supports text generation and conversational tasks, with LLMs including DeepSeek V4, Kimi-K2.6, and GLM-5.1 available at launch. Expanded support for text-to-image, text-to-video, and embedding tasks is planned.

The integration surfaces DeepInfra as an option in Hugging Face's web UI, allowing users to invoke models directly from model cards. Developers can establish preferences in account settings, specifying which providers to prioritize and whether to supply custom API keys. Two billing modes are available: customers can route requests through Hugging Face (charged to their HF account at standard provider rates with no markup), or authenticate directly to DeepInfra and incur charges on their own DeepInfra account.

Both the Python (huggingface_hub >= 1.11.2) and JavaScript (@huggingface/inference) SDKs now include DeepInfra support, enabling programmatic access via standard OpenAI-compatible client libraries. The integration extends to popular agent frameworks such as Pi, OpenCode, and Hermes Agents, reducing the friction of plugging DeepInfra-hosted models into downstream applications.

Hugging Face PRO subscribers receive $2 in monthly inference credits usable across all integrated providers, while free-tier users access a smaller quota of free inference capacity.

Sources
  1. 01Hugging FaceDeepInfra on Hugging Face Inference Providers
Also on Tools

Stories may contain errors. Dispatch is assembled with AI assistance and curated by human editors; despite the trust-score filter, mistakes happen. We correct publicly — every article links to its revision history. Nothing here is financial, legal, or medical advice. Verify before relying on any claim.

© 2026 Dispatch. No ads. No sponsorships. No paid placement. Reader-supported via Ko-fi.

Built by a person who cares about honest AI news.