Models | Machine Learning Inference

Qwen/

Qwen2.5-72B-Instruct

text-generation

Qwen2.5 is a model pretrained on a large-scale dataset of up to 18 trillion tokens, offering significant improvements in knowledge, coding, mathematics, and instruction following compared to its predecessor Qwen2. The model also features enhanced capabilities in generating long texts, understanding structured data, and generating structured outputs, while supporting multilingual capabilities for over 29 languages.

bfloat16

32k

$0.04/$0.10 in/out Mtoken

Qwen/

Qwen2.5-7B-Instruct

text-generation

The 7 billion parameter Qwen2.5 excels in language understanding, multilingual capabilities, coding, mathematics, and reasoning

Qwen/Qwen2.5-Coder-32B-Instruct cover image

fp8

32k

$0.06/$0.15 in/out Mtoken

Qwen/

Qwen2.5-Coder-32B-Instruct

text-generation

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). It has significant improvements in code generation, code reasoning and code fixing. A more comprehensive foundation for real-world applications such as Code Agents. Not only enhancing coding capabilities but also maintaining its strengths in mathematics and general competencies.

32k

Replaced

Qwen/

Qwen2.5-Coder-7B

text-generation

Qwen2.5-Coder-7B is a powerful code-specific large language model with 7.61 billion parameters. It's designed for code generation, reasoning, and fixing tasks. The model covers 92 programming languages and has been trained on 5.5 trillion tokens of data, including source code, text-code grounding, and synthetic data.

bfloat16

125k

$0.20/$0.60 in/out Mtoken

Qwen/

Qwen2.5-VL-32B-Instruct

text-generation

32k

$0.002 / Mtoken

Qwen/

Qwen3-Embedding-0.6B

embeddings

The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks. Building upon the dense foundational models of the Qwen3 series, it provides a comprehensive range of text embeddings and reranking models in various sizes (0.6B, 4B, and 8B).

32k

$0.005 / Mtoken

Qwen/

Qwen3-Embedding-4B

embeddings

The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks. Building upon the dense foundational models of the Qwen3 series, it provides a comprehensive range of text embeddings and reranking models in various sizes (0.6B, 4B, and 8B).

32k

$0.010 / Mtoken

Qwen/

Qwen3-Embedding-8B

embeddings

The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks. Building upon the dense foundational models of the Qwen3 series, it provides a comprehensive range of text embeddings and reranking models in various sizes (0.6B, 4B, and 8B).

32k

$0.010 / Mtoken

Qwen/

Qwen3-Reranker-0.6B

reranker

The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks. Building upon the dense foundational models of the Qwen3 series, it provides a comprehensive range of text embeddings and reranking models in various sizes (0.6B, 4B, and 8B)

32k

$0.025 / Mtoken

Qwen/

Qwen3-Reranker-4B

reranker

The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks. Building upon the dense foundational models of the Qwen3 series, it provides a comprehensive range of text embeddings and reranking models in various sizes (0.6B, 4B, and 8B)

32k

$0.050 / Mtoken

Qwen/

Qwen3-Reranker-8B

reranker

The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks. Building upon the dense foundational models of the Qwen3 series, it provides a comprehensive range of text embeddings and reranking models in various sizes (0.6B, 4B, and 8B)

$10.00 per M characters

ResembleAI/

chatterbox

text-to-speech

New model named Chatterbox by Resemble AI's first production-grade open source TTS model. Licensed under MIT, Chatterbox has been benchmarked against leading closed-source systems like ElevenLabs, and is consistently preferred in side-by-side evaluations. Whether you're working on memes, videos, games, or AI agents, Chatterbox brings your content to life. It's also the first open source TTS model to support emotion exaggeration control, a powerful feature that makes your voices stand out.

fp8

8k

Replaced

Sao10K/

L3-70B-Euryale-v2.1

text-generation

Euryale 70B v2.1 is a model focused on creative roleplay from Sao10k

bfloat16

8k

Replaced

Sao10K/

L3-8B-Lunaris-v1

text-generation

A generalist / roleplaying model merge based on Llama 3. Sao10K has carefully selected the values based on extensive personal experimentation and has fine-tuned them to create a customized recipe.

fp8

8k

$0.02/$0.05 in/out Mtoken

Sao10K/

L3-8B-Lunaris-v1-Turbo

text-generation

Sao10K/L3.1-70B-Euryale-v2.2 cover image

fp8

128k

$0.65/$0.75 in/out Mtoken

Sao10K/

L3.1-70B-Euryale-v2.2

text-generation

Euryale 3.1 - 70B v2.2 is a model focused on creative roleplay from Sao10k

Sao10K/L3.3-70B-Euryale-v2.3 cover image

fp8

128k

$0.65/$0.75 in/out Mtoken

Sao10K/

L3.3-70B-Euryale-v2.3

text-generation

L3.3-70B-Euryale-v2.3 is a model focused on creative roleplay from Sao10k

$0.10 / video

Wan-AI/

Wan2.1-T2V-1.3B

text-to-video

The Wan2.1 1.3B model is a lightweight, efficient text-to-video generator. Despite its compact size, it delivers impressive performance across benchmarks and generates high-quality 480P videos.

Browse deepinfra models:

Category/all

Unlock the most affordable AI hosting