New LLMs Available: All Qwen3's Best Models

Three new Qwen3 models just landed on NetMind API. From long-context thinking models to lean coders, we’ve got them all plugged into our Model Library and ready to run.

Whether you’re building a LLM-powered assistant, a cost-efficient code interpreter, or just need to prototype with SOTA-level LLM models—there’s something here for you.

Model at a glance

Qwen3-235B-A22B-Thinking-2507

This is your go-to model for deep reasoning over long context windows. It’s the most capable general-purpose model in this drop.

Total parameters: 235B
Active parameters: 22B (Mixture-of-Experts)
Native context window: 262K tokens
Function calling: Supported
Pricing: $0.20 (input) / $0.80 (output) per million tokens
RPM: 60
Feature: This model supports only thinking mode. The default chat template automatically includes a token of <think> to enforce model thinking. Therefore, it is normal for the model's output to contain only </think> without an explicit opening <think> tag.

Qwen3-235B-A22B-Instruct-2507

Fine-tuned for instruction following and conversational reliability. If you want chat performance without compromising capability, this is it.

Total parameters: 235B
Active parameters: 22B (MoE)
Native context window: 256K tokens
Function calling: Supported
Pricing: $0.20 (input) / $0.80 (output) per million tokens
RPM: 60
Feature: This model supports only non-thinking mode and does not generate <think></think> blocks in its output. Meanwhile, specifying enable_thinking=False is no longer required.

Qwen3-Coder-30B-A3B-Instruct
A smaller, sharper coding model designed for speed and affordability. Perfect for in-IDE agents, repo summarizers, or lightweight RAG pipelines.

Total parameters: 30B
Active parameters: 3B (MoE)
Native context window: 128K tokens
Function calling: Supported
Pricing: $0.04 (input) / $0.17 (output) per million tokens
RPM: 60
Feature: Qwen3-Coder series excel in coding tasks. This is a light-weight alternative to the 480B-parameter Qwen3-Coder.

All three models support function calling out of the box and run on NetMind’s ultra-efficient low-latency infra, so you can deploy with confidence—whether you're automating workflows, building tools, or just experimenting.

If you build something with these, we want to see it.

User-Agent: