Qwen3.6 Is Now Live on NetMind: Plus, Flash, and the Open-Source 35B-A3B

We're thrilled to announce that the full Qwen3.6 family is now live on the NetMind Model Library: Qwen3.6-Plus, Qwen3.6-Flash, and the open-source Qwen3.6-35B-A3B.

Alibaba's Qwen team built the 3.6 family for the real-world agent era: long-horizon planning, repository-scale reasoning, native tool use, and low-latency inference at production cost. For your OpenClaw agent, Claude Code, Cursor, internal coding assistant, or production AI workflow, you can now plug any of the three variants into NetMind with one OpenAI-compatible endpoint and start building.

Qwen3.6, Built for Real-World Agents at Every Scale

The Qwen3.6 series is designed as a lineup, not a single model. Each variant solves a different real-world constraint (depth of reasoning, throughput and cost, or on-premises ownership) while sharing the same architectural DNA across the family.

All three variants share a new hybrid architecture: efficient linear attention combined with sparse mixture-of-experts routing. The practical outcome is that context length stops being a latency tax. Plus and Flash both process up to 1 million tokens on NetMind, and the open-source 35B-A3B delivers frontier-class reasoning with just 3 billion parameters active per token.

Benchmark-Topping Agentic Coding

Both Qwen3.6-Plus and the open-source 35B-A3B post some of the strongest publicly reported coding and reasoning benchmarks in their respective tiers.

Qwen3.6-Plus (flagship)

SWE-bench Verified: 78.8 · SWE-bench Multilingual: 73.8 · SWE-bench Pro: 56.6
Terminal-Bench 2.0: 61.6. The highest published score on agentic terminal execution across the comparison set.
MMMU: 86.0. Frontier-tier multimodal reasoning.
NL2Repo: 37.9 · QwenWebBench (Elo): 1502. Leadership on long-horizon repository generation and front-end code.

Qwen3.6-35B-A3B (open-source, 3B active params)

SWE-bench Verified: 73.4 · SWE-bench Multilingual: 67.2 · SWE-bench Pro: 49.5
Terminal-Bench 2.0: 51.5. Real-world terminal task execution, not just isolated code snippets.
LiveCodeBench v6: 80.4 · NL2Repo: 29.4 · MCPMark: 37.0
AIME26: 92.7 · GPQA: 86.0 · MMLU-Pro: 85.2. Deep reasoning and STEM strength beyond coding.

For developers, the point is simpler: both are genuinely deployable agent-grade systems, not bench-optimized chat models. The 35B-A3B result is especially striking: frontier-class reasoning you can actually self-host.

Three Models, Three Deployment Patterns

Most model families ship one flagship and call the rest variants. Qwen3.6 takes a different approach: each of the three models is engineered for a distinct production pattern.

Qwen3.6-Plus

Qwen3.6-Plus is the hybrid-architecture flagship: linear attention combined with sparse mixture-of-experts routing, built for agentic coding, front-end development, and complex long-context tasks. Choose it when you want the model to spend minutes thinking and orchestrating, not milliseconds responding.

Qwen3.6-Flash

Qwen3.6-Flash is the commercial speed tier, sharing the same hybrid linear-attention + sparse MoE design. It supports a 1M-token context window with thinking mode, tuned for fast and cost-efficient inference on agentic and complex tasks. This is the always-on workhorse for real-time chat, in-IDE autocomplete, latency-bound agent loops, and high-volume production traffic.

Qwen3.6-35B-A3B

Qwen3.6-35B-A3B is the open-source tier: Apache 2.0 license, 35B total parameters with 3B active per token, and a native context window of 262K tokens extensible to ~1M. You can fine-tune it, deploy it commercially, run it on-prem, and ship it inside your own product, with zero vendor lock-in.

That matters because production agent workloads are rarely uniform. Useful platforms need a flagship for the hard, high-stakes tasks, a speed tier for the volume of day-to-day interactions, and an open-source tier for fine-tuning, edge deployment, or sovereignty requirements. Qwen3.6 ships all three in one coherent family.

Pick the Right Qwen3.6 for the Job

Qwen3.6-Plus: when you need frontier reasoning, 1M context, and long-horizon autonomy for complex coding work, such as: "Refactor this 50K-line monorepo over a long agent session until the tests pass."
Qwen3.6-Flash: when you need speed, scale, and production cost-efficiency, such as: "Serve a real-time coding copilot at production scale without blowing the budget."
Qwen3.6-35B-A3B: when you need open weights, Apache 2.0 freedom, and the ability to self-host or fine-tune, such as: "Self-host a fine-tuned coding agent on our own GPUs, commercially licensed."

All three support function calling and reasoning out of the box, all three run on NetMind's ultra-efficient low-latency infrastructure, and all three share the same OpenAI-compatible endpoint.

Models at a Glance

Qwen3.6-Plus

Provider: Alibaba / Qwen Team
Model type: Text generation / agentic coding flagship
Architecture: Hybrid linear attention + sparse Mixture-of-Experts
Context on NetMind: 1,000K tokens
Reasoning mode: Supported
Function calling: Supported
RPM: 600
NetMind pricing: $0.274 input / $1.644 output per million tokens
Best for: Long-horizon coding agents, 1M-context refactors, agentic front-end development

Qwen3.6-Flash

Provider: Alibaba / Qwen Team
Model type: Text generation / speed-optimized commercial
Architecture: Hybrid linear attention + sparse Mixture-of-Experts
Context on NetMind: 1,000K tokens
Reasoning mode: Supported (thinking mode)
Function calling: Supported
RPM: 600
NetMind pricing: $0.164 input / $0.986 output per million tokens
Best for: High-throughput production apps, real-time UIs, latency-bound agents, cost-efficient inference at scale

Qwen3.6-35B-A3B

Provider: Alibaba / Qwen Team
Model type: Text generation / open-source agentic coding
Architecture: Sparse Mixture-of-Experts (256 experts, 8 routed + 1 shared)
Scale: 35B total parameters, 3B active parameters
Context on NetMind: 262K tokens (natively; extensible to ~1M)
Reasoning mode: Supported
Function calling: Supported
Open weights: Available on Hugging Face
License: Apache 2.0
RPM: 600
NetMind pricing: $0.247 input / $1.479 output per million tokens
Best for: Self-hosted agents, fine-tuning, on-prem deployments, commercial open-source apps

Why Use Qwen3.6 on NetMind?

One API, Unified Access

NetMind gives you managed access to all three through the same OpenAI-compatible API surface you already use:

Swap the model name to pick the right tier. Everything else (your SDK, your prompts, your tool definitions, your agent loop) stays the same.

OpenClaw & Hermes Integration Snippets Provided

We know you want to power your most productive tools with Qwen as easily as possible, so we have provided ready-to-use code for you to copy and paste.

Built for the Modern Agent Stack

The Qwen3.6 lineup fits especially well into workflows where the model needs to do more than produce a polished answer:

Coding agents that inspect files, edit code, run tests, and iterate (Plus for hard tasks, Flash for fast iteration).
OpenClaw-style autonomous workflows where the agent must plan and act over many steps.
Repository-scale refactoring where 1M context and persistence matter.
Real-time developer tools (autocomplete, inline suggestions, chat-driven refactors) that demand sub-second latency.
Fine-tuned vertical agents built on 35B-A3B for domain-specific or on-prem deployments.
Front-end and artifact generation where the model must turn vague requirements into usable interfaces.

With the full Qwen3.6 family now on NetMind, developers get a complete tier stack for agent work: frontier reasoning, cost-efficient scale, and open-source ownership when you need it.

Start Building with Qwen3.6 Today

The full Qwen3.6 family is now available in the NetMind Model Library:

Qwen3.6-Plus: $0.274 / $1.644 per million input/output tokens, 1M context.
Qwen3.6-Flash: $0.164 / $0.986 per million input/output tokens, 1M context.
Qwen3.6-35B-A3B: $0.247 / $1.479 per million input/output tokens, 262K context, Apache 2.0 open weights.

If you build something with these, we want to see it. Join the discussion in our Reddit community.

User-Agent: