Devstral Small 2
Mistral's latest coding-focused model. Agentic coding capabilities.
All tracked Mistral models — pricing, context windows, lifecycle state, and release history in one place.
Mistral's latest coding-focused model. Agentic coding capabilities.
Mistral's flagship reasoning model with chain-of-thought capabilities. 128K context.
Smaller reasoning model. Open-source. 128K context.
Medium-size coding model. Balanced capability and cost.
Open-weight small model. 24B parameters, 128K context, vision support.
Mistral's coding specialist. 256K context, strong code generation.
Mistral's largest non-reasoning model. 123B parameters, 128K context.
Vision-capable variant of Mistral Large. 128K context, image understanding.
Open-weight 12B multimodal model. 128K context, image understanding.
First Mistral Small 3 model. 24B params, text-only. Superseded by 3.1 with vision.
Pioneering open-weight MoE model. 8 experts, 7B each. 32K context.
Mistral's first open-weight model. 7B params, 32K context. Still used for local inference.
Predecessor → successor chains tracked for Mistral models.
Mistral's latest coding-focused model. Agentic coding capabilities.
Mistral's flagship reasoning model with chain-of-thought capabilities. 128K context.
Smaller reasoning model. Open-source. 128K context.
Medium-size coding model. Balanced capability and cost.
Open-weight small model. 24B parameters, 128K context, vision support.
Mistral's coding specialist. 256K context, strong code generation.
First Mistral Small 3 model. 24B params, text-only. Superseded by 3.1 with vision.
Mistral's largest non-reasoning model. 123B parameters, 128K context.
Vision-capable variant of Mistral Large. 128K context, image understanding.
Open-weight 12B multimodal model. 128K context, image understanding.
Pioneering open-weight MoE model. 8 experts, 7B each. 32K context.
Mistral's first open-weight model. 7B params, 32K context. Still used for local inference.