All terms · Model Architecture

Parameters

The learnable numerical weights inside a neural network — often cited as billions ("7B", "70B") as a rough proxy for model size and capability.

Parameters are the numerical values inside a neural network that are adjusted during training. Every connection between neurons has a weight (a parameter); every layer has biases (more parameters). When you see "Llama 3 70B" or "GPT-4 is estimated at ~1.8 trillion parameters," that count refers to these weights.

Why parameter count matters (and its limits): - More parameters generally means more capacity to store patterns from training data - Larger models tend to be more capable at reasoning, instruction-following, and knowledge recall - But parameter count alone does not determine quality — training data quality, training methodology, and architecture choices matter as much - A well-trained smaller model (e.g. Mistral 7B) can outperform a poorly-trained larger one

Common parameter scales: - ~7B: fast, runs on consumer hardware, limited reasoning - ~13B–30B: mid-range, runs on good consumer hardware with quantisation - ~70B: powerful, requires significant GPU memory - 100B+: frontier models, run on data centre infrastructure

Most hosted model providers (Anthropic, OpenAI, Google) do not disclose exact parameter counts for their frontier models.

Example

Claude Opus 4.7 is estimated to be a very large model by capability — likely 100B+ parameters — though Anthropic does not publish the exact count. The parameter scale contributes to its ability to handle complex multi-step reasoning.