Regime
Loading regime
Crossover
Monthly cost by tokens per day
Self-host
Hosted API
Assumptions defaults
Close call
Costs are close enough to inspect
Costs are close enough that assumptions could flip the recommendation. RunWhere can hand this composition to a live analysis path.
Uses this model, cloud, GPU, quantization, and usage shape.
Provenance
Sources behind this artifact
cited: Hugging Face Hub model metadata cited: OpenRouter model catalog snapshot measured: OpenRouter endpoint throughput snapshot cited: Cloud GPU pricing snapshot calculated: memory bandwidth divided by active parameter bytes; capacity sized from output-token peak measured: OpenRouter p50 throughput times provider count compared with peak output tokens/sec calculated: required GPU hours plus 0.1 FTE labor; utilization affects capacity sizing calculated: 1.5x regime threshold
Artifact deepseek-v3-2--gcp--v1.1.0-d33b3cda - ruleset 1.1.0