Basic example
Select only fast models:How it works
Strategies execute in order:Performance-based filtering
Low latency
Prefer the fastest models:High reliability
Prefer models with low error rates:Balanced
Balance speed and reliability:Cost optimization
Prefer cheaper models
Select mini/turbo models when available:Cost tiers
Define cost tiers with fallback:Provider preference
Prefer specific provider
Try a specific provider first:Avoid specific provider
Exclude a provider unless necessary:Multi-criteria filtering
Complex performance criteria
Tiered selection
Prioritize different criteria:Metadata-based filtering
Use custom metadata to categorize and filter models:Compliance filtering
Filter by compliance requirements:Feature-based filtering
Filter by supported capabilities:Vision models
Select models that can process images:Real-world examples
High-volume production
Optimize for cost and speed:High-reliability application
Optimize for reliability over cost:Development environment
Prefer self-hosted for development:Monitoring strategy effectiveness
Track which models are selected:Available variables
| Variable | Type | Description |
|---|---|---|
m.id | string | Model identifier |
m.provider_id | string | Provider identifier |
m.author_id | string | Model author identifier |
m.display_name | string | Human-readable model name |
m.custom | boolean | Whether this is a custom model |
m.metadata | object | Custom metadata |
m.input_modalities | list | Supported input types |
m.output_modalities | list | Supported output types |
m.supported_features | list | Supported capabilities |
m.metrics.global.request_count | number | Total requests |
m.metrics.global.latency.upstream_ms_avg | number | Average latency (ms) |
m.metrics.global.latency.upstream_ms_p95 | number | P95 latency (ms) |
m.metrics.global.error_rate.total | number | Overall error rate (0-1) |
m.metrics.global.error_rate.timeout | number | Timeout error rate (0-1) |
m.metrics.global.error_rate.rate_limit | number | Rate limit error rate (0-1) |
Best practices
- Start simple - Begin with basic filters, add complexity as needed
- Include fallbacks - Always have a final strategy that accepts any model
- Monitor metrics - Use metrics to validate your strategy is working
- Test strategies - Test with production traffic patterns
- Use request_count - Filter by
m.metrics.global.request_count > 0to ensure metrics exist
See also
- Model Selection Strategies - Complete selection reference
- Metrics Reference - Understanding metrics
- Configuration Schema - All configuration options