Llama 3.1 8B Instruct
Compact Llama 3.1 deployment template for general-purpose inference
This template targets low-latency general inference with balanced resource usage.
This template targets low-latency general inference with balanced resource usage.