LLM Inference vLLM 7B

Run a 7B instruction model with vLLM for efficient token throughput

Use this template to launch a standard OpenAI-compatible endpoint with vLLM.