NVIDIA NIM¶
This example shows how to deploy DeepSeek-R1-Distill-Llama-8B using NVIDIA NIM and dstack
.
Prerequisites
Once dstack
is installed, go ahead clone the repo, and run dstack init
.
$ git clone https://github.com/dstackai/dstack
$ cd dstack
$ dstack init
Deployment¶
Here's an example of a service that deploys DeepSeek-R1-Distill-Llama-8B using NIM.
type: service
name: serve-distill-deepseek
image: nvcr.io/nim/deepseek-ai/deepseek-r1-distill-llama-8b
env:
- NGC_API_KEY
- NIM_MAX_MODEL_LEN=4096
registry_auth:
username: $oauthtoken
password: ${{ env.NGC_API_KEY }}
port: 8000
# Register the model
model: deepseek-ai/deepseek-r1-distill-llama-8b
# Uncomment to leverage spot instances
#spot_policy: auto
# Cache downloaded models
volumes:
- instance_path: /root/.cache/nim
path: /opt/nim/.cache
optional: true
resources:
gpu: A100:40GB
# Uncomment if using multiple GPUs
#shm_size: 16GB
Running a configuration¶
To run a configuration, use the dstack apply
command.
$ NGC_API_KEY=...
$ dstack apply -f examples/deployment/nim/.dstack.yml
# BACKEND REGION RESOURCES SPOT PRICE
1 vultr ewr 6xCPU, 60GB, 1xA100 (40GB) no $1.199
2 vultr ewr 6xCPU, 60GB, 1xA100 (40GB) no $1.199
3 vultr nrt 6xCPU, 60GB, 1xA100 (40GB) no $1.199
Submit the run serve-distill-deepseek? [y/n]: y
Provisioning...
---> 100%
If no gateway is created, the model will be available via the OpenAI-compatible endpoint
at <dstack server URL>/proxy/models/<project name>/
.
$ curl http://127.0.0.1:3000/proxy/models/main/chat/completions \
-X POST \
-H 'Authorization: Bearer <dstack token>' \
-H 'Content-Type: application/json' \
-d '{
"model": "meta/llama3-8b-instruct",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "What is Deep Learning?"
}
],
"max_tokens": 128
}'
When a gateway is configured, the OpenAI-compatible endpoint
is available at https://gateway.<gateway domain>/
.
Source code¶
The source-code of this example can be found in
examples/deployment/nim
.
What's next?¶
- Check services
- Browse the DeepSeek AI NIM