Skip to content

Volumes

Beyond Kubernetes: 2024 recap and what's ahead for AI infra

At dstack, we aim to simplify AI model development, training, and deployment of AI models by offering an alternative to the complex Kubernetes ecosystem. Our goal is to enable seamless AI infrastructure management across any cloud or hardware vendor.

As 2024 comes to a close, we reflect on the milestones we've achieved and look ahead to the next steps.

Introducing instance volumes to persist data on instances

Until now, dstack supported data persistence only with network volumes, managed by clouds. While convenient, sometimes you might want to use a simple cache on the instance or mount an NFS share to your SSH fleet. To address this, we're now introducing instance volumes that work for both cases.

type: task 
name: llama32-task

env:
  - HF_TOKEN
  - MODEL_ID=meta-llama/Llama-3.2-3B-Instruct
commands:
  - pip install vllm
  - vllm serve $MODEL_ID --max-model-len 4096
ports: [8000]

volumes:
  - /root/.dstack/cache:/root/.cache

resources:
  gpu: 16GB..

Using volumes to optimize cold starts on RunPod

Deploying custom models in the cloud often faces the challenge of cold start times, including the time to provision a new instance and download the model. This is especially relevant for services with autoscaling when new model replicas need to be provisioned quickly.

Let's explore how dstack optimizes this process using volumes, with an example of deploying a model on RunPod.