// features · rentals

On-Demand Nodes

Rent GPU compute by the second. Pay only for verifiable cycles. Filter the mesh by GPU model, memory, region, and latency. OpenAI-compatible endpoints.

section · featuresread 3 min

Two ways to rent

Spot — pay per cycle

Best for short jobs, experiments, batch inference. Bid against current market price. Node can be reclaimed by higher bidder if you don't reserve.

Reserved — pay per hour

Lock a specific node for 1h to 30d. Up to 40% cheaper than spot for long-running workloads. Cannot be reclaimed.

Filter the mesh

The deployment dashboard exposes filters by:

Pricing snapshot

GPUVRAMAWS On-DemandVouchGPU spotYou save
NVIDIA H10080 GB$12.29 / hr$2.84 / hr−77%
NVIDIA A10080 GB$4.10 / hr$0.96 / hr−77%
NVIDIA L40S48 GB$2.45 / hr$0.62 / hr−75%
RTX 409024 GB$0.31 / hrconsumer tier
Apple M3 Max128 GB unified$0.48 / hrconsumer tier

API access

Every deployment exposes OpenAI-compatible chat-completion and embeddings endpoints out of the box. Existing code that uses the OpenAI SDK works with a one-line URL swap.

from openai import OpenAI

client = OpenAI(
    base_url="https://<your-deployment>.vouchgpu.xyz/v1",
    api_key="<your-vouchgpu-key>"
)

resp = client.chat.completions.create(
    model="llama-3.1-70b",
    messages=[{"role":"user","content":"hello"}]
)