Insufficient cluster resources to launch trial - has only 0 GPUs

Question

Insufficient cluster resources to launch trial - has only 0 GPUs

894 views Asked by m02ph3u5 At 06 December 2020 at 13:40

I am following this tutorial (which is basically this) in order to use ray tune for hyperparemeter optimization. My model is training fine on the GPU without the optimization but now I want to optimize.

I applied the tutorial to my code but when I try to kick off the thing:

result = tune.run(
    train,
    resources_per_trial={"gpu": 1},
    config=config,
    num_samples=10,
    scheduler=scheduler,
    progress_reporter=reporter,
    checkpoint_at_end=False,
)

I'm stuck with:

TuneError: Insufficient cluster resources to launch trial: trial requested 1 CPUs, 1 GPUs, but the cluster has only 6 CPUs, 0 GPUs, 12.74 GiB heap, 4.39 GiB objects (1.0 node:XXX).

But then again, when I take a look at the ray dashboard:

there clearly are both GPUs listed.

Why isn't ray tune seeing my GPUs? How do I make this work?

Specs:

GPU 0: TITAN Xp
GPU 1: GeForce GTX 1080 Ti
CUDA 10.1
Python 3.7
PyTorch 1.7
Debian 9.12
ray tune 1.0.1.post1

//edit:

ray.init(num_gpus=1)
ray.get_gpu_ids()

[]

Original Q&A

There are 1 answers

**Asimandia** · Answer 1 · 2024-03-07T08:16:27+00:00

Asimandia On 07 March 2024 at 08:16

I would suggest checking out placement_strategy and max_t variables. First can cause freezes dependent on your system specification and second can just exceed total time for computation

TechQA.

Insufficient cluster resources to launch trial - has only 0 GPUs

There are 1 answers

Related Questions in PYTHON

Related Questions in PYTORCH

Related Questions in RAY

Related Questions in RAY-TUNE

Popular Questions

Popular Tags

Trending Questions