TPU not found on Google VM (jax version 0.2.16)

1k views Asked by At

I'm running a TPU v3-8 VM on Google. On the VM, I installed jax with pip install "jax[tpu]==0.2.16" -f https://storage.googleapis.com/jax-releases/libtpu_releases.html.

Unfortunately, I'm getting the message No GPU/TPU found, falling back to CPU, when issuing jax.device_count(). The same holds for pip install jax==0.2.12. Only when I'm using pip install "jax[tpu]>=0.2.16" -f https://storage.googleapis.com/jax-releases/libtpu_releases.html (newest jax version), it works. But I need jax version 0.2.12 or 0.2.16 because I would like to train GPT-J on a TPU following the tutorial https://github.com/kingoflolz/mesh-transformer-jax/blob/master/howto_finetune.md

How can I get it running with these versions?

1

There are 1 answers

0
Anisha Mazumder On

Could you please try to explicitly set TPU_LIBRARY_PATH to the present location of the libtpu.so? most likely /home/<your username>/.local/lib/python3.8/site-packages/libtpu/libtpu.so

Here is the relevant GitHub issue: https://github.com/google/jax/issues/13321

As mentioned there, " The underlying problem is that this version of jax still expected libtpu.so to be automatically installed in the VM image (https://github.com/google/jax/blob/jax-v0.2.16/jax/_src/cloud_tpu_init.py#L104), which the TPU VM base image no longer does. "