Who uses Google TPUs for inference in production?

I am really puzzled by TPUs. I've been reading everywhere that TPUs are powerful and a great alternative to NVIDIA.

I have been playing with TPUs for a couple of months now, and to be honest I don't understand how can people use them in production for inference:

- almost no resources online showing how to run modern generative models like Mistral, Yi 34B, etc. on TPUs - poor compatibility between JAX and Pytorch - very hard to understand the memory consumption of the TPU chips (no nvidia-smi equivalent) - rotating IP addresses on TPU VMs - almost impossible to get my hands on a TPU v5

Is it only me? Or did I miss something?

I totally understand that TPUs can be useful for training though.

Comments URL: https://news.ycombinator.com/item?id=39670121

Points: 16

# Comments: 2

from Hacker News: Front Page https://ift.tt/UB0tSEi

Popular Posts

Labels

Popular Posts

Popular Content

Recent Posts

Labels

Report Abuse

About Me

Search This Blog

Blog Archive

Follow on Google+

Recent Posts

About us

Who uses Google TPUs for inference in production?

0 comments