Quantization of torch tensors to reduce storage size

166 views Asked by At

I have a number (N) of vectors of size 192 x 1 currently stored as torch tensors. Each element in the tensor are float numbers. These Nvectors are used to be compared against a reference vector by its' similarity.

Now, N can be very large and hence these vectors start to actually consume some memory after a while. Therefore, I'd like to try some quantization methods to potentially reduce the memory needed to store them. For instance, by storing them with fewer bits (at the cost of lower precision).

Is there a convenient way to do this while preserving the comparability between a reference vector and the N stored vectors?

I've tried using;

q_vec = torch.quantize_per_tensor_dynamic(vec, torch.quint8, True)

But using getsizeof(x) the size is still the same as before quantization. Obviously, I'm probably missing something or misunderstanding the concept here. Any nudge in the right direction would be highly appreciated.

1

There are 1 answers

0
Snykral fa Ashama On

Instead of using sys.getsizeof(q_vec), as it will return a constant value for tensors, you should use:

q_vec.element_size() * q_vec.nelement()

And then measure the before vs after sizes.