I am trying my hand on EDGE AI and specifically in using YOLO models on Jetson Orin Nano (8GB). Specifically, I am doing some preliminary research on how many video streams I can handle simultaneously (doing object detection) on Orin Nano board. The claimed performance is 40TOPS (for calculations with INT8 precision).
The first idea was to see the computational complexity required to run YOLOv8 models, and from the official documentation I got this:
- YOLOv8n = 10.5 GFLOPs | 3.5 params (M)
- YOLOv8s = 29.7 GFLOPs | 11.4 params (M)
- YOLOv8m = 80.6 GFLOPs | 26.2 params (M)
- YOLOv8l = 167.4 GFLOPs | 44.1 params (M)
- YOLOv8x = 260.6 GFLOPs | 68.7 params (M)
The fact is, if I understand correctly, these models use FP32 precision and therefore I cannot make a direct calculation with Orin Nano performance since it is expressed in INT8.
I then looked to see if there were any versions of YOLO with INT8 weights and found YOLO-NAS, which seems to have comparable performance to the standard YOLO models, but with fewer resource demands.
The problem is that the official documentation does not give data on the TOPS (or GOPS) required for the models, only the number of parameters:
- YOLO-NAS S = 19.0 params (M)
- YOLO-NAS M = 51.1 params (M)
- YOLO-NAS L = 66.9 params (M)
I then tried the following code to obtain the number of TOPS needed, but it comes out with a value that I do not think is congruous: "[...] Total mult-adds (T): 1.04"
from torchinfo import summary
summary(model=yolo_nas_l,
input_size=(16, 3, 640, 640),
col_names=["input_size", "output_size", "num_params", "trainable"],
col_width=20,
row_settings=["var_names"]
)
Indeed, wanting to make a rough estimate based on the number of parameters, the YOLO-NAS L model would be similar to YOLOv8x, but it has only 260.6 GFLOPs and not 1.04 (T). Is there any error in the calculation of the measurements? Also related to the fact that since it is INT8, we should no longer talk about FLOPS (floating point operations) but OPS... is this correct?
Would anyone be able to give me some tips on where (or how) to find the complexity of YOLO-NAS models, expressed in OPS? And so that it is usable to make a calculation of the possible resources available on the Orin Nano (which I remember being expressed in TOPS).
Thanks a lot!