I've a python program that goes over tables in a DB (not mine) and for each column from type number it performs some mathematical operations such as stdev. However, there are some columns with very numbers and when I'm trying to execute:
select STDDEV(big_col) from table1;
I'm getting the error:
Number out of representable range: type FIXED[SB16](38,0){not null}, value 3.67864e+38
Any idea how can I handle this one? It's ok for me just to ignore this values in this case but I don't want my query to fail.
Thanks, Nir.
As @dnoeth mentioned in the comment section, casting the standard deviation as
DOUBLE
should fix the issue:STDDEV(CAST(big_col as DOUBLE))
.The OP asked how the resulting standard deviation seems to be significantly smaller than e+38 (which is the max number of digits that the
NUMBER
format can hold), then why do we need to cast this number asDOUBLE
?The reason for this lies in the standard deviation formula:
The first step in this process is to subtract the mean of the column from each individual value in that column. All those values are then squared. Now this square determines the 88 upper bound for the values that the
NUMBER
format needs to be able to handle before further arithmetic operations, like dividing by the number of records (N
), and taking a square root reduce it down to the final answer. The final value of standard deviation that you get from this process could be significantly smaller than the sum of squares that's required to be calculated first.