I was plotting the Boxplot and labelling it with quartiles and min-max values. It worked fine for a few columns; however, for some columns, the stats value was not exactly matching with the boxplot stats.
For example, the summary
command was giving a median
value of 2320
, whereas boxplot.stats
were giving the value 2319.5
.
I was using Statlog (German Credit Data) Data Set
for credit risk scoring.
Dataset link: https://archive.ics.uci.edu/ml/datasets/statlog+(german+credit+data)
Different functions can format values differently. The printed value is based on the value set in
options("digits")
which is often about 7 significant digits (not decimal places) but rarely the exact value. In addition to the system setting, the function can set a different value for displaying numbers. The only way to see the entire value as it is stored internally is to usedput()
:Notice that both functions compute the same value for the median, but boxplot.stats prints out more decimal places. Another factor for quantiles other than the median is that there are different ways of computing them. The
quantile
function offers 9 different methods (see?quantile
).