I'm using Apache Phoenix to create table in Hbase because it provides secondary index features and also sql like datatypes. I Created a table using phoenix with columns as both Double and Varchar.
CREATE TABLE INVOICE (ROWKEY VARCHAR NOT NULL PRIMARY KEY, CF1.INVOICEID VARCHAR, CF1.TOTALAMOUNT DOUBLE,CF1.STATUS VARCHAR,CF1.CREATEDATE DATE);
Phoenix is storing Double values in Hbase as ByteArray like below
column=CF1:TOTALAMOUNT, timestamp=1434102384451, value=\xC0m@\x00\x00\x00\x00\x01
i wrote mapreduce program to read the values directly from Hbase SCAN api without using Phoenix, it's working fine for Varchar values but other datatypes which are stored as Byte array are returning different value. Refer both Phoenix and mapreduce output. All positive double values returning as negative values and negative double values as 0.018310546875 like below
public void map(ImmutableBytesWritable key, Result value, Context context)
throws IOException, InterruptedException {
Double Val = Bytes.toDouble(value.getValue(CF.TOTALAMOUNT)
context.write(key, new Text(val));
}
-
AQIMPNEW_12345689_SQ123,-100.00000000000001
aqipm2037|4567899,0.018310546875,
aqipm2047|456789,-4.9E-324,
Phoenix Output :
| TOTALAMOUNT |
| 100.0 |
| -234.0 |
| 0.0
Phoenix uses its different conversion scheme to store datatypes into HBase. When you fetch the data using Phoenix, it decode the data using same conversion scheme and show it to you. So, instead of trying to connect directly HBase from MR code, use Phoenix Map Reduce integration.
Refer: https://phoenix.apache.org/phoenix_mr.html
However, if you still want to directly connect to HBase, you have to use same encoders and decoders which are used by Phoenix.
Refer this class "org.apache.phoenix.schema.PDataType" : http://grepcode.com/file/repo1.maven.org/maven2/org.apache.phoenix/phoenix/2.2.3-incubating/org/apache/phoenix/schema/PDataType.java#PDataType.BaseCodec.encodeDouble%28double%2Cbyte[]%2Cint%29