hbase shell QualifierFilter is not filtering out columns when used with logical OR and SingleColumnValueFilter

21 views Asked by At

In cutomer_product table, I have 2 column families: personal_info’ (with the columns ‘first_name’ and ‘last_name’) and ‘product_purchase’ (with the columns ‘product’ and ‘related_product’).

I am trying to write a query to find all products, for which the related products are “Pens” or “Comfortchair”. I want to display on 'product' column and nothing else.

I have tried below queries of scan with their output:

scan 'customer_product', {FILTER => "((SingleColumnValueFilter('product_purchase','related_product', =, 'binary:Pens')) OR\
(SingleColumnValueFilter('product_purchase','related_product', =, 'binary:Comfortchair'))) AND\
(QualifierFilter(=, 'binary:product'))"}

this outputs:

 [email protected]                               column=personal_info:first_name, timestamp=1581381162345, value=David
 [email protected]                               column=personal_info:last_name, timestamp=1581381162348, value=Pierce
 [email protected]                               column=product_purchase:product, timestamp=1581381162350, value=Bed
 [email protected]                               column=personal_info:first_name, timestamp=1581381162110, value=Debra
 [email protected]                               column=personal_info:last_name, timestamp=1581381162114, value=Dantha
 [email protected]                               column=product_purchase:product, timestamp=1581381162118, value=Chair
 [email protected]                                 column=personal_info:first_name, timestamp=1581381162077, value=Kia
 [email protected]                                 column=personal_info:last_name, timestamp=1581381162080, value=Bobby
 [email protected]                                 column=product_purchase:product, timestamp=1581381162083, value=Notebooks
 [email protected]                                 column=product_purchase:related_product, timestamp=1581381162087, value=Pens
 [email protected]                                  column=personal_info:first_name, timestamp=1581381162336, value=Matthew
 [email protected]                                  column=personal_info:last_name, timestamp=1581381162338, value=Cyril
 [email protected]                                  column=product_purchase:product, timestamp=1581381162341, value=Pencils
 [email protected]                                  column=product_purchase:related_product, timestamp=1581381162343, value=Pens

for this:

scan 'customer_product', {COLUMN => 'product_purchase',\
FILTER => "((SingleColumnValueFilter('product_purchase','related_product', =, 'binary:Pens')) OR\
(SingleColumnValueFilter('product_purchase','related_product', =, 'binary:Comfortchair'))) AND\
(QualifierFilter(=, 'binary:product'))"}

it outputs:

 [email protected]                               column=product_purchase:product, timestamp=1581381162350, value=Bed
 [email protected]                               column=product_purchase:product, timestamp=1581381162118, value=Chair
 [email protected]                                 column=product_purchase:product, timestamp=1581381162083, value=Notebooks
 [email protected]                                 column=product_purchase:related_product, timestamp=1581381162087, value=Pens
 [email protected]                                  column=product_purchase:product, timestamp=1581381162341, value=Pencils
 [email protected]                                  column=product_purchase:related_product, timestamp=1581381162343, value=Pens

why isn't related_product filtered out for "Pens" in the output?

Also,

scan 'customer_product', {COLUMN => 'product_purchase:product',\
FILTER => "((SingleColumnValueFilter('product_purchase','related_product', =, 'binary:Pens')) OR\
(SingleColumnValueFilter('product_purchase','related_product', =, 'binary:Comfortchair'))) AND\
(QualifierFilter(=, 'binary:product'))"}

prints all product and no filtering is happening.

I do not know what am I missing.

0

There are 0 answers