MemSQL Weird Insert/Update Behaviour

723 views Asked by At

We are using single node MemSQL and everything was working fine but when we are trying to move our MemSQL setup to use multi node the insert/update statements are behaving very weirdly

My table structures are like below , have removed many columns , to keep it short

CREATE /*!90618 REFERENCE*/ TABLE `fact_orderitem_hourly_release_update` 
(
  `order_id` int(11) NOT NULL DEFAULT '0',
  `customer_login` varchar(128) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
  `warehouse_id` int(11) DEFAULT NULL,
  `city` varchar(100) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
  `store_id` int(11) DEFAULT NULL,
  PRIMARY KEY (`order_id`)
);

CREATE TABLE `fact_orderitem_hourly_scale` (
  `order_id` int(11) NOT NULL DEFAULT '0',
  `order_group_id` int(11) NOT NULL DEFAULT '0',
  `item_id` int(11) NOT NULL,
  `sku_id` int(11) NOT NULL DEFAULT '0',
  `sku_code` varchar(45) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
  `po_type` varchar(20) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
  `store_order_id` varchar(50) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
  `bi_last_modified_on` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00.000000',
  PRIMARY KEY (`item_id`,`sku_id`),
  /*!90618 SHARD */ KEY `sku_id` (`sku_id`),
  KEY `idx_fact_orderitem_hourly_lmd` (`bi_last_modified_on`),
  KEY `idx_fact_orderitem_hourly_ord` (`order_id`),
  KEY `idx_order_group_id` (`order_group_id`),
  KEY `idx_store_order_id` (`store_order_id`)
);

My Load Script :

mysql -h$LiveMemSQL_DB -u$LiveMemSQL_USER --password=$LiveMemSQL_PASS -P$LiveMemSQL_PORT --verbose reports_and_summary < /home/titan/brand_catalog/upsert_memsql_orl_update.sql

Contents of .SQL File :

--start of .sql file
    TRUNCATE TABLE reports_and_summary.fact_orderitem_hourly_release_update;

#Load data into staging

LOAD DATA LOCAL INFILE '/myntra/redshift/delta_files/live_scale_order_release_upd.txt' INTO TABLE reports_and_summary.fact_orderitem_hourly_release_update LINES TERMINATED BY '\n';


#Insert/Update statement  
INSERT INTO reports_and_summary.fact_orderitem_hourly_scale
(
item_id,
sku_id,
customer_login,
order_status,
is_realised,
is_shipped,
shipping_charge,
gift_charge,
warehouse_id,
city,
store_id
)
select
fo.item_id,
fo.sku_id,
fr.customer_login,
fr.order_status,
fr.is_realised,
fr.is_shipped,
fr.shipping_charge,
fr.gift_charge,
fr.warehouse_id,
fr.city,
fr.store_id
from fact_orderitem_hourly_release_update fr
join fact_orderitem_hourly_scale fo
on fr.order_id=fo.order_id
ON duplicate key update
customer_login=values(customer_login),
order_status=values(order_status),
is_realised=values(is_realised),
is_shipped=values(is_shipped),
shipping_charge=values(shipping_charge),
gift_charge=values(gift_charge),
warehouse_id=values(warehouse_id),
city=values(city),
store_id=values(store_id);

--End .sql file

When I trigger the above .sql through mysql command line client , it works sometimes and it doesn't many of times , and some times if I execute the same .sql file continuously 5-10 times , the updates will get effected in one of those runs , and sometimes say for example if there are 3 records with order_id 101 and status SHIPPED and we got an update in merge table say the order status has been changed to DELIVERED , ideally status of all 3 orders should be changed to DELIVERED , but only one or 2 of the rows associated with an order are getting updated but if I execute the same .sql file content through MySQLWorkbench it works perfectly fine , I may sound stupid , but this is what is happening and I am struggling from last 2 days with this weird behavior

Please find the below screen cast , where I captured this behaviour https://www.youtube.com/watch?v=v2HN-n4V0MI&feature=youtu.be

1

There are 1 answers

6
Jack Chen On BEST ANSWER

Your staging table is a reference table, and writes to reference tables are replicated asynchronously to the cluster. This is why sometimes your updates work as expected and sometimes they don't.

You can

  • wait for a bit after writing into the reference table
  • make the staging table non-reference