We are using single node MemSQL and everything was working fine but when we are trying to move our MemSQL setup to use multi node the insert/update statements are behaving very weirdly
My table structures are like below , have removed many columns , to keep it short
CREATE /*!90618 REFERENCE*/ TABLE `fact_orderitem_hourly_release_update`
(
`order_id` int(11) NOT NULL DEFAULT '0',
`customer_login` varchar(128) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
`warehouse_id` int(11) DEFAULT NULL,
`city` varchar(100) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
`store_id` int(11) DEFAULT NULL,
PRIMARY KEY (`order_id`)
);
CREATE TABLE `fact_orderitem_hourly_scale` (
`order_id` int(11) NOT NULL DEFAULT '0',
`order_group_id` int(11) NOT NULL DEFAULT '0',
`item_id` int(11) NOT NULL,
`sku_id` int(11) NOT NULL DEFAULT '0',
`sku_code` varchar(45) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
`po_type` varchar(20) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
`store_order_id` varchar(50) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
`bi_last_modified_on` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00.000000',
PRIMARY KEY (`item_id`,`sku_id`),
/*!90618 SHARD */ KEY `sku_id` (`sku_id`),
KEY `idx_fact_orderitem_hourly_lmd` (`bi_last_modified_on`),
KEY `idx_fact_orderitem_hourly_ord` (`order_id`),
KEY `idx_order_group_id` (`order_group_id`),
KEY `idx_store_order_id` (`store_order_id`)
);
My Load Script :
mysql -h$LiveMemSQL_DB -u$LiveMemSQL_USER --password=$LiveMemSQL_PASS -P$LiveMemSQL_PORT --verbose reports_and_summary < /home/titan/brand_catalog/upsert_memsql_orl_update.sql
Contents of .SQL File :
--start of .sql file
TRUNCATE TABLE reports_and_summary.fact_orderitem_hourly_release_update;
#Load data into staging
LOAD DATA LOCAL INFILE '/myntra/redshift/delta_files/live_scale_order_release_upd.txt' INTO TABLE reports_and_summary.fact_orderitem_hourly_release_update LINES TERMINATED BY '\n';
#Insert/Update statement
INSERT INTO reports_and_summary.fact_orderitem_hourly_scale
(
item_id,
sku_id,
customer_login,
order_status,
is_realised,
is_shipped,
shipping_charge,
gift_charge,
warehouse_id,
city,
store_id
)
select
fo.item_id,
fo.sku_id,
fr.customer_login,
fr.order_status,
fr.is_realised,
fr.is_shipped,
fr.shipping_charge,
fr.gift_charge,
fr.warehouse_id,
fr.city,
fr.store_id
from fact_orderitem_hourly_release_update fr
join fact_orderitem_hourly_scale fo
on fr.order_id=fo.order_id
ON duplicate key update
customer_login=values(customer_login),
order_status=values(order_status),
is_realised=values(is_realised),
is_shipped=values(is_shipped),
shipping_charge=values(shipping_charge),
gift_charge=values(gift_charge),
warehouse_id=values(warehouse_id),
city=values(city),
store_id=values(store_id);
--End .sql file
When I trigger the above .sql through mysql command line client , it works sometimes and it doesn't many of times , and some times if I execute the same .sql file continuously 5-10 times , the updates will get effected in one of those runs , and sometimes say for example if there are 3 records with order_id 101 and status SHIPPED and we got an update in merge table say the order status has been changed to DELIVERED , ideally status of all 3 orders should be changed to DELIVERED , but only one or 2 of the rows associated with an order are getting updated but if I execute the same .sql file content through MySQLWorkbench it works perfectly fine , I may sound stupid , but this is what is happening and I am struggling from last 2 days with this weird behavior
Please find the below screen cast , where I captured this behaviour https://www.youtube.com/watch?v=v2HN-n4V0MI&feature=youtu.be
Your staging table is a reference table, and writes to reference tables are replicated asynchronously to the cluster. This is why sometimes your updates work as expected and sometimes they don't.
You can