Load data infile csv file in infobright

2.2k views Asked by At

I have a table ( created successfully in infobright).

I m using windows system

CREATE TABLE `file_records` (
  `id` int(11) NOT NULL ,
  `file_id` int(11)  NULL,
  `file_url` varchar(255)  NULL,
  `switch_id` int(11)  NULL,
  `carrierid_supplier` int(11)  NULL,
  `technical_profileid_supplier` int(11)  NULL,
  `carrierid_customer` int(11)  NULL,
  `technical_profileid_customer` int(11)  NULL,
  `billing_increment_supplier` varchar(10)  NULL,
  `billing_increment_customer` varchar(10)  NULL,
  `billable_duration_supplier` int(11)  NULL,
  `call_duration_seconds` int(11)  NULL,
  `billable_duration_customer` int(11)  NULL,
  `destination` varchar(50)  NULL,
  `destination_country` varchar(50)  NULL,
  `destination_country_number` varchar(50)  NULL,
  `rateplanid_supplier` int(11)  NULL,
  `rateplanid_customer` int(11)  NULL,
  `rate_supplier` int(11)  NULL,
  `rate_customer` int(11)  NULL,
  `rate_total_supplier` varchar(10)  NULL,
  `rate_total_customer` varchar(10)  NULL,
  `rate_effective_date_supplier` date  NULL,
  `rate_effective_date_customer` date  NULL,
  `call_hour` varchar(10)  NULL,
  `sequence_number` int(11)  NULL,
  `version` varchar(10)  NULL,
  `record_type` varchar(1)  NULL,
  `connection_type` varchar(16)  NULL,
  `session_id` varchar(36)  NULL,
  `release_cause` smallint(6)  NULL,
  `start_time_date` datetime  NULL,
  `answer_time_date` datetime  NULL,
  `release_tod` datetime  NULL,
  `greenwich_mean_time` varchar(32)  NULL,
  `release_cause_protocol_stack` varchar(32)  NULL,
  `binary_value_protocol_stack` smallint(6)  NULL,
  `first_release_dialogue` varchar(1)  NULL,
  `origination_trunkid` int(11)  NULL,
  `voip_protocol` varchar(6)  NULL,
  `source_number` varchar(128)  NULL,
  `source_host_name` varchar(128)  NULL,
  `destination_number` varchar(128)  NULL,
  `destination_host_name` varchar(128)  NULL,
  `callid` varchar(128)  NULL,
  `remote_payload_ipaddress` varchar(16)  NULL,
  `remote_payload_udpaddress` varchar(6)  NULL,
  `local_payload_ipaddress` varchar(16)  NULL,
  `local_payload_udpaddress` varchar(6)  NULL,
  `codec_list` varchar(128)  NULL,
  `ingress_packets` int(11)  NULL,
  `egress_packets` int(11)  NULL,
  `ingress_octets` int(11)  NULL,
  `egress_octets` int(11)  NULL,
  `ingress_packet_loss` int(11)  NULL,
  `ingress_delay` int(11)  NULL,
  `ingress_packet_jitter` int(11)  NULL,
  `supplierid` mediumint(9)  NULL,
  `protocol` varchar(6)  NULL,
  `termination_source_number` varchar(128)  NULL,
  `termination_source_host` varchar(128)  NULL,
  `termination_destination_number` varchar(128)  NULL,
  `termination_destination_host_name` varchar(128)  NULL,
  `termination_callid` varchar(128)  NULL,
  `termination_remote_payload_ipaddress` varchar(16)  NULL,
  `termination_remote_payload_udpaddress` varchar(6)  NULL,
  `termination_local_payload_ipaddress` varchar(16)  NULL,
  `termination_local_payload_udpaddress` varchar(6)  NULL,
  `termination_codec_list` varchar(128)  NULL,
  `termination_ingress_packets` int(11)  NULL,
  `termination_egress_packets` int(11)  NULL,
  `termination_ingress_octets` int(11)  NULL,
  `termination_egress_octets` int(11)  NULL,
  `termination_ingress_packet_loss` int(11)  NULL,
  `termination_ingress_delay` int(11)  NULL,
  `termination_ingress_packet_jitter` int(11)  NULL,
  `final_route_indication` varchar(1)  NULL,
  `routing_digits` varchar(64)  NULL,
  `call_duration` mediumint(9)  NULL,
  `post_dial_delay` mediumint(9)  NULL,
  `ring_time` mediumint(9)  NULL,
  `call_duration_ms` int(11)  NULL,
  `confid` varchar(32)  NULL,
  `rpid` varchar(32)  NULL,
  `route_entry_index` tinyint(4)  NULL,
  `route_table_used` mediumint(9)  NULL,
  `lnp_dipped` varchar(1)  NULL,
  `ingress_lrn` varchar(32)  NULL,
  `egress_lrn` varchar(32)  NULL,
  `cnam_dipped` tinyint(4)  NULL,
  `dnc_dipped` tinyint(4)  NULL,
  `origination_device_name` varchar(15)  NULL,
  `termination_device_name` varchar(15)  NULL,
  `ers_dipped` varchar(1)  NULL,
  `oli_digits` varchar(8)  NULL
) ENGINE=Brighthouse;

And loading csv data file using

LOAD DATA INFILE 'C:/Users/dhairya/Desktop/a.csv' INTO TABLE file_records FIELDS TERMINATED BY ';' LINES TERMINATED BY '\n' ( sequence_number, version, record_type, connection_type, session_id, release_cause, @start_time_date, @answer_time_date, @release_tod, greenwich_mean_time, release_cause_protocol_stack, binary_value_protocol_stack, first_release_dialogue, origination_trunkid, voip_protocol, source_number, source_host_name, destination_number, destination_host_name, callid, remote_payload_ipaddress, remote_payload_udpaddress, local_payload_ipaddress, local_payload_udpaddress, codec_list, ingress_packets, egress_packets, ingress_octets, egress_octets, ingress_packet_loss, ingress_delay, ingress_packet_jitter, supplierid, protocol, termination_source_number, termination_source_host, termination_destination_number, termination_destination_host_name, termination_callid, termination_remote_payload_ipaddress, termination_remote_payload_udpaddress, termination_local_payload_ipaddress, termination_local_payload_udpaddress, termination_codec_list, termination_ingress_packets, termination_egress_packets, termination_ingress_octets, termination_egress_octets, termination_ingress_packet_loss, termination_ingress_delay, termination_ingress_packet_jitter, final_route_indication, routing_digits, call_duration, post_dial_delay, ring_time, call_duration_ms, confid, rpid, route_entry_index, route_table_used, lnp_dipped, ingress_lrn, egress_lrn, cnam_dipped, dnc_dipped, origination_device_name, termination_device_name, ers_dipped, oli_digits, routing_match, termination_pdd, proxy_charge_information, jurisdictoinal_indication_parameter, matched_digits_ers, route_table_irs, route_sequence_number_ers, jurisdiction_type_ers) set start_time_date = 0, answer_time_date = 0, release_tod = 0, file_id = 1;

I always get err 2 Hy0000 : wrong data or column definition

sample csv data is as below

018002724;V1.17;R;No-Media;[email protected];0001;Sat Apr 27 23:57:21 2013;Sat Apr 27 23:57:54 2013;Sat Apr 27 23:58:44 2013;0;Normal BYE; 200;O;030057;SIP;0186645212;80.84.30.12;6282330238419;98.123.123.000;7006204640450523785-1367107040-32218;80.84.30.12;39264;NA;NA;G.729,G.723.1,RFC 2833,G.711a_64k,G.711u_64k;0;0;0;0;0;0;0;090069;SIP;0186645212;98.122.122.122;6282330238419;112.123.123.123;[email protected];123.123.123.2;24992;NA;NA;G.729,RFC 2833;0;0;0;0;0;0;0;F;6282330238419;49;1;32;49680;12E192A8 345A3974 ADF7A11C A6919E62;;1;103;0;;;0;0;SP Belgacom - B;Vinculum WHS Ve;0;;62823;1140;;;;0;0;0;

I check the termination as '\n' for the csv file

The above queries will work fine in mysql

1

There are 1 answers

0
StampyCode On

The problem is the BRIGHTHOUSE loader does not support specifying columns in the LOAD DATA INFILE command. It also doesn't support setting default values for columns in this command.

To use BRIGHTHOUSE tables properly like this, you have 4 options:

  • Make the table schema match the CSV file layout,
  • Make the CSV layout match the table schema,
  • Use the setting SET @BH_DATAFORMAT = 'mysql' to use the standard mysql loader instead of the BRIGHTHOUSE one,
  • Or create a new temporary table that matches your CSV layout, load into that, then write the contents of that table to a new CSV file in a format that matches the desired CSV format for the original table.

I know it's painful, I've been working with an Infobright database for the past 6 months and it's a real headache to use.

Also, don't forget these handy settings when working with the BRIGHTHOUSE loader:

  • SET @BH_DATAFORMAT = 'txt_variable'; This means use the BRIGHTHOUSE loader.
  • SET @BH_DATAFORMAT = 'mysql'; This means use the MySQL loader.
  • SET @BH_REJECT_FILE_PATH = '/tmp/new_file_name'; This means copy any line from your CSV file into this output file when it doesnt match the schema you're loading into. This is especially helpful for debugging, or for when you have large CSV files with headers that you cannot remove, because the BRIGHTHOUSE loader does not implement the IGNORE n LINES parameter in the LOAD DATA INFILE command.
  • SET @BH_ABORT_ON_COUNT = 1000; Aborts the file load command when this number of lines in your input CSV file does not match the schema of the table you're loading into.

Also bear in mind that there is a bug in the BRIGHTHOUSE CSV exporter that means backslashes are not escaped in the resulting CSV file. So if you want to load this CSV back into your database schema you will need to use LOAD DATA INFILE '...' INTO TABLE ... FIELDS ESCAPED BY '' or the importer will use the backslashes as part of an escape sequence to change the meaning of the character following it.

Here is a handy SQL block that might help you to load CSV into a BRIGHTHOUSE table with mismatched schema:

CREATE TABLE `tmptable` ENGINE=MySQL SELECT * FROM `maintable` WHERE 0 LIMIT 1;

@SET BH_DATAFORMAT = 'mysql';

LOAD DATA INFILE '/tmp/myfile' INTO TABLE `tmptable`
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"' 
ESCAPED BY '' 
LINES TERMINATED BY '\n' 
IGNORE 1 LINES
(field1, field2, field3) SET field4 = 'somevalue'; /*as per your CSV layout*/

SELECT * FROM tmptable INTO OUTFILE '/tmp/mynewfile.csv' 
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"' 
ESCAPED BY '' 
LINES TERMINATED BY '\n';

DROP TABLE `tmptable`;

@SET BH_DATAFORMAT = 'txt_variable';
@SET BH_REJECT_FILE_PATH = '/tmp/failed_csv_lines.csv';
@SET BH_ABORT_ON_COUNT = 10;

LOAD DATA INFILE '/tmp/mynewfile.csv' INTO TABLE `maintable`
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"' 
ESCAPED BY '' 
LINES TERMINATED BY '\n';

(The WHERE 0 LIMIT 1 bit works where LIMIT 0 does not - in MySQL LIMIT 0 works fine, but when using BRIGHTHOUSE it throws some really weird error that took me a while to track down, so I use this method instead.)

I know this answer is a bit late coming, but I hope it's useful to anyone who stumbles upon this question.