I have a table with millions of records in MySQL -> imported into Infobright using notes in wiki pages, no problem!
Here is the table syntax
CREATE TABLE `myTable` (
`a` varchar(255) COLLATE latin1_bin DEFAULT NULL,
`b` varchar(255) COLLATE latin1_bin DEFAULT NULL,
`c` bigint(20) NOT NULL,
`d` bigint(20) NOT NULL,
`e` int(10) NOT NULL
) ENGINE=BRIGHTHOUSE
now I need to run a select query 450 times, each time a different 'a' is being used a constraint as such:
SELECT d,e FROM `myTable` WHERE a = 'myString';
The goal is to speed up the amount of time all the queries in total is being called. But I am having problems with that. When I run the select query about 450 times, on average it takes 0.52 seconds each query!
When I run it through MySQL however, it takes about 1.7 milliseconds per query!
How can I optimize this to beat MySQL time? Would it require me to use the 'IN' clause, instead of the '=' with an additional selection of 'a' ontop of d,e? eg:
SELECT a,d,e FROM `myTable` WHERE a IN ('myString1','myString2'.... etc )
Infobright excels at working with numeric data (specifically when it comes to query conditions). They do not support traditional BTREE indexes, as they use a "knowledge grid" meta-data system. When you run
WHERE a = 'myString'
, you're forcing the engine to open up every packrow (usually 50,000 rows of data per packrow) and do a string comparison against each record.MySQL does have support for the BTREE indexes, which will offer better performance for text-based query conditions like you've noted here. Given that you've said you need to run the query 450 times, I will assume that you have at least 450 unique values for `a`.
If you have a text column like that with a very small set of unique values (like... some sort of status indicator, for example), you would greatly benefit from defining the column as a LOOKUP column
NOTE: This sort of table setup offers the best performance for low-cardinality columns, preferably with a small size allocation for varchar(), not 255.
For this set of queries, you would probably be better off sticking with MySQL, or re-working the data to convert these values to numeric data in Infobright.
Since this question is about 4 months old, I'm kinda just posting this as a reference for anybody else that stumbles upon it