Which engine to be used for more than 100 insert query per second

1.6k views Asked by At

Which engine to be used for more than 100 insert query per second

I read differences and pros and cons of MYISAM and Innodb.
But i am still confused for 100+ insert query in a table (basically for tracking purpose) which db should i use.

I refered What's the difference between MyISAM and InnoDB?
Based on my understanding, for each insert MYISAM will lock table and hence innodb should be used for row locking.
But on the otherhand performance of MYISAM are 100times better.


So what should be the optimal and correct selection and why?

2

There are 2 answers

0
Rick James On BEST ANSWER

Simple code that does one-row INSERTs without any tuning maxes out at about 100 rows per second in any engine, especially InnoDB.

But, it is possible to get 1000 rows per second or even more.

The quick fix for InnoDB is to set innodb_flush_log_at_trx_commit = 2; that will uncork the main thing stopping InnoDB at 100 inserts/second using a commodity spinning disk. Setting innodb_buffer_pool_size to about 70% of available RAM is also important.

If a user is inserting multiple rows into the same table at the same time, then LOAD DATA or a batch Insert (INSERT ... VALUES (...), (...), ...) of 100 rows or more will insert ten times as fast. This applies to any Engine.

MyISAM is not 100 times as fast; it is not even 10 times as fast as InnoDB. Today (5.6 or newer), you would be hard pressed to find a well tuned application that is more than a little faster in MyISAM. You are, or will be, I/O-limited.

As for corruption -- No engine suffers from corruption except during a crash. A power failure may mangle MyISAM indexes, usually recoverably. Moreover, a batch insert could be half done. InnoDB will be clean -- the entire batch is done or none of it is done; no corruption.

ARCHIVE saves disk space, but costs CPU.

MEMORY is often faster because it has no I/O. But you have too much data for that Engine, correct?

MariaDB with TokuDB can probably run faster than anything I describe here; but you have not indicated the need for it.

100 rows inserted per second = 8M/day = 3 Billion/year. Will you be purging the data eventually? Will you be querying the data? Purging: Let's talk about PARTITION. Querying: Let's talk about Summary Tables.

Indexing: Minimize the number of indexes. If you have a 'random' index, such as a UUID, and you have a billion rows, you will be stuck with 100 rows/second, regardless of which Engine and regardless of any tuning. Do I need to explain further?

If this is a queuing system, I say "Don't queue it, just do it."

Bottom line: Use a InnoDB. Tune it. Use batch inserts. Avoid random indexes. etc.

0
O. Jones On

You are correct that MyISAM is a faster choice if your operational use case is lots of insertions. But that answer can change drastically based on the kind of use you make of the data. If this is an archival application you might consider the ARCHIVE storage engine. It is best for write-once, read-rarely applications.

You should investigate INSERT DELAYED as it will allow your client programs to fire-and-forget these inserts rather than waiting for completion. This burns RAM in your mysqld process, though. If that style of operation meets your needs, this is a compelling reason to go with MyISAM.

Beware indexes in the target table of your inserts. Maintaining indexes is a big part of the server's insert workload.

Don't forget to look into MariaDB. It's a compatible fork of MySQL with some more advanced storage engines and features.

I have experience with a similar application. In our case, the application scaled up beyond the original insert rate, and the server could not keep up.(It's always good when an application workload grows!) We ended up doing two things, one after the other.

  1. Using a message queuing system, and running just a couple of processes to actually do the inserts. The original clients wrote their logging records to the message queue rather than directly to the database. (Amazon AWS's SQS is an example of such a queuing system).
  2. reworking the insert process to use LOAD DATA INFILE to load great gobs of log rows at once.

(You probably have figured out that this kind of workload isn't feasible on a cheap shared hosting service or an AWS micro instance.)