I have a perl script (in Ubuntu 12.04 LTS) writing to 26 TCH files. The keys are roughly equally distributed. The writes become very slow after 3 Million inserts (equally distributed to all the files) and the speed comes down from 240,000 inserts/min at the beginning to 14,000 inserts/min after 3 MM inserts. Individually the shard files are no more than 150 MB and overall their size comes to around 2.7 GB.
I run optimize on every TCH File after every 100K inserts to that file with bnum as 4*num_records_then and options set to TLARGE and make sure xmsiz matches the size of bnum (as mentioned in Why does tokyo tyrant slow down exponentially even after adjusting bnum?)
Even after this, the inserts start at high speed then slowly decrease to 14k inserts/min from 240k inserts/min. Could it be due to holding multiple tch connections (26) in a single script? Or is there configuration setting, I'm missing (would disabling journaling help, but the above thread says journaling affects performance only after the tch file becomes bigger than 3-4GB, my shards are <150MB files..)?
I would turn off journaling and measure what changes. The cited thread talks about a 2-3 GB tch file, but if you sum the sizes of your 26 tch files, you are in the same league. For the filesystem, the total amount of data ranges written to should be the relevant parameter.