GlusterFS read directory performance in replicated setup - what's wrong? how to optimize?

Question

GlusterFS read directory performance in replicated setup - what's wrong? how to optimize?

1.5k views Asked by Udo G At 18 December 2014 at 11:13

I'm currently playing around with a two-node cluster with Gluster 3.5 in replication mode. This is to get some insight in the system before trying to implement a real 3-node cluster on server hardware.

The test hardware is nothing high-end: a Intel Atom D2550 and a Intel i5 connected together via a Ethernet cross-cable on the Gbit ports.

In the Gluster file system there are about 20,000 mostly small files (basically a Debian installation), which is similar to the real world usage it will need to handle later (on different hardware).

Since some old software will run on the brick, that unfortunately requires to poll periodically over most of these files, latency when polling file stats is a factor.

I did a simple test (GlusterFS mounted on the Gluster node itself):

# time find | wc -l
22174

real    0m18.542s
user    0m0.224s
sys     0m0.789s

From what i know, this maybe is so slow because GlusterFS has to poll the other node on each stat.

When polling directly on the brick storage directory, starting with the second trial I get timings in the range of 0.16 seconds (as expected, everything is probably read from cache).

However, when I shut down the other node, so that there is only one node left, I get pretty similar results:

# time find | wc -l
22174

real    0m16.445s
user    0m0.213s
sys     0m0.702s

How can that be? What's slowing down Gluster in this case?

In general, how can I minimize read latency in GlusterFS redundant setup? It would not be a problem if polling the file system dirlist would temporarily lag behind the real situation during recovery after a crash, if that could improve dirlist performance..

Original Q&A

There are 1 answers

**nicolasochem** · Accepted Answer · 2015-01-28T23:43:37+00:00

nicolasochem On 28 January 2015 at 23:43 BEST ANSWER

Try mounting using the NFS compatibility layer. It is actually faster when you have a lot of small files: http://www.gluster.org/community/documentation/index.php/GlusterFS_General_FAQ#Is_the_gluster_client_or_NFS_client_faster.3F

TechQA.

GlusterFS read directory performance in replicated setup - what's wrong? how to optimize?

There are 1 answers

Related Questions in LINUX

Related Questions in GLUSTERFS

Popular Questions

Popular Tags

Trending Questions