I have pretty straightforward setup of CouchDB on my Mint/Debian box. My Java webapp was sufferring rather long delays on querying CouchDB, so I started to seek for the causes.
EDIT: The query pattern is lots of small queries and small JSON objects (like 300 bytes up / 1Kbyte down).
Wireshark dumps are pretty nice, showing mostly 3-5 millis request-response turnaround. JVM frame sampling showed me that socket code (client side queries to the Couch) is somewhat busy, but nothing remarkable. Then I tried to profile the same with ApacheBench and oops: I currently see that keep-alive introduces steady extra 39ms delay over non-persistent setups.
Does anyone know how to explain this? Maybe persistent connections increase the congestion window on the TCP layer and then are idling out due to TCP_WAIT and small request/response sizes, or something like that? Should this option (TCP_WAIT) be ever switched ON for loopback tcp connections?
w@mint ~ $ uname -a
Linux mint 2.6.39-2-486 #1 Tue Jul 5 02:52:23 UTC 2011 i686 GNU/Linux
w@mint ~ $ curl http://127.0.0.1:5984/
{"couchdb":"Welcome","version":"1.1.1"}
running with keep alive, average 40 millis per request
w@mint ~ $ ab -n 1024 -c 1 -k http://127.0.0.1:5984/
>>>snip
Server Software: CouchDB/1.1.1
Server Hostname: 127.0.0.1
Server Port: 5984
Document Path: /
Document Length: 40 bytes
Concurrency Level: 1
Time taken for tests: 41.001 seconds
Complete requests: 1024
Failed requests: 0
Write errors: 0
Keep-Alive requests: 1024
Total transferred: 261120 bytes
HTML transferred: 40960 bytes
Requests per second: 24.98 [#/sec] (mean)
Time per request: 40.040 [ms] (mean)
Time per request: 40.040 [ms] (mean, across all concurrent requests)
Transfer rate: 6.22 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 1 40 1.4 40 48
Waiting: 0 1 0.7 1 8
Total: 1 40 1.3 40 48
Percentage of the requests served within a certain time (ms)
50% 40
>>>snip
95% 40
98% 41
99% 44
100% 48 (longest request)
No keepalive, and voila - 1 ms per request, mostly.
w@mint ~ $ ab -n 1024 -c 1 http://127.0.0.1:5984/
>>>snip
Time taken for tests: 1.080 seconds
Complete requests: 1024
Failed requests: 0
Write errors: 0
Total transferred: 236544 bytes
HTML transferred: 40960 bytes
Requests per second: 948.15 [#/sec] (mean)
Time per request: 1.055 [ms] (mean)
Time per request: 1.055 [ms] (mean, across all concurrent requests)
Transfer rate: 213.89 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 1 1 1.0 1 11
Waiting: 1 1 0.9 1 11
Total: 1 1 1.0 1 11
Percentage of the requests served within a certain time (ms)
50% 1
>>>snip
80% 1
90% 2
95% 3
98% 5
99% 6
100% 11 (longest request)
Okay, now with keep-alive on but also asking to close the connection via http header. Also 1 ms per request or so.
w@mint ~ $ ab -n 1024 -c 1 -k -H 'Connection: close' http://127.0.0.1:5984/
>>>snip
Time taken for tests: 1.131 seconds
Complete requests: 1024
Failed requests: 0
Write errors: 0
Keep-Alive requests: 0
Total transferred: 236544 bytes
HTML transferred: 40960 bytes
Requests per second: 905.03 [#/sec] (mean)
Time per request: 1.105 [ms] (mean)
Time per request: 1.105 [ms] (mean, across all concurrent requests)
Transfer rate: 204.16 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 1 1 1.2 1 14
Waiting: 0 1 1.1 1 13
Total: 1 1 1.2 1 14
Percentage of the requests served within a certain time (ms)
50% 1
>>>snip
80% 1
90% 2
95% 3
98% 6
99% 7
100% 14 (longest request)
Yeah, this is related to tcp socket setup options. This configuration now leveled off all three cases at 1ms per request.
See this for details: http://wiki.apache.org/couchdb/Performance#Network