I have configured a 2 node cluster with hadoop and installed hbase. It was working properly and I have run some basic map reduce jobs in hadoop and I was able to create and list some tables in hbase too. However I have few data in hdfs/hbase and there were no job running. After a while I started to get "Java.net.Socket: Too many open files"
error in hbase logs.
I have looked for some solutions but there are mainly answers about increasing limit. However I am curious about why there are too many open files. This cluster is not used by any other program and I have not run any job other than simple map reduce tasks in tutorials.
Why could it be?
EDIT
After Andrzej suggested, I have run this command (lsof | grep java
) and I have observed that there are lots of connection in different ports which are waiting to be closed. This is just a few line of the output of the command
java 29872 hadoop 151u IPv6 158476883 0t0 TCP os231.myIP:44712->os231.myIP:50010 (CLOSE_WAIT)
java 29872 hadoop 152u IPv6 158476885 0t0 TCP os231.myIP:35214->os233.myIP:50010 (CLOSE_WAIT)
java 29872 hadoop 153u IPv6 158476886 0t0 TCP os231.myIP:39899->os232.myIP:50010 (CLOSE_WAIT)
java 29872 hadoop 155u IPv6 158476892 0t0 TCP os231.myIP:44717->os231.myIP:50010 (CLOSE_WAIT)
java 29872 hadoop 156u IPv6 158476895 0t0 TCP os231.myIP:44718->os231.myIP:50010 (CLOSE_WAIT)
Now the question becomes, why do not they close automatically if the connection is useless now? If they do not get aborted automatically, is there any way to close them with a crontab script or something similar?
Thanks
HBase keeps open all the files all the time. Here is some example. If you have 10 tables with 3 column familes each with average of 3 files per column family and 100 regions per Region Server per table, there will be 10*3*3*100 = 9000 file descriptors open. This math doesn't take in account JAR files, temp files etc.
Suggested value for
ulimit
is 10240, but you might want to set it to a value that matches better your case.