1. the problem
I deployed a webapp in Tomcat, and I found that it shutdowns randomly, the time varies from 2 or 3 hours to 2 or 3 days. The log in catalina.out
is:
26224 2015-06-10 13:59:04.110 {http-nio-8080-exec-3} INFO com.timediff.controller.user.UserProfileController#getUserHome - /user/profile/home done, curUid: 889
26225 10-Jun-2015 14:15:35.050 INFO [Thread-11] org.apache.coyote.AbstractProtocol.pause Pausing ProtocolHandler ["http-nio-8080"]
26226 10-Jun-2015 14:15:35.052 INFO [Thread-11] org.apache.coyote.AbstractProtocol.pause Pausing ProtocolHandler ["ajp-nio-8009"]
26227 10-Jun-2015 14:15:35.053 INFO [Thread-11] org.apache.catalina.core.StandardService.stopInternal Stopping service Catalina
26228 10-Jun-2015 14:15:35.058 INFO [localhost-startStop-2] org.springframework.web.context.support.XmlWebApplicationContext.doClose Closing WebApplicationContext for namespace 'timediff-dispatcher-servlet': startup date [Wed Jun 10 13:38:14 CST 2015]; root of context hierarchy
26229 10-Jun-2015 14:15:35.059 INFO [localhost-startStop-2] org.springframework.context.support.DefaultLifecycleProcessor.stop Stopping beans in phase 2147483647
26230 2015-06-10 14:15:35.061 {localhost-startStop-2} INFO org.quartz.core.QuartzScheduler#standby - Scheduler TimediffScheduler_$_iZu1skaofy1Z1433914696931 paused.
26231 10-Jun-2015 14:15:35.072 INFO [localhost-startStop-2] org.springframework.scheduling.quartz.SchedulerFactoryBean.destroy Shutting down Quartz Scheduler
26232 2015-06-10 14:15:35.072 {localhost-startStop-2} INFO org.quartz.core.QuartzScheduler#shutdown - Scheduler TimediffScheduler_$_iZu1skaofy1Z1433914696931 shutting down.
26233 2015-06-10 14:15:35.075 {localhost-startStop-2} INFO org.quartz.core.QuartzScheduler#standby - Scheduler TimediffScheduler_$_iZu1skaofy1Z1433914696931 paused.
26234 2015-06-10 14:15:35.077 {localhost-startStop-2} INFO org.quartz.core.QuartzScheduler#shutdown - Scheduler TimediffScheduler_$_iZu1skaofy1Z1433914696931 shutdown complete.
26235 10-Jun-2015 14:15:35.082 INFO [localhost-startStop-2] org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor.shutdown Shutting down ExecutorService 'quartzThreadPool'
26236 2015-06-10 14:15:35.103 {localhost-startStop-2} INFO com.timediff.listener.StopMemoryLeakListener#lambda$contextDestroyed$0 - driver: com.mysql.jdbc.Driver@7657b26d is de-registered.
26237 2015-06-10 14:15:35.104 {localhost-startStop-2} INFO com.timediff.listener.StopMemoryLeakListener#contextDestroyed - AbandonedConnectionCleanupThread shutdown.
26238 10-Jun-2015 14:15:35.150 INFO [Thread-11] org.apache.coyote.AbstractProtocol.stop Stopping ProtocolHandler ["http-nio-8080"]
26239 10-Jun-2015 14:15:35.152 INFO [Thread-11] org.apache.coyote.AbstractProtocol.stop Stopping ProtocolHandler ["ajp-nio-8009"]
26240 10-Jun-2015 14:15:35.154 INFO [Thread-11] org.apache.coyote.AbstractProtocol.destroy Destroying ProtocolHandler ["http-nio-8080"]
26241 10-Jun-2015 14:15:35.156 INFO [Thread-11] org.apache.coyote.AbstractProtocol.destroy Destroying ProtocolHandler ["ajp-nio-8009"]
on stackoverflow, this question and this question is very similar to my situation, but I'm still stumbled.
Now I'll give a detailed description of my problem:
2. more detail
2.1 tomcat and jdk version
Tomcat: 8.0.22
JDK: 1.8.0_45
2.2 jvm options in catalina.sh:
CATALINA_OPTS="-server -Xms1g -Xmx1g -XX:MaxMetaspaceSize=512m -Xmn512m
-XX:SurvivorRatio=8
-XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
-XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSInitiatingOccupancyFraction=70 -XX:+ScavengeBeforeFullGC
-XX:+CMSScavengeBeforeRemark
-XX:+PrintGCDateStamps -verbose:gc -XX:+PrintGCDetails
-Xloggc:/opt/logs/gc/timediff-gc.log
-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10M
-Dsun.net.inetaddr.ttl=120 -XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/opt/logs/gc/timediff-oom.hprof
-Djava.rmi.server.hostname=**.**.**.**
-Dcom.sun.management.jmxremote.port=1099
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false"
2.3 there are no exception logs in my webapp related to the abortion of tomcat, and I'm sure I never called System.exit(), and there are no code blocks like:
try {
} catch(Exception e) {
// do nothing
}
2.4 while I actually found Allocation Failure in gc log:
2015-06-10T15:36:28.589+0800: 3099.795: [GC (Allocation Failure) 3099.795: [ParNew: 419780K->382K(471872K), 0.0125816 secs] 469721K->50348K(996160K), 0.0126820 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]
2015-06-10T15:37:30.141+0800: 3161.347: [GC (Allocation Failure) 3161.347: [ParNew: 419838K->372K(471872K), 0.0062445 secs] 469804K->50338K(996160K), 0.0063629 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]
2015-06-10T15:38:41.680+0800: 3232.886: [GC (Allocation Failure) 3232.886: [ParNew: 419828K->369K(471872K), 0.0064920 secs] 469794K->50356K(996160K), 0.0066009 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]
2015-06-10T15:39:43.222+0800: 3294.428: [GC (Allocation Failure) 3294.428: [ParNew: 419825K->384K(471872K), 0.0058772 secs] 469812K->50372K(996160K), 0.0059823 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]
2015-06-10T15:40:54.758+0800: 3365.964: [GC (Allocation Failure) 3365.964: [ParNew: 419840K->388K(471872K), 0.0056674 secs] 469828K->50395K(996160K), 0.0069850 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]
I think that maybe this is the cause, but the result of TOP and jvisualVM makes it unclear:
web@iZu1skaofy1Z:/usr/local/apache-tomcat-8.0.22/logs$ free -m
total used free shared buffers cached
Mem: 3951 3087 864 0 190 553
-/+ buffers/cache: 2343 1608
Swap: 0 0 0
top - 15:50:05 up 16 days, 5:11, 2 users, load average: 0.33, 0.17, 0.09
Tasks: 128 total, 2 running, 126 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.8 us, 0.5 sy, 0.0 ni, 98.5 id, 0.0 wa, 0.2 hi, 0.0 si, 0.0 st
KiB Mem: 4046820 total, 3161260 used, 885560 free, 194880 buffers
KiB Swap: 0 total, 0 used, 0 free. 566984 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
27307 web 20 0 2068604 865872 22048 S 0.7 21.4 20:20.28 java
16557 web 20 0 3680756 801708 13740 S 0.0 19.8 2:02.99 java
15597 mysql 20 0 1800972 526220 6636 S 0.0 13.0 36:26.08 mysqld
2.4 I deployed another tomcat on the same server, but I changed the shutdown port and connector port, I don't think that they are conflict.
I've tried my best, maybe I forget something during the analysis, please help to give me some tips, thanks in advance!
update(2015-07-04): after I switch from user web
to user root
when running tomcat, the problem never occurs. So I doubt that tomcat is kill by system because of user privilege, if you have any idea, please tell me, thanks!
This answer (from one of the questions you found) seems good.
Something is telling Tomcat to stop. And since it doesn't happen when Tomcat is run as
root
, I think the cause is some other (non-system) process (maybe a script or a cron job) sending a signal (probably,SIGTERM
) to Tomcat, likekill <tomcat pid>
. Maybe that other process is also run as userweb
- that would explain why that process can't killroot
's Tomcat. Or maybe that other process just searches for processes to kill, and one of the criteria is "processes owned byweb
".I suggest you carefully read crontabs for users
root
andweb
, system-wide crontab and everything in/etc/cron.*/
folders. You can also check if any other processes owned byweb
are suddenly terminated. And building Tomcat from source, with some tracing added (as suggested in the answer I mentioned), seems a good idea.