We've had a memory problem for a long time. I've finally tracked down how to replicate the issue but I do not know what is causing it or how to fix it.
We have a number of cfc's in a web accessible /controller directory that handle for submits and processing. When a cfc is called directly with no method argument the server begins to chew up memory.
For instance an URL like http://www.domain.com/controller/LoginController.cfc will run in the browser until it times out. The /CFIDE has been locked down and is not publically accessible
so the cfexplorer is not (or should not) be available.
We use FusionReactor to monitor our instances. Our servers are set for 20GB of heap space. On a fresh restart after loading the application, memory will cruise around 800MB.
With normal traffic, memory will fluctuate between 5GB and 10GB with regular garbage collection. After awhile, the server eventually reaches 98% capacity. It tends to run there
fine for hours or even days sometimes until some spike in traffic pushes it over and an outofmemory error occurs. Garbage collection recovers no memory and there are no active
long running threads reported by FusionReactor. Only a server restart will recovery it.
Using FusionReactor (which we've just installed which is how I finally got some insight into this issue) I was inspecting the PermGen memory space and found that it accounted for
85% of the heap. This didn't seem right at all. I performed a memory dump and loaded it into MAP through Eclipse to analyze it. I found that there were 10 objects in memory
measuring 1.7GB (1.7x10 is approx 85% of total heap). These objects look like this:
Class Name | Shallow Heap | Retained Heap | Percentage
byte[1769628928] @ 0x4d963b198 ...128.................POST......../controller/LoginController.cfc......../controller/LoginController.cfc........173.14.93.66........173.14.93.66........www.domain.com........443........HTTP/1.1.......;D:\websites\domain\system\controller\Lo...| 1,769,628,944 | 1,769,628,944 | 8.60%
So I restarted CF on one of our servers. Checked FusionReactor and saw no memory usage. Then went to a browser and called the cfc first like this:
http://www.domain.com/controller/LoginController.cfc?method=foo
This resulted in the onMissingMethod handler properly kicking and redirecting to the appropriate error page with no server effect.
But then calling this:
http://www.domain.com/controller/LoginController.cfc
Resulting in a page hang. FusionReactor reports there are no active request even though one is running one which is why we couldn't identify the problem while it was happening. Worse, refreshing the memory sees it slowly increase by tenths of a percentage with no reported activity. The timeout on the server is set to 5 minutes. I'm assuming that eventually it gets killed and then orphaned at 1.7GB. This didn't bring down the server, just spiked the memory where it was now running at a flat 3GB usage where garbage collection recovers nothing. This seems to explain why over time, random calls to these URLs slowly chew up and hold onto memory.
Next I called the URL from multiple browser tabs. This spiked the memory almost instantaneously to 98%. FusionReactor now showed two long running requests 10 seconds and climbing even though there were over 15 browser tabs running. Force killing the thread seemed to do nothing. Only a server restart solved the problem.
So now I've identified the issue specifically (phantom threads creating huge orphaned objects in PermGen heap) and how to replicate the issue.
How or why requests are made directly to the cfc I have no idea. Possibly bots or occasional weird browser behavior.
All the huge objects are instances of jrun.servlet.jrpp.ProxyEndpoint.
What specifically is causing this issue and how do I fix it.
This is CF9.01 Standard on Win2003 Server running Java 1.7.0_25.
Thanks!
I know it'd represent a big shift in how you do things, but I've always avoided allowing CF to unnecessarily create CFCs. Unless they've changed how they do things (I last played with this versions ago), hitting the CFC directly causes a new instance to be created.
If you're up for a small test, maybe try setting up a simple front controller/delegate .cfm page and moving the CFCs within 'controller' to the application scope. There's certainly more elegant architectures to handle it (short of moving to a full-bore framework), but you could:
Use Application.cfc to set an instance of something (like LoginController) into the application scope and then use a simple "invoke.cfm" page that basically expects the name of one of these application-scoped CFCs to invoke along with parameters. Something like (just for example's sake):
Note that this'd cause your 'controllers' to be stateful and thread-safety would need to be considered (but should already be, anyhow).