I'm looking for ways to detect changes in runtime performance of my code in an automatic way. This would act in a similar way that JUnit does, but instead of testing the code's functionality it would test for sudden changes in speed. As far as I know there are no tools right now to do this automatically.
So the first question is: Are there any tools available which will do this?
Then the second questions is: If there are no tools available and I need to roll my own, what are the issues that need to be addressed?
If the second question is relevant, then here are the issues that I see:
- Variability depending on the environment it is run on.
- How do detect changes since micro benchmarks in Java have a large variance.
- If Caliper collects the results, how to get the results out of caliper so that they can be saved in a custom format. Caliber's documentation is lacking.
I don't know any separate tools to handle this, but JUnit has an optional parameter called timeout in the @Test-annotation:
So, you could write additional unit-tests to check that certain parts work "fast enough". Of course, you'd need to somehow first decide what is the maximum amount of time a particular task should take to run.
-
There will always be some variability, but to minimize it, I'd use Hudson or similar automated building & testing server to run the tests, so the environment would be the same each time (of course, if the server running Hudson also does all other sorts of tasks, these other tasks still could affect the results). You'd need to take this into account when deciding the maximum running time for tests (leave some "head room", so if the test takes, say, 5% more to run than usually, it still wouldn't fail straight away).
Microbenchmarks in Java are rarely reliable, I'd say test larger chunks with integration tests (such as handling a single http-request or what ever you have) and measure the total time. If the test fails due to taking too much time, isolate the problematic code by profiling, or measure and log out the running time of separate parts of the test during the test run to see which part takes the largest amount of time.
Unfortunately, I don't know anything about Caliper.