I've seen benchmarks on Yesod's homepage, but they are mostly for static files. And the benchmarks on Snap's website are outdated.
I'm trying to expose a Haskell module as a service. The server's logic is to receive the function name and arguments in JSON, invoke the Haskell function and deliver the output again as JSON. The referential transparency guarantees thread safety and the ability to memoize and cache functions.
If I were to support concurrent connections in the order of 2k - 5k, how would I go about implementing it? How scalable can this approach be?
I would highly recommend making the choice between Warp/Yesod and Snap based on which system provides you with the best set of tools for creating your application. Both Warp and Snap are using the same underlying GHC I/O manager, and both are highly optimized. I would be surprised if a well-written application for each system, doing anything non-trivial, showed a significant performance gap.
Your last paragraph is a bit vague, but I think the basic answer for either Warp or Snap is to just write your code, and the I/O manager will scale as well as possible. If you really find concurrent connections to be the bottleneck, you could consider trying out the prefork technique, using GHC 7.8
(not yet released, but has a much improved I/O manager)
, or using multiple servers.