Hope some of you can give some pointers on this one.
I generate code where i have to make calls to remote resources like webservices or databases.
Consider this piece of code
class Parent{
IEnumerable<Child> Children;
int SumChildren() {
// note the AsParallel
return Children.AsParallel().Sum(c => c.RemoteCall());
}
}
class Child {
public int RemoteCall() {
// call some webservice. I'd like to pool these calls
// without having to rewrite the rest of this code
}
}
For 50 children, it's going to make 50 calls to the service, taking the overhead 50 times. In my real life examples this could easily be a million calls, bringing the whole thing to a crawl.
What i would like to do is batch these calls in some way that is transparent for the calling thread/task. So instead of calling the service directly, it calls some central queue (the ' train station') that batches these calls.
So that when it does that, the calling task blocks. Then the queue waits for X calls to accumulate and then makes 1 call to the remote service with a list of requests.
When the result comes this queue returns the return values to the right task and unblocks it. for the calling thread, all this remains hidden and it looks like just another function call.
Can this be done? Are there primitives in the TPL that will let me do this?
It kinda smells like the CCR with lots of things going on at the same time waiting for other stuff to complete.
I could of course rewrite this code to make the list of requests on the Parent class and then call the service. The thing is that with my real problem all this code is generated. So I would have to 'look inside' the implementation of Child.RemoteCall, making this all a whole lot more complicated than it already is. Also the the Child could be a proxy to a remote object etc. Would be very hard if doable at all, i'd rather isolate this complexity.
Hope this make sense to someone, if not let me know i'll elaborate.
If the queue receives x calls (x < X) then the calling task will block until another task pushes the total >= X. If you only have one task that wants to make N * x calls, it will get stuck.
If your application usually has a lot of tasks running, then you might only see this problem intermittently - where you have unusually low load, or a clean shutdown.
You could solve this by adding a time out, so that the queue will send the batched requests anyway if no requests have been added within a time limit, and/or the first request has been waiting longer than a time limit.
Perhaps you are on the right track with this approach. Could you find a way of replacing the generated method implementation with a hand-coded implementation, by delegation, inheritance, lambda method, or enhancing your generator?
One point that I'm not quite clear on is which parts of the code are generated (hard to modify) and which parts of the code can be modified to solve this problem?
If it's neither of the above then you have to be able to modify something in order to solve the problem. Are the Parent and Child instances built by an AbstractFactory? If so, then it might be possible to insert a proxy to the Child instances that can be used to modify the non-functional aspects of their behavior(s).