Possible to parallelize for loop with dependencies?

Question

Possible to parallelize for loop with dependencies?

101 views Asked by lstilo At 24 June 2015 at 07:51

Hello veteran R users,

I'm quite new to R and am wondering if there's any possibility of parallezing my process. My dataset is essentially derived from a pcap file where I've extracted the packets that correspond to a particular protocol-MODBUS/TCP. There are over 800k packets and every two consecutive packets correspond to a query/response of a particular (ie, the same) MODBUS transaction.

As some values are contained in either the query/response, I've created an initial for loop which goes through line by line to "line up" the data so that I have a single line per transaction with all the variables filled from both the query/response lines. The only way to differentiate between a query/response is by the source/destination port number, which is in conditional if statements.

I'm using data tables, setting keys, preallocating variables (the merged table/results). Functions applied to vectors (columns in the result data.table) execute fairly quickly.

My PC is running debian wheezy with 4 processors. Since there are dependencies, from what I've read my understanding is that it's not really possible to leverage the parallel processing? However is there some way I can partition the entire dataset, have them process in parallel and then merge the results? It took over 3 hrs to run, perhaps there's some other optimization I can apply?

Any guidance/pointers greatly appreciated. Thanks!

Original Q&A

There are 1 answers

**lstilo** · Answer 1 · 2015-07-17T12:52:21+00:00

lstilo On 17 July 2015 at 12:52

I've reimplemented the code in C, and have since discovered Rcpp which I'm currently exploring. This seems to be the way to go.

TechQA.

Possible to parallelize for loop with dependencies?

There are 1 answers

Related Questions in R

Related Questions in RPARALLEL

Popular Questions

Popular Tags

Trending Questions