I have a basic question; sorry, it might be very silly and generic, but is very important for our knowledge. How is the C/C++ generated code implemented for parallel (AND) states? Does it implement them using multi-threaded approach?
I have generated C source code for a simple scenario, only 2 parallel states, as shown in this figure. It is a complicated code even for this simple scenario, but I did not see any threading basis on it.
You are confusing parallelism with concurrency. In simulink terms all parallel states with be executed in a single time-step. This is acceptable in simulation but not for code generation for a real-time system when you might want to make full use of multiple execution cores. This also applies to simulink blocks in general. Asynchronous blocks (with different sample rates) and parallel states execute sequentially on the same thread (in the step function). The only thing you have any control over is the order of execution. See below:
http://uk.mathworks.com/help/stateflow/ug/execution-order-for-parallel-states.html
For example, here is the generated code for the parallel states below:
Note that in the step() function on line 13 & 15 the output is assigned according to the order in stateflow.
You might want to consider PCT (Parallel Computing Toolbox) to see if it supports your needs.
http://uk.mathworks.com/solutions/parallel-computing/index.html
Alternatively, depending on the target hardware you might find it suitable to manually write source code using C11 std::thread 's and bring that into your simulation using s-function's and/or legacy code tool.