I have a question. If a thread modifies a variable, will the thread on the same physical core (a different hyperthread core) see the modification earlier than other cores? Or it has to wait until all the other cores see it?
I've been trying to pin two threads on the same physical core, but get performance degradation. I know it's because two cores share lots of resources. But in terms of synchronization. Will it help to put threads on the same physical core?
Thanks!
The answer is dependant of the platform (especially the underlying architecture). That being said, on the (mainstream) x86-64 architecture, threads sharing the same core communicate faster than threads on different cores or even different sockets. One main reason is that the two threads will often share the same L1 cache (and if not, the L2 cache). Thus, on thread can directly read what the other just wrote. Moreover, the threads can often run in parallel thanks to simultaneous multithreading (called Hyper-Threading on Intel CPUs) reducing the communication latency (no scheduling quantum to wait). Meanwhile, threads on different cores will have to communicate through a (slow) bus or share data using the L3 cache (significantly slower than the L1/L2).
Then your workload is bound by communication (latency or throughput), it is often better to put threads close to each other (ie. on the same core). When the number of threads per core exceed the number of hardware thread, then performance decrease due to preemptive multitasking. When the workload is compute bound, it is better to put them on separate cores. Note that on modern x86 processors, threads working on the same core can even share the computing resources (ALUs) at the instruction level.