Multi-threaded ray tracer significantly faster under Mavericks than Yosemite

422 views Asked by At

I'm writing a path tracer (ray tracer) to teach myself swift programming. Ray tracing is perfectly suited for parallelization because one can render each pixel independently. Here's how I set up my main loop, which allows for 8 threads.

var group:dispatch_group_t = dispatch_group_create()
var queue:dispatch_queue_t = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0)
let lockQueue:dispatch_queue_t = dispatch_queue_create("lockQueue", nil)
var semaphore:dispatch_semaphore_t = dispatch_semaphore_create(8)

for every_pixel {
    dispatch_semaphore_wait(semaphore, DISPATCH_TIME_FOREVER);
    dispatch_group_enter(group)
    dispatch_group_async(group, queue) { () -> Void in

    let pixelColor = castRayAndIntersectObjects()

    dispatch_sync(lockQueue, { () -> Void in
        PutPixelOnBitmap(pixelColor);
    })                

    dispatch_group_leave(group)
    dispatch_semaphore_signal(semaphore);
}

dispatch_group_wait(group, DISPATCH_TIME_FOREVER)

After installing OS X Yosemite, I noticed a huge performance drop. What used to take 28 seconds to render under Mavericks, now takes 70 seconds on my quad-core macbook pro. Activity monitor shows that Mavericks is using 750% of my CPU, while Yosemite is only using 200% of the CPU. It's as if Yosemite is only able to run two threads at a time.

I'm using the -O (fastest) optimization level for swift, and I'm running the same exact swift executable on both operating systems.

If I modify my code to be a single threaded application, I get the same render time in both operating systems. So it seems to be a threading problem.

Did I set up multi-threading correctly? I'm wondering if I'm doing something wrong, or if I discovered a problem with Yosemite.

More information (November 16, 2014):

When I run the profiler, I see that the threads are locking on something, and that's where most of the time is spent on Yosemite.

On Mavericks, the profiler doesn't show __psynch_mutexwait and __psynch_mutexdrop, and that's why it runs significantly faster. So the question is why would something lock on Yosemite, and why does it not lock on Mavericks?

0

There are 0 answers