I had a look on the sse and mmx instruction set and there are no instructions for 3 channel image processing. Of course, for many operations you can use the same instructions, such as averaging two images. But when it comes to operations like unshuffling the channels or mixing different channels by a linear transformation, it seems a lot easier to use 32 bit images.
How are the performance chararteristics of typical image processing tasks with 24 vs. 32 bit images?
24 bit/pixel are faster if your images are large and the operations are simple (such as alpha-blending etc).
Very often the operations in image processing are quite simple, but you execute millions of them. So the time used to move data in and out from main-memory to the CPU can easily dominate the performance of an algorithm.
Therefore 24 bit/pixel images can offer an advantage over 32 bit/pixel images because there is 1/4 less data to move around.
Writing image-processing code that performs well with 24 bit/pixel is a pain though. The SSE instructions don't really fit the data, so you have to shuffle bytes around, and then you have to deal with all the different alignments.
If the images you are working with are small and fit in the l1 or l2 cache, things are different and the CPU time will dominate the performance. In these cases 32 bit/pixel performs faster.