I've done a lot of reading on the two subjects, and I still cannot quite figure it out. From what I understand Perlin Noise (in 2D) generates a square grid, and you get the value of a point from that grid by calculating the contribution of each corner of the square you are in.
Simplex noise would be, from what I understand, also a square grid (in 2D). Instead of getting the value by calculating the contribution of the surrounding four corners, you split the square into two parts, and get the contribution from the three corners of the triangle you are currently in.
Do I understand this correctly? If so, isn't this just another way to calculate the contribution of the corners, and not another way of generating noise?
Half right. Simplex noise is also summing contributions from corners, but in 2D the actual shape being used is the equilateral triangle. (That bit about half squares in Gustavson's 2005 paper was in skewed space... just a way for the computer to figure out which triangle a point is in.)
Because the corners are now in different places and blended differently, the resulting noisy image will have different visual properties, and is thus considered a different type of noise. In particular, one will find triangular 60 degree artifacts in simplex noise that the eye is not trained to notice (as demonstrated in formal gardening) instead of the right angles in classic Perlin noise. The circular kernel also adds lumpiness to the image.