I've a question regarding to a statement written in this paper, "Generalized Integrated Interleaved Codes". The paper mentions that erasure decoding of Reed-Solomon (RS) code incurs no miscorrection, but error-only decoding of RS code incurs a high miscorrection rate if the correction capability is too low.
From my understanding, I think the difference between erasure decoding and error-only decoding is that erasure decoding does not require to compute the error locations. On the other hand, error-only decoding requires to know the error locations, which can be computed by Berlekamp–Massey algorithm. I wonder if the miscorrection for error-only decoding comes from computing the wrong error locations? If yes, why the miscorrection rate is related to the correction capability of the RS code?
Yes. For example, consider an RS code with 6 parities, which can correct 3 errors. Assume that 4 errors have occurred, and that a 3 error correction attempt created an additional 3 errors, for a total of 7 errors. It will produce a valid codeword, but the wrong codeword.
There are situations where the probability of miscorrection can be lowered. If the message is a shortened message, say 64 bytes of data and 6 parities for a total of 70 bytes, then if a 3 error case produces an invalid location, miscorrection can be avoided. In this case, the odds of 3 random locations being valid is (70/255)^3 ~= .02 (2%).
Another way to avoid miscorrection is to not use all of the parities for correction. With 6 parities, the correction could be limited to 2 errors, leaving 2 parities for detection purposes. Or use 7 parities, for 3 error correction, with 1 parity used for detection.
Follow up based on comments:
First note that there are 3 decoders that can be used for BCH view Reed Solomon: PGZ (Peterson Gorenstein Zierler) matrix decoder, BKM (Berlekamp Massey) discrepancy decoder , and Sugiyama's extended Euclid decoder. PGZ has greater time complexity O((n-k)^3) than BKM or Euclid, so most implementations use BKM or Euclid. You can read a bit more about these decoders here:
https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction
Getting back to 6 parities, 4 errors. All valid RS(n+6, n) codewords differ from each other by at least 7 elements. If 4 of the elements of a message are in error, there may be a valid codeword that differs by only 3 more elements from that message with 4 error elements, and in this case, all 3 decoders will find that the message differs from a valid codeword by 3 elements, and "correct" those 3 elements to produce a valid codeword, but in this case the wrong valid codeword, which will differ by 7 elements from the original codeword. With 5 elements in error, a 2 or 3 error miscorrection could occur, and with 6 or more elements in error, a 1 or 2 or 3 error miscorrection could occur.
Invalid location - consider a RS code based on GF(2^8), which allows for a message size up to 255 bytes. Valid locations for a 255 byte message are 0 to 254. If the message size is less than 255 bytes, for example 64 data + 6 parity = 70 bytes, then locations 0 to 69 are valid, while locations 70 to 254 are invalid. In what would otherwise be a case of miscorrection, if a calculated location is out of range, then the decoder has detected an uncorrectable message, rather than miscorrect it. Assume a garbled message and that the decoder generates 3 random locations in the range 0 to 254, the probability of all 3 being in the range 0 to 69 is (70/255)^3.
Another case where miscorrection is avoided is when the number of distinct roots of the error locator polynomial does not match the degree of the polynomial. Consider a 3 error case with generated error locator polynomial x^3 + a x^2 + b x + c. If there are more than 3 errors in the message, then the generated polynomial may have less than 3 distinct roots, such as a double root, or zero roots, or ... , in which case miscorrection is avoided and the message is detected as being uncorrectable.