How to remove unknown chars on string in windows-1251 charset

567 views Asked by At

I have a text which cannot be converted to windows-1251 charset. For example:

中华全国工商业联合会-HelloWorld

I have a method for converting from UTF8 to windows-1251:

static string ChangeEncoding(string text)
{
    if (text == null || text == "")
        return "";
    Encoding win1251 = Encoding.GetEncoding("windows-1251");
    Encoding ascii = Encoding.UTF8;
    byte[] utfBytes = ascii.GetBytes(text);
    byte[] isoBytes = Encoding.Convert(ascii, win1251, utfBytes);
    return win1251.GetString(isoBytes);
}

Now it is returning this:

??????????-HelloWorld

I don't want to show chars which was not converted to windows1251 charset correct. In this case I want just:

-HelloWorld

How can I do this?

1

There are 1 answers

1
Dilshod K On BEST ANSWER

According to @JeroenMostert suggestion this method helped me:

    public static string ChangeEncoding(string text)
    {
        Encoding win1251 = Encoding.GetEncoding("windows-1251", new EncoderReplacementFallback(string.Empty), new DecoderExceptionFallback());
        return win1251.GetString(Encoding.Convert(Encoding.UTF8, win1251, Encoding.UTF8.GetBytes(text)));
    }