NHunspell suggestions for non-ANSI (?) characters

72 views Asked by At

I tried the NHunspell NuGet package as follows:

var hunspell = new NHunspell.Hunspell(@"AffPath", @"DicPath");
hunspell.Add("Upadeṣasāhasrī");

var suggestions = hunspell.Suggest("Upadesasahasri");
Console.WriteLine(suggestions.First());

Unfortunately, the suggestion reads "Upade?asahasri". The "s with dot below" is returned as a question mark, while the "a with macron" and the "i with macron" are returned as a and i, respectively.

To the best of my knowledge, the native Hunspell DLL is fully unicode enabled. Therefore, I suppose that the NHunspell C# layer breaks something. The source code looks like this:

internal delegate IntPtr HunspellSuggestDelegate(IntPtr handle, [MarshalAs(UnmanagedType.LPWStr)] string word);


IntPtr strings = MarshalHunspellDll.HunspellSuggest(this.unmanagedHandle, word);

int stringCount = 0;
IntPtr currentString = Marshal.ReadIntPtr(strings, stringCount * IntPtr.Size);

 while (currentString != IntPtr.Zero)
 {
     ++stringCount;
     result.Add(Marshal.PtrToStringUni(currentString));
     currentString = Marshal.ReadIntPtr(strings, stringCount * IntPtr.Size);
 }

I am not a marshalling expert at all, but UnmanagedType.LPWStr and Marshal.PtrToStringUni seem to take Unicode into account. Nevertheless, it obviously does not work. Does anyone have a suggestion (pun intended)?

Thanks, Thomas

0

There are 0 answers