The icu4x icu_segmenter::WordSegmenter seems like the best word segmenter out there.
I don't understand how data providers work with word segmentation at all. It seems very complicated to me and I couldn't find any example.
I need it for Thai. I guess it uses the LSTM segmenter by default. It's better than anything I've seen before by default. It still has trouble with a lot of exotic names. Which is why I'd like to add my dictionary to it for personal use.
How to do that?