dear community,
How much has the spanCategorizer improved your models? I am curious. I have been using the textcat for categorizing text with a recall of about 85%. I wonder how much applying a spancategorizer could make a difference. I am trying to predict if a question of a questionary will bring confidential (personally identifiable) information (such as name, telephone number, address, social security number, etc.). Some questions may be very long, and then the textcat gets confused. I am expecting that being able to catch key terms should improve the prediction, but I wonder how much improvement others have brought in their models. Many thanks for your answers!