Why are SST-2, CoLA and models trained on both commonly used for measuring bias and subesequent debiasing? Does it correlate with GLUE benchmark being widely accepted and used for research purposes? As SST-2 in particular consists of movie reviews, what is the expected gain from debiasing such data? Would it not be reasonable to debias datasets with more inherent bias (although it is not obvious at first which datasets are biased I assume)?
My current understanding is, that without regard of whether a trained model and or dataset does exhibit biases or not, debiasing is an important step to strive for more fairness. SST-2 and CoLA thereby provide general datasets with a wide range of biases and interface to be researched on.
Someone basically told me, that SST-2 and CoLA does not make sense for debiasing, despite it being used in plenty of papers which made me question it....