Suppose I wanted to split my NER dataset that looks like this:
Data: "Jokowi is the president of Indonesia"
Label: ['B-Person', 'O', 'O', 'O', 'O', 'Country']
Is there any python library or algorithm that makes sure that each class distribution for the train and test dataset is the same? any suggestions would be appreciated

You can explore StratifiedShuffleSplit available in Scikit learn library.