I have a dataset df:
category | var 1 | ... | var 32 | weighting | country | |
---|---|---|---|---|---|---|
1 | blue | 1.0 | 54.2 | 3.0 | US | |
2 | pink | 0.0 | 101.0 | 1.0 | other | |
3 | blue | 1.0 | 49.9 | 3.0 | US | |
4 | green | 1.0 | 72.2 | 9.0 | US |
I'm using the kNN classifier (on the country variable) but need it to take into account the current dataset weights I have included. After looking at the sklearn pack I can see the KNeighborsClassifier() does have a weight argument, can I set this argument 'weight = df.weighting'? or do I have to go about this another way?
you can explode samples by weight, for example, or you can think about creating custom weighted distance function