Why smogn is extremely slow?

Question

Why smogn is extremely slow?

631 views Asked by Maciek Woźniak At 24 November 2020 at 19:24

I am using smoter for balancing my data for regression. I have 130k samples, 3 feature columns, and 1 target column. Smoter is taking ages to balance the data. e.g. with smote from learning for classification, it took seconds. Am I doing something wrong or it's just the size of the data? The estimated time by the smoter is around 20h to balance all the data. I also checked how would it be for e.g. 20 % of the data so 13k samples, estimated time was around 2h...

import smogn
smogn.smoter(
    
    ## main arguments
    data = df_gonzalez_healthy,           ## pandas dataframe
    y = 'healthy',          ## string ('header name')
    k = 9,                    ## positive integer (k < n)
    samp_method = 'extreme',  ## string ('balance' or 'extreme')

    ## phi relevance arguments
    rel_thres = 0.80,         ## positive real number (0 < R < 1)
    rel_method = 'auto',      ## string ('auto' or 'manual')
    rel_xtrm_type = 'high',   ## string ('low' or 'both' or 'high')
    rel_coef = 2.25           ## positive real number (0 < R)
)

Original Q&A

There are 1 answers

**Radhakrishna S P** · Answer 1 · 2021-02-02T06:57:44+00:00

Radhakrishna S P On 02 February 2021 at 06:57

I don't think you're doing anything wrong, it's actually the case with many of the users.

It's probably because of a lot of for loops.

Author/developer has already said he's working on making smogn more efficient.

TechQA.

Why smogn is extremely slow?

There are 1 answers

Related Questions in PANDAS

Related Questions in SMOTE

Popular Questions

Popular Tags

Trending Questions