'DataFlowAnalysis' object has no attribute 'op_MAKE_FUNCTION' in Numba

1.1k views Asked by At

I haven't seen this specific scenario in my research for this error in Numba. This is my first time using the package so it might be something obvious.

I have a function that calculates engineered features in a data set by adding, multiplying and/or dividing each column in a dataframe called data and I wanted to test whether numba would speed it up

@jit
def engineer_features(engineer_type,features,joined):
    #choose which features to engineer (must be > 1)
    engineered = features

    if len(engineered) > 1:
        if 'Square' in engineer_type:
            sq = data[features].apply(np.square)
            sq.columns = map(lambda s:s + '_^2',features)

        for c1,c2 in combinations(engineered,2):
            if 'Add' in engineer_type:
                data['{0}+{1}'.format(c1,c2)] = data[c1] + data[c2]
            if 'Multiply' in engineer_type:
                data['{0}*{1}'.format(c1,c2)] = data[c1] * data[c2]
            if 'Divide' in engineer_type:
                data['{0}/{1}'.format(c1,c2)] = data[c1] / data[c2]

        if 'Square' in engineer_type and len(sq) > 0:
            data= pd.merge(data,sq,left_index=True,right_index=True)

        return data

When I call it with lists of features, engineer_type and the dataset:

engineer_type = ['Square','Add','Multiply','Divide']   

df = engineer_features(engineer_type,features,joined)

I get the error: Failed at object (analyzing bytecode) 'DataFlowAnalysis' object has no attribute 'op_MAKE_FUNCTION'

2

There are 2 answers

0
Carlos Vega On

Same question here. I think the problem might be the lambda function since numba does not support function creation.

0
Aseem On

I had this same error. Numba doesnt support pandas. I converted important columns from my pandas df into bunch of arrays and it worked successfully under @JIT. Also arrays are much faster then pandas df, incase you need it for processing large data.