my code as follow:
df['O_ID'] = (df.apply(get_match_id, arguments=[df['pickup_longitude'], df['pickup_latitude']])).jit_cuda()
When first I used this function——jit_cuda(),there was an error "No Module named cupy"
But, when I have installed the cupy-cuda101(Adapt to my CUDA version) I get a new error
Traceback (most recent call last):
File "F:\Anaconda3\lib\site-packages\vaex\dataframe.py", line 3580, in table_part
values[name] = df.evaluate(name)
File "F:\Anaconda3\lib\site-packages\vaex\dataframe.py", line 2616, in evaluate
return self._evaluate_implementation(expression, i1=i1, i2=i2, out=out, selection=selection, filtered=filtered, internal=internal, parallel=parallel, chunk_size=chunk_size)
File "F:\Anaconda3\lib\site-packages\vaex\dataframe.py", line 5352, in _evaluate_implementation
dtypes[expression] = df.data_type(expression, internal=False)
File "F:\Anaconda3\lib\site-packages\vaex\dataframe.py", line 1998, in data_type
data = self.evaluate(expression, 0, 1, filtered=True, internal=True, parallel=False)
File "F:\Anaconda3\lib\site-packages\vaex\dataframe.py", line 2616, in evaluate
return self._evaluate_implementation(expression, i1=i1, i2=i2, out=out, selection=selection, filtered=filtered, internal=internal, parallel=parallel, chunk_size=chunk_size)
File "F:\Anaconda3\lib\site-packages\vaex\dataframe.py", line 5427, in _evaluate_implementation
value = scope.evaluate(expression)
File "F:\Anaconda3\lib\site-packages\vaex\scopes.py", line 97, in evaluate
result = self[expression]
File "F:\Anaconda3\lib\site-packages\vaex\scopes.py", line 139, in __getitem__
self.values[variable] = self.evaluate(expression) # , out=self.buffers[variable])
File "F:\Anaconda3\lib\site-packages\vaex\scopes.py", line 103, in evaluate
result = eval(expression, expression_namespace, self)
File "<string>", line 1, in <module>
File "F:\Anaconda3\lib\site-packages\vaex\expression.py", line 1073, in __call__
return self.f(*args, **kwargs)
File "F:\Anaconda3\lib\site-packages\vaex\expression.py", line 1120, in wrapper
return cupy.asnumpy(func(*args))
File "cupy\core\fusion.pyx", line 905, in cupy.core.fusion.Fusion.__call__
File "cupy\core\fusion.pyx", line 754, in cupy.core.fusion._FusionHistory.get_fusion
File "<string>", line 6, in f
NameError: name 'lambda_function_1' is not defined
How should I solve it?
My understanding is that just-in-time compilation in vaex works only for virtual columns, or expressions/columns computed mainly with various arithmetic operations using numpy methods or pure python arithmetics.
When using
apply
, a function can be quite abstract, basically whatever you want, so it may not be possible for it to be compiled.If you can rewrite your
.apply
function using numpy expressions, then you are likely able to use thejit_cuda
method to accelerate it. Vaex does not recommend using.apply
anyway, since it is hard to parallelize and should be used a "last resort" of sorts.Source: https://vaex.io/docs/tutorial.html#Just-In-Time-compilation