How can I use CUDA with vaex (a Python library)

267 views Asked by At

my code as follow:

df['O_ID'] = (df.apply(get_match_id, arguments=[df['pickup_longitude'], df['pickup_latitude']])).jit_cuda()

When first I used this function——jit_cuda(),there was an error "No Module named cupy"

But, when I have installed the cupy-cuda101(Adapt to my CUDA version) I get a new error

Traceback (most recent call last):
  File "F:\Anaconda3\lib\site-packages\vaex\dataframe.py", line 3580, in table_part
    values[name] = df.evaluate(name)
  File "F:\Anaconda3\lib\site-packages\vaex\dataframe.py", line 2616, in evaluate
    return self._evaluate_implementation(expression, i1=i1, i2=i2, out=out, selection=selection, filtered=filtered, internal=internal, parallel=parallel, chunk_size=chunk_size)
  File "F:\Anaconda3\lib\site-packages\vaex\dataframe.py", line 5352, in _evaluate_implementation
    dtypes[expression] = df.data_type(expression, internal=False)
  File "F:\Anaconda3\lib\site-packages\vaex\dataframe.py", line 1998, in data_type
    data = self.evaluate(expression, 0, 1, filtered=True, internal=True, parallel=False)
  File "F:\Anaconda3\lib\site-packages\vaex\dataframe.py", line 2616, in evaluate
    return self._evaluate_implementation(expression, i1=i1, i2=i2, out=out, selection=selection, filtered=filtered, internal=internal, parallel=parallel, chunk_size=chunk_size)
  File "F:\Anaconda3\lib\site-packages\vaex\dataframe.py", line 5427, in _evaluate_implementation
    value = scope.evaluate(expression)
  File "F:\Anaconda3\lib\site-packages\vaex\scopes.py", line 97, in evaluate
    result = self[expression]
  File "F:\Anaconda3\lib\site-packages\vaex\scopes.py", line 139, in __getitem__
    self.values[variable] = self.evaluate(expression)  # , out=self.buffers[variable])
  File "F:\Anaconda3\lib\site-packages\vaex\scopes.py", line 103, in evaluate
    result = eval(expression, expression_namespace, self)
  File "<string>", line 1, in <module>
  File "F:\Anaconda3\lib\site-packages\vaex\expression.py", line 1073, in __call__
    return self.f(*args, **kwargs)
  File "F:\Anaconda3\lib\site-packages\vaex\expression.py", line 1120, in wrapper
    return cupy.asnumpy(func(*args))
  File "cupy\core\fusion.pyx", line 905, in cupy.core.fusion.Fusion.__call__
  File "cupy\core\fusion.pyx", line 754, in cupy.core.fusion._FusionHistory.get_fusion
  File "<string>", line 6, in f
NameError: name 'lambda_function_1' is not defined

How should I solve it?

1

There are 1 answers

0
Joco On

My understanding is that just-in-time compilation in vaex works only for virtual columns, or expressions/columns computed mainly with various arithmetic operations using numpy methods or pure python arithmetics.

When using apply, a function can be quite abstract, basically whatever you want, so it may not be possible for it to be compiled.

If you can rewrite your .apply function using numpy expressions, then you are likely able to use the jit_cuda method to accelerate it. Vaex does not recommend using .apply anyway, since it is hard to parallelize and should be used a "last resort" of sorts.

Source: https://vaex.io/docs/tutorial.html#Just-In-Time-compilation