Altair with Vaex

209 views Asked by At

I am trying to use Vaex together with Altair but I am having some troubles passing Vaex dataframes to Altair.

When trying to make a simple line chart

alt.Chart(df)\
.mark_line()\
.encode(alt.X('x'), alt.Y('y1'))

I get an error saying that

[the] encoding field[s] is[are] specified without a type; the type cannot be automatically inferred because the data is not specified as a pandas.DataFrame.

but if I try to specify them

alt.Chart(df)\
.mark_line()\
.encode(alt.X('x:T'), alt.Y('y1:Q'))

I get an error saying that

altair.vegalite.v4.api.Chart->0, validating 'additionalProperties'

Additional properties are not allowed ('y1', 'x', 'y2' were unexpected)

It seems to me that there is some problem linking a Vaex dataframe to Altair, but I have no idea on how to get around it...

Here the full code:

import altair as alt
import numpy as np
import vaex
import datetime

base = datetime.datetime.today()
dates = [base - datetime.timedelta(days=x) for x in range(10)]

y1 = np.sin(range(10))
y2 = np.cos(range(10))

df = vaex.from_arrays(x=dates, y1=y1, y2=y2)

alt.Chart(df)\
.mark_line()\
.encode(alt.X('x:T'), alt.Y('y1:Q')) #.encode(alt.X('x'), alt.Y('y1'))
1

There are 1 answers

0
jakevdp On BEST ANSWER

Altair is not compatible with Vaex. The easiest way to proceed would be to convert your Vaex dataframe to pandas when using it in an altair chart; for example:

alt.Chart(df.to_pandas_df())

There is very little downside to using this conversion: pandas is a hard requirement of Altair, and Altair will always serialize the data to JSON in order to pass it to Vega-Lite. For the size of datasets that Altair can handle, the efficiency of data representation & serialization provided by Vaex are not particularly important.

If you want this to happen automatically, you can register a new data transformer that will support vaex. This should do the trick:

import altair as alt

def vaex_data_transformer(df):
  try:
    df = df.to_pandas_df()
  except AttributeError:
    pass
  return alt.data.default_data_transformer(df)

alt.data_transformers.register('vaex', vaex_data_transformer)
alt.data_transformers.enable('vaex')

With this enabled, alt.Chart() will accept a vaex dataframe anywhere that a pandas dataframe is accepted.