I am trying to use Vaex together with Altair but I am having some troubles passing Vaex dataframes to Altair.
When trying to make a simple line chart
alt.Chart(df)\
.mark_line()\
.encode(alt.X('x'), alt.Y('y1'))
I get an error saying that
[the] encoding field[s] is[are] specified without a type; the type cannot be automatically inferred because the data is not specified as a pandas.DataFrame.
but if I try to specify them
alt.Chart(df)\
.mark_line()\
.encode(alt.X('x:T'), alt.Y('y1:Q'))
I get an error saying that
altair.vegalite.v4.api.Chart->0, validating 'additionalProperties'
Additional properties are not allowed ('y1', 'x', 'y2' were unexpected)
It seems to me that there is some problem linking a Vaex dataframe to Altair, but I have no idea on how to get around it...
Here the full code:
import altair as alt
import numpy as np
import vaex
import datetime
base = datetime.datetime.today()
dates = [base - datetime.timedelta(days=x) for x in range(10)]
y1 = np.sin(range(10))
y2 = np.cos(range(10))
df = vaex.from_arrays(x=dates, y1=y1, y2=y2)
alt.Chart(df)\
.mark_line()\
.encode(alt.X('x:T'), alt.Y('y1:Q')) #.encode(alt.X('x'), alt.Y('y1'))
Altair is not compatible with Vaex. The easiest way to proceed would be to convert your Vaex dataframe to pandas when using it in an altair chart; for example:
There is very little downside to using this conversion: pandas is a hard requirement of Altair, and Altair will always serialize the data to JSON in order to pass it to Vega-Lite. For the size of datasets that Altair can handle, the efficiency of data representation & serialization provided by Vaex are not particularly important.
If you want this to happen automatically, you can register a new data transformer that will support vaex. This should do the trick:
With this enabled,
alt.Chart()
will accept a vaex dataframe anywhere that a pandas dataframe is accepted.