Transfer/ Unbox Julia's DataFrame objects and use as C++ object

50 views Asked by At

I would like to (un)box complex objects like DataFrames created in Julia, from C++. So I can use as input data for my C++ code.

For example, by loading MixedModels sample data dyestuff, and describing it as DataFrame, I can get:

using DataFrames, MixedModels

dyestuff = MixedModels.dataset(:dyestuff)
describe(DataFrame(dyestuff))

dyestuff df description

Note that MixedModel output is of type Arrow.Table. By using fieldnames(typeof(dyestuff)), the available attributes are (:names, :types, :columns, :lookup, :schema, :metadata).

How can I effectively and efficiently transform/unbox complex object (like data frames) from Julia to C++?

Because it is not possible (AFAIK) to share such complex data frame object, I felt it could be convenient to use the separated attributes as primitive objects and create my own DataFrame class in C++. So in C++ I am trying:

// to make sure I can execute julia code from C++
jl_eval_string("println(describe(DataFrame(dyestuff)))"); // 2×7 DataFrame
jl_eval_string("println(typeof(dyestuff))");              // Arrow.Table

// Get the length of column names attibute in julia's dyestuff dataframe.
jl_value_t *n_p = jl_eval_string("length(getfield(dyestuff, :names))");
int n = jl_unbox_int16(n_p); // output is 2 because we have 2 column names.

// Trying to load the actual column names (as array of strings) by following the Julia's manual:
jl_array_t *names_list = (jl_array_t *)jl_eval_string("String.(getfield(dyestuff, :names))");
string *names = (string *)jl_array_data(names_list);
cout << jl_array_len(names_list) << endl;
for (size_t i = 0; i < jl_array_len(names_list); i++)
{
    cout << " " << names[i] << endl;
}

But when printing out the names I am getting strange characters which seems to be binary of the object, but not the column names.

 batch��
��Rmath��
��yield�F���1ath��
��NaNMath�F���min th�F��� min  ��
��QuadGK�F���1adGKG���1adGK��
��GLMs(G���max eXG��� max  ��
��JSON3epG���1ON3�G���1ON3��
��NLopt�G���  pt�G���1optL��
��AdaptH���   tH���   0H���     HH���│9`H���  dfxH���90df_�H���90df_�H���Any �H��� Any  �H���90ets�H���90ndI���Any 4 I��� Any  8I���90dlPI���   inghI���90pe�I���90it2�I���┼ets�I���  code�I���  1tf��
��   1 ��
��│J��� 

Note that out put of Julia's code is:

julia> String.(getfield(dyestuff, :names))
2-element Vector{String}:
 "batch"
 "yield"

Getting the names attribute is just the first step. But other challenges will come when working with other Data Frame attibutes like types, columns, metadata, lookup, etc. So any suggestion in that front will be helpful. Thanks in advance!

UPDATE: To whom may be interested, a parallel discussion is happening in https://discourse.julialang.org/t/transfer-unbox-julias-dataframe-objects-and-use-as-c-object/105440/5

0

There are 0 answers