I want to analyze data from the ArangoDB
. These data are available as a tree structure.
I want to analyze these data with Pandas
now. I used Pandas before, but these datasets were all in a different structure, e.g. name, date, price, ...
(all in one line like a CSV). You can find below an example of what my dataset looks like.
- What is the best option to analyze these data?
- How can I break down the data set to a 'normal' structure e.g. CSV?
What my data looks like
└───dataset
├───createdAt
├───currency
├───date
├───lineItems
│ ├───createdAt
│ ├───customer
│ │ ├───id
│ │ └───plant
│ ├───id
│ ├───price
│ └───unit
├───metaData
│ └───originSystem
├───netPrice
│ └───0
│ └───netPrice
└───payment
├───adress
│ ├───name
│ └───street
└───number
I know that pandas.json_normalize
exists, but unfortunately, the dataset is more complex and I have more than one dataset with a tree structure to analyze.
Example
import pandas as pd
df=pd.json_normalize(result['dataset']['lineItems'])
# I could get the dataset as a dict
# dict_arangodb = ArangoDB(...)
# ...
# df = pd.json_normalize(dict_arangodb)