The 200MB data is collected among 300 days, about 600kb daily.
Currently I use d3.tsv to load one file containing all data, and then use setTimeout to loop through each day.
But the question is to load 200MB data to client's browser, it can take a few minutes...
How to overcome this problem? Can we use some prefetching and catching technique?
Kibana and Hue can handle gigabytes of data visualization. And both of them seem to use D3 for visualization.
How do they solve the time delay in loading data from the server to client side?
One way I am thinking is to load daily data in every second, and then merge these data to the clients' memory. But how to do the data merge inside of d3.tsv?
mergedData = []
for (f in filelist) {
d3.tsv(f, function(error, data)){
//Following code cannot work as expected
mergedData = mergedData.concat(data)
}
}
Because d3.tsv is onload, any global var cannot work inside of it.
Any tricks to work this out?
As far as kibana is concerned it works on elastic search which operates on millions of records and pulls up the analytic numbers using aggregates and displays it as charts.
Hue works on hadoop.
So in short they get the stats on the bigdata using the backend support of elasticsearch(In case of kibana) and display the numbers as barchart or other chart using d3.
Hence with large data you should consider taking backend support merge all data get the numbers and let d3 show it up on graph.