I woulad like to draw a network chart with d3JS and have some troubles to format my datas. My expected result is getting this chart : https://bl.ocks.org/mbostock/1062288

As an input, I have this set of data (simplified):

Note 1: I have to do this on a file that contains +1M lines

Note 2: I'm using PHP to create the JSON / Python is OK too

from, to
https://example.org/, https://example.org/dir1/page/1.html
https://example.org/, https://example.org/dir1/page/2.html
https://example.org/, https://example.org/dir1/page/3.html
https://example.org/, https://example.org/dir2/page/1.html
https://example.org/, https://example.org/dir2/page/2.html
https://example.org/, https://example.org/dir3/page/1.html
https://example.org/, https://example.org/dir4/page/2.html
https://example.org/, https://example.org/dir5/page/3.html
https://example.org/dir1/page/1.html, https://example.org/
https://example.org/dir1/page/1.html, https://example.org/dir1/page/2.html
https://example.org/dir1/page/1.html, https://example.org/dir1/page/3.html
https://example.org/dir1/page/1.html, https://example.org/dir2/page/1.html
https://example.org/dir1/page/2.html, https://example.org/
https://example.org/dir1/page/3.html, https://example.org/dir1/page/2.html
https://example.org/dir1/page/3.html, https://example.org/dir2/page/1.html
https://example.org/dir1/page/3.html, https://example.org/dir2/page/1.html
https://example.org/dir2/page/1.html, https://example.org/dir6/page/1.html
https://example.org/dir3/page/1.html, https://example.org/dir7/page/1.html
https://example.org/dir5/page/1.html, https://example.org/
https://example.org/dir6/page/1.html, https://example.org/
https://example.org/dir6/page/1.html, https://example.org/dir7/page/1.html

I would like to convert this into a tree base nodes when every directory in the URL became a node, then children.

For exemple, this URL https://example.org/dir1/page/1.html will have a dir1 as a node and page as a child, then 1.html a child of page ...

what I want to get is something like :

    {
 "name": "https://example.org/",
 "children": [
  {
   "name": "dir1",
   "children": [
    {
     "name": "page",
     "children": [
        {"name": "page",
        "children": [
            {"name": "1", "size": 3534},
            {"name": "1", "size": 3534}
            {"name": "3", "size": 3534}
      ]}
     ]
    },
    {
     "name": "dir2",
     "children": [
      {"name": "page",
        "children": [
            {"name": "1", "size": 3534},
            {"name": "2", "size": 3534}
      ]}
     ]
    },
    {
     "name": "dir3",
     "children": [
      {"name": "page",
        "children": [
            {"name": "page",
              "children": [
              {"name": "1", "size": 3534},
      ]}
      ]}
     ]
    },
    {
     "name": "dir4",
     "children": [
      {"name": "page",
        "children": [
            {"name": "2", "size": 3534}
      ]}
     ]
    },
    {
     "name": "dir5",
     "children": [
      {"name": "page",
        "children": [
            {"name": "3", "size": 3534}
      ]}
     ]
    }
  ]
}],
...
}

any idea ? thank you

0 Answers