Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional Transform to Support Flat Hierarchies #3733

Open
PBI-David opened this issue Apr 24, 2023 · 1 comment
Open

Additional Transform to Support Flat Hierarchies #3733

PBI-David opened this issue Apr 24, 2023 · 1 comment
Labels
feature-request For requesting new features or transforms

Comments

@PBI-David
Copy link
Contributor

PBI-David commented Apr 24, 2023

I'm currently building tree layouts (https://vega.github.io/vega/examples/tree-layout/) as seen here:

image

Typically, hierarchical data is stored in source systems in one of two different formats:

  1. As a narrow table of parent and child columns. e.g.
[
  {"id": "A", "parent": null},
  {"id": "B", "parent": "A"},
  {"id": "C", "parent": "A"},
  {"id": "D", "parent": "C"},
  {"id": "E", "parent": "C"}
]
  1. As a wide, flattened table with each level of the hierarchy as its own column.

[
  {"id": "A", "job": "Doctor", "region": "East"},
  {"id": "B", "job": "Doctor", "region": "East"},
  {"id": "C", "job": "Lawyer", "region": "East"},
  {"id": "D", "job": "Lawyer", "region": "East"},
  {"id": "E", "job": "Doctor", "region": "West"},
  {"id": "F", "job": "Doctor", "region": "West"},
  {"id": "G", "job": "Lawyer", "region": "West"},
  {"id": "H", "job": "Lawyer", "region": "West"}
]

The first format is already catered for in Vega by supplying it to a Stratify transform and then a Tree transform. However, I don't believe there is currently any support for the second format which is quite common from my experience. I originally thought the Nest transform would handle these cases but the Nest transform only supports data where all leaves are at the same level. i.e. nest cannot handle something like the following ragged hierarchy (the last record for Accountant stops at a depth of 3 where as others records go to a depth of 4):

      "values": [
        {"id": "A", "job": "Doctor", "region": "East", "company": "Acme"},
        {"id": "B", "job": "Doctor", "region": "East", "company": "Acme"},
        {"id": "C", "job": "Lawyer", "region": "East", "company": "Acme"},
        {"id": "D", "job": "Lawyer", "region": "East", "company": "Acme"},
        {"id": "E", "job": "Doctor", "region": "West", "company": "Acme"},
        {"id": "F", "job": "Doctor", "region": "West", "company": "Acme"},
        {"id": "G", "job": "Lawyer", "region": "West", "company": "Acme"},
        {"id": "H", "job": "Lawyer", "region": "West", "company": "Acme"},
        {"id": null, "job": "Accountant", "region": "North", "company": "Acme"}
      ],

Incidentally, it might be an idea to update the documentation on Nest to clarify that only hierarchies with leaves at the same level are supported as it wasn't 100% clear to me until I tested.

In summary, a transform to turn the wide format of example 2 into the narrow format in example 1 before feeding to a stratify would be really helpful. Relevant d3 issues is here: d3/d3-hierarchy#149

@PBI-David PBI-David added the feature-request For requesting new features or transforms label Apr 24, 2023
@PBI-David
Copy link
Contributor Author

For anyone else facing this issue, you can get a result with a series of transforms but it is clunky and native support would still be better.

image

{
  "$schema": "https://vega.github.io/schema/vega/v5.json",
  "description": "An example of Cartesian layouts for a node-link diagram of hierarchical data.",
  "width": 600,
  "height": 600,
  "padding": 5,
  "data": [
    {
      "name": "tree",
      "values": [
        {"id": "A", "job": "Doctor", "region": "East", "company": "Acme"},
        {"id": "B", "job": "Doctor", "region": "East", "company": "Acme"},
        {"id": "C", "job": "Lawyer", "region": "East", "company": "Acme"},
        {"id": "D", "job": "Lawyer", "region": "East", "company": "Acme"},
        {"id": "E", "job": "Doctor", "region": "West", "company": "Acme"},
        {"id": "F", "job": "Doctor", "region": "West", "company": "Acme"},
        {"id": "G", "job": "Lawyer", "region": "West", "company": "Acme"},
        {"id": "H", "job": "Lawyer", "region": "West", "company": "Acme"},
        {"id": null, "job": "Accountant", "region": "North", "company": "Acme"}
      ],
      "transform": [
        {
          "type": "formula",
          "expr": "[datum.id, datum.job + '|'+datum.region+ '|'+datum.company]",
          "as": "p2"
        },
        {
          "type": "formula",
          "expr": "[datum.job + '|'+datum.region+ '|'+datum.company, datum.region+ '|'+datum.company]",
          "as": "p1"
        },
        {
          "type": "formula",
          "expr": "[datum.region+ '|'+datum.company, datum.company]",
          "as": "p0"
        },
        {"type": "formula", "expr": "[ 'Acme', null]", "as": "p-1"},
        {"type": "fold", "fields": ["p-1", "p0", "p1", "p2"]},
        {"type": "project", "fields": ["key", "value"]},
        {"type": "aggregate", "groupby": ["key", "value"]},
        {"type": "formula", "expr": "datum.value[0]", "as": "nodeKey"},
        {
          "type": "formula",
          "expr": "split(datum.value[0],'|')[0]",
          "as": "nodeName"
        },
        {"type": "formula", "expr": "datum.value[1]", "as": "parentKey"},
        {"type": "filter", "expr": "datum.nodeName != 'null'"},
        {"type": "stratify", "key": "nodeKey", "parentKey": "parentKey"},
        {
          "type": "tree",
          "method": {"signal": "'tidy'"},
          "size": [{"signal": "height"}, {"signal": "width - 100"}],
          "separation": true,
          "as": ["y", "x", "depth", "children"]
        }
      ]
    },
    {
      "name": "links",
      "source": "tree",
      "transform": [
        {"type": "treelinks"},
        {"type": "linkpath", "orient": "horizontal", "shape": "diagonal"}
      ]
    }
  ],
  "scales": [
    {
      "name": "color",
      "type": "linear",
      "range": {"scheme": "magma"},
      "domain": {"data": "tree", "field": "depth"},
      "zero": true
    }
  ],
  "marks": [
    {
      "type": "path",
      "from": {"data": "links"},
      "encode": {
        "update": {"path": {"field": "path"}, "stroke": {"value": "#ccc"}}
      }
    },
    {
      "type": "symbol",
      "from": {"data": "tree"},
      "encode": {
        "enter": {"size": {"value": 100}, "stroke": {"value": "#fff"}},
        "update": {
          "x": {"field": "x"},
          "y": {"field": "y"},
          "fill": {"scale": "color", "field": "depth"}
        }
      }
    },
    {
      "type": "text",
      "from": {"data": "tree"},
      "encode": {
        "enter": {
          "text": {"field": "nodeName"},
          "fontSize": {"value": 9},
          "baseline": {"value": "middle"}
        },
        "update": {
          "x": {"field": "x"},
          "y": {"field": "y"},
          "dx": {"signal": "datum.children ? -7 : 7"},
          "align": {"signal": "datum.children ? 'right' : 'left'"},
          "opacity": {"signal": "1"}
        }
      }
    }
  ]
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request For requesting new features or transforms
Projects
None yet
Development

No branches or pull requests

1 participant