JSON в древовидную диаграмму - PullRequest
0 голосов
/ 28 июня 2018

У меня есть объект JSON, который сильно вложен. Есть ли способ, которым я мог бы просмотреть иерархическую древовидную диаграмму? Я просмотрел несколько ресурсов, таких как Pydot, Plotly и т. Д., Но ничто не могло отразить JSON в моем формате.

Файл JSON:

{
  "found_intents": {
    "_DATE": {}
  },
  "sentence": "What is your name",
  "tree": [
    [
      {
        "canonical": null,
        "concept": "_START_TAG",
        "correct_string": "<start>",
        "definition": "",
        "details": [
          {
            "canonical": null,
            "concept": "",
            "correct_string": "<start>",
            "definition": "",
            "details": [],
            "e.g.": [],
            "grammar": "<start>",
            "language": "english",
            "span": [
              0,
              1
            ],
            "span_string": "<start>",
            "weight": 1.0
          }
        ],
        "e.g.": [],
        "grammar": "<start>",
        "language": "english",
        "span": [
          0,
          1
        ],
        "span_string": "<start>",
        "weight": 1.0
      },
      {
        "canonical": null,
        "concept": "_WHAT_IS",
        "correct_string": "what is",
        "definition": "",
        "details": [
          {
            "canonical": null,
            "concept": "",
            "correct_string": "what",
            "definition": "",
            "details": [],
            "e.g.": [],
            "grammar": "what",
            "language": "english",
            "span": [
              1,
              2
            ],
            "span_string": "what",
            "weight": 1.0
          },
          {
            "canonical": null,
            "concept": "",
            "correct_string": "is",
            "definition": "",
            "details": [],
            "e.g.": [],
            "grammar": "is",
            "language": "english",
            "span": [
              2,
              3
            ],
            "span_string": "is",
            "weight": 1.0
          }
        ],
        "e.g.": [],
        "grammar": "what is",
        "language": "english",
        "span": [
          1,
          3
        ],
        "span_string": "what is",
        "weight": 1.0
      },
      {
        "canonical": null,
        "concept": "_DICTIONARY",
        "correct_string": "your",
        "definition": "",
        "details": [
          {
            "canonical": null,
            "concept": "",
            "correct_string": "your",
            "definition": "",
            "details": [],
            "e.g.": [],
            "grammar": "your",
            "language": "english",
            "span": [
              3,
              4
            ],
            "span_string": "your",
            "weight": 1.0
          }
        ],
        "e.g.": [],
        "grammar": "your",
        "language": "english",
        "span": [
          3,
          4
        ],
        "span_string": "your",
        "weight": 1.0
      },
      {
        "canonical": null,
        "concept": "_DICTIONARY",
        "correct_string": "name",
        "definition": "",
        "details": [
          {
            "canonical": null,
            "concept": "",
            "correct_string": "name",
            "definition": "",
            "details": [],
            "e.g.": [],
            "grammar": "name",
            "language": "english",
            "span": [
              4,
              5
            ],
            "span_string": "name",
            "weight": 1.0
          }
        ],
        "e.g.": [],
        "grammar": "name",
        "language": "english",
        "span": [
          4,
          5
        ],
        "span_string": "name",
        "weight": 1.0
      },
      {
        "canonical": null,
        "concept": "_END_TAG",
        "correct_string": "<end>",
        "definition": "",
        "details": [
          {
            "canonical": null,
            "concept": "",
            "correct_string": "<end>",
            "definition": "",
            "details": [],
            "e.g.": [],
            "grammar": "<end>",
            "language": "english",
            "span": [
              5,
              6
            ],
            "span_string": "<end>",
            "weight": 1.0
          }
        ],
        "e.g.": [],
        "grammar": "<end>",
        "language": "english",
        "span": [
          5,
          6
        ],
        "span_string": "<end>",
        "weight": 1.0
      }
    ],
    [
      {
        "canonical": null,
        "concept": "_START_TAG",
        "correct_string": "<start>",
        "definition": "",
        "details": [
          {
            "canonical": null,
            "concept": "",
            "correct_string": "<start>",
            "definition": "",
            "details": [],
            "e.g.": [],
            "grammar": "<start>",
            "language": "english",
            "span": [
              0,
              1
            ],
            "span_string": "<start>",
            "weight": 1.0
          }
        ],
        "e.g.": [],
        "grammar": "<start>",
        "language": "english",
        "span": [
          0,
          1
        ],
        "span_string": "<start>",
        "weight": 1.0
      },
      {
        "canonical": null,
        "concept": "_WHAT_IS",
        "correct_string": "what is",
        "definition": "",
        "details": [
          {
            "canonical": null,
            "concept": "",
            "correct_string": "what",
            "definition": "",
            "details": [],
            "e.g.": [],
            "grammar": "what",
            "language": "english",
            "span": [
              1,
              2
            ],
            "span_string": "what",
            "weight": 1.0
          },
          {
            "canonical": null,
            "concept": "",
            "correct_string": "is",
            "definition": "",
            "details": [],
            "e.g.": [],
            "grammar": "is",
            "language": "english",
            "span": [
              2,
              3
            ],
            "span_string": "is",
            "weight": 1.0
          }
        ],
        "e.g.": [],
        "grammar": "what is",
        "language": "english",
        "span": [
          1,
          3
        ],
        "span_string": "what is",
        "weight": 1.0
      },
      {
        "canonical": null,
        "concept": "_DICTIONARY",
        "correct_string": "your",
        "definition": "",
        "details": [
          {
            "canonical": null,
            "concept": "",
            "correct_string": "your",
            "definition": "",
            "details": [],
            "e.g.": [],
            "grammar": "your",
            "language": "english",
            "span": [
              3,
              4
            ],
            "span_string": "your",
            "weight": 1.0
          }
        ],
        "e.g.": [],
        "grammar": "your",
        "language": "english",
        "span": [
          3,
          4
        ],
        "span_string": "your",
        "weight": 1.0
      },
      {
        "canonical": null,
        "concept": "_THEATRE_ID",
        "correct_string": "name",
        "definition": "",
        "details": [
          {
            "canonical": null,
            "concept": "",
            "correct_string": "name",
            "definition": "",
            "details": [],
            "e.g.": [],
            "grammar": "name",
            "language": "english",
            "span": [
              4,
              5
            ],
            "span_string": "name",
            "weight": 1.0
          }
        ],
        "e.g.": [],
        "grammar": "name",
        "language": "english",
        "span": [
          4,
          5
        ],
        "span_string": "name",
        "weight": 1.0
      },
      {
        "canonical": null,
        "concept": "_END_TAG",
        "correct_string": "<end>",
        "definition": "",
        "details": [
          {
            "canonical": null,
            "concept": "",
            "correct_string": "<end>",
            "definition": "",
            "details": [],
            "e.g.": [],
            "grammar": "<end>",
            "language": "english",
            "span": [
              5,
              6
            ],
            "span_string": "<end>",
            "weight": 1.0
          }
        ],
        "e.g.": [],
        "grammar": "<end>",
        "language": "english",
        "span": [
          5,
          6
        ],
        "span_string": "<end>",
        "weight": 1.0
      }
    ],
    [
      {
        "canonical": null,
        "concept": "_START_TAG",
        "correct_string": "<start>",
        "definition": "",
        "details": [
          {
            "canonical": null,
            "concept": "",
            "correct_string": "<start>",
            "definition": "",
            "details": [],
            "e.g.": [],
            "grammar": "<start>",
            "language": "english",
            "span": [
              0,
              1
            ],
            "span_string": "<start>",
            "weight": 1.0
          }
        ],
        "e.g.": [],
        "grammar": "<start>",
        "language": "english",
        "span": [
          0,
          1
        ],
        "span_string": "<start>",
        "weight": 1.0
      },
      {
        "canonical": null,
        "concept": "_WHAT_IS",
        "correct_string": "what is",
        "definition": "",
        "details": [
          {
            "canonical": null,
            "concept": "",
            "correct_string": "what",
            "definition": "",
            "details": [],
            "e.g.": [],
            "grammar": "what",
            "language": "english",
            "span": [
              1,
              2
            ],
            "span_string": "what",
            "weight": 1.0
          },
          {
            "canonical": null,
            "concept": "",
            "correct_string": "is",
            "definition": "",
            "details": [],
            "e.g.": [],
            "grammar": "is",
            "language": "english",
            "span": [
              2,
              3
            ],
            "span_string": "is",
            "weight": 1.0
          }
        ],
        "e.g.": [],
        "grammar": "what is",
        "language": "english",
        "span": [
          1,
          3
        ],
        "span_string": "what is",
        "weight": 1.0
      },
      {
        "canonical": null,
        "concept": "_THEATRE_ID",
        "correct_string": "your",
        "definition": "",
        "details": [
          {
            "canonical": null,
            "concept": "",
            "correct_string": "your",
            "definition": "",
            "details": [],
            "e.g.": [],
            "grammar": "your",
            "language": "english",
            "span": [
              3,
              4
            ],
            "span_string": "your",
            "weight": 1.0
          }
        ],
        "e.g.": [],
        "grammar": "your",
        "language": "english",
        "span": [
          3,
          4
        ],
        "span_string": "your",
        "weight": 1.0
      },
      {
        "canonical": null,
        "concept": "_DICTIONARY",
        "correct_string": "name",
        "definition": "",
        "details": [
          {
            "canonical": null,
            "concept": "",
            "correct_string": "name",
            "definition": "",
            "details": [],
            "e.g.": [],
            "grammar": "name",
            "language": "english",
            "span": [
              4,
              5
            ],
            "span_string": "name",
            "weight": 1.0
          }
        ],
        "e.g.": [],
        "grammar": "name",
        "language": "english",
        "span": [
          4,
          5
        ],
        "span_string": "name",
        "weight": 1.0
      },
      {
        "canonical": null,
        "concept": "_END_TAG",
        "correct_string": "<end>",
        "definition": "",
        "details": [
          {
            "canonical": null,
            "concept": "",
            "correct_string": "<end>",
            "definition": "",
            "details": [],
            "e.g.": [],
            "grammar": "<end>",
            "language": "english",
            "span": [
              5,
              6
            ],
            "span_string": "<end>",
            "weight": 1.0
          }
        ],
        "e.g.": [],
        "grammar": "<end>",
        "language": "english",
        "span": [
          5,
          6
        ],
        "span_string": "<end>",
        "weight": 1.0
      }
    ],
    [
      {
        "canonical": null,
        "concept": "_START_TAG",
        "correct_string": "<start>",
        "definition": "",
        "details": [
          {
            "canonical": null,
            "concept": "",
            "correct_string": "<start>",
            "definition": "",
            "details": [],
            "e.g.": [],
            "grammar": "<start>",
            "language": "english",
            "span": [
              0,
              1
            ],
            "span_string": "<start>",
            "weight": 1.0
          }
        ],
        "e.g.": [],
        "grammar": "<start>",
        "language": "english",
        "span": [
          0,
          1
        ],
        "span_string": "<start>",
        "weight": 1.0
      },
      {
        "canonical": null,
        "concept": "_WHAT_IS",
        "correct_string": "what is",
        "definition": "",
        "details": [
          {
            "canonical": null,
            "concept": "",
            "correct_string": "what",
            "definition": "",
            "details": [],
            "e.g.": [],
            "grammar": "what",
            "language": "english",
            "span": [
              1,
              2
            ],
            "span_string": "what",
            "weight": 1.0
          },
          {
            "canonical": null,
            "concept": "",
            "correct_string": "is",
            "definition": "",
            "details": [],
            "e.g.": [],
            "grammar": "is",
            "language": "english",
            "span": [
              2,
              3
            ],
            "span_string": "is",
            "weight": 1.0
          }
        ],
        "e.g.": [],
        "grammar": "what is",
        "language": "english",
        "span": [
          1,
          3
        ],
        "span_string": "what is",
        "weight": 1.0
      },
      {
        "canonical": null,
        "concept": "_THEATRE_ID",
        "correct_string": "your",
        "definition": "",
        "details": [
          {
            "canonical": null,
            "concept": "",
            "correct_string": "your",
            "definition": "",
            "details": [],
            "e.g.": [],
            "grammar": "your",
            "language": "english",
            "span": [
              3,
              4
            ],
            "span_string": "your",
            "weight": 1.0
          }
        ],
        "e.g.": [],
        "grammar": "your",
        "language": "english",
        "span": [
          3,
          4
        ],
        "span_string": "your",
        "weight": 1.0
      },
      {
        "canonical": null,
        "concept": "_THEATRE_ID",
        "correct_string": "name",
        "definition": "",
        "details": [
          {
            "canonical": null,
            "concept": "",
            "correct_string": "name",
            "definition": "",
            "details": [],
            "e.g.": [],
            "grammar": "name",
            "language": "english",
            "span": [
              4,
              5
            ],
            "span_string": "name",
            "weight": 1.0
          }
        ],
        "e.g.": [],
        "grammar": "name",
        "language": "english",
        "span": [
          4,
          5
        ],
        "span_string": "name",
        "weight": 1.0
      },
      {
        "canonical": null,
        "concept": "_END_TAG",
        "correct_string": "<end>",
        "definition": "",
        "details": [
          {
            "canonical": null,
            "concept": "",
            "correct_string": "<end>",
            "definition": "",
            "details": [],
            "e.g.": [],
            "grammar": "<end>",
            "language": "english",
            "span": [
              5,
              6
            ],
            "span_string": "<end>",
            "weight": 1.0
          }
        ],
        "e.g.": [],
        "grammar": "<end>",
        "language": "english",
        "span": [
          5,
          6
        ],
        "span_string": "<end>",
        "weight": 1.0
      }
    ]
  ]
}

Точнее, я бы хотел извлечь среднее количество веток на уровень.

Ожидаемые результаты будут:

Level 0: 4  
Level 1: (1 + 2 + 1 + 1)/4  
Level 2: 0

1 Ответ

0 голосов
/ 28 июня 2018

Не совсем понятно, что вы пытаетесь сделать, но следующий код подсчитывает количество диктовок и списков на каждой глубине вложения. Мы используем collections.deque в качестве стека для выполнения поиска в ширину, подсчитывая количество вложенных контейнеров на каждом уровне, сохраняя значения в списках defaultdict и положить эти контейнеры в стек для дальнейшей обработки. Закончив подсчет всех объектов, мы вычисляем среднее число ветвей на каждом уровне.

Я не буду вставлять ваши данные в этот код, так как он имеет длину около 740 строк. Я называю эти данные как data_string, на своей машине я просто обернул ваши данные в тройные кавычки, но, конечно, вы можете сохранить их как файл и использовать json.load для загрузки.

import json
from collections import defaultdict, deque

data = json.loads(data_string)

def get_branches(obj):
    branches = defaultdict(list)
    stack = deque()
    stack.append((obj, 0))
    while stack:
        obj, depth = stack.pop()
        newdepth = depth + 1
        branch_count = 0
        if isinstance(obj, dict):
            obj = obj.values()
        for child in obj:
            if isinstance(child, (list, dict)):
                branch_count += 1
                stack.append((child, newdepth))
        if branch_count:
            branches[depth].append(branch_count)
    return branches

branches = get_branches(data)
for depth in sorted(branches.keys()):
    row = branches[depth]
    mean = sum(row) / len(row) if row else None
    print('Level', depth, row, mean)

выход

Level 0 [2] 2.0
Level 1 [4, 1] 2.5
Level 2 [5, 5, 5, 5] 5.0
Level 3 [3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3] 3.0
Level 4 [1, 1, 1, 2, 1, 1, 1, 1, 2, 1, 1, 1, 1, 2, 1, 1, 1, 1, 2, 1] 1.2
Level 5 [3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3] 3.0
Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...