Время выполнения увеличивается при чтении нескольких уровней повторяющихся данных структуры JSON в Google BigQuery - PullRequest
0 голосов
/ 30 марта 2020

У меня возникают проблемы со временем выполнения запроса, когда я пытался прочитать несколько уровней повторяющейся структуры JSON. он отлично работает до 6-7 полей, но по мере того, как я читаю больше столбцов, он замедляется.

Входные данные: {"ProjectId": "P.2000002", "OperationId": " O.2000002.01 "," ActivityId ":" A.2000002.01.01 "," Description ":" "," Combos ": [{" ComboId ":" 9146 "," Demands ": {" DownHoleTools ": {" PrimaryTools ": [{" ToolCode ":" 19139 "," ToolDescription ":" ABCD-C / D "," IsEdoApplicable ": true," Source ":" A "," DemandDurationInfo ": {" StartDate ":" 2019- 09-09T17: 42: 10 "," EndDate ":" 2019-09-19T23: 00: 00 "}," HashNumber ": 1," ClassificationName ":" ABCD-C / D "," ClassificationType ": 0, "GroupInfo": {"Код": "1519", "Описание": "ABCD"}, "CategoryInfo": {"Код": "1519", "Описание": "ABCD"}, "Комментарии": "" , "IsDeleted": false, "PartNumber": "", "Description": "", "CreatedDate": "0001-01-01T00: 00: 00", "CreatedBy": "", "LastModifiedDate": "0001 -01-01T00: 00: 00 "," LastModifiedBy ":" "," Id ":" 1 "}, {" ToolCode ":" 7030 "," ToolDescription ":" VSIB-P "," IsEdoApplicable ": false , "Источник": "А", "ДемандДурат" ionInfo ": {" StartDate ":" 2019-09-09T17: 42: 12 "," EndDate ":" 2019-09-19T23: 00: 00 "}," HashNumber ": 1," ClassificationName ":" VSIB- P "," ClassificationType ": 0," GroupInfo ": {" Code ":" 1519 "," Description ":" ABCD "}," CategoryInfo ": {" Code ":" 1519 "," Description ":" ABCD "}," Комментарии ":" "," IsDeleted ": false," PartNumber ":" "," Description ":" "," CreatedDate ":" 0001-01-01T00: 00: 00 "," CreatedBy ": "", "LastModifiedDate": "0001-01-01T00: 00: 00", "LastModifiedBy": "", "Id": "1"}, {"ToolCode": "3707", "ToolDescription": "HILT -TLD-H "," IsEdoApplicable ": false," Source ":" A "," DemandDurationInfo ": {" StartDate ":" 2020-02-12T15: 18: 32 "," EndDate ":" 2020-02- 13T15: 18: 32 "}," HashNumber ": 1," ClassificationName ":" HILT-TLD-H "," ClassificationType ": 0," GroupInfo ": {" Code ":" 842 "," Description ":" HILT "}," CategoryInfo ": {" Code ":" 842 "," Description ":" HILT "}," Comments ":" "," IsDeleted ": false," PartNumber ":" "," Description ": "HILT-TLD-H", "CreatedDate": "0001-01-01T00: 00: 00", "CreatedBy": "", "LastModifiedDate": "0001-01-01T00: 00: 00", "LastModifiedBy": "", "Id": "1"}, {"ToolCode": "3707", "ToolDescription": "HILT-TLD- H "," IsEdoApplicable ": false," Source ":" A "," DemandDurationInfo ": { «StartDate»: «2020-02-12T15: 18: 32», «EndDate»: «2020-02-13T15: 18: 32»}, «HashNumber»: 2, «ClassificationName»: «HILT-TLD-H» , "ClassificationType": 0, "GroupInfo": {"Code": "842", "Description": "HILT"}, "CategoryInfo": {"Code": "842", "Description": "HILT"} , "Комментарии": "", "IsDeleted": false, "PartNumber": "", "Описание": "HILT-TLD-H", "CreatedDate": "0001-01-01T00: 00: 00", " CreatedBy ":" "," LastModifiedDate ":" 0001-01-01T00: 00: 00 "," LastModifiedBy ":" "," Id ":" 1 "}]," BackupTools ": [

        ]
      },
      "SurfaceTools": {
        "PrimaryTools": [
          {
            "ToolCode": "19153",
            "ToolDescription": "MDT_Surface Eqpt",
            "IsEdoApplicable": false,
            "Source": "A",
            "DemandDurationInfo": {
              "StartDate": "2020-02-12T15:18:32",
              "EndDate": "2020-02-13T15:18:32"
            },
            "HashNumber": 1,
            "ClassificationName": "MDT_Surface Eqpt",
            "ClassificationType": 1,
            "GroupInfo": {
              "Code": "965",
              "Description": "MDT Accessories"
            },
            "CategoryInfo": {
              "Code": "965",
              "Description": "MDT Accessories"
            },
            "Comments": "",
            "IsDeleted": false,
            "PartNumber": "",
            "Description": "MDT_Surface Eqpt",
            "CreatedDate": "0001-01-01T00:00:00",
            "CreatedBy": "",
            "LastModifiedDate": "0001-01-01T00:00:00",
            "LastModifiedBy": "",
            "Id": "1"
          },
          {
            "ToolCode": "19153",
            "ToolDescription": "MDT_Surface Eqpt",
            "IsEdoApplicable": false,
            "Source": "A",
            "DemandDurationInfo": {
              "StartDate": "2020-02-12T15:18:32",
              "EndDate": "2020-02-13T15:18:32"
            },
            "HashNumber": 2,
            "ClassificationName": "MDT_Surface Eqpt",
            "ClassificationType": 1,
            "GroupInfo": {
              "Code": "965",
              "Description": "MDT Accessories"
            },
            "CategoryInfo": {
              "Code": "965",
              "Description": "MDT Accessories"
            },
            "Comments": "",
            "IsDeleted": false,
            "PartNumber": "",
            "Description": "MDT_Surface Eqpt",
            "CreatedDate": "0001-01-01T00:00:00",
            "CreatedBy": "",
            "LastModifiedDate": "0001-01-01T00:00:00",
            "LastModifiedBy": "",
            "Id": "1"
          }
        ],
        "BackupTools": [

        ]
      },
      "Techniques": {
        "PrimaryTools": [

        ],
        "BackupTools": [

        ]
      },
      "Services": [

      ],
      "Tools": ""
    },
    "ComboType": 2,
    "HashCode": "",
    "SequenceNumber": "",
    "ConveyanceInfo": "",
    "CreatedDate": "0001-01-01T00:00:00",
    "CreatedBy": "",
    "LastModifiedDate": "0001-01-01T00:00:00",
    "LastModifiedBy": "",
    "Id": "98e9418f-e50a-417b-affb-5fc4c1f71f39"
  },
  {
    "ComboId": "5970",
    "Demands": {
      "DownHoleTools": {
        "PrimaryTools": [

        ],
        "BackupTools": [

        ]
      },
      "SurfaceTools": {
        "PrimaryTools": [

        ],
        "BackupTools": [

        ]
      },
      "Techniques": {
        "PrimaryTools": [

        ],
        "BackupTools": [

        ]
      },
      "Services": [

      ],
      "Tools": ""
    },
    "ComboType": 1,
    "HashCode": "",
    "SequenceNumber": "",
    "ConveyanceInfo": "",
    "CreatedDate": "0001-01-01T00:00:00",
    "CreatedBy": "",
    "LastModifiedDate": "0001-01-01T00:00:00",
    "LastModifiedBy": "",
    "Id": "944cf025-2a8c-4372-9f87-6c80c844ac68"
  },
  {
    "ComboId": "5971",
    "Demands": {
      "DownHoleTools": {
        "PrimaryTools": [

        ],
        "BackupTools": [

        ]
      },
      "SurfaceTools": {
        "PrimaryTools": [

        ],
        "BackupTools": [

        ]
      },
      "Techniques": {
        "PrimaryTools": [

        ],
        "BackupTools": [

        ]
      },
      "Services": [

      ],
      "Tools": ""
    },
    "ComboType": 0,
    "HashCode": "",
    "SequenceNumber": "",
    "ConveyanceInfo": "",
    "CreatedDate": "0001-01-01T00:00:00",
    "CreatedBy": "",
    "LastModifiedDate": "0001-01-01T00:00:00",
    "LastModifiedBy": "",
    "Id": "0a9338b2-aa95-4d5a-8e57-1305e78fec0c"
  },
  {
    "ComboId": "26793",
    "Demands": {
      "DownHoleTools": {
        "PrimaryTools": [

        ],
        "BackupTools": [

        ]
      },
      "SurfaceTools": {
        "PrimaryTools": [

        ],
        "BackupTools": [

        ]
      },
      "Techniques": {
        "PrimaryTools": [

        ],
        "BackupTools": [

        ]
      },
      "Services": [
        {
          "Code": "GIWS",
          "Name": "Grease Injection WHE Service",
          "Description": "",
          "GroupInfos": "",
          "ClassificationType": 4,
          "LegacySystemMapping": [
            {
              "LegacyId": "EE61B186-CE57-46A8-B280-FE913CC8FF33",
              "LegacySystemMappedProperty": "Grease Injection WHE Service",
              "LegacySystemName": "ODM"
            },
            {
              "LegacyId": "2e4b1aea-3d91-43b6-9c32-a165a546ed39",
              "LegacySystemMappedProperty": "Grease Injection WHE Service",
              "LegacySystemName": "OSCompliance"
            }
          ],
          "Source": "A",
          "Comments": "",
          "Id": "9c405cc1-5231-4baf-864f-7974bb4fbe07"
        },
        {
          "Code": "SCNNGWS",
          "Name": "Slick Cable Non-Grease Injection WHE Service",
          "Description": "",
          "GroupInfos": "",
          "ClassificationType": 4,
          "LegacySystemMapping": [
            {
              "LegacyId": "7B41D3DD-A6A7-47AF-81CD-5D8C248183B5",
              "LegacySystemMappedProperty": "Slick Cable Non-Grease Injection WHE Service",
              "LegacySystemName": "ODM"
            },
            {
              "LegacyId": "399e61ed-353a-404f-aad3-1e84c46cb273",
              "LegacySystemMappedProperty": "Slick Cable Non-Grease Injection WHE Service",
              "LegacySystemName": "OSCompliance"
            }
          ],
          "Source": "A",
          "Comments": "",
          "Id": "0a4848d3-7d93-4721-a556-2ecf1b0a7f43"
        },
        {
          "Code": "TPWS",
          "Name": "Third Party WHE Service",
          "Description": "",
          "GroupInfos": "",
          "ClassificationType": 4,
          "LegacySystemMapping": [
            {
              "LegacyId": "CAF5754A-3F90-40C6-82AC-0A4F484A4E74",
              "LegacySystemMappedProperty": "Third Party WHE Service",
              "LegacySystemName": "ODM"
            },
            {
              "LegacyId": "50ff1668-0b3b-489c-a7ad-794e327028e5",
              "LegacySystemMappedProperty": "Third Party WHE Service",
              "LegacySystemName": "OSCompliance"
            }
          ],
          "Source": "A",
          "Comments": "",
          "Id": "ee923f9e-210f-4c36-8fc4-f1e9521d0cbe"
        },
        {
          "Code": "WLPPS",
          "Name": "Wireline Low Pressure Packoff Service",
          "Description": "",
          "GroupInfos": "",
          "ClassificationType": 4,
          "LegacySystemMapping": [
            {
              "LegacyId": "CB79CDE1-2271-418D-9E67-D4B07E94AC61",
              "LegacySystemMappedProperty": "Wireline Low Pressure Packoff Service",
              "LegacySystemName": "ODM"
            },
            {
              "LegacyId": "514650dd-5b71-4889-b991-0d77dd355666",
              "LegacySystemMappedProperty": "Wireline Low Pressure Packoff Service",
              "LegacySystemName": "OSCompliance"
            }
          ],
          "Source": "A",
          "Comments": "",
          "Id": "2216c1c0-41ba-40aa-90c2-1096e8191d2c"
        }
      ],
      "Tools": ""
    },
    "ComboType": 2,
    "HashCode": "",
    "SequenceNumber": "",
    "ConveyanceInfo": "",
    "CreatedDate": "0001-01-01T00:00:00",
    "CreatedBy": "",
    "LastModifiedDate": "0001-01-01T00:00:00",
    "LastModifiedBy": "",
    "Id": "446d43e1-6476-4408-b850-e5f233933ba9"
  }
],
"CreatedDate": "2019-09-09T13:12:14.94",
"CreatedBy": "VHiremath",
"LastModifiedDate": "2020-02-27T07:40:08.071",
"LastModifiedBy": "VPanath",
"Id": "5d764fae3d6a351088a1c9d3"

}

КОД:

CREATE TEMPORARY FUNCTION CUSTOM_JSON_EXTRACT(json STRING, json_path STRING)
RETURNS ARRAY<STRING>
LANGUAGE js AS '''
  var result = jsonPath(JSON.parse(json), json_path);
  if(result){return result;} 
  else {return [];}
'''
OPTIONS (
    library="gs://temp-dev-workspace/json_temp/jsonpath-0.8.0.js"
);
SELECT distinct
  job_id,
  combo_id,
  tool_code,
  tool_description,
  is_edo_applicable,
  Source,
  Demand_Start_date,
  Demand_End_date,
  Hash_Number,
  Classification_Name,
  Classification_Type,
  GroupInfo_Code,
  GroupInfo_Description,
  CategoryInfo_Code,
  CategoryInfo_Description,
  Is_Deleted,
  Part_NUmber,
  Description,
  Created_Date,
  Created_By,
  Last_Modified_Date,
  Last_Modified_By  

FROM temp.dbm_eqp_data,
UNNEST(CUSTOM_JSON_EXTRACT(conv_column, '$.Combos[*].ComboId')) combo_id
LEFT JOIN UNNEST(CUSTOM_JSON_EXTRACT(conv_column, '$.Combos[?(@.ComboId=="' || combo_id || '")].Demands.DownHoleTools.PrimaryTools[*].ToolCode')) tool_code
LEFT JOIN UNNEST(CUSTOM_JSON_EXTRACT(conv_column, '$.Combos[?(@.ComboId=="' || combo_id || '")].Demands.DownHoleTools.PrimaryTools[?(@.ToolCode=="' || tool_code || '")].ToolDescription')) tool_description
LEFT JOIN UNNEST(CUSTOM_JSON_EXTRACT(conv_column, '$.Combos[?(@.ComboId=="' || combo_id || '")].Demands.DownHoleTools.PrimaryTools[?(@.ToolCode=="' || tool_code || '")].IsEdoApplicable')) is_edo_applicable
LEFT JOIN UNNEST(CUSTOM_JSON_EXTRACT(conv_column, '$.Combos[?(@.ComboId=="' || combo_id || '")].Demands.DownHoleTools.PrimaryTools[?(@.ToolCode=="' || tool_code || '")].Source')) Source
LEFT JOIN UNNEST(CUSTOM_JSON_EXTRACT(conv_column, '$.Combos[?(@.ComboId=="' || combo_id || '")].Demands.DownHoleTools.PrimaryTools[?(@.ToolCode=="' || tool_code || '")].DemandDurationInfo.StartDate')) Demand_Start_date
LEFT JOIN UNNEST(CUSTOM_JSON_EXTRACT(conv_column, '$.Combos[?(@.ComboId=="' || combo_id || '")].Demands.DownHoleTools.PrimaryTools[?(@.ToolCode=="' || tool_code || '")].DemandDurationInfo.EndDate')) Demand_End_date
LEFT JOIN UNNEST(CUSTOM_JSON_EXTRACT(conv_column, '$.Combos[?(@.ComboId=="' || combo_id || '")].Demands.DownHoleTools.PrimaryTools[?(@.ToolCode=="' || tool_code || '")].HashNumber')) Hash_Number
LEFT JOIN UNNEST(CUSTOM_JSON_EXTRACT(conv_column, '$.Combos[?(@.ComboId=="' || combo_id || '")].Demands.DownHoleTools.PrimaryTools[?(@.ToolCode=="' || tool_code || '")].ClassificationName')) Classification_Name
LEFT JOIN UNNEST(CUSTOM_JSON_EXTRACT(conv_column, '$.Combos[?(@.ComboId=="' || combo_id || '")].Demands.DownHoleTools.PrimaryTools[?(@.ToolCode=="' || tool_code || '")].ClassificationType')) Classification_Type
LEFT JOIN UNNEST(CUSTOM_JSON_EXTRACT(conv_column, '$.Combos[?(@.ComboId=="' || combo_id || '")].Demands.DownHoleTools.PrimaryTools[?(@.ToolCode=="' || tool_code || '")].GroupInfo.Code')) GroupInfo_Code
LEFT JOIN UNNEST(CUSTOM_JSON_EXTRACT(conv_column, '$.Combos[?(@.ComboId=="' || combo_id || '")].Demands.DownHoleTools.PrimaryTools[?(@.ToolCode=="' || tool_code || '")].GroupInfo.Description')) GroupInfo_Description
LEFT JOIN UNNEST(CUSTOM_JSON_EXTRACT(conv_column, '$.Combos[?(@.ComboId=="' || combo_id || '")].Demands.DownHoleTools.PrimaryTools[?(@.ToolCode=="' || tool_code || '")].CategoryInfo.Code')) CategoryInfo_Code
LEFT JOIN UNNEST(CUSTOM_JSON_EXTRACT(conv_column, '$.Combos[?(@.ComboId=="' || combo_id || '")].Demands.DownHoleTools.PrimaryTools[?(@.ToolCode=="' || tool_code || '")].CategoryInfo.Description')) CategoryInfo_Description
LEFT JOIN UNNEST(CUSTOM_JSON_EXTRACT(conv_column, '$.Combos[?(@.ComboId=="' || combo_id || '")].Demands.DownHoleTools.PrimaryTools[?(@.ToolCode=="' || tool_code || '")].IsDeleted')) Is_Deleted
LEFT JOIN UNNEST(CUSTOM_JSON_EXTRACT(conv_column, '$.Combos[?(@.ComboId=="' || combo_id || '")].Demands.DownHoleTools.PrimaryTools[?(@.ToolCode=="' || tool_code || '")].PartNumber')) Part_Number
LEFT JOIN UNNEST(CUSTOM_JSON_EXTRACT(conv_column, '$.Combos[?(@.ComboId=="' || combo_id || '")].Demands.DownHoleTools.PrimaryTools[?(@.ToolCode=="' || tool_code || '")].Description')) Description
LEFT JOIN UNNEST(CUSTOM_JSON_EXTRACT(conv_column, '$.Combos[?(@.ComboId=="' || combo_id || '")].Demands.DownHoleTools.PrimaryTools[?(@.ToolCode=="' || tool_code || '")].CreatedDate')) Created_Date
LEFT JOIN UNNEST(CUSTOM_JSON_EXTRACT(conv_column, '$.Combos[?(@.ComboId=="' || combo_id || '")].Demands.DownHoleTools.PrimaryTools[?(@.ToolCode=="' || tool_code || '")].CreatedBy')) Created_By
LEFT JOIN UNNEST(CUSTOM_JSON_EXTRACT(conv_column, '$.Combos[?(@.ComboId=="' || combo_id || '")].Demands.DownHoleTools.PrimaryTools[?(@.ToolCode=="' || tool_code || '")].LastModifiedDate')) Last_Modified_Date
LEFT JOIN UNNEST(CUSTOM_JSON_EXTRACT(conv_column, '$.Combos[?(@.ComboId=="' || combo_id || '")].Demands.DownHoleTools.PrimaryTools[?(@.ToolCode=="' || tool_code || '")].LastModifiedBy')) Last_Modified_By

ВЫХОД: enter image description here

1 Ответ

1 голос
/ 31 марта 2020

Ниже для BigQuery Standard SQL (и устраняет проблему производительности)

#standardSQL
CREATE TEMPORARY FUNCTION CUSTOM_JSON_EXTRACT(json STRING, json_path STRING) 
RETURNS ARRAY<STRING> LANGUAGE js AS '''
  var result = jsonPath(JSON.parse(json), json_path);
  if(result){return result;} 
  else {return [];}
'''
OPTIONS (library='gs://my_test__bucket/jsonpath-0.8.0.js');

CREATE TEMP FUNCTION JSON_2_ARRAY(input STRING) RETURNS ARRAY<STRING>
  LANGUAGE js AS 'return JSON.parse(input).map(x => JSON.stringify(x));';

SELECT 
  job_id,
  combo_id,
  JSON_EXTRACT_SCALAR(primary_tools, '$.ToolCode') tool_code,
  JSON_EXTRACT_SCALAR(primary_tools, '$.ToolDescription') tool_description,
  JSON_EXTRACT_SCALAR(primary_tools, '$.IsEdoApplicable') is_edo_applicable,
  JSON_EXTRACT_SCALAR(primary_tools, '$.Source') Source,
  JSON_EXTRACT_SCALAR(primary_tools, '$.DemandDurationInfo.StartDate') Demand_Start_date,
  JSON_EXTRACT_SCALAR(primary_tools, '$.DemandDurationInfo.EndDate') Demand_End_date,
  JSON_EXTRACT_SCALAR(primary_tools, '$.HashNumber') Hash_Number,
  JSON_EXTRACT_SCALAR(primary_tools, '$.ClassificationName') Classification_Name,
  JSON_EXTRACT_SCALAR(primary_tools, '$.ClassificationType') Classification_Type,
  JSON_EXTRACT_SCALAR(primary_tools, '$.GroupInfo.Code') GroupInfo_Code,
  JSON_EXTRACT_SCALAR(primary_tools, '$.GroupInfo.Description') GroupInfo_Description,
  JSON_EXTRACT_SCALAR(primary_tools, '$.CategoryInfo.Code') CategoryInfo_Code,
  JSON_EXTRACT_SCALAR(primary_tools, '$.CategoryInfo.Description') CategoryInfo_Description,
  JSON_EXTRACT_SCALAR(primary_tools, '$.IsDeleted') Is_Deleted,
  JSON_EXTRACT_SCALAR(primary_tools, '$.PartNumber') Part_Number,
  JSON_EXTRACT_SCALAR(primary_tools, '$.Description') Description,
  JSON_EXTRACT_SCALAR(primary_tools, '$.CreatedDate') Created_Date,
  JSON_EXTRACT_SCALAR(primary_tools, '$.CreatedBy') Created_By,
  JSON_EXTRACT_SCALAR(primary_tools, '$.LastModifiedDate') Last_Modified_Date,
  JSON_EXTRACT_SCALAR(primary_tools, '$.LastModifiedBy') Last_Modified_By
FROM `temp.dbm_eqp_data`,
UNNEST(JSON_2_ARRAY(JSON_EXTRACT(conv_column, '$.Combos'))) combo
LEFT JOIN UNNEST(CUSTOM_JSON_EXTRACT(combo, '$.ComboId')) combo_id
LEFT JOIN UNNEST(JSON_2_ARRAY(JSON_EXTRACT(combo, '$.Demands.DownHoleTools.PrimaryTools'))) primary_tools

с выводом

Row job_id  combo_id    tool_code   tool_description    is_edo_applicable   Source  Demand_Start_date   Demand_End_date Hash_Number Classification_Name Classification_Type GroupInfo_Code  GroupInfo_Description   CategoryInfo_Code   CategoryInfo_Description    Is_Deleted  Part_Number Description Created_Date    Created_By  Last_Modified_Date  Last_Modified_By     
1   1   9146    19139   VSIT-C/D    true    A   2019-09-09T17:42:10 2019-09-19T23:00:00 1   VSIT-C/D    0   1519    VSIT    1519    VSIT    false           0001-01-01T00:00:00     0001-01-01T00:00:00      
2   1   9146    7030    VSIB-P  false   A   2019-09-09T17:42:12 2019-09-19T23:00:00 1   VSIB-P  0   1519    VSIT    1519    VSIT    false           0001-01-01T00:00:00     0001-01-01T00:00:00      
3   1   9146    3707    HILT-TLD-H  false   A   2020-02-12T15:18:32 2020-02-13T15:18:32 1   HILT-TLD-H  0   842 HILT    842 HILT    false       HILT-TLD-H  0001-01-01T00:00:00     0001-01-01T00:00:00      
4   1   9146    3707    HILT-TLD-H  false   A   2020-02-12T15:18:32 2020-02-13T15:18:32 2   HILT-TLD-H  0   842 HILT    842 HILT    false       HILT-TLD-H  0001-01-01T00:00:00     0001-01-01T00:00:00      
5   1   5970    null    null    null    null    null    null    null    null    null    null    null    null    null    null    null    null    null    null    null    null     
6   1   5971    null    null    null    null    null    null    null    null    null    null    null    null    null    null    null    null    null    null    null    null     
7   1   26793   null    null    null    null    null    null    null    null    null    null    null    null    null    null    null    null    null    null    null    null     
...