JSON Ошибка декодирования в python при использовании модуля запросов - PullRequest
0 голосов
/ 13 июля 2020
base_url = "https://github.com/statsbomb/open-data/tree/master/data/"

comp_url =  base_url + "matches/{}/{}.json"
match_url = base_url + "events/{}.json"

Это ссылка, содержащая данные.

Я использовал функцию для анализа различных типов данных в ней

def parsing_data(comp_id,season_id):
    matches = requests.get(url= comp_url.format(comp_id,season_id)).json()
    match_ids =  [m['match_id'] for m in matches]

    for id in match_ids:
        events = requests.get(url= match_url.format(id)).json()
        shots = [x for x in events if x['type']['name'] == 'Shot']

        all_events = []
        for s in shots:
            attribute = {
               'Match_ID' : id,
               'Team' : s['possession_team']['name'],
               'Player': s['player']['name'],
               'Minute': s['minute'],
               'X_shot': s['location'][0],
               'Y_shot': s['location'][1],
               'Shot_with': s['body_part']['name'],
               'Outcome': s['outcome']['name']
            }
            all_events.append(attribute)

    return pd.DataFrame(all_events)

Но я получаю JSONDecodeError: Ожидаемое значение: строка 6, столбец 1 (char 5) l ie this, когда я вызываю функцию,

comp_id = 43
season_id = 3

df = parsing_data(comp_id,season_id)

Может ли кто-нибудь помочь мне с этим?

Ответы [ 2 ]

0 голосов
/ 13 июля 2020

Вы перешли по ссылке GitHub, вам нужно взять ссылку на исходные данные файла GitHub, например

https://raw.githubusercontent.com/statsbomb/open-data/master/data/

И другое дело, вы должны использовать requests.get(url="").content для извлечения данных.
И еще одно - это данные body_part & результат находятся в shot .

Его можно преобразовать в объект JSON с помощью json.loads(string)

Затем вы можете написать код как

import requests
import pandas as pd
import json

base_url = "https://raw.githubusercontent.com/statsbomb/open-data/master/data/"

comp_url =  base_url + "matches/{}/{}.json"
match_url = base_url + "events/{}.json"

def parsing_data(comp_id,season_id):
    matches = json.loads(requests.get(url=comp_url.format(comp_id,season_id)).content)
    match_ids =  [m['match_id'] for m in matches]

    for id in match_ids:
        events = requests.get(url= match_url.format(id)).json()
        shots = [x for x in events if x['type']['name'] == 'Shot']

        all_events = []
        for s in shots:
            attribute = {
               'Match_ID' : id,
               'Team' : s['possession_team']['name'],
               'Player': s['player']['name'],
               'Minute': s['minute'],
               'X_shot': s['location'][0],
               'Y_shot': s['location'][1],
               'Shot_with': s['body_part']['name'],
               'Outcome': s['outcome']['name']
            }
            all_events.append(attribute)

    # return pd.DataFrame(all_events)

comp_id = 43
season_id = 3

df = parsing_data(comp_id,season_id)

Спасибо

0 голосов
/ 13 июля 2020

base_url необходимо изменить, чтобы получить необработанный контент Json, также были две ошибки в Shot_with и Outcome.

Этот скрипт:

import requests
import pandas as pd


# changed the base_url to get raw content:
base_url = "https://raw.githubusercontent.com/statsbomb/open-data/master/data/"

comp_url =  base_url + "matches/{}/{}.json"
match_url = base_url + "events/{}.json"

def parsing_data(comp_id,season_id):
    url = comp_url.format(comp_id,season_id)
    matches = requests.get(url=url).json()
    match_ids =  [m['match_id'] for m in matches]

    for id in match_ids:
        events = requests.get(url= match_url.format(id)).json()
        shots = [x for x in events if x['type']['name'] == 'Shot']

        all_events = []
        for s in shots:
            attribute = {
               'Match_ID' : id,
               'Team' : s['possession_team']['name'],
               'Player': s['player']['name'],
               'Minute': s['minute'],
               'X_shot': s['location'][0],
               'Y_shot': s['location'][1],
               'Shot_with': s['shot']['body_part']['name'], # <-- added 'shot'
               'Outcome': s['shot']['outcome']['name']      # <-- added 'shot'
            }
            all_events.append(attribute)

    return pd.DataFrame(all_events)

comp_id = 43
season_id = 3

df = parsing_data(comp_id,season_id)
print(df)

Печать:

    Match_ID     Team                     Player  Minute  X_shot  Y_shot   Shot_with  Outcome
0       8656  England            Kieran Trippier       4    96.0    43.0  Right Foot     Goal
1       8656  England              Harry Maguire      13   111.0    37.0        Head    Off T
2       8656  Croatia               Ivan Perišić      18    94.0    20.0  Right Foot    Off T
3       8656  Croatia                 Ante Rebić      20    98.0    41.0   Left Foot  Blocked
4       8656  Croatia               Ivan Perišić      22    87.0    26.0  Right Foot    Off T
5       8656  Croatia                 Ante Rebić      31   101.0    50.0   Left Foot    Saved
6       8656  England              Jesse Lingard      35   102.0    41.0  Right Foot    Off T
7       8656  England  Raheem Shaquille Sterling      36   104.0    52.0   Left Foot  Blocked
8       8656  Croatia              Šime Vrsaljko      42    88.0    51.0  Right Foot    Off T
9       8656  England              Jesse Lingard      55    96.0    45.0   Left Foot  Blocked
10      8656  Croatia               Ivan Rakitić      60    97.0    34.0   Left Foot    Off T
11      8656  Croatia               Ivan Perišić      64   103.0    41.0  Right Foot  Blocked
12      8656  England                 Harry Kane      66   118.0    56.0  Right Foot    Off T
13      8656  Croatia               Ivan Perišić      67   114.0    40.0   Left Foot     Goal
14      8656  Croatia               Ivan Perišić      71   112.0    30.0   Left Foot     Post
15      8656  Croatia                 Ante Rebić      71   111.0    44.0   Left Foot    Saved
16      8656  Croatia           Marcelo Brozović      72    98.0    48.0  Right Foot    Off T
17      8656  England              Jesse Lingard      76   115.0    55.0  Right Foot  Wayward
18      8656  England     Jordan Brian Henderson      77    95.0    45.0  Right Foot    Off T
19      8656  Croatia            Mario Mandžukić      82   113.0    52.0  Right Foot    Saved
20      8656  Croatia               Ivan Perišić      83   113.0    24.0  Right Foot    Off T
21      8656  Croatia               Dejan Lovren      89    89.0    57.0  Right Foot    Off T
22      8656  England                 Harry Kane      91   113.0    33.0        Head    Off T
23      8656  England                  Eric Dier      97    92.0    51.0  Right Foot  Blocked
24      8656  England                John Stones      98   113.0    49.0        Head  Blocked
25      8656  Croatia            Andrej Kramarić     101   106.0    58.0   Left Foot  Blocked
26      8656  Croatia            Andrej Kramarić     105   101.0    34.0   Left Foot  Blocked
27      8656  Croatia            Mario Mandžukić     106   114.0    39.0  Right Foot    Saved
28      8656  Croatia           Marcelo Brozović     107   111.0    27.0   Left Foot    Off T
29      8656  Croatia            Mario Mandžukić     108   114.0    33.0   Left Foot     Goal
30      8656  Croatia               Ivan Perišić     113   107.0    32.0   Left Foot  Blocked
31      8656  Croatia           Marcelo Brozović     115    97.0    22.0  Right Foot    Saved
32      8656  Croatia            Andrej Kramarić     119   109.0    52.0  Right Foot    Off T
...