Попытка получить python программу для распечатки выбранной статистики из веб-парсинга - PullRequest
1 голос
/ 05 августа 2020

Я новичок в красивом супе и искал способ, чтобы пользователь ввел, какую команду они хотели и на какой неделе. Затем попросите сценарий распечатать определенную статистику за эту неделю. В выводе, когда я ввожу команду и номер недели, он просто попадает прямо в командную строку.

Вот мой код:

import requests  
from bs4 import BeautifulSoup  

team = input('''What team are you looking for?
    crd - Arizona Cardinals
    atl - Atlanta Falcons
    rav - Baltimore Ravens
    buf - Buffalo Bills
    car - Carolina Panthers
    chi - Chicago Bears
    cin - Cincinnati Bengals
    cle - Cleveland Browns
    dal - Dallas Cowboys
    den - Denver Broncos
    det - Detroit Lions
    gnb - Green Bay Packers
    htx - Houston Texans
    clt - Indianapolis Colts
    jax - Jacksonville Jaguars
    kan - Kansas City Chiefs
    sdg - Los Angeles Chargers
    ram - Los Angeles Rams
    mia - Miami Dolphins
    min - Minnesota Vikings
    nwe - New England Patriots
    nor - New Orleans Saints
    nyg - New York Giants
    nyj - New York Jets
    rai - Oakland Raiders
    phi - Philadelphia Eagles
    pit - Pittsburgh Steelers
    sfo - San Fransisco 49ers
    sea - Seattle Seahawks
    tam - Tampa Bay Buccaneers
    oti - Tennessee Titans
    was - Washington Football Team

    Enter the 3 letter code for the team: ''')
week = int(input('What week are you looking for? '))
  
url = 'https://www.pro-football-reference.com/teams/' + team.lower() + '/2019.htm'  
page = requests.get(url)  
  
soup = BeautifulSoup(page.content, 'html.parser')     

week_num = soup.find_all('th', attrs={"data-stat": "week_num", "class": "right", "scope": "row"})
total_off = soup.find_all('td', attrs={"data-stat": "yards_off", "class": "right"})
total_def = soup.find_all('td', attrs={"data-stat": "yards_def", "class": "right"})
pass_yards_off = soup.find_all('td', attrs={"data-stat": "pass_yds_off", "class": "right"})
pass_yards_def = soup.find_all('td', attrs={"data-stat": "pass_yds_def", "class": "right"})
rush_yards_off = soup.find_all('td', attrs={"data-stat": "rush_yds_off", "class": "right"})
rush_yards_def = soup.find_all('td', attrs={"data-stat": "rush_yds_def", "class": "right"})
team_score = soup.find_all('td', attrs={"data-stat": "pts_off", "class": "right"})
opp_score = soup.find_all('td', attrs={"data-stat": "pts_def", "class": "right"})




for i in range(len(week_num)):
    if week in week_num:
        print('Week Number: ' + week_num[i].text.strip(),
            'Total Off: ' + total_off[i].text.strip(),
            'Total Def: ' + total_def[i].text.strip(),
            'Passing Yards Off: ' + pass_yards_off[i].text.strip(),
            'Passing Yards Def: ' + pass_yards_def[i].text.strip(),
            'Rushing Yards Off: ' + rush_yards_off[i].text.strip(),
            'Rushing Yards Def: ' + rush_yards_def[i].text.strip(), '\n')

Вот результат, когда я его запускаю:

What team are you looking for?
    crd - Arizona Cardinals
    atl - Atlanta Falcons
    rav - Baltimore Ravens
    buf - Buffalo Bills
    car - Carolina Panthers
    chi - Chicago Bears
    cin - Cincinnati Bengals
    cle - Cleveland Browns
    dal - Dallas Cowboys
    den - Denver Broncos
    det - Detroit Lions
    gnb - Green Bay Packers
    htx - Houston Texans
    clt - Indianapolis Colts
    jax - Jacksonville Jaguars
    kan - Kansas City Chiefs
    sdg - Los Angeles Chargers
    ram - Los Angeles Rams
    mia - Miami Dolphins
    min - Minnesota Vikings
    nwe - New England Patriots
    nor - New Orleans Saints
    nyg - New York Giants
    nyj - New York Jets
    rai - Oakland Raiders
    phi - Philadelphia Eagles
    pit - Pittsburgh Steelers
    sfo - San Fransisco 49ers
    sea - Seattle Seahawks
    tam - Tampa Bay Buccaneers
    oti - Tennessee Titans
    was - Washington Football Team

    Enter the 3 letter code for the team: nwe
What week are you looking for? 6

Ответы [ 2 ]

1 голос
/ 05 августа 2020

Условие if в l oop должно быть изменено.

import requests  
from bs4 import BeautifulSoup  

team = input('''What team are you looking for?
    crd - Arizona Cardinals
    atl - Atlanta Falcons
    rav - Baltimore Ravens
    buf - Buffalo Bills
    car - Carolina Panthers
    chi - Chicago Bears
    cin - Cincinnati Bengals
    cle - Cleveland Browns
    dal - Dallas Cowboys
    den - Denver Broncos
    det - Detroit Lions
    gnb - Green Bay Packers
    htx - Houston Texans
    clt - Indianapolis Colts
    jax - Jacksonville Jaguars
    kan - Kansas City Chiefs
    sdg - Los Angeles Chargers
    ram - Los Angeles Rams
    mia - Miami Dolphins
    min - Minnesota Vikings
    nwe - New England Patriots
    nor - New Orleans Saints
    nyg - New York Giants
    nyj - New York Jets
    rai - Oakland Raiders
    phi - Philadelphia Eagles
    pit - Pittsburgh Steelers
    sfo - San Fransisco 49ers
    sea - Seattle Seahawks
    tam - Tampa Bay Buccaneers
    oti - Tennessee Titans
    was - Washington Football Team

    Enter the 3 letter code for the team: ''')

week = int(input('What week are you looking for? '))
  
url = 'https://www.pro-football-reference.com/teams/' + team.lower() + '/2019.htm'  
page = requests.get(url)  

soup = BeautifulSoup(page.content, 'html.parser')     

week_num = soup.find_all('th', attrs={"data-stat": "week_num", "class": "right", "scope": "row"})
total_off = soup.find_all('td', attrs={"data-stat": "yards_off", "class": "right"})
total_def = soup.find_all('td', attrs={"data-stat": "yards_def", "class": "right"})
pass_yards_off = soup.find_all('td', attrs={"data-stat": "pass_yds_off", "class": "right"})
pass_yards_def = soup.find_all('td', attrs={"data-stat": "pass_yds_def", "class": "right"})
rush_yards_off = soup.find_all('td', attrs={"data-stat": "rush_yds_off", "class": "right"})
rush_yards_def = soup.find_all('td', attrs={"data-stat": "rush_yds_def", "class": "right"})
team_score = soup.find_all('td', attrs={"data-stat": "pts_off", "class": "right"})
opp_score = soup.find_all('td', attrs={"data-stat": "pts_def", "class": "right"})

try:
    print('Week Number: ' + week_num[week].text.strip(),
            'Total Off: ' + total_off[week].text.strip(),
            'Total Def: ' + total_def[week].text.strip(),
            'Passing Yards Off: ' + pass_yards_off[week].text.strip(),
            'Passing Yards Def: ' + pass_yards_def[week].text.strip(),
            'Rushing Yards Off: ' + rush_yards_off[week].text.strip(),
            'Rushing Yards Def: ' + rush_yards_def[week].text.strip(), '\n')
except Exception as e:
    print(e)

Вывод для crd и 2:

Week Number: 3 Total Off: 248 Total Def: 413 Passing Yards Off: 127 Passing Yards Def: 240 Rushing Yards Off: 121 Rushing Yards Def: 173
0 голосов
/ 05 августа 2020

На самом деле мы могли бы динамически создавать команды из таблицы. Вы также можете использовать pandas, чтобы получить таблицу, а затем отфильтровать по номеру недели, а не повторять.

* Примечание: вам нужно pip install choice

import pandas as pd
import requests
from bs4 import BeautifulSoup
import choice

url= 'https://www.pro-football-reference.com/teams/'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
teams = soup.find_all('th')

# Get the links to the teams in the table
teams_dict = {}
for each in teams:
    if each.find('a'):
        teams_dict[each.text] = each.find('a')['href']

   
team_choice = choice.Menu(teams_dict.keys()).ask()
week = input('What week are you looking for? ')

url = 'https://www.pro-football-reference.com{team_url}2019.htm'.format(team_url=teams_dict[team_choice])
df = pd.read_html(url,attrs={'id':'games'})[0]

new_col_names = [col[-1] if 'Unnamed' in col[0] else '_'.join(col) for col in df.columns]

# for loop equivalent to the list comprehension above
#new_col_names = []
#for col in df.columns:
#    if 'Unnamed' in col[0]:
#        new_col_names.append(col[-1])
#    else:
#        new_col_names.append('_'.join(col))

# List comprehension equivilant to above loop
#new_col_names = [col[-1] if 'Unnamed' in col[0] else '_'.join(col) for col in df.columns]

df.columns = new_col_names
df['Week'] = df['Week'].astype(str)
week_stats = df[df['Week']==week]

cols = ['Week','Offense_TotYd','Defense_TotYd','Offense_PassY','Defense_PassY','Offense_RushY','Defense_RushY']
print (week_stats[cols].to_string())

Вывод: для NE 6 недели

  Week  Offense_TotYd  Defense_TotYd  Offense_PassY  Defense_PassY  Offense_RushY  Defense_RushY
5    6          427.0          213.0          313.0          161.0          114.0           52.0
...