Python: чтение и анализ файлов CSV - PullRequest
0 голосов
/ 06 декабря 2018

У меня есть файл CSV с именами учеников и их средними показателями по 8 предметам.Я должен подсчитать, какие ученики получили почетную грамоту (в среднем 80 или выше) и какие ученики получили предметную награду (самая высокая оценка по каждому предмету).Я сделал часть ролла чести, и это работает, но я не могу заставить часть награды за предмет работать.Как бы мне заставить это работать?Я не могу понять!

Вот мой код:

import csv

with open('C:/Users/rohan/Desktop/Google Drive/honourCSVreader/honour.csv') 
as csv_file:
    csv_reader = csv.reader(csv_file, delimiter=",")

    # Honour Roll
    print('The honour roll students are:')
    for col in csv_reader:
        if not col[0] or col[1]:
            for row in csv_reader:
                if (int(row[2]) + int(row[3]) + int(row[4]) + int(row[5]) + 
                int(row[6]) + int(row[7]) + int(row[8]) + int(row[9])) / 8 
                >= 80:
                    print(row[1] + " " + row[0])
    # Subject Awards
    print('The subject award winners are:')
    for col in csv_reader:
        if not col[0] and not col[1]:
            name = []
            maximum_grade = 0
            subject = []
            for col[2:] in csv_reader:
                if col > maximum_grade:
                    subject = row
                    maximum_grade = col
                    name = [col[1], col[0]]
                    print(str(name) + ' - ' + str(subject))

А вот файл 'чести' (список студентов): https://1drv.ms/x/s!AhndVfox8v67iggaLRaK7LTpxBQt

Спасибо!

Ответы [ 3 ]

0 голосов
/ 06 декабря 2018

Работая над более хорошим способом сделать это так, чтобы код был чистым, модульным и понятным.

https://paiza.io/projects/e/35So9iUPfMdIORGzJTb2NQ

Сначала прочитайте данные ученика как словарь.

import csv

with open('data.csv') as csv_file:
    csv_reader = csv.DictReader(csv_file, delimiter=",")
    for line in csv_reader:
        print line

Вывод:

{'History': '39', 'Last': 'Agalawatte', 'Science': '68', 'Gym': '88', 'Music': '84', 'English': '97', 'Art': '89', 'First': 'Matthew', 'Math': '79', 'Geography': '73'}
{'History': '95', 'Last': 'Agorhom', 'Science': '95', 'Gym': '80', 'Music': '93', 'English': '95', 'Art': '72', 'First': 'Devin', 'Math': '60', 'Geography': '80'}
{'History': '84', 'Last': 'Ahn', 'Science': '98', 'Gym': '71', 'Music': '95', 'English': '91', 'Art': '56', 'First': 'Jevon', 'Math': '95', 'Geography': '83'}
{'History': '97', 'Last': 'Ajagu', 'Science': '69', 'Gym': '82', 'Music': '87', 'English': '60', 'Art': '74', 'First': 'Darion', 'Math': '72', 'Geography': '99'}
{'History': '74', 'Last': 'Akahira', 'Science': '90', 'Gym': '71', 'Music': '79', 'English': '94', 'Art': '86', 'First': 'Chandler', 'Math': '89', 'Geography': '77'}

Намного приятнее работать правильно?

Теперь подумайте о каждой строке как о студенте, а затем напишите две функции, которые оценивают, является ли студентподходит для любого списка.

Выясните, как вы собираетесь отслеживать результаты.Здесь я использую несколько вложенных словарей:

import csv
import json

roles = {}
roles['honor role'] = []
subjects = ['History', 'Science','Gym', 'Music', 'English', 'Art', 'Math', 'Geography']
for subject in subjects:
    roles[subject] = {'highest grade':0, 'students':[]}


def isHonorRole(student):
    ''' Test to see if this student has earned the honor role'''
    return False

def isSubjectAward(subject, student):
    ''' Test to see if this student has earned the honor role'''
    return False

with open('data.csv') as csv_file:
    csv_reader = csv.DictReader(csv_file, delimiter=",")
    for student in csv_reader:

        if isHonorRole(student):
            ''' Add to the honor role '''

        for subject in subjects:
            if isSubjectAward(subject, student):

Хорошо, теперь нам нужно реализовать логику, которая классифицирует, кто получает предметные награды.

def isSubjectAward(subject, student):
    ''' Test to see if this student has earned the subject award'''
    grade    = float(student[subject])
    highest  = roles[subject]['highest grade']
    students = roles[subject]['students']

    student = (student['First'], student['Last'])

    # is this grade higher than the current highest?
    if grade > highest:
        # we have a new highest!
        # clear the list
        students = []
        students.append(student)

        # set new highest
        highest = grade
    elif grade == highest:
        # add to list of students
        students.append(student)
    else:
        return

    # There where changes to the list
    roles[subject]['highest grade'] = grade
    roles[subject]['students'] = students

print json.dumps(roles, sort_keys=True, indent=4)

Теперь у нас есть предметлауреаты премии:

{
    "Art": {
        "highest grade": 100.0, 
        "students": [
            [
                "Nathan", 
                "Bryson"
            ], 
            [
                "Chase", 
                "Putnam"
            ]
        ]
    }, 
    "English": {
        "highest grade": 99.0, 
        "students": [
            [
                "Josiah", 
                "Gower"
            ]
        ]
    }, 
    "Geography": {
        "highest grade": 100.0, 
        "students": [
            [
                "Ismaila", 
                "LeBlanc"
            ]
        ]
    }, 
    "Gym": {
        "highest grade": 100.0, 
        "students": [
            [
                "Woo Taek (James)", 
                "Irvine"
            ]
        ]
    }, 
    "History": {
        "highest grade": 100.0, 
        "students": [
            [
                "Tami", 
                "Easterbrook"
            ]
        ]
    }, 
    "Math": {
        "highest grade": 99.0, 
        "students": [
            [
                "Carson", 
                "Whicher"
            ]
        ]
    }, 
    "Music": {
        "highest grade": 100.0, 
        "students": [
            [
                "Jamie", 
                "Bates"
            ], 
            [
                "Michael", 
                "Giroux"
            ]
        ]
    }, 
    "Science": {
        "highest grade": 100.0, 
        "students": [
            [
                "Jonathan", 
                "Emes"
            ], 
            [
                "Jack", 
                "Hudspeth"
            ]
        ]
    }, 
    "honor role": []
}

Найти почетную роль студенты должны тривиально.Особенно если бы у нас было несколько вспомогательных функций:

def getOverallAverage(student):
    ''' Returns the average of all the student's subject grades '''
    total = sum([float(student[subject]) for subject in subjects])
    return total/len(subjects)

def getName(student):
    '''Extracts the student's first and last name as a tuple'''
    return ' '.join((student['First'], student['Last']))

def isHonorRole(student):
    ''' Test to see if this student has earned the honor role'''
    cutoff = 80

    if getOverallAverage(student) >= cutoff:
        roles['honor role'].append(getName(student))

    return False

Роль чести:

"honor role": [
        "Devin Agorhom", 
        "Jevon Ahn", 
        "Darion Ajagu", 
        "Chandler Akahira", 
        "Stas Al-Turki", 
        "Bryce Allison", 
        "Tucker Allison", 
        "Eric Andrews", 
        "Henry Angeletti", 
        "Harry Apps", 
        "Jesse Arnold", 
        "Benjamin Aucoin", 
        "Matthew Bainbridge", 
        "Geordie Ball", 
        "Sean Barbe", 
        "Dwayne Barida", 
        "Jamie Bates", 
        "Bradley Baverstock", 
        "Adam Beckman", 
        "Michael Becq", 
        "Joshua Berezny", 
        "Aaron Best", 
        "Doug Bolsonello", 
        "Richard Bolton", 
        "Trevor Bolton", 
        "Travis Bonellos", 
        "Daniel Boulet", 
        "Nicholas Bowman", 
        "Connor Brent", 
        "Michael Britnell", 
        "Shu Brooks", 
        "Cody Brown", 
        "Dylan Brown", 
        "Mark Brown", 
        "Xinkai (Kevin) Brown", 
        "Daniel Bryce", 
        "Nathan Bryson", 
        "Greg Bull", 
        "Eric Burnham", 
        "Kevin Burns", 
        "Rhys Caldwell", 
        "Evan Campbell", 
        "Jeremiah Carroll", 
        "Ian Cass", 
        "Robert Cassidy", 
        "Matt Catleugh", 
        "Garin Chalmers", 
        "Matthew Chan", 
        "Ryan Cheeseman", 
        "Jack Chen", 
        "Phillipe Chester", 
        "Cameron Choi", 
        "Jason Clare", 
        "Brandon Clarke", 
        "Justin Clarke", 
        "Reid Clarke", 
        "Brendan Cleland", 
        "Andrew Clemens", 
        "Matthew Clemens", 
        "Pete Conly", 
        "Marc Coombs", 
        "Leif Coughlin", 
        "Michael Cox", 
        "Michael Creighton", 
        "Raymond Croke", 
        "Andrew Cummins", 
        "William Cupillari", 
        "James Davidson", 
        "Maxim Davis", 
        "Peter Davis", 
        "Daniel Dearham", 
        "Michael Deaville", 
        "Andrew Decker", 
        "Alex Del Peral", 
        "Kobe Dick", 
        "Alec Dion", 
        "Gaelan Domej", 
        "Harrison Dudas", 
        "Ted Duncan", 
        "Andrew Dunkin", 
        "Micah Dupuy", 
        "Cameron Dziedzic", 
        "Tami Easterbrook", 
        "Ethan Ellis", 
        "Jonathan Emes", 
        "Kevin Ernst", 
        "Taylor Evans", 
        "Jack Everett", 
        "Andrew Fabbri", 
        "Les Fawns", 
        "Cameron Faya", 
        "Patrick Feaver", 
        "Josh Ferrando", 
        "Aidan Flett", 
        "Tommy Flowers", 
        "Gregory Friberg", 
        "Craig Friesen", 
        "Keegan Friesen", 
        "Ryan Fullerton", 
        "Jason Gainer", 
        "Adam Gall", 
        "Ryan Gallant", 
        "Michael Gasparotto", 
        "Scott Gerald", 
        "Michael Giroux", 
        "Ramanand Gleeson", 
        "Jack Goldblatt", 
        "Daniel Gonzalez-Stewart", 
        "Christopher Got", 
        "Josiah Gower", 
        "Zachary Grannum", 
        "Stuart Gray", 
        "Gonzalo Grift-White", 
        "Aris Grosvenor", 
        "Eric Hager", 
        "I\u00c3\u00b1igo Hamel", 
        "Davin Hamilton", 
        "Matthew Hanafy", 
        "Christopher Harpur", 
        "Tomas Hart", 
        "Gage Haslam", 
        "Ross Hayward", 
        "Sean Heath", 
        "Ryan Hess", 
        "Matthew Hessey", 
        "Stephen Hewis", 
        "Michael Hill", 
        "Edward Holbrook", 
        "Gavin Holenski", 
        "Brendan Holmes", 
        "Gregory Houston", 
        "Douglas Howarth", 
        "Conor Hoyle", 
        "Agustin Huang", 
        "Jack Hudspeth", 
        "James Humfries", 
        "David Hunchak", 
        "Jesse Im", 
        "Steve Inglis", 
        "Woo Taek (James) Irvine", 
        "Kenny James", 
        "Eric Jang", 
        "Erik Jeong", 
        "Michael Jervis", 
        "Brett Johnson", 
        "Adam Johnston", 
        "Ben Johnstone", 
        "Taylor Jones", 
        "Braedon Journeay", 
        "Neil Karakatsanis", 
        "David Karrys", 
        "Ryan Keane", 
        "Josh Kear", 
        "Alexander Kee", 
        "Joshua Khan", 
        "Matthew Kim", 
        "David Kimbell Boddy", 
        "Daniel King", 
        "Tristan Knappett", 
        "Timothy Koornneef", 
        "Michael Krikorian", 
        "George Kronberg", 
        "Danny Kwiatkowski", 
        "Chris Lackey", 
        "Spenser LaMarre", 
        "Matthew Lampi", 
        "Craig Landerville", 
        "Dallas Lane", 
        "Matthew Lanselle", 
        "Allen Lapko", 
        "Cory Latimer", 
        "Ben Lawrence", 
        "Matthew Lebel", 
        "Ismaila LeBlanc", 
        "Christopher Lee", 
        "Bailey Legiehn", 
        "Andy Lennox", 
        "Samuel Leonard", 
        "Sam Lockner", 
        "Jeffrey MacPherson", 
        "Simon Mahoney", 
        "Lucas Maier", 
        "Trent Manley", 
        "Jeremy Manoukas", 
        "Nathanial Marsh", 
        "Alastair Marshall", 
        "Connor Mattucci", 
        "Samuel McCormick", 
        "Cameron McCuaig", 
        "Ronan Mcewan", 
        "John McGuire", 
        "Brian McNaughton", 
        "Christopher McPherson", 
        "Alistair McRae", 
        "Andrew Medlock", 
        "Trevor Meipoom", 
        "Justin Metcalfe", 
        "Chieh (Jack) Miller", 
        "Graham Miller", 
        "Josh Miller", 
        "Salvador Miller", 
        "Max Missiuna", 
        "Jack Mitchell", 
        "Michael Morris", 
        "Paul Morrison", 
        "Morgan Moszczynski", 
        "Curtis Muir", 
        "Christopher Murphy", 
        "Mark Murphy", 
        "Hiroki Nakajima", 
        "Michael Neary", 
        "James Nelson", 
        "John Nicholson", 
        "Stephen Nishida", 
        "Michael Nowlan", 
        "Jason O'Brien", 
        "Manny O'Brien", 
        "James O'Donnell", 
        "Spencer Olubala Paynter", 
        "Daniel Ortiz", 
        "Jihwan Ottenhof", 
        "Joel Ottenhof", 
        "Roger Owen", 
        "Jason Ozark", 
        "Brent Pardhan", 
        "Bernard Park", 
        "Jason Parker", 
        "Alistair Pasechnyk", 
        "James Patrick", 
        "Hunter Pellow", 
        "Jason Pennings", 
        "Brant Perras", 
        "Michael Petersen", 
        "Jordan Petrov", 
        "Don Philp", 
        "Adam Piil", 
        "Ryan Pirhonen", 
        "Alex Pollard", 
        "Daniel Postlethwaite", 
        "John-Michael Potter", 
        "Tim Powell", 
        "Chad Power", 
        "Jack Pratt", 
        "Alexander Price", 
        "Tyler Purdie", 
        "Andrew Purvis", 
        "Colin Purvis", 
        "Chase Putnam", 
        "Kael Radonicich", 
        "Curtis Ravensdale", 
        "Brett Ray", 
        "Forrest Reid", 
        "Aiden Ren", 
        "Tyler Rennicks", 
        "Alden Revell", 
        "Joshua Robinson", 
        "Richard Roffey", 
        "Michael Rose", 
        "Nicholas Roy", 
        "Christopher Samuel", 
        "Chris Sandilands", 
        "Christopher Sarbutt", 
        "David Saun", 
        "David Scharman", 
        "Adam Schoenmaker", 
        "Derek Schultz", 
        "Rocky Scuralli", 
        "Turner Seale", 
        "Bryan Senn", 
        "Alexander Serena", 
        "Seth Shaubel", 
        "Alex Shaw", 
        "Denroy Shaw", 
        "William Sibbald", 
        "Curtis Simao", 
        "Greg Simm", 
        "Nicholas Simon", 
        "Stuart Simons", 
        "Michael Skarsten", 
        "Matthew Skorbinski", 
        "Greg Slogan", 
        "Lucas Smith", 
        "Andrew South", 
        "Benjamin Sprowl", 
        "Jackson Staley", 
        "Reid Stencill-Hohn", 
        "Matthew Stevens", 
        "Jason Sula", 
        "Edward Sunderland", 
        "James Suppa", 
        "Jason Talbot", 
        "Tony Tan", 
        "Stuart Tang", 
        "Alex Temple", 
        "Leonard Theaker", 
        "Parker Thomas", 
        "Matthew Tisi", 
        "Scott Toda", 
        "Michael Toth", 
        "Zachary Trotter", 
        "Matthew Underwood", 
        "David Ure", 
        "Michael Utts", 
        "Joey Van Dyk", 
        "Jonathan Van Gaal", 
        "Chris Vandervies", 
        "Ryan Vickery", 
        "Dustin Wain", 
        "Brian Walker", 
        "Young-Jun Walsh", 
        "Brad Walton", 
        "Zachary Waugh", 
        "Matthew Webster", 
        "Samuel Welsh", 
        "Coleman West", 
        "Alexander Westendorp", 
        "Carson Whicher", 
        "David Whitney", 
        "Samuel Wilkinson", 
        "Kevin Williams", 
        "Aedan Williamson", 
        "Jason Wilson", 
        "William Wilson", 
        "David Wilton", 
        "Isaac Windeler", 
        "Liam Winter", 
        "Timothy Wong", 
        "Vladimir Wong", 
        "Robert Workman", 
        "Brian Yang", 
        "Owen Yates", 
        "Devin Young", 
        "Paul Young", 
        "Joshua Zhao"
    ]

DONE

0 голосов
/ 06 декабря 2018

Мои два цента:

Выполните оба вычисления в одном цикле.Хотя использование max и lambda выглядит довольно круто и читабельно, и оно все равно будет O (n), оно также будет в 9 раз медленнее, чем следующая реализация, которая использует один цикл для обоих вычислений (Honour Roll и * 1006).*):

#!/usr/bin/env python
import csv

with open('/Users/edil3508/Downloads/honours.csv') as csv_file:
    csv_reader = csv.reader(csv_file, delimiter=",")
    next(csv_reader, None)  # skip the headers

    subjects = ['English', 'Math', 'Geography', 'Science', 'Gym', 'History', 'Art', 'Music']
    award_winners = [['', 0], ['', 0], ['', 0], ['', 0], ['', 0], ['', 0], ['', 0], ['', 0]]
    # Honour Roll
    print('The honour roll students are:')
    print("-" * 80)
    for row in csv_reader:
        subtotal = 0
        for i in range(2, 8 + 2):
            subtotal += int(row[i])
            if int(row[i]) > award_winners[i-2][1]:
                award_winners[i - 2][0] = row[1] + " " + row[0]
                award_winners[i - 2][1] = int(row[i])
        avg = subtotal / 8
        if avg > 80:
            print(row[1] + " " + row[0], avg)
    # Subject Awards
    print("-" * 80)
    print('The subject award winners are:')
    print("-" * 80)
    for ix, student_grade in enumerate(award_winners):
        print('{}: {} with {}'.format(subjects[ix], student_grade[0], student_grade[1]))

Вывод:

The honour roll students are:
----------------------------------------------------------------------
Devin Agorhom 83.75 
Jevon Ahn 84.125 
Chandler Akahira 82.5
Stas Al-Turki 84.25
...

-----------------------------------------------------------------------
The subject award winners are:
-----------------------------------------------------------------------
English: Josiah Gower with 99
Math: Carson Whicher with 99
Geography: Ismaila LeBlanc with 100
Science: Jonathan Emes with 100
Gym: Woo Taek (James) Irvine with 100
History: Tami Easterbrook with 100
Art: Nathan Bryson with 100
Music: Jamie Bates with 100
0 голосов
/ 06 декабря 2018

[EDIT] В сотрудничестве с @edilio я сделал более эффективную версию, которая отслеживает связи .Их много, так что это довольно важное различие.Код длинный, поэтому я буду размещать его в гисте.

https://gist.github.com/SamyBencherif/fde7c3bca702545dd22739dd8caf796a


Нет необходимости в for циклах.На самом деле синтаксис в вашем втором цикле for был полностью неработоспособным.

import csv

with open('C:/Users/rohan/Desktop/Google Drive/honourCSVreader/honour.csv') 
as csv_file:
    csv_list = list(csv.reader(csv_file, delimiter=","))[1:]

    # Subject Awards
    print('The subject award winners are:')
    print('English', max(csv_list, key=lambda row: row[2]))
    print('Math', max(csv_list, key=lambda row: row[3]))
    print('Geography', max(csv_list, key=lambda row: row[4]))     

и т. Д.

Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...