Question

Я использую try / кроме, чтобы ловить проблемы при чтении файла построчно.Блок try содержит серию манипуляций, последняя из которых обычно является причиной исключения.Удивительно, но я заметил, что все предыдущие манипуляции выполняются в блоке try, даже когда возникает исключение.Это проблема при попытке превратить созданный мной словарь во фрейм данных, поскольку длина списков неравна.

Этот код создает проблему:

d = {'dates':[],'states':[], 'longitude':[], 'latitude':[], 'tweet_ids':[], 'user_ids':[], 'source':[]}
for file in f:
    print("Processing file "+file)
    t1 = file.split('/')[-1].split("_")
    date = t1[0]
    state_code = t1[1]
    state = list(states_ref.loc[states_ref.code==state_code]['abbr'])[0]

    collection = JsonCollection(file)
    counter = 0
    for tweet in collection.get_iterator():
        counter += 1
        try:

            d['dates'].append(date)
            d['states'].append(state)
            t2 = tweet_parser.get_entity_field('geo', tweet)
            if t2 == None:
                d['longitude'].append(t2)
                d['latitude'].append(t2)
            else:
                d['longitude'].append(t2['coordinates'][1])
                d['latitude'].append(t2['coordinates'][0])


#note: the 3 lines bellow are the ones that can raise an exception 
            temp = tweet_parser.get_entity_field('source', tweet)
            t5 =  re.findall(r'>(.*?)<', temp)[0]
            d['source'].append(t5)

        except:
            c += 1
            print("Tweet {} in file {} had a problem and got skipped".format(counter, file))
            print("This is a total  of {} tweets I am missing from the {} archive I process.".format(c, sys.argv[1]))
            next

tab = pd.DataFrame.from_dict(d)

Я исправилпроблема в том, чтобы переместить манипуляции, которые склонны выдавать ошибку наверху, но я хотел бы лучше понять, почему попытка / исключение ведет себя так.Есть идеи?

Этот код работает:

d = {'dates':[],'states':[], 'longitude':[], 'latitude':[], 'tweet_ids':[], 'user_ids':[], 'source':[]}
for file in f:
    print("Processing file "+file)
    t1 = file.split('/')[-1].split("_")
    date = t1[0]
    state_code = t1[1]
    state = list(states_ref.loc[states_ref.code==state_code]['abbr'])[0]

    collection = JsonCollection(file)
    counter = 0
    for tweet in collection.get_iterator():
        counter += 1
        try:
            #note: the 3 lines bellow are the ones that can raise an exception 
temp = tweet_parser.get_entity_field('source', tweet)
            t5 =  re.findall(r'>(.*?)<', temp)[0]
            d['source'].append(t5)

            d['dates'].append(date)
            d['states'].append(state)
            t2 = tweet_parser.get_entity_field('geo', tweet)
            if t2 == None:
                d['longitude'].append(t2)
                d['latitude'].append(t2)
            else:
                d['longitude'].append(t2['coordinates'][1])
                d['latitude'].append(t2['coordinates'][0])
        except:
            c += 1
            print("Tweet {} in file {} had a problem and got skipped".format(counter, file))
            print("This is a total  of {} tweets I am missing from the {} archive I process.".format(c, sys.argv[1]))
            next

tab = pd.DataFrame.from_dict(d)

Daniel R. Vasquez Montes · Answer 1 · 23 февраля 2019

Вы всегда можете использовать временный объект для хранения вывода ваших функций перед добавлением к целевому объекту.Таким образом, если что-то не получится, перед отправкой данных в целевой объект будет выдано исключение.

try:
    #Put all data into a temporal Dictionary
    #Can raise an exception here
    temp = tweet_parser.get_entity_field('source', tweet)
    t2 = tweet_parser.get_entity_field('geo', tweet)
    tempDictionary = {
        "source"    : re.findall(r'>(.*?)<', temp)[0],
        "latitude"  : None if (t2 is None) else t2['coordinates'][1],
        "longitude" : None if (t2 is None) else t2['coordinates'][0]
    }
    #Append data from temporal Dictionary
    d['source'].append(tempDictionary['source'])
    d['latitude'].append(tempDictionary['latitude'])
    d['longitude'].append(tempDictionary['longitude'])
    d['dates'].append(date)
    d['states'].append(state)
except:
    c += 1
    print("Tweet {} in file {} had a problem and got skipped".format(counter, file))
    print("This is a total  of {} tweets I am missing from the {} archive I process.".format(c, sys.argv[1]))

Блок Try дает вывод, даже если исключение вызвано последней командой (но не первой)

Пожалуйста, войдите или зарегистрируйтесь чтобы ответить на этот вопрос.

1 Ответ

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Блок Try дает вывод, даже если исключение вызвано последней командой (но не первой)

Пожалуйста, войдите или зарегистрируйтесь чтобы ответить на этот вопрос.

1 Ответ

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Похожие темы