Оптимизация вложенного цикла for, который возвращает все значения в наборе данных для выполнения ряда вычислений - PullRequest
0 голосов
/ 09 июля 2019

Я занимаюсь разработкой некоторых функций, для которых необходимо просмотреть все даты в наборе данных и найти все ограничения, которые существовали на эту дату, а затем создать таблицу с их профилями. Бежать вечно, и я уверен, что есть лучший способ сделать это

Я пытался функционировать, но я не думаю, что это будет иметь большое значение, потому что функция все равно будет нуждаться в цикле for


    output = []

    for unique_date in log_progress(unique_dates):
        # filter speed_restr by dates
        temp_speed_restr = speed_restr[speed_restr['source_update_datetime'].dt.date < pd.to_datetime(unique_date).date()]

        # get max version of speed_restrs
        max_speed_restr_ver = temp_speed_restr[['tpps_tc_id', 'tpps_tc_version']].groupby('tpps_tc_id').max().reset_index()
        latest_speed_restr = max_speed_restr_ver.merge(temp_speed_restr, how = 'left', on = ['tpps_tc_id', 'tpps_tc_version'])

        # get not lifted speed_restrs
        latest_speed_restr = latest_speed_restr[latest_speed_restr['is_lifted'] == 0]

        for unique_line in unique_lines:
            line_speed_restr = latest_speed_restr[latest_speed_restr['start_track_type'] == unique_line]

            for unique_km in log_progress(unique_kms):
                # 2 km prior
                speed_restr_2km_prior = line_speed_restr[((line_speed_restr['start_kilometre_track'] >= unique_km - 2) & (line_speed_restr['start_kilometre_track'] < unique_km))
                          | ((line_speed_restr['end_kilometre_track'] >= unique_km - 2) & (line_speed_restr['end_kilometre_track'] < unique_km))
                          | ((line_speed_restr['start_kilometre_track'] <= unique_km - 2) & (line_speed_restr['end_kilometre_track'] > unique_km))]

                speed_restr_2km_prior_count = speed_restr_2km_prior.shape[0]
                speed_restr_2km_prior_mean = speed_restr_2km_prior['speed'].mean()
                speed_restr_2km_prior_max = speed_restr_2km_prior['speed'].max()
                speed_restr_2km_prior_min = speed_restr_2km_prior['speed'].min()

                # 5 km prior
                speed_restr_5km_prior = line_speed_restr[((line_speed_restr['start_kilometre_track'] >= unique_km - 5) & (line_speed_restr['start_kilometre_track'] < unique_km - 2))
                          | ((line_speed_restr['end_kilometre_track'] >= unique_km - 5) & (line_speed_restr['end_kilometre_track'] < unique_km - 2))
                          | ((line_speed_restr['start_kilometre_track'] <= unique_km - 5) & (line_speed_restr['end_kilometre_track'] > unique_km - 2))]

                speed_restr_5km_prior_count = speed_restr_5km_prior.shape[0]
                speed_restr_5km_prior_mean = speed_restr_5km_prior['speed'].mean()
                speed_restr_5km_prior_max = speed_restr_5km_prior['speed'].max()
                speed_restr_5km_prior_min = speed_restr_5km_prior['speed'].min()

                # 2 km post
                speed_restr_2km_post = line_speed_restr[((latest_speed_restr['start_kilometre_track'] >= unique_km) & (line_speed_restr['start_kilometre_track'] < unique_km + 2))
                          | ((line_speed_restr['end_kilometre_track'] >= unique_km) & (line_speed_restr['end_kilometre_track'] < unique_km + 2))
                          | ((line_speed_restr['start_kilometre_track'] <= unique_km) & (line_speed_restr['end_kilometre_track'] > unique_km + 2))]

                speed_restr_2km_post_count = speed_restr_2km_post.shape[0]
                speed_restr_2km_post_mean = speed_restr_2km_post['speed'].mean()
                speed_restr_2km_post_max = speed_restr_2km_post['speed'].max()
                speed_restr_2km_post_min = speed_restr_2km_post['speed'].min()

                # 5 km prior
                speed_restr_5km_post = line_speed_restr[((latest_speed_restr['start_kilometre_track'] >= unique_km + 2) & (line_speed_restr['start_kilometre_track'] < unique_km + 5))
                          | ((line_speed_restr['end_kilometre_track'] >= unique_km + 2) & (line_speed_restr['end_kilometre_track'] < unique_km + 5))
                          | ((line_speed_restr['start_kilometre_track'] <= unique_km + 2) & (line_speed_restr['end_kilometre_track'] > unique_km + 5))]

                speed_restr_5km_post_count = speed_restr_5km_post.shape[0]
                speed_restr_5km_post_mean = speed_restr_5km_post['speed'].mean()
                speed_restr_5km_post_max = speed_restr_5km_post['speed'].max()
                speed_restr_5km_post_min = speed_restr_5km_post['speed'].min()

                # populate data
                output.append([unique_date, unique_line, unique_km, 
                               speed_restr_2km_prior_count, speed_restr_2km_prior_mean, speed_restr_2km_prior_max, speed_restr_2km_prior_min,
                              speed_restr_5km_prior_count, speed_restr_5km_prior_mean, speed_restr_5km_prior_max, speed_restr_5km_prior_min,
                              speed_restr_2km_post_count, speed_restr_2km_post_mean, speed_restr_2km_post_max, speed_restr_2km_post_min,
                              speed_restr_5km_post_count, speed_restr_5km_post_mean, speed_restr_5km_post_max, speed_restr_5km_post_min])
    ```
Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...