Поощрение ошибок после изменения примера ring_road - PullRequest
0 голосов
/ 25 сентября 2019

Я пытался выполнить пример ring_road после внесения некоторых изменений в RLController.Я добавил некоторые ограничения на функцию ускорения в BaseController, которая должна быть основой RLController.Я использовал 5 процессоров для обработки.Программа застрянет следующим образом.Кто-нибудь может предоставить некоторые идеи?

(base) hao@Hao:~/projects/flow/examples/rllib$ source activate flow
(flow) hao@Hao:~/projects/flow/examples/rllib$ python ring_hao.py 
Process STDOUT and STDERR is being redirected to /tmp/ray/session_2019-09-25_10-59-33_27001/logs.
Waiting for redis server at 127.0.0.1:42173 to respond...
Waiting for redis server at 127.0.0.1:42369 to respond...
Starting the Plasma object store with 6.698305126 GB memory using /dev/shm.
I0925 10:59:34.167969 27017 store.cc:994] Allowing the Plasma store to use up to 6.69831GB of memory.
I0925 10:59:34.168114 27017 store.cc:1024] Starting object store with directory /dev/shm and huge page support disabled
Failed to start the UI, you may need to run 'pip install jupyter'.
== Status ==
Using FIFO scheduling algorithm.
Resources requested: 0/5 CPUs, 0/0 GPUs
Memory usage on this node: 14.7/16.7 GB

Created LogSyncer for /home/hao/ray_results/stabilizing_the_ring/PPO_WaveAttenuationPOEnv-v0_0_2019-09-25_10-59-34wp890kdw -> 
== Status ==
Using FIFO scheduling algorithm.
Resources requested: 5/5 CPUs, 0/0 GPUs
Memory usage on this node: 14.7/16.7 GB
Result logdir: /home/hao/ray_results/stabilizing_the_ring
RUNNING trials:
 - PPO_WaveAttenuationPOEnv-v0_0:   RUNNING

2019-09-25 10:59:38,236 WARNING ppo.py:137 -- By default, observations will be normalized with MeanStdFilter
Loading configuration... done.
Success.
 Starting SUMO on port 34305
Loading configuration... done.
2019-09-25 10:59:39,706 INFO policy_evaluator.py:262 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
2019-09-25 10:59:39.706769: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py:100: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
2019-09-25 10:59:40,421 INFO multi_gpu_optimizer.py:74 -- LocalMultiGPUOptimizer devices ['/cpu:0']
Loading configuration... done.
Success.
 Starting SUMO on port 50757
Loading configuration... Loading configuration... done.
done.
Loading configuration... done.
Success.
Success.
 Starting SUMO on port 46977
 Starting SUMO on port 59675
Loading configuration... Loading configuration... done.
done.
Loading configuration... done.
Success.
 Starting SUMO on port 44953
Loading configuration... done.
2019-09-25 10:59:44,881 INFO policy_evaluator.py:262 -- Creating policy evaluation worker 1 on CPU (please ignore any CUDA init errors)
2019-09-25 10:59:44.882358: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-09-25 10:59:44,899 INFO policy_evaluator.py:262 -- Creating policy evaluation worker 4 on CPU (please ignore any CUDA init errors)
2019-09-25 10:59:44.900531: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-09-25 10:59:44,908 INFO policy_evaluator.py:262 -- Creating policy evaluation worker 2 on CPU (please ignore any CUDA init errors)
2019-09-25 10:59:44.909385: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-09-25 10:59:44,943 INFO policy_evaluator.py:262 -- Creating policy evaluation worker 3 on CPU (please ignore any CUDA init errors)
2019-09-25 10:59:44.944574: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py:100: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py:100: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py:100: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py:100: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "

-----------------------
ring length: 224
v_max: 3.4281086136538996
-----------------------

-----------------------
ring length: 221
v_max: 3.285334164169417
-----------------------

-----------------------
ring length: 242
v_max: 4.2844067680020546
-----------------------

-----------------------
ring length: 247
v_max: 4.522125250095652
-----------------------
Loading configuration... Loading configuration... done.
done.
Loading configuration... Loading configuration... done.
done.
Success.
Success.
 Starting SUMO on port 50757
 Starting SUMO on port 46977
Success.
Success.
 Starting SUMO on port 59675
 Starting SUMO on port 44953
Loading configuration... done.
Loading configuration... done.
Loading configuration... done.
Loading configuration... done.
construct a RLController using BaseController
construct a RLController using BaseController
construct a RLController using BaseController
construct a RLController using BaseController
Error processing event.
Traceback (most recent call last):
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/tune/trial_runner.py", line 261, in _process_events
    result = self.trial_executor.fetch_result(trial)
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/tune/ray_trial_executor.py", line 211, in fetch_result
    result = ray.get(trial_future[0])
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/worker.py", line 2386, in get
    raise value
ray.worker.RayTaskError: ray_PPOAgent:train() (pid=27024, host=Hao)
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/agents/agent.py", line 279, in train
    result = Trainable.train(self)
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/tune/trainable.py", line 146, in train
    result = self._train()
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/agents/ppo/ppo.py", line 101, in _train
    fetches = self.optimizer.step()
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/optimizers/multi_gpu_optimizer.py", line 125, in step
    self.num_envs_per_worker, self.train_batch_size)
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/optimizers/rollout.py", line 28, in collect_samples
    next_sample = ray.get(fut_sample)
ray.worker.RayTaskError: ray_PolicyEvaluator:sample() (pid=27027, host=Hao)
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/evaluation/policy_evaluator.py", line 368, in sample
    batches = [self.input_reader.next()]
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/offline/input_reader.py", line 25, in next
    batches = [self.sampler.get_data()]
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/evaluation/sampler.py", line 64, in get_data
    item = next(self.rollout_provider)
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/evaluation/sampler.py", line 261, in _env_runner
    async_vector_env.poll()
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/env/async_vector_env.py", line 228, in poll
    self.new_obs = self.vector_env.vector_reset()
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/env/vector_env.py", line 79, in vector_reset
    return [e.reset() for e in self.envs]
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/env/vector_env.py", line 79, in <listcomp>
    return [e.reset() for e in self.envs]
  File "/home/hao/flow/flow/envs/ring/wave_attenuation.py", line 210, in reset
    observation = super().reset()
  File "/home/hao/flow/flow/envs/base.py", line 536, in reset
    observation, _, _, _ = self.step(rl_actions=None)
  File "/home/hao/flow/flow/envs/base.py", line 379, in step
    states = self.get_state()
  File "/home/hao/flow/flow/envs/ring/wave_attenuation.py", line 255, in get_state
    rl_id = self.k.vehicle.get_rl_ids()[0]
IndexError: list index out of range


Worker ip unknown, skipping log sync for /home/hao/ray_results/stabilizing_the_ring/PPO_WaveAttenuationPOEnv-v0_0_2019-09-25_10-59-34wp890kdw
Attempting to recover trial state from last checkpoint.
I0925 10:59:48.275229 27017 store.cc:599] Disconnecting client on fd 11
I0925 10:59:48.275298 27017 store.cc:599] Disconnecting client on fd 15
I0925 10:59:48.275626 27017 store.cc:599] Disconnecting client on fd 12
== Status ==
Using FIFO scheduling algorithm.
Resources requested: 5/5 CPUs, 0/0 GPUs
Memory usage on this node: 15.5/16.7 GB: ***LOW MEMORY*** less than 10% of the memory on this node is available for use. This can cause unexpected crashes. Consider reducing the memory used by your application or reducing the Ray object store size by setting `object_store_memory` when calling `ray.init`.
Result logdir: /home/hao/ray_results/stabilizing_the_ring
RUNNING trials:
 - PPO_WaveAttenuationPOEnv-v0_0:   RUNNING, 1 failures: /home/hao/ray_results/stabilizing_the_ring/PPO_WaveAttenuationPOEnv-v0_0_2019-09-25_10-59-34wp890kdw/error_2019-09-25_10-59-48.txt

I0925 10:59:48.453184 27017 store.cc:599] Disconnecting client on fd 14
I0925 10:59:48.500077 27017 store.cc:599] Disconnecting client on fd 13
2019-09-25 10:59:51,950 WARNING ppo.py:137 -- By default, observations will be normalized with MeanStdFilter
Loading configuration... done.
Success.
 Starting SUMO on port 58821
Loading configuration... done.
2019-09-25 10:59:53,010 INFO policy_evaluator.py:262 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
2019-09-25 10:59:53.011323: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py:100: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
2019-09-25 10:59:53,664 INFO multi_gpu_optimizer.py:74 -- LocalMultiGPUOptimizer devices ['/cpu:0']
Loading configuration... done.
Loading configuration... done.
Success.
 Starting SUMO on port 38249
Success.
 Starting SUMO on port 54287
Loading configuration... done.
Loading configuration... done.
Loading configuration... done.
Success.
 Starting SUMO on port 48283
Loading configuration... done.
2019-09-25 10:59:54,856 INFO policy_evaluator.py:262 -- Creating policy evaluation worker 1 on CPU (please ignore any CUDA init errors)
2019-09-25 10:59:54.857009: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-09-25 10:59:54,861 INFO policy_evaluator.py:262 -- Creating policy evaluation worker 2 on CPU (please ignore any CUDA init errors)
2019-09-25 10:59:54.862636: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-09-25 10:59:54,882 INFO policy_evaluator.py:262 -- Creating policy evaluation worker 3 on CPU (please ignore any CUDA init errors)
2019-09-25 10:59:54.883449: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py:100: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py:100: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py:100: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
Loading configuration... done.
Success.
 Starting SUMO on port 53451
Loading configuration... done.
2019-09-25 10:59:58,307 INFO policy_evaluator.py:262 -- Creating policy evaluation worker 4 on CPU (please ignore any CUDA init errors)
2019-09-25 10:59:58.308767: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py:100: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "

-----------------------
ring length: 238
v_max: 4.094180836186086
-----------------------

-----------------------
ring length: 251
v_max: 4.71224180704279
-----------------------

-----------------------
ring length: 238
v_max: 4.094180836186086
-----------------------

-----------------------
ring length: 269
v_max: 5.566938458220921
-----------------------
Loading configuration... Loading configuration... done.
done.
Loading configuration... done.
Loading configuration... done.
Success.
Success.
Success.
Success.
 Starting SUMO on port 54287
 Starting SUMO on port 48283
 Starting SUMO on port 38249
 Starting SUMO on port 53451
Loading configuration... Loading configuration... Loading configuration... done.
done.
done.
Loading configuration... done.
construct a RLController using BaseController
construct a RLController using BaseController
construct a RLController using BaseController
construct a RLController using BaseController
Error processing event.
Traceback (most recent call last):
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/tune/trial_runner.py", line 261, in _process_events
    result = self.trial_executor.fetch_result(trial)
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/tune/ray_trial_executor.py", line 211, in fetch_result
    result = ray.get(trial_future[0])
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/worker.py", line 2386, in get
    raise value
ray.worker.RayTaskError: ray_PPOAgent:train() (pid=27214, host=Hao)
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/agents/agent.py", line 279, in train
    result = Trainable.train(self)
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/tune/trainable.py", line 146, in train
    result = self._train()
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/agents/ppo/ppo.py", line 101, in _train
    fetches = self.optimizer.step()
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/optimizers/multi_gpu_optimizer.py", line 125, in step
    self.num_envs_per_worker, self.train_batch_size)
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/optimizers/rollout.py", line 28, in collect_samples
    next_sample = ray.get(fut_sample)
ray.worker.RayTaskError: ray_PolicyEvaluator:sample() (pid=27266, host=Hao)
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/evaluation/policy_evaluator.py", line 368, in sample
    batches = [self.input_reader.next()]
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/offline/input_reader.py", line 25, in next
    batches = [self.sampler.get_data()]
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/evaluation/sampler.py", line 64, in get_data
    item = next(self.rollout_provider)
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/evaluation/sampler.py", line 261, in _env_runner
    async_vector_env.poll()
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/env/async_vector_env.py", line 228, in poll
    self.new_obs = self.vector_env.vector_reset()
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/env/vector_env.py", line 79, in vector_reset
    return [e.reset() for e in self.envs]
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/rllib/env/vector_env.py", line 79, in <listcomp>
    return [e.reset() for e in self.envs]
  File "/home/hao/flow/flow/envs/ring/wave_attenuation.py", line 210, in reset
    observation = super().reset()
  File "/home/hao/flow/flow/envs/base.py", line 536, in reset
    observation, _, _, _ = self.step(rl_actions=None)
  File "/home/hao/flow/flow/envs/base.py", line 379, in step
    states = self.get_state()
  File "/home/hao/flow/flow/envs/ring/wave_attenuation.py", line 255, in get_state
    rl_id = self.k.vehicle.get_rl_ids()[0]
IndexError: list index out of range


Worker ip unknown, skipping log sync for /home/hao/ray_results/stabilizing_the_ring/PPO_WaveAttenuationPOEnv-v0_0_2019-09-25_10-59-34wp890kdw
Attempting to recover trial state from last checkpoint.
I0925 11:00:01.452862 27017 store.cc:599] Disconnecting client on fd 11
I0925 11:00:01.452919 27017 store.cc:599] Disconnecting client on fd 16
== Status ==
Using FIFO scheduling algorithm.
Resources requested: 5/5 CPUs, 0/0 GPUs
Memory usage on this node: 15.7/16.7 GB: ***LOW MEMORY*** less than 10% of the memory on this node is available for use. This can cause unexpected crashes. Consider reducing the memory used by your application or reducing the Ray object store size by setting `object_store_memory` when calling `ray.init`.
Result logdir: /home/hao/ray_results/stabilizing_the_ring
RUNNING trials:
 - PPO_WaveAttenuationPOEnv-v0_0:   RUNNING, 2 failures: /home/hao/ray_results/stabilizing_the_ring/PPO_WaveAttenuationPOEnv-v0_0_2019-09-25_10-59-34wp890kdw/error_2019-09-25_11-00-01.txt

...

Worker ip unknown, skipping log sync for /home/hao/ray_results/stabilizing_the_ring/PPO_WaveAttenuationPOEnv-v0_0_2019-09-25_10-59-34wp890kdw
Attempting to recover trial state from last checkpoint.
I0925 11:00:11.351241 27017 store.cc:599] Disconnecting client on fd 20
Error processing event.
Traceback (most recent call last):
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/tune/trial_runner.py", line 261, in _process_events
    result = self.trial_executor.fetch_result(trial)
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/tune/ray_trial_executor.py", line 211, in fetch_result
    result = ray.get(trial_future[0])
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/worker.py", line 2386, in get
    raise value
ray.worker.RayTaskError: ray_PPOAgent:train() (pid=27446, host=Hao)
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/memory_monitor.py", line 78, in raise_if_low_memory
    self.error_threshold))
ray.memory_monitor.RayOutOfMemoryError: More than 95% of the memory on node Hao is used (16.04 / 16.75 GB). The top 5 memory consumers are:

PID MEM COMMAND
5826    1.1GB   /opt/google/chrome/chrome --type=renderer --field-trial-handle=13673121439687080380,1652662612341263
6161    1.03GB  /usr/bin/nautilus --gapplication-service
1827    0.87GB  /usr/bin/gnome-shell
29430   0.75GB  /snap/pycharm-community/150/jbr/bin/java -classpath /snap/pycharm-community/150/lib/bootstrap.jar:/s
2684    0.56GB  /opt/google/chrome/chrome --type=gpu-process --field-trial-handle=13673121439687080380,1652662612341

In addition, ~0.69 GB of shared memory is currently being used by the Ray object store. You can set the object store size with the `object_store_memory` parameter when starting Ray, and the max Redis size with `redis_max_memory`.

Worker ip unknown, skipping log sync for /home/hao/ray_results/stabilizing_the_ring/PPO_WaveAttenuationPOEnv-v0_0_2019-09-25_10-59-34wp890kdw
Attempting to recover trial state from last checkpoint.
I0925 11:00:11.423305 27017 store.cc:599] Disconnecting client on fd 19
Error processing event.
Traceback (most recent call last):
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/tune/trial_runner.py", line 261, in _process_events
    result = self.trial_executor.fetch_result(trial)
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/tune/ray_trial_executor.py", line 211, in fetch_result
    result = ray.get(trial_future[0])
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/worker.py", line 2386, in get
    raise value
ray.worker.RayTaskError: ray_PPOAgent:train() (pid=27452, host=Hao)
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/psutil/_common.py", line 342, in wrapper
    ret = self._cache[fun]
AttributeError: _cache

During handling of the above exception, another exception occurred:

ray_PPOAgent:train() (pid=27452, host=Hao)
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/psutil/_pslinux.py", line 1513, in wrapper
    return fun(self, *args, **kwargs)
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/psutil/_common.py", line 345, in wrapper
    return fun(self)
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/psutil/_pslinux.py", line 1559, in _parse_stat_file
    with open_binary("%s/%s/stat" % (self._procfs_path, self.pid)) as f:
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/psutil/_common.py", line 587, in open_binary
    return open(fname, "rb", **kwargs)
FileNotFoundError: [Errno 2] No such file or directory: '/proc/27449/stat'

During handling of the above exception, another exception occurred:

ray_PPOAgent:train() (pid=27452, host=Hao)
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/psutil/__init__.py", line 473, in _init
    self.create_time()
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/psutil/__init__.py", line 823, in create_time
    self._create_time = self._proc.create_time()
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/psutil/_pslinux.py", line 1513, in wrapper
    return fun(self, *args, **kwargs)
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/psutil/_pslinux.py", line 1723, in create_time
    ctime = float(self._parse_stat_file()['create_time'])
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/psutil/_pslinux.py", line 1524, in wrapper
    raise NoSuchProcess(self.pid, self._name)
psutil.NoSuchProcess: psutil.NoSuchProcess process no longer exists (pid=27449)

During handling of the above exception, another exception occurred:

ray_PPOAgent:train() (pid=27452, host=Hao)
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/memory_monitor.py", line 78, in raise_if_low_memory
    self.error_threshold))
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/ray/memory_monitor.py", line 26, in get_message
    proc = psutil.Process(pid)
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/psutil/__init__.py", line 446, in __init__
    self._init(pid)
  File "/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/psutil/__init__.py", line 486, in _init
    raise NoSuchProcess(pid, None, msg)
psutil.NoSuchProcess: psutil.NoSuchProcess no process found with pid 27449

Worker ip unknown, skipping log sync for /home/hao/ray_results/stabilizing_the_ring/PPO_WaveAttenuationPOEnv-v0_0_2019-09-25_10-59-34wp890kdw
I0925 11:00:11.492957 27017 store.cc:599] Disconnecting client on fd 16
Attempting to recover trial state from last checkpoint.
2019-09-25 11:00:15,188 WARNING ppo.py:137 -- By default, observations will be normalized with MeanStdFilter
Loading configuration... done.
Success.
 Starting SUMO on port 55505
Loading configuration... done.
2019-09-25 11:00:16,304 INFO policy_evaluator.py:262 -- Creating policy evaluation worker 0 on CPU (please ignore any CUDA init errors)
2019-09-25 11:00:16.305477: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
/home/hao/anaconda3/envs/flow/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py:100: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
2019-09-25 11:00:16,948 INFO multi_gpu_optimizer.py:74 -- LocalMultiGPUOptimizer devices ['/cpu:0']
...