Как исправить ошибку «Смещение при фиксации на разделе com.application.iot.measure.stage-0 со смещением 1053078427: истекло время ожидания запроса». - PullRequest
0 голосов
/ 04 июня 2019

У меня есть потребительское приложение kafka, которое читает данные IoT из темы kafka.Но все же я получаю следующие ошибки / предупреждения ошибочно.

logs

   2019-06-04T23:21:03.58+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:51:03.583 ERROR 12 --- [ntainer#0-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-2, groupId=newton] Offset commit failed on partition com.newton.forwarding.application.iot.measure.stage-0 at offset 1053164658: The coordinator is not aware of this member.
   2019-06-04T23:21:03.58+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:51:03.583  WARN 12 --- [ntainer#0-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-2, groupId=newton] Asynchronous auto-commit of offsets {com.newton.forwarding.application.iot.measure.stage-0=OffsetAndMetadata{offset=1053164658, leaderEpoch=null, metadata=''}} failed: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing max.poll.interval.ms or by reducing the maximum size of batches returned in poll() with max.poll.records.
   2019-06-04T23:21:03.58+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:51:03.583  WARN 12 --- [ntainer#0-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-2, groupId=newton] Synchronous auto-commit of offsets {com.newton.forwarding.application.iot.measure.stage-0=OffsetAndMetadata{offset=1053167516, leaderEpoch=null, metadata=''}} failed: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing max.poll.interval.ms or by reducing the maximum size of batches returned in poll() with max.poll.records.

Я уже опробовал несколько комбинаций max.poll.records и max.конфигурации poll.interval.ms.Я даже пытался увеличить request.timeout.ms, но эти ошибки и предупреждения не прекращаются.

Примечание: У меня нет контроля над брокером, поэтому я не могу попытаться изменить сессию.timeout.ms, поскольку он должен находиться в пределах диапазона параметров group.min.session.timeout.ms и group.max.session.timeout.ms посредника.

application.yml

spring:
  kafka:
    consumer:
      group-id: iot
      auto-offset-reset: earliest
      properties:
        fetch.max.wait.ms: 10000
        fetch.min.bytes: 30000000
        retry.backoff.ms: 1000
        max.poll.records: 4000000
        max.poll.interval.ms: 720000
        request.timeout.ms: 900000

В настоящее время поведение так же неустойчиво, как показано ниже.

logs

   2019-06-04T23:08:03.82+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:38:03.827 ERROR 12 --- [ntainer#0-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-2, groupId=newton] Offset commit failed on partition com.newton.forwarding.application.iot.measure.stage-0 at offset 1053064069: The coordinator is not aware of this member.
   2019-06-04T23:08:03.82+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:38:03.827  WARN 12 --- [ntainer#0-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-2, groupId=newton] Asynchronous auto-commit of offsets {com.newton.forwarding.application.iot.measure.stage-0=OffsetAndMetadata{offset=1053064069, leaderEpoch=null, metadata=''}} failed: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing max.poll.interval.ms or by reducing the maximum size of batches returned in poll() with max.poll.records.
   2019-06-04T23:08:03.82+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:38:03.827  WARN 12 --- [ntainer#0-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-2, groupId=newton] Synchronous auto-commit of offsets {com.newton.forwarding.application.iot.measure.stage-0=OffsetAndMetadata{offset=1053066926, leaderEpoch=null, metadata=''}} failed: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing max.poll.interval.ms or by reducing the maximum size of batches returned in poll() with max.poll.records.
   2019-06-04T23:09:43.04+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:39:43.044  WARN 12 --- [ntainer#0-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-2, groupId=newton] Synchronous auto-commit of offsets {com.newton.forwarding.application.iot.measure.stage-0=OffsetAndMetadata{offset=1053064069, leaderEpoch=null, metadata=''}} failed: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing max.poll.interval.ms or by reducing the maximum size of batches returned in poll() with max.poll.records.
   2019-06-04T23:10:00.13+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:40:00.130  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 119. No Of measures: 2857
   2019-06-04T23:10:12.90+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:40:12.909  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 125. No Of measures: 2893
   2019-06-04T23:10:22.94+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:40:22.948  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 74. No Of measures: 2880
   2019-06-04T23:10:34.44+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:40:34.445  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 73. No Of measures: 2862
   2019-06-04T23:10:50.50+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:40:50.501  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 73. No Of measures: 2866
   2019-06-04T23:10:56.08+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:40:56.086 ERROR 12 --- [ntainer#0-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-2, groupId=newton] Offset commit failed on partition com.newton.forwarding.application.iot.measure.stage-0 at offset 1053075561: The coordinator is not aware of this member.
   2019-06-04T23:10:56.08+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:40:56.086  WARN 12 --- [ntainer#0-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-2, groupId=newton] Asynchronous auto-commit of offsets {com.newton.forwarding.application.iot.measure.stage-0=OffsetAndMetadata{offset=1053075561, leaderEpoch=null, metadata=''}} failed: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing max.poll.interval.ms or by reducing the maximum size of batches returned in poll() with max.poll.records.
   2019-06-04T23:10:56.08+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:40:56.086  WARN 12 --- [ntainer#0-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-2, groupId=newton] Synchronous auto-commit of offsets {com.newton.forwarding.application.iot.measure.stage-0=OffsetAndMetadata{offset=1053078427, leaderEpoch=null, metadata=''}} failed: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing max.poll.interval.ms or by reducing the maximum size of batches returned in poll() with max.poll.records.
   2019-06-04T23:11:03.86+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:41:03.867 ERROR 12 --- [ntainer#0-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-2, groupId=newton] Offset commit failed on partition com.newton.forwarding.application.iot.measure.stage-0 at offset 1053078427: The coordinator is not aware of this member.
   2019-06-04T23:11:05.50+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:41:05.506  WARN 12 --- [ntainer#0-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-2, groupId=newton] Asynchronous auto-commit of offsets {com.newton.forwarding.application.iot.measure.stage-0=OffsetAndMetadata{offset=1053078427, leaderEpoch=null, metadata=''}} failed: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing max.poll.interval.ms or by reducing the maximum size of batches returned in poll() with max.poll.records.
   2019-06-04T23:11:33.74+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:41:33.743  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 79. No Of measures: 2862
   2019-06-04T23:11:45.66+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:41:45.664  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 109. No Of measures: 2866
   2019-06-04T23:11:56.49+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:41:56.492  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 75. No Of measures: 2880
   2019-06-04T23:12:08.39+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:42:08.390  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 90. No Of measures: 2889
   2019-06-04T23:12:15.71+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:42:15.716 ERROR 12 --- [ntainer#0-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-2, groupId=newton] Offset commit failed on partition com.newton.forwarding.application.iot.measure.stage-0 at offset 1053078427: The request timed out.
   2019-06-04T23:12:25.00+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:42:25.001  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 80. No Of measures: 2880
   2019-06-04T23:12:43.71+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:42:43.714  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 97. No Of measures: 2870
   2019-06-04T23:13:02.37+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:43:02.374  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 121. No Of measures: 2868
   2019-06-04T23:13:21.72+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:43:21.724  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 99. No Of measures: 2867
   2019-06-04T23:13:42.36+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:43:42.368  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 101. No Of measures: 2860
   2019-06-04T23:14:01.73+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:44:01.737  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 145. No Of measures: 2862
   2019-06-04T23:14:19.28+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:44:19.287  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 118. No Of measures: 2873
   2019-06-04T23:14:37.63+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:44:37.630  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 104. No Of measures: 2866
   2019-06-04T23:14:55.88+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:44:55.889  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 117. No Of measures: 2880
   2019-06-04T23:15:12.29+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:45:12.298  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 203. No Of measures: 2880
   2019-06-04T23:15:31.48+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:45:31.480  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 105. No Of measures: 2880
   2019-06-04T23:15:51.25+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:45:51.251  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 176. No Of measures: 2880
   2019-06-04T23:16:06.69+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:46:06.692  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 157. No Of measures: 2880
   2019-06-04T23:16:23.27+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:46:23.271  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 110. No Of measures: 2880
   2019-06-04T23:16:39.18+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:46:39.184  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 88. No Of measures: 2880
   2019-06-04T23:16:58.28+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:46:58.285  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 108. No Of measures: 2880
   2019-06-04T23:17:17.67+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:47:17.676  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 141. No Of measures: 2885
   2019-06-04T23:17:36.67+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:47:36.669  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 107. No Of measures: 2880
   2019-06-04T23:17:53.78+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:47:53.783  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 344. No Of measures: 2855
   2019-06-04T23:18:12.35+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:48:12.351  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 67. No Of measures: 2880
   2019-06-04T23:18:29.12+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:48:29.129  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 109. No Of measures: 2895
   2019-06-04T23:18:46.31+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:48:46.313  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 131. No Of measures: 2861
   2019-06-04T23:19:03.72+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:49:03.729  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 116. No Of measures: 2880
   2019-06-04T23:19:22.91+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:49:22.913  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 113. No Of measures: 2867
   2019-06-04T23:19:40.83+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:49:40.832  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 118. No Of measures: 2859
   2019-06-04T23:19:58.58+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:49:58.587  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 106. No Of measures: 2880
   2019-06-04T23:20:16.08+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:50:16.086  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 89. No Of measures: 2880
   2019-06-04T23:20:35.23+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:50:35.239  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 163. No Of measures: 2854
   2019-06-04T23:20:55.44+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:50:55.446  INFO 12 --- [ntainer#0-0-C-1] c.s.n.f.a.s.impl.ConsumerServiceImpl     : Time taken(ms) 214. No Of measures: 2858
   2019-06-04T23:21:03.58+0530 [APP/PROC/WEB/0] OUT 2019-06-04 17:51:03.583 ERROR 12 --- [ntainer#0-0-C-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-2, groupId=newton] Offset commit failed on partition com.newton.forwarding.application.iot.measure.stage-0 at offset 1053164658: The coordinator is not aware of this member.

Любой совет, чтобы решить эту проблему оченьприветствуется.

PS: Должен ли я рассмотреть возможность изолировать протектор слушателя-потребителя от обработки данных с обработкой рабочих потоков, как указано в вопросе this ?

1 Ответ

0 голосов
/ 06 июня 2019

Вам необходимо определить, сколько времени потребуется для обработки 4000000 записей, и установить для max.poll.interval.ms подходящее значение.

Плохой дизайн - иметь только один раздел для такого большого количества записей.

Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...