Перезапуск Flume и удаление файла канала из-за сбоя приемника - PullRequest
0 голосов
/ 24 октября 2018

Я использую flume (файловый канал) для сбора логов и погружения в MQ.

Иногда flume перезапускается автоматически из-за исключения тайм-аута соединения между flume и MQ, как показано во фрагменте журнала.

Для удобства удаляйте дубликаты или аналогичные записи.Это нормально?

24 十月 2018 20:40:23,192 INFO  [lifecycleSupervisor-1-0] (org.apache.flume.node.PollingPropertiesFileConfigurationProvider.start:62)  - Configuration provider starting
24 十月 2018 20:40:23,200 INFO  [conf-file-poller-0] (org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run:134)  - Reloading configuration file:/flume/conf/odps.conf
24 十月 2018 20:40:23,210 INFO  [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016)  - Processing:k1
24 十月 2018 20:40:23,210 INFO  [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016)  - Processing:k2
24 十月 2018 20:40:23,210 INFO  [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016)  - Processing:k3
24 十月 2018 20:40:23,310 INFO  [conf-file-poller-0] (org.apache.flume.channel.DefaultChannelFactory.create:42)  - Creating instance of channel c1 type file
24 十月 2018 20:40:23,310 INFO  [conf-file-poller-0] (org.apache.flume.node.AbstractConfigurationProvider.loadChannels:201)  - Created channel c1
24 十月 2018 20:40:23,311 INFO  [conf-file-poller-0] (org.apache.flume.source.DefaultSourceFactory.create:41)  - Creating instance of source r1, type exec
24 十月 2018 20:40:23,339 INFO  [conf-file-poller-0] (org.apache.flume.sink.DefaultSinkFactory.create:42)  - Creating instance of sink: k1, type: com.aliyun.datahub.flume.sink.DatahubSink
24 十月 2018 20:40:23,350 INFO  [conf-file-poller-0] (org.apache.flume.node.AbstractConfigurationProvider.getConfiguration:116)  - Channel c1 connected to [r1, k1]
24 十月 2018 20:40:23,373 INFO  [lifecycleSupervisor-1-3] (org.apache.flume.channel.file.Log.<init>:344)  - Encryption is not enabled
24 十月 2018 20:40:23,375 INFO  [lifecycleSupervisor-1-3] (org.apache.flume.channel.file.Log.replay:393)  - Replay started
24 十月 2018 20:40:23,384 INFO  [lifecycleSupervisor-1-6] (org.apache.flume.channel.file.Log.replay:405)  - Found NextFileID 9, from [/flume/data/c7/log-8, /flume/data/c1/log-9]
24 十月 2018 20:40:23,391 INFO  [lifecycleSupervisor-1-2] (org.apache.flume.channel.file.EventQueueBackingStoreFileV3.<init>:53)  - Starting up with /flume/checkpoint/c1/checkpoint and /flume/checkpoint/c11/checkpoint.meta
24 十月 2018 20:40:23,393 INFO  [lifecycleSupervisor-1-9] (org.apache.flume.channel.file.EventQueueBackingStoreFileV3.<init>:57)  - Reading checkpoint metadata from /flume/checkpoint/c9/checkpoint.meta
24 十月 2018 20:40:23,540 INFO  [lifecycleSupervisor-1-7] (org.apache.flume.channel.file.FlumeEventQueue.<init>:115)  - QueueSet population inserting 0 took 0
24 十月 2018 20:40:23,547 INFO  [lifecycleSupervisor-1-3] (org.apache.flume.channel.file.Log.replay:444)  - Last Checkpoint Wed Oct 24 20:35:12 CST 2018, queue depth = 0
24 十月 2018 20:40:23,599 INFO  [lifecycleSupervisor-1-8] (org.apache.flume.channel.file.Log.doReplay:529)  - Replaying logs with v2 replay logic

Но я обнаружил «Удаление старого файла» в журнале потоков, «старый файл» относится к каталогу данных канала, и кажется, что события в файловом канале потеряны.Я думаю, что это неразумно.

Так есть ли какие-нибудь "отказоустойчивые" опции конфигурации?

24 十月 2018 20:42:52,657 INFO  [Log-BackgroundWorker-c3] (org.apache.flume.channel.file.LogFile$RandomReader.close:520)  - Closing RandomReader /flume/data/c3/log-13
24 十月 2018 20:42:52,659 INFO  [Log-BackgroundWorker-c11] (org.apache.flume.channel.file.Log.writeCheckpoint:1052)  - Updated checkpoint for file: /flume/data/c11/log-14 position: 4864 logWriteOrderID: 1540387389295
24 十月 2018 20:42:52,659 INFO  [Log-BackgroundWorker-c11] (org.apache.flume.channel.file.Log.removeOldLogs:1108)  - Removing old file: /flume/data/c11/log-10
24 十月 2018 20:42:52,659 INFO  [Log-BackgroundWorker-c11] (org.apache.flume.channel.file.Log.removeOldLogs:1108)  - Removing old file: /flume/data/c11/log-10.meta
24 十月 2018 20:42:52,660 INFO  [Log-BackgroundWorker-c11] (org.apache.flume.channel.file.Log.removeOldLogs:1108)  - Removing old file: /flume/data/c11/log-11
24 十月 2018 20:42:52,660 INFO  [Log-BackgroundWorker-c7] (org.apache.flume.channel.file.EventQueueBackingStoreFile.checkpoint:252)  - Updating checkpoint metadata: logWriteOrderID: 1540387389297, queueSize: 161908, queueHead: 71092
24 十月 2018 20:42:52,660 INFO  [Log-BackgroundWorker-c11] (org.apache.flume.channel.file.Log.removeOldLogs:1108)  - Removing old file: /flume/data/c11/log-11.meta
24 十月 2018 20:42:52,660 INFO  [Log-BackgroundWorker-c11] (org.apache.flume.channel.file.Log.removeOldLogs:1108)  - Removing old file: /flume/data/c11/log-12
24 十月 2018 20:42:52,660 INFO  [Log-BackgroundWorker-c11] (org.apache.flume.channel.file.Log.removeOldLogs:1108)  - Removing old file: /flume/data/c11/log-12.meta
24 十月 2018 20:42:52,665 INFO  [Log-BackgroundWorker-c8] (org.apache.flume.channel.file.Log.writeCheckpoint:1052)  - Updated checkpoint for file: /flume/data/c8/log-14 position: 3003 logWriteOrderID: 1540387389296
24 十月 2018 20:42:52,666 INFO  [Log-BackgroundWorker-c8] (org.apache.flume.channel.file.LogFile$RandomReader.close:520)  - Closing RandomReader /flume/data/c8/log-11
24 十月 2018 20:42:52,671 INFO  [Log-BackgroundWorker-c8] (org.apache.flume.channel.file.LogFile$RandomReader.close:520)  - Closing RandomReader /flume/data/c8/log-12
24 十月 2018 20:42:52,672 INFO  [Log-BackgroundWorker-c7] (org.apache.flume.channel.file.Log.writeCheckpoint:1052)  - Updated checkpoint for file: /flume/data/c7/log-15 position: 4658633 logWriteOrderID: 1540387389297
24 十月 2018 20:42:52,672 INFO  [Log-BackgroundWorker-c7] (org.apache.flume.channel.file.LogFile$RandomReader.close:520)  - Closing RandomReader /flume/data/c7/log-11
24 十月 2018 20:42:52,679 INFO  [Log-BackgroundWorker-c7] (org.apache.flume.channel.file.LogFile$RandomReader.close:520)  - Closing RandomReader /flume/data/c7/log-12
24 十月 2018 20:42:52,689 INFO  [Log-BackgroundWorker-c7] (org.apache.flume.channel.file.LogFile$RandomReader.close:520)  - Closing RandomReader /flume/data/c7/log-13
24 十月 2018 20:42:52,698 INFO  [Log-BackgroundWorker-c7] (org.apache.flume.channel.file.LogFile$RandomReader.close:520)  - Closing RandomReader /flume/data/c7/log-14
24 十月 2018 20:42:53,362 INFO  [Log-BackgroundWorker-c2] (org.apache.flume.channel.file.EventQueueBackingStoreFile.beginCheckpoint:227)  - Start checkpoint for /flume/checkpoint/c2/checkpoint, elements to sync = 16
24 十月 2018 20:42:53,370 INFO  [Log-BackgroundWorker-c2] (org.apache.flume.channel.file.EventQueueBackingStoreFile.checkpoint:252)  - Updating checkpoint metadata: logWriteOrderID: 1540387389766, queueSize: 70, queueHead: 4238
...