Когда я пытаюсь выполнить простую команду (в данном случае / печать с маршрутизатора) на тысяче хостов, я получаю значительное время зависания между последовательными нажатиями. причина, по которой я прибег к использованию serial, была связана с тем, что при асинхронном выполнении команд для второй задачи для извлечения полезной информации из предыдущей задачи она в конечном итоге замедлится до сканирования. Я решил, что запуск в партиях размером 200 будет удовлетворять мои потребности. Ниже: время профиля моих задач, мой текущий ansible файл конфигурации, моя структура playbook и все, кроме моего файла инвентаря, который содержит слишком много конфиденциальной информации.
Время профиля задачи
PLAY [Open22]
TASK [Test : print info]
Thursday 27 February 2020 16:01:36 +0000 (0:00:00.051) 0:00:00.051
TASK [Test : check the task]
Thursday 27 February 2020 16:01:47 +0000 (0:00:10.686) 0:00:10.738
PLAY [Open22]
TASK [Test : print info]
Thursday 27 February 2020 16:03:41 +0000 (0:01:54.000) 0:02:04.739
TASK [Test : check the task]
Thursday 27 February 2020 16:03:52 +0000 (0:00:10.675) 0:02:15.414
PLAY [Open22]
TASK [Test : print info]
Thursday 27 February 2020 16:04:32 +0000 (0:00:00.052) 0:00:00.052
TASK [Test : check the task]
Thursday 27 February 2020 16:04:43 +0000 (0:00:10.996) 0:00:11.048
PLAY [Open22]
TASK [Test : print info]
Thursday 27 February 2020 16:06:36 +0000 (0:01:53.780) 0:02:04.829
TASK [Test : check the task]
Thursday 27 February 2020 16:06:47 +0000 (0:00:10.839) 0:02:15.669
PLAY [Open22]
TASK [Test : print info]
Thursday 27 February 2020 16:08:33 +0000 (0:01:45.977) 0:04:01.646
TASK [Test : check the task]
Thursday 27 February 2020 16:08:44 +0000 (0:00:10.708) 0:04:12.354
PLAY [Open22]
TASK [Test : print info]
Thursday 27 February 2020 16:09:12 +0000 (0:00:27.695) 0:04:40.050
TASK [Test : check the task]
Thursday 27 February 2020 16:09:23 +0000 (0:00:10.975) 0:04:51.025
PLAY [Open22]
TASK [Test : print info]
Thursday 27 February 2020 16:09:44 +0000 (0:00:20.888) 0:05:11.914
TASK [Test : check the task]
Thursday 27 February 2020 16:09:55 +0000 (0:00:11.035) 0:05:22.949
Thursday 27 February 2020 16:10:16 +0000 (0:00:21.639) 0:05:44.589
===============================================================================
Test : check the task --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 21.64s
Test : print info ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 11.04s
Playbook run took 0 days, 0 hours, 5 minutes, 44 seconds
ansible .cfg файл ниже
[defaults]
forks = 100
poll_interval = 5
host_key_checking = False
callback_whitelist = profile_tasks, timer
log_path = /etc/ansible/log.txt
deprecation_warnings = False
[inventory]
[privilege_escalation]
[paramiko_connection]
[ssh_connection]
ssh_args = -o ControlMaster=auto -o ControlPersist=180000 -o PreferredAuthentications=publickey
control_path = %(directory)s/ansible-ssh-%%h-%%p-%%r
pipelining = True
[persistent_connection]
[accelerate]
[selinux]
[colors]
[diff]
Test-Targets.yml
---
- name: Open22
gather_facts: False
serial: 200
hosts:
- 106W87thNewYorkNY
- 130WestKentMT
- 1411WillowLouisvilleKY
- 16CrosbyStNewYorkNY
- 180WestCarrboroNC
- 200EastAptsDurhamNC
- 2211ThirdAvenueNYC
- 23HundredAtRidgeviewPlanoTX
- 2500HudsonTerraceFortLeeNJ
- 25SuttonPlaceNewYorkNY
- 300East62ndStNewYorkNY
- 314HamiltonSaginawMI
- 3401RedRiverAustinTX
- 3800HamptonStLouisMO
- 380RiversideNewYorkNY
- 40HydeParkAustinTx
- 4650HamptonStLouisMO
- 5258MarcyAveBrooklynNY
- 53348thAvenueAstoriaNY
- 555EastAptsWinstonSalemNC
- 9615ShoreRoadNewYork
- AbbotsgateLoftsPowellOH
- AberdeenAcresSanAntonioTX
- AlexanderVillageCharlotteNC
- AlonAtCastleHillsSanAntonioTX
roles:
- Test
main.yml
- name: print info
routeros_command:
commands: /system routerboard print
async: 10000
poll: 0
register: yum_sleeper
- name: check the task
async_status:
jid: "{{ yum_sleeper.ansible_job_id }}"
register: job_result
until: job_result.finished
Структура Playbook
└── MikroTikMajorPush3
├── InventoryList
├── roles
│ └── Test
│ └── tasks
│ └── main.yml
└── Test-Targets.yml
Это действительно делает работу. Просто зависания убивают меня тем, сколько времени они добавляют к общей игре. Отсутствие серийного номера позволяет быстро завершить первую задачу, но вторая задача начинает со временем замедляться до сканирования
Редактировать: после некоторой отладки похоже, что она может быть связана с ошибочными соединениями. ansible не нравится, если он не может с чем-то разговаривать. Любые идеи о том, как сказать Ansible, чтобы не задерживаться на этом? Приведенная ниже куча примеров различных текстовых блоков с жалобами на эти проблемы.
10692 1582838291.00956: host 180west-h01 is done iterating, returning
10692 1582838291.00963: getting the next task for host 180west-h02
socket_path issue category in Network Debug and Troubleshooting Guide
10692 1582838291.12410: got an error while closing persistent connection: socket_path does not exist or cannot be found.
t task for host 180west-s02
10692 1582838291.18927: done getting next task for host 180west-s02
10692 1582838291.18935: ^ task is: TASK: meta (flush_handlers)
10692 1582838291.18941: ^ state is: HOST STATE: block=1, task=1, rescue=0, always=0, run_state=ITERATING_TASKS, fail_state=FAILED_NONE, pending_setup=False, tasks child state? (None), rescue child state? (None), always child state? (None), did rescue? False, did start at task? False
10692 1582838291.17193: ^ failed state is now: HOST STATE: block=0, task=0, rescue=0, always=0, run_state=ITERATING_COMPLETE, fail_state=FAILED_SETUP, pending_setup=False, tasks child state? (None), rescue child state? (None), always child state? (None), did rescue? False, did start at task? False
getting the next task for host 2211-2l
10692 1582838291.30568: done getting next task for host 2211-2l
10692 1582838291.30578: ^ task is: TASK: meta (flush_handlers)
10692 1582838291.30585: ^ state is: HOST STATE: block=1, task=1, rescue=0, always=0, run_state=ITERATING_TASKS, fail_state=FAILED_NONE, pending_setup=False, tasks child state? (None), rescue child state? (None), always child state? (None), did rescue? False, did start at task? False
10692 1582838291.30593: getting the next task for host 2211-2n
10692 1582838291.30601: done getting next task for host 2211-2n
10692 1582838291.30608: ^ task is: TASK: meta (flush_handlers)
10692 1582838291.30615: ^ state is: HOST STATE: block=1, task=1, rescue=0, always=0, run_state=ITERATING_TASKS, fail_state=FAILED_NONE, pending_setup=False, tasks child state? (None), rescue child state? (None), always child state? (None), did rescue? False, did start at task? False