Кардиостимулятор не запускает ресурс jboss и pgsql - PullRequest
0 голосов
/ 26 декабря 2018

Я тестирую кардиостимулятор на двух серверах.

На двух узлах стоит CentOS 7 x64

    jdk-7u80-linux-x64
    JBoss 7.1.1 Final
    Pgsql (PostgreSQL) 9.2.24

    pcs --version
    0.9.165

Настройка 3 ресурсов.IPaddr2 работает без проблем.Но с проблемами jboss и pgsql.

Что если они управляют командами

    /bin/sh /usr/lib/ocf/resource.d/heartbeat/pgsql start
    /bin/sh /usr/lib/ocf/resource.d/heartbeat/jboss start

они работают, но кардиостимулятор их не видит.

    [root @ centos-test1 heartbeat] # pcs status --all
    Cluster name: test
    Stack: corosync
    Current DC: centos-test1 (version 1.1.19-8.el7_6.2-c3c624ea3d) - partition with quorum
    Last updated: Wed Dec 26 06:58:21 2018
    Last change: Wed Dec 26 06:07:27 2018 by root via cibadmin on centos-test1

    2 nodes configured
    3 resources configured

    Online: [centos-test1 centos-test2]

    Full list of resources:

    virtual_ip (ocf :: heartbeat: ipaddr2): Started centos-test1
    jboss (ocf :: heartbeat: jboss): Stopped
    pgsql (ocf :: heartbeat: pgsql): Stopped

    Failed Actions:
    * jboss_start_0 on centos-test1 'unknown error' (1): call = 18, status = Timed Out, exitreason = '',

last-rc-change = 'ср. 26 дек. 06:08:16 2018', в очереди = 0 мс, exec = 20002 мс * pgsql_start_0 на centos-test1 'не настроен' (6): вызов = 15, статус = завершен, exitreason = '', последний-rc-change = 'ср. 26 дек. 06:07:56 2018', в очереди = 0 мс, exec = 115 мс * jboss_start_0 в centos-test2 'неизвестная ошибка' (1): вызов = 14, состояние = время ожидания, выход из режима = '', last-rc-change =' Среда, 26 декабря 13:07:04 2018 ', в очереди = 0 мс, exec = 20002 мс

    Daemon Status:
      corosync: active / enabled
      pacemaker: active / enabled
      pcsd: active / enabled

В ocf :: heartbeat: были ошибки с переменными среды, мыпришлось явно указать в файлах:

     # Initialization:

     : / usr / lib / ocf / lib / heartbeat
     . / usr / lib / ocf / lib / heartbeat / ocf-shellfuncs

     #: $ {OCF_FUNCTIONS_DIR = $ {OCF_ROOT} / lib / heartbeat}
     # $ {OCF_FUNCTIONS_DIR} / ocf-shellfuncs

corasync.log

     Dec 26 14:19:21 [27771] centos-test1    pengine:     info: common_print:        virtual_ip      (ocf::heartbeat:IPaddr2):    Started centos-test1
     Dec 26 14:19:21 [27771] centos-test1    pengine:     info: common_print:        jboss   (ocf::heartbeat:jboss): FAILED centos-test1
     Dec 26 14:19:21 [27771] centos-test1    pengine:     info: common_print:        pgsql   (ocf::heartbeat:pgsql): Stopped
     Dec 26 14:19:21 [27771] centos-test1    pengine:     info: pe_get_failcount:    jboss has failed INFINITY times on centos-test1
     Dec 26 14:19:21 [27771] centos-test1    pengine:  warning: check_migration_threshold:   Forcing jboss away from centos-test1 after 1000000 failures (max=1000000)
     Dec 26 14:19:21 [27771] centos-test1    pengine:     info: pe_get_failcount:    pgsql has failed INFINITY times on centos-test1
     Dec 26 14:19:21 [27771] centos-test1    pengine:  warning: check_migration_threshold:   Forcing pgsql away from centos-test1 after 1000000 failures (max=1000000)
     Dec 26 14:19:21 [27771] centos-test1    pengine:     info: pe_get_failcount:    jboss has failed INFINITY times on centos-test2
     Dec 26 14:19:21 [27771] centos-test1    pengine:  warning: check_migration_threshold:   Forcing jboss away from centos-test2 after 1000000 failures (max=1000000)
     Dec 26 14:19:21 [27771] centos-test1    pengine:     info: native_color:        Resource jboss cannot run anywhere
     Dec 26 14:19:21 [27771] centos-test1    pengine:     info: native_color:        Resource pgsql cannot run anywhere
     Dec 26 14:19:21 [27771] centos-test1    pengine:     info: LogActions:  Leave   virtual_ip      (Started centos-test1)
     Dec 26 14:19:21 [27771] centos-test1    pengine:   notice: LogAction:    * Stop       jboss          (                 centos-test1 )   due to node availability
     Dec 26 14:19:21 [27771] centos-test1    pengine:     info: LogActions:  Leave   pgsql   (Stopped)
     Dec 26 14:19:21 [27771] centos-test1    pengine:   notice: process_pe_message:  Calculated transition 5, saving inputs in /var/lib/pacemaker/pengine/pe-input-266.bz2
     Dec 26 14:19:21 [27772] centos-test1       crmd:     info: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE | input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response
     Dec 26 14:19:21 [27772] centos-test1       crmd:     info: do_te_invoke:        Processing graph 5 (ref=pe_calc-dc-1545823161-30) derived from /var/lib/pacemaker/pengine/pe-input-266.bz2
     Dec 26 14:19:21 [27772] centos-test1       crmd:   notice: te_rsc_command:      Initiating stop operation jboss_stop_0 locally on centos-test1 | action 2
     Dec 26 14:19:21 [27772] centos-test1       crmd:     info: do_lrm_rsc_op:       Performing key=2:5:0:19594a89-d772-4748-8c9a-5a7888a82914 op=jboss_stop_0
     Dec 26 14:19:21 [27767] centos-test1        cib:     info: cib_process_request: Forwarding cib_modify operation for section status to all (origin=local/crmd/54)
     Dec 26 14:19:21 [27769] centos-test1       lrmd:     info: log_execute: executing - rsc:jboss action:stop call_id:18
     Dec 26 14:19:21 [27767] centos-test1        cib:     info: cib_perform_op:      Diff: --- 0.15.35 2
     Dec 26 14:19:21 [27767] centos-test1        cib:     info: cib_perform_op:      Diff: +++ 0.15.36 (null)
     Dec 26 14:19:21 [27767] centos-test1        cib:     info: cib_perform_op:      +  /cib:  @num_updates=36
     Dec 26 14:19:21 [27767] centos-test1        cib:     info: cib_perform_op:      +  /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='jboss']/lrm_rsc_op[@id='jboss_last_0']:  @operation_key=jboss_stop_0, @operation=stop, @transition-key=2:5:0:19594a89-d772-4748-8c9a-5a7888a82914, @transition-magic=-1:193;2:5:0:19594a89-d772-4748-8c9a-5a7888a82914, @call-id=-1, @rc-code=193, @op-status=-1, @last-run=1545823161, @last-rc-change=1545823161, @exec-time=0
     Dec 26 14:19:21 [27767] centos-test1        cib:     info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=centos-test1/crmd/54, version=0.15.36)
     Dec 26 14:19:21  jboss(jboss)[28346]:    INFO: JBoss[jboss] is already stopped.
     Dec 26 14:19:21 [27769] centos-test1       lrmd:     info: log_finished:        finished - rsc:jboss action:stop call_id:18 pid:28346 exit-code:0 exec-time:21ms queue-time:0ms
     Dec 26 14:19:21 [27772] centos-test1       crmd:   notice: process_lrm_event:   Result of stop operation for jboss on centos-test1: 0 (ok) | call=18 key=jboss_stop_0 confirmed=true cib-update=55
     Dec 26 14:19:21 [27767] centos-test1        cib:     info: cib_process_request: Forwarding cib_modify operation for section status to all (origin=local/crmd/55)
     Dec 26 14:19:21 [27767] centos-test1        cib:     info: cib_perform_op:      Diff: --- 0.15.36 2
     Dec 26 14:19:21 [27767] centos-test1        cib:     info: cib_perform_op:      Diff: +++ 0.15.37 (null)
     Dec 26 14:19:21 [27767] centos-test1        cib:     info: cib_perform_op:      +  /cib:  @num_updates=37
     Dec 26 14:19:21 [27767] centos-test1        cib:     info: cib_perform_op:      +  /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='jboss']/lrm_rsc_op[@id='jboss_last_0']:  @transition-magic=0:0;2:5:0:19594a89-d772-4748-8c9a-5a7888a82914, @call-id=18, @rc-code=0, @op-status=0, @exec-time=21
     Dec 26 14:19:21 [27767] centos-test1        cib:     info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=centos-test1/crmd/55, version=0.15.37)
     Dec 26 14:19:21 [27772] centos-test1       crmd:     info: match_graph_event:   Action jboss_stop_0 (2) confirmed on centos-test1 (rc=0)
     Dec 26 14:19:21 [27772] centos-test1       crmd:   notice: run_graph:   Transition 5 (Complete=2, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-266.bz2): Complete
     Dec 26 14:19:21 [27772] centos-test1       crmd:     info: do_log:      Input I_TE_SUCCESS received in state S_TRANSITION_ENGINE from notify_crmd
     Dec 26 14:19:21 [27772] centos-test1       crmd:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE | input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd
     Dec 26 14:19:26 [27767] centos-test1        cib:     info: cib_process_ping:    Reporting our current digest to centos-test1: 1441d742a8ffbf1c1f45b9d38dd1a776 for 0.15.37 (0x55a05cd6c580 0)

1 Ответ

0 голосов
/ 24 января 2019

Ресурс Boht достиг максимального ограничения на количество попыток запуска.кластер отказался от них.

crm_resource может помочь сбросить счетчик отказов и запустить запуск ресурса на узле.

Необходимо очистить ресурсы на каждом узле hte для запускаперезапуск.

В командной строке выполните «crm_resource --help» и перейдите к концу для примера crm_resource --cleanup

...