Как правильно настроить кластерный Кварц с MariaDB - PullRequest
0 голосов
/ 10 мая 2018

У нас есть кластерная кварцевая конфигурация с 8 географически распределенными узлами, все из которых поддерживаются одним экземпляром MariaDB.Мы неоднократно наблюдали следующую ошибку:

2018-04-26 00:14:01,186 [censored_QuartzSchedulerThread] ERROR org.quartz.core.ErrorLogger - An error occurred while firing triggers '[Trigger 'DEFAULT.a563a68e-b30d-4dab-b24f-540c5fa0cef8':  triggerClass: 'org.quartz.impl.triggers.CronTriggerImpl calendar: 'null' misfireInstruction: 0 nextFireTime: Thu Apr 26 00:11:00 EDT 2018]'
org.quartz.impl.jdbcjobstore.LockException: Failure obtaining db row lock: (conn:83) Lock wait timeout exceeded; try restarting transaction
Query is: SELECT * FROM QRTZ_LOCKS WHERE SCHED_NAME = 'censored' AND LOCK_NAME = ? FOR UPDATE, parameters ['TRIGGER_ACCESS']
        at org.quartz.impl.jdbcjobstore.StdRowLockSemaphore.executeSQL(StdRowLockSemaphore.java:157)
        at org.quartz.impl.jdbcjobstore.DBSemaphore.obtainLock(DBSemaphore.java:113)
        at org.quartz.impl.jdbcjobstore.JobStoreSupport.executeInNonManagedTXLock(JobStoreSupport.java:3792)
        at org.quartz.impl.jdbcjobstore.JobStoreSupport.triggersFired(JobStoreSupport.java:2912)
        at org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:336)
Caused by: java.sql.SQLException: (conn:83) Lock wait timeout exceeded; try restarting transaction
Query is: SELECT * FROM QRTZ_LOCKS WHERE SCHED_NAME = 'censored' AND LOCK_NAME = ? FOR UPDATE, parameters ['TRIGGER_ACCESS']
        at org.mariadb.jdbc.internal.util.ExceptionMapper.get(ExceptionMapper.java:150)
        at org.mariadb.jdbc.internal.util.ExceptionMapper.getException(ExceptionMapper.java:101)
        at org.mariadb.jdbc.internal.util.ExceptionMapper.throwAndLogException(ExceptionMapper.java:77)
        at org.mariadb.jdbc.MariaDbStatement.executeQueryEpilog(MariaDbStatement.java:226)
        at org.mariadb.jdbc.MariaDbServerPreparedStatement.executeInternal(MariaDbServerPreparedStatement.java:413)
        at org.mariadb.jdbc.MariaDbServerPreparedStatement.execute(MariaDbServerPreparedStatement.java:362)
        at org.mariadb.jdbc.MariaDbServerPreparedStatement.executeQuery(MariaDbServerPreparedStatement.java:343)
        at org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
        at org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
        at org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
        at org.quartz.impl.jdbcjobstore.StdRowLockSemaphore.executeSQL(StdRowLockSemaphore.java:96)
        ... 4 common frames omitted

Другие узлы также имеют похожие ошибки:

2018-04-26 00:01:05,106 [censored_QuartzSchedulerThread] ERROR org.quartz.core.ErrorLogger - An error occurred while scanning for the next triggers to fire.
org.quartz.JobPersistenceException: Couldn't acquire next trigger: (conn:75) Lock wait timeout exceeded; try restarting transaction
Query is: UPDATE QRTZ_TRIGGERS SET TRIGGER_STATE = ? WHERE SCHED_NAME = 'censored' AND TRIGGER_NAME = ? AND TRIGGER_GROUP = ? AND TRIGGER_STATE = ?, parameters ['ACQUIRED','12d649da-56fb-47bf-874e-96dca451291f','DEFAULT','WAITING']
      at org.quartz.impl.jdbcjobstore.JobStoreSupport.acquireNextTrigger(JobStoreSupport.java:2860)
      at org.quartz.impl.jdbcjobstore.JobStoreSupport$40.execute(JobStoreSupport.java:2759)
      at org.quartz.impl.jdbcjobstore.JobStoreSupport$40.execute(JobStoreSupport.java:2757)
      at org.quartz.impl.jdbcjobstore.JobStoreSupport.executeInNonManagedTXLock(JobStoreSupport.java:3799)
      at org.quartz.impl.jdbcjobstore.JobStoreSupport.acquireNextTriggers(JobStoreSupport.java:2756)
      at org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:272)
Caused by: java.sql.SQLException: (conn:75) Lock wait timeout exceeded; try restarting transaction

И

2018-05-06 06:27:10,438 [QuartzScheduler_censored-schedulerName_ClusterManager] ERROR o.q.impl.jdbcjobstore.JobStoreTX - ClusterManager: Error managing cluster: Failure updating scheduler state when checking-in: (conn:52) Deadlock found when trying to get lock; try restarting transaction
Query is: INSERT INTO QRTZ_SCHEDULER_STATE (SCHED_NAME, INSTANCE_NAME, LAST_CHECKIN_TIME, CHECKIN_INTERVAL) VALUES('censored', ?, ?, ?), parameters ['schedulerName',1525602430435,15000]
org.quartz.JobPersistenceException: Failure updating scheduler state when checking-in: (conn:52) Deadlock found when trying to get lock; try restarting transaction
Query is: INSERT INTO QRTZ_SCHEDULER_STATE (SCHED_NAME, INSTANCE_NAME, LAST_CHECKIN_TIME, CHECKIN_INTERVAL) VALUES('censored', ?, ?, ?), parameters ['schedulerName',1525602430435,15000]
      at org.quartz.impl.jdbcjobstore.JobStoreSupport.clusterCheckIn(JobStoreSupport.java:3418)
      at org.quartz.impl.jdbcjobstore.JobStoreSupport.doCheckin(JobStoreSupport.java:3265)
      at org.quartz.impl.jdbcjobstore.JobStoreSupport$ClusterManager.manage(JobStoreSupport.java:3870)
      at org.quartz.impl.jdbcjobstore.JobStoreSupport$ClusterManager.run(JobStoreSupport.java:3907)
Caused by: java.sql.SQLTransactionRollbackException: (conn:52) Deadlock found when trying to get lock; try restarting transaction
Query is: INSERT INTO QRTZ_SCHEDULER_STATE (SCHED_NAME, INSTANCE_NAME, LAST_CHECKIN_TIME, CHECKIN_INTERVAL) VALUES('censored', ?, ?, ?), parameters ['schedulerName',1525602430435,15000]
      at org.mariadb.jdbc.internal.util.ExceptionMapper.get(ExceptionMapper.java:141)
      at org.mariadb.jdbc.internal.util.ExceptionMapper.getException(ExceptionMapper.java:101)
      at org.mariadb.jdbc.internal.util.ExceptionMapper.throwAndLogException(ExceptionMapper.java:77)
      at org.mariadb.jdbc.MariaDbStatement.executeQueryEpilog(MariaDbStatement.java:226)
      at org.mariadb.jdbc.MariaDbServerPreparedStatement.executeInternal(MariaDbServerPreparedStatement.java:413)
      at org.mariadb.jdbc.MariaDbServerPreparedStatement.execute(MariaDbServerPreparedStatement.java:362)
      at org.mariadb.jdbc.MariaDbServerPreparedStatement.executeUpdate(MariaDbServerPreparedStatement.java:351)
      at org.apache.commons.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:105)
      at org.apache.commons.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:105)
      at org.apache.commons.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:105)
      at org.quartz.impl.jdbcjobstore.StdJDBCDelegate.insertSchedulerState(StdJDBCDelegate.java:2948)
      at org.quartz.impl.jdbcjobstore.JobStoreSupport.clusterCheckIn(JobStoreSupport.java:3413)
      ... 3 common frames omitted
Caused by: org.mariadb.jdbc.internal.util.dao.QueryException: Deadlock found when trying to get lock; try restarting transaction
Query is: INSERT INTO QRTZ_SCHEDULER_STATE (SCHED_NAME, INSTANCE_NAME, LAST_CHECKIN_TIME, CHECKIN_INTERVAL) VALUES('censored', ?, ?, ?), parameters ['schedulerName',1525602430435,15000]
      at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.readErrorPacket(AbstractQueryProtocol.java:1144)
      at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.readPacket(AbstractQueryProtocol.java:1076)
      at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.getResult(AbstractQueryProtocol.java:1031)
      at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.executePreparedQuery(AbstractQueryProtocol.java:617)
      at org.mariadb.jdbc.MariaDbServerPreparedStatement.executeInternal(MariaDbServerPreparedStatement.java:401)
      ... 10 common frames omitted

Наша Кварцевая конфигурация:

org.quartz.scheduler.instanceName=censored
org.quartz.scheduler.instanceId=AUTO

org.quartz.threadPool.class=org.quartz.simpl.SimpleThreadPool
org.quartz.threadPool.threadCount=10

org.quartz.jobListener.NAME.class=our.JobListenerImplementation
org.quartz.jobListener.NAME.jobListenerName=JobListener

org.quartz.triggerListener.NAME.class=our.TriggerListenerImplementation
org.quartz.triggerListener.NAME.triggerName=SchedulerTriggerListener


org.quartz.jobStore.class=org.quartz.impl.jdbcjobstore.JobStoreTX
org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.StdJDBCDelegate
org.quartz.jobStore.tablePrefix=QRTZ_
org.quartz.jobStore.dataSource=quartzDataSource
org.quartz.jobStore.isClustered = true
org.quartz.jobStore.clusterCheckinInterval=15000
org.quartz.jobStore.maxMisfiresToHandleAtATime=20
org.quartz.jobStore.acquireTriggersWithinLock=true


org.quartz.dataSource.quartzDataSource.connectionProvider.class=our.ConnectionProviderImplementation

org.quartz.dataSource.quartzDataSource.driver=org.mariadb.jdbc.Driver
org.quartz.dataSource.quartzDataSource.url=
org.quartz.dataSource.quartzDataSource.user=
org.quartz.dataSource.quartzDataSource.password=

org.quartz.dataSource.quartzDataSource.initialPoolSize=3
org.quartz.dataSource.quartzDataSource.maxActive=50
org.quartz.dataSource.quartzDataSource.maxIdle=20
org.quartz.dataSource.quartzDataSource.minIdle=2
org.quartz.dataSource.quartzDataSource.maxWait=180000
org.quartz.dataSource.quartzDataSource.setMaxOpenPreparedStatements=10
org.quartz.dataSource.quartzDataSource.removeAbandoned=true
org.quartz.dataSource.quartzDataSource.removeAbandonedTimeout=300
org.quartz.dataSource.quartzDataSource.logAbandoned=true
org.quartz.dataSource.quartzDataSource.testWhileIdle=true
org.quartz.dataSource.quartzDataSource.testOnBorrow=true
org.quartz.dataSource.quartzDataSource.testOnReturn=true
org.quartz.dataSource.quartzDataSource.validationQuery=select 1 from dual
org.quartz.dataSource.quartzDataSource.validationQueryTimeout=1

В результате этих проблем с получением блокировки вызов для создания нового расписания часто (но не всегда) не завершается, и триггеры не запускаются должным образом.Все это происходит при небольшой нагрузке, при которой требуется не более пары запросов для создания расписания в минуту.

Мы также видим следующее предупреждение, которое может или не может быть связано:

2018-05-06 06:27:10,434 [QuartzScheduler_censored-schedulerName_ClusterManager] WARN  o.q.impl.jdbcjobstore.JobStoreTX - This scheduler instance (schedulerName) is still active but was recovered by another instance in the cluster.  This may cause inconsistent behavior.

Кто-нибудь испытывал эту проблему при запуске кластерного кварца с MariaDB?Предложения будут с благодарностью.

Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...