Quartz使用问题记录
概述
本文记录历史遗留项目在生产环境中使用Quartz时遇到的问题,有些问题并未解决,请知悉。
背景:项目虽然立项时间并不早(2018年),但是依然没有使用分布式的任务调度系统,如xxl-job,elastic-job等开源产品。而是使用quartz这个算是最古老的工具,Quartz虽然可以实现分布式调度,并依赖于数据库表实现的。
所谓的分布式调度,即在应用多节点分布式部署的情况下,一个任务不会重复在多个节点上面跑。
问题
配置文件
quartz.properties
配置文件:
org.quartz.scheduler.instanceId=AUTO
org.quartz.scheduler.makeSchedulerThreadDaemon=true
org.quartz.threadPool.class=org.quartz.simpl.SimpleThreadPool
org.quartz.threadPool.makeThreadsDaemons=true
org.quartz.threadPool.threadCount:20
org.quartz.threadPool.threadPriority:5
org.quartz.jobStore.class=org.quartz.impl.jdbcjobstore.JobStoreTX
org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.StdJDBCDelegate
org.quartz.jobStore.tablePrefix=QRTZ_
org.quartz.jobStore.isClustered=true
org.quartz.jobStore.misfireThreshold=25000
注意上面的isClustered
配置项。
新旧两个集群同时共存
项目最初是用虚拟机部署方式,即Jenkins构建,然后通过脚本替换tomcat下面的war包。现考虑接入公司的容器云部署平台,接入迁移的过程中,使用的数据库还是同一个。故而存在新旧集群同时存在的问题。
tags.HOST_IP=11.55.44.66
,新的服务器集群其中一个节点,记录的报错信息:
[SchedulerFactory_QuartzSchedulerThread] ERROR o.s.s.quartz.LocalDataSourceJobStore - Error retrieving job, setting trigger state to ERROR.
org.quartz.JobPersistenceException: Couldn't retrieve job because a required class was not found: com.aaa.channelcore.business.job.strategyjob.AddAdsetToGuoJi
at org.quartz.impl.jdbcjobstore.JobStoreSupport.retrieveJob(JobStoreSupport.java:1388)
at org.quartz.impl.jdbcjobstore.JobStoreSupport.acquireNextTrigger(JobStoreSupport.java:2818)
at org.quartz.impl.jdbcjobstore.JobStoreSupport$40.execute(JobStoreSupport.java:2759)
at org.quartz.impl.jdbcjobstore.JobStoreSupport$40.execute(JobStoreSupport.java:2757)
at org.quartz.impl.jdbcjobstore.JobStoreSupport.executeInNonManagedTXLock(JobStoreSupport.java:3803)
at org.quartz.impl.jdbcjobstore.JobStoreSupport.acquireNextTriggers(JobStoreSupport.java:2756)
at org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:272)
Caused by: java.lang.ClassNotFoundException: com.aaa.channelcore.business.job.strategyjob.AddAdsetToGuoJi
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at org.springframework.boot.loader.LaunchedURLClassLoader.loadClass(LaunchedURLClassLoader.java:94)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.springframework.scheduling.quartz.ResourceLoaderClassLoadHelper.loadClass(ResourceLoaderClassLoadHelper.java:76)
at org.springframework.scheduling.quartz.ResourceLoaderClassLoadHelper.loadClass(ResourceLoaderClassLoadHelper.java:81)
at org.quartz.impl.jdbcjobstore.StdJDBCDelegate.selectJobDetail(StdJDBCDelegate.java:852)
at org.quartz.impl.jdbcjobstore.JobStoreSupport.retrieveJob(JobStoreSupport.java:1385)
上面记录的是找不到job执行时需要依赖的class,下面的报错则是根据class来存在job信息失败(同名的job已存在):
ERROR o.s.boot.SpringApplication - Application startup failed
java.lang.IllegalStateException: Failed to execute ApplicationRunner
at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:770)
at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:757)
at org.springframework.boot.SpringApplication.afterRefresh(SpringApplication.java:747)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:315)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1162)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1151)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
Caused by: org.quartz.ObjectAlreadyExistsException: Unable to store Job : 'facebook.facebook_14', because one already exists with this identification.
at org.quartz.impl.jdbcjobstore.JobStoreSupport.storeJob(JobStoreSupport.java:1108)
at org.quartz.impl.jdbcjobstore.JobStoreSupport$2.executeVoid(JobStoreSupport.java:1062)
at org.quartz.impl.jdbcjobstore.JobStoreSupport$VoidTransactionCallback.execute(JobStoreSupport.java:3719)
at org.quartz.impl.jdbcjobstore.JobStoreSupport$VoidTransactionCallback.execute(JobStoreSupport.java:3717)
at org.quartz.impl.jdbcjobstore.JobStoreCMT.executeInLock(JobStoreCMT.java:245)
at org.quartz.impl.jdbcjobstore.JobStoreSupport.storeJobAndTrigger(JobStoreSupport.java:1058)
at org.quartz.core.QuartzScheduler.scheduleJob(QuartzScheduler.java:886)
at org.quartz.impl.StdScheduler.scheduleJob(StdScheduler.java:249)
at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:767)
tags.HOST_IP
=111.222.222.111服务器节点记录的报错信息,旧的站点日志:
[SchedulerFactory_QuartzSchedulerThread] ERROR o.s.s.quartz.LocalDataSourceJobStore - Error retrieving job, setting trigger state to ERROR.
org.quartz.JobPersistenceException: Couldn't retrieve job because a required class was not found: com.aaa.cbd.platform.job.google.getAccountSpendDataJob
at org.quartz.impl.jdbcjobstore.JobStoreSupport.retrieveJob(JobStoreSupport.java:1388)
at org.quartz.impl.jdbcjobstore.JobStoreSupport.acquireNextTrigger(JobStoreSupport.java:2818)
at org.quartz.impl.jdbcjobstore.JobStoreSupport$40.execute(JobStoreSupport.java:2759)
at org.quartz.impl.jdbcjobstore.JobStoreSupport$40.execute(JobStoreSupport.java:2757)
at org.quartz.impl.jdbcjobstore.JobStoreSupport.executeInNonManagedTXLock(JobStoreSupport.java:3803)
at org.quartz.impl.jdbcjobstore.JobStoreSupport.acquireNextTriggers(JobStoreSupport.java:2756)
at org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:272)
Caused by: java.lang.ClassNotFoundException: com.ppdai.cbd.platform.job.google.getAccountSpendDataJob
at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1282)
at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1116)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.springframework.util.ClassUtils.forName(ClassUtils.java:275)
at org.springframework.scheduling.quartz.ResourceLoaderClassLoadHelper.loadClass(ResourceLoaderClassLoadHelper.java:81)
at org.springframework.scheduling.quartz.ResourceLoaderClassLoadHelper.loadClass(ResourceLoaderClassLoadHelper.java:86)
at org.quartz.impl.jdbcjobstore.StdJDBCDelegate.selectJobDetail(StdJDBCDelegate.java:852)
at org.quartz.impl.jdbcjobstore.JobStoreSupport.retrieveJob(JobStoreSupport.java:1385)
未找到
Caused by: org.quartz.ObjectAlreadyExistsException: Unable to store Job : 'facebook.facebook_14', because one already exists with this identification.
java.lang.InterruptedException: null
java.lang.InterruptedException: null
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
at java.util.concurrent.ThreadPoolExecutor.awaitTermination(ThreadPoolExecutor.java:1475)
at org.eclipse.jetty.util.thread.ExecutorThreadPool.join(ExecutorThreadPool.java:182)
at org.eclipse.jetty.server.Server.join(Server.java:617)
at com.aaa.job.core.rpc.netcom.jetty.server.JettyServer$1.run(JettyServer.java:55)
at java.lang.Thread.run(Thread.java:748)
2021-08-26 14:02:02,980 [QuartzScheduler_SchedulerFactory-PPC-02020001181629957363873_ClusterManager] WARN o.s.s.quartz.LocalDataSourceJobStore- This scheduler instance (PPC-02020001181629957363873) is still active but was recovered by another instance in the cluster. This may cause inconsistent behavior.
job class
至于把旧版集群下线,新版应用发布,启动时,quartz会自动清空数据表,然后重新insert数据,主要是记录着class和job的对应关系那张表。
问题:通过什么字段来标志旧的集群依然在线呢???
本地测试Job
在本地以及测试环境下,需要把下面的配置改为false:org.quartz.jobStore.isClustered=false
,否则会出现的一个情况是:开发机器和测试机器组成一个集群,随机选择一个节点来执行任务,如果想在本地单元测试断点调试应用时,会发现进不到任务执行逻辑内。
SchedulerException: Job threw an unhandled exception
报错日志:
[schedulerFactoryBean_Worker-78] ERROR org.quartz.core.ErrorLogger - Job (DEFAULT.1568 threw an exception.
org.quartz.SchedulerException: Job threw an unhandled exception.
at org.quartz.core.JobRunShell.run(JobRunShell.java:213)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573)
Caused by: java.lang.NullPointerException: null
qrtz_cron_triggers
表里面有trigger_name='1568'
这条数据:
然后1568对应到的业务表,那条数据已经被逻辑删除:
Couldn’t acquire next trigger: Deadlock found when trying to get lock; try restarting transaction
报错日志:
[schedulerFactoryBean_QuartzSchedulerThread] ERROR org.quartz.core.ErrorLogger - An error occurred while scanning for the next triggers to fire.
org.quartz.JobPersistenceException: Couldn't acquire next trigger: Deadlock found when trying to get lock; try restarting transaction
at org.quartz.impl.jdbcjobstore.JobStoreSupport.acquireNextTrigger(JobStoreSupport.java:2864)
at org.quartz.impl.jdbcjobstore.JobStoreSupport$40.execute(JobStoreSupport.java:2759)
at org.quartz.impl.jdbcjobstore.JobStoreSupport$40.execute(JobStoreSupport.java:2757)
at org.quartz.impl.jdbcjobstore.JobStoreSupport.executeInNonManagedTXLock(JobStoreSupport.java:3803)
at org.quartz.impl.jdbcjobstore.JobStoreSupport.acquireNextTriggers(JobStoreSupport.java:2756)
at org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:272)
Caused by: com.mysql.cj.jdbc.exceptions.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:123)
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97)
at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122)
at com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal(ClientPreparedStatement.java:953)
at com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdateInternal(ClientPreparedStatement.java:1092)
at com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdateInternal(ClientPreparedStatement.java:1040)
at com.mysql.cj.jdbc.ClientPreparedStatement.executeLargeUpdate(ClientPreparedStatement.java:1347)
at com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdate(ClientPreparedStatement.java:1025)
at com.alibaba.druid.filter.FilterChainImpl.preparedStatement_executeUpdate(FilterChainImpl.java:2723)
at com.alibaba.druid.filter.FilterAdapter.preparedStatement_executeUpdate(FilterAdapter.java:1069)
at com.alibaba.druid.filter.FilterEventAdapter.preparedStatement_executeUpdate(FilterEventAdapter.java:491)
at com.alibaba.druid.filter.FilterChainImpl.preparedStatement_executeUpdate(FilterChainImpl.java:2721)
at com.alibaba.druid.filter.FilterAdapter.preparedStatement_executeUpdate(FilterAdapter.java:1069)
at com.alibaba.druid.filter.FilterEventAdapter.preparedStatement_executeUpdate(FilterEventAdapter.java:491)
at com.alibaba.druid.filter.FilterChainImpl.preparedStatement_executeUpdate(FilterChainImpl.java:2721)
at com.alibaba.druid.proxy.jdbc.PreparedStatementProxyImpl.executeUpdate(PreparedStatementProxyImpl.java:158)
at com.alibaba.druid.pool.DruidPooledPreparedStatement.executeUpdate(DruidPooledPreparedStatement.java:253)
at org.quartz.impl.jdbcjobstore.StdJDBCDelegate.updateTriggerStateFromOtherState(StdJDBCDelegate.java:1439)
at org.quartz.impl.jdbcjobstore.JobStoreSupport.acquireNextTrigger(JobStoreSupport.java:2842)
... 5 common frames omitted
任务不执行
任务配置的执行时间为每天9点40分:
然后发现任务未执行,执行日志没有12号到14号的数据。
用户反馈过不止一次这样的生产事故。怀疑是发生上面的死锁问题。
SQLSyntaxErrorException: Table ‘shit.QRTZ_LOCKS’ doesn’t exist
应用启动正常,但是控制台在疯狂打印如下报错信息:
ERROR sql.Statement: {conn-10001, pstmt-20003} execute error. SELECT * FROM QRTZ_LOCKS WHERE SCHED_NAME = 'schedulerFactoryBean' AND LOCK_NAME = ? FOR UPDATE
java.sql.SQLSyntaxErrorException: Table 'shit.QRTZ_LOCKS' doesn't exist
配置文件:org.quartz.jobStore.tablePrefix = QRTZ_
。此配置文件只有一份。
原理就在于MySQL有个配置属性:lower_case_table_names
show variables like '%lower_case_table_names%';
Windows上安装MySQL,默认是1,代表忽略大小写
Linux上安装MySQL,默认是0,代表不忽略大小写
解决方法:表重命名。借助于IDEA强大的快捷键功能,选中表,Shift + F6 重命名,Ctrl + Shift + U大小写转换。
参考
Quartz里job不执行
quartz定时任务不执行
https://www.jianshu.com/p/2ae101916337