Scheduled jobs can be re-triggered if there was a loss of connection between nodes in a cluster

Description

Steps to Reproduce:

In a clustered environment startup first node (for now master node).
Startup second node (salve node).
Deploy in both nodes a simple scheduled job test portlet set to trigger every 10 minutes.
Activate log level to DEBUG for com.liferay.portal.scheduler.ClusterSchedulerEngine to be able to detect which is the node currently executing jobs.
Wait 10 minutes to check that the job gets correctly fired in the master node (optional, this step can be skipped, it's only to check that configuration is fine).
Before job's next execution simulate a loss of connection between both nodes.
Check, in the logs, that slave node is now also master and is going to execute jobs.
Before job's next execution enable again connection between both nodes.
After a while one of both nodes will be established as slave and will indicate in the logs that is no longer going to execute jobs.
Wait until master executes the job (for first or second time depending whether you skipped the optional step or not).
Before the job is executed another time simulate a loss of connection between both nodes.

Expected Results:
No job should be executed at this point until the expected time.

Actual Results:
Jobs get immediately fired in one of the nodes (the one which was a slave before breaking connection for a second time) independently of the expected time.

Attachments

Linked issues

causes

LPSA-47849

SchedulerException is thrown when scheduling a job with a trigger interval that is less than zero

LPSA-58687

Slave node does not work as expected after initializing/deploying a portlet that is not available on the master node

relates

LPE-15296

Scheduled jobs can be re-triggered if there was a loss of connection between nodes in a cluster

Activity

Show:

Shitian "Shelton" Zhang November 3, 2016 at 12:47 AM

PASSED Manual Testing using the following steps:

Set up a cluster environment.
Put the

in webapps/ROOT.
In the first node, access the jsp.
Assert the job is triggered in the first node.
Wait and check the time.
Shutdown the first node.
Assert the job is triggered in the second node.
Check the time.

Reproduced on:

Tomcat 8.0.32 + MySQL 5.6. Portal ee-7.0.x GIT ID: 9c2b1cf2e9988f282b15662fbb447f618801ab32.

The job gets fired immediately.

Fixed on:
Tomcat 8.0.32 + MySQL 5.6. Portal ee-7.0.x GIT ID: 6a1002bf5a0b1908908e6bb764d2a37ed1b3dc87.

The job gets fired until the expected time.

Fixed

Details
Assignee
Shitian "Shelton" Zhang(Deactivated)
Reporter
Mariano Alvaro
Labels
liferay-digital-enterprise-70-sp1liferay-fixpack-de-2-7010liferay-fixpack-portal-125-6210
Branch Version/s
7.0.x
6.2.x
Backported to Branch
Committed
Fix Priority
3
Git Pull Request
7.0 Fix Pack Version
2
Story Points
1.5
Components
Fix versions
6.2.X EE
7.0.0 DXP FP2
7.0.0 DXP SP1
7.0.3 CE GA4
7.1.X
Master
Affects versions
6.2.X EE
7.0.1 CE GA2
Master
Priority
Medium

Zendesk Support

Created June 28, 2016 at 12:06 AM

Updated June 26, 2023 at 12:06 AM

Resolved August 19, 2016 at 9:59 PM

Scheduled jobs can be re-triggered if there was a loss of connection between nodes in a cluster

Description

Attachments

Linked issues

causes

relates

Activity

Shitian "Shelton" Zhang November 3, 2016 at 12:47 AM

Details

Assignee

Reporter

Labels

Branch Version/s

Backported to Branch

Fix Priority

Git Pull Request

7.0 Fix Pack Version

Story Points

Components

Fix versions

Affects versions

Priority

Zendesk SupportLinked Tickets

Zendesk Support

Zendesk Support