Scheduled jobs can be re-triggered if there was a loss of connection between nodes in a cluster

Description

Steps to Reproduce:

  1. In a clustered environment startup first node (for now master node).

  2. Startup second node (salve node).

  3. Deploy in both nodes a simple scheduled job test portlet set to trigger every 10 minutes.

  4. Activate log level to DEBUG for com.liferay.portal.scheduler.ClusterSchedulerEngine to be able to detect which is the node currently executing jobs.

  5. Wait 10 minutes to check that the job gets correctly fired in the master node (optional, this step can be skipped, it's only to check that configuration is fine).

  6. Before job's next execution simulate a loss of connection between both nodes.

  7. Check, in the logs, that slave node is now also master and is going to execute jobs.

  8. Before job's next execution enable again connection between both nodes.

  9. After a while one of both nodes will be established as slave and will indicate in the logs that is no longer going to execute jobs.

  10. Wait until master executes the job (for first or second time depending whether you skipped the optional step or not).

  11. Before the job is executed another time simulate a loss of connection between both nodes.

Expected Results:
No job should be executed at this point until the expected time.

Actual Results:
Jobs get immediately fired in one of the nodes (the one which was a slave before breaking connection for a second time) independently of the expected time.

Attachments

1

Activity

Show:

Shitian "Shelton" Zhang November 3, 2016 at 12:47 AM

PASSED Manual Testing using the following steps:

  1. Set up a cluster environment.

  2. Put the

  1. in webapps/ROOT.

  2. In the first node, access the jsp.

  3. Assert the job is triggered in the first node.

  4. Wait and check the time.

  5. Shutdown the first node.

  6. Assert the job is triggered in the second node.

  7. Check the time.

Reproduced on:

Tomcat 8.0.32 + MySQL 5.6. Portal ee-7.0.x GIT ID: 9c2b1cf2e9988f282b15662fbb447f618801ab32.

The job gets fired immediately.

Fixed on:
Tomcat 8.0.32 + MySQL 5.6. Portal ee-7.0.x GIT ID: 6a1002bf5a0b1908908e6bb764d2a37ed1b3dc87.

The job gets fired until the expected time.

Fixed

Details

Assignee

Reporter

Branch Version/s

7.0.x
6.2.x

Backported to Branch

Committed

Fix Priority

3

Git Pull Request

7.0 Fix Pack Version

2

Story Points

Components

Priority

Zendesk Support

Created June 28, 2016 at 12:06 AM
Updated June 26, 2023 at 12:06 AM
Resolved August 19, 2016 at 9:59 PM
Loading...