Details

    • Type: Bug Bug
    • Status: Closed
    • Resolution: No Longer Reproducible
    • Affects Version/s: 6.0.11 EE, 6.1.0 CE GA1
    • Fix Version/s: 6.2.0 CE M4
    • Environment:
    • Epic/Theme:
    • Fix Priority:
      3
    • Similar Issues:
      Show 3 results 

      Description

      Users attempt to login but never authenticate and remain on the sign-in page with the browser indicating it is waiting for a response. Only certain users are affected and typically call into support to notify us.

      User steps to reproduce the problem:

      1. Navigate via the browser to the sign in page.
      2. Input login credentials in to sign-in form.
      3. Click sign-in.

      Expected: Successful authentication and login.
      Actual: 'Waiting for xxx.server.xxx...' in browser status pane.

      This was attempted on various different browsers including Firefox 4.0.1, Chrome 12.0.742.100 and Internet Explorer 8.0.6001.18702. All browsers running on Windows 32-bit.

      While attempting to sign-in as one of the affected users, we inserted a debug breakpoint in our custom authenticator code and discovered that the thread ends up waiting for data from Postgres. Just for some background, we developed a custom authenticator and applied it to the 'pre' pipeline. We mirrored our implementation as per the LDAP flavour that comes prepackaged with Liferay. One of the things we do is import/merge user groups and the user itself into liferay upon successful authentication (exactly as LDAP) to keep Liferay in-sync with our master of record repository of users.

      When we paused the thread to determine what exactly was going on, we found the thread was blocked doing an I/O read through a socket connecting to Postgres. The stack trace indicated this was happening during the Hibernate session flush (we're assuming this is when the JDBC connection is being committed). This block was indefinite. Following the stack trace backwards, we found that the last method invocation in our custom code was to:

      UserLocalServiceUtil.updateUser(long userId, java.lang.String oldPassword, java.lang.String newPassword1, java.lang.String newPassword2, boolean passwordReset, java.lang.String reminderQueryQuestion, java.lang.String reminderQueryAnswer, java.lang.String screenName, java.lang.String emailAddress, long facebookId,
      java.lang.String openId, java.lang.String languageId, java.lang.String timeZoneId, java.lang.String greeting,
      java.lang.String comments, java.lang.String firstName, java.lang.String middleName, java.lang.String lastName, int prefixId, int suffixId, boolean male, int birthdayMonth, int birthdayDay, int birthdayYear, java.lang.String smsSn, java.lang.String aimSn, java.lang.String facebookSn, java.lang.String icqSn,
      java.lang.String jabberSn, java.lang.String msnSn, java.lang.String mySpaceSn, java.lang.String skypeSn,
      java.lang.String twitterSn, java.lang.String ymSn, java.lang.String jobTitle, long[] groupIds, long[] organizationIds, long[] roleIds, java.util.List<com.liferay.portal.model.UserGroupRole> userGroupRoles,
      long[] userGroupIds, com.liferay.portal.service.ServiceContext serviceContext) throws com.liferay.portal.kernel.exception.PortalException, com.liferay.portal.kernel.exception.SystemException

      Via the stack strace we inspected the JDBC statements executed and found the thread blocking occurred at the following SQL was executed:

      update User_ set uuid_=$1, companyId=$2, createDate=$3, modifiedDate=$4, defaultUser=$5, contactId=$6, password_=$7, passwordEncrypted=$8, passwordReset=$9, passwordModifiedDate=$10, digest=$11, reminderQueryQuestion=$12, reminderQueryAnswer=$13, graceLoginCount=$14, screenName=$15, emailAddress=$16, facebookId=$17, openId=$18, portraitId=$19, languageId=$20, timeZoneId=$21, greeting=$22, comments=$23, firstName=$24, middleName=$25, lastName=$26, jobTitle=$27, loginDate=$28, loginIP=$29, lastLoginDate=$30, lastLoginIP=$31, lastFailedLoginDate=$32, failedLoginAttempts=$33, lockout=$34, lockoutDate=$35, agreedToTermsOfUse=$36, active_=$37 where userId=$38 (where $x are parameter placeholders).

      While this happening we also executed various Postgres system based queries to get a sense as to what connections were open and from whom. We discovered that for each 'update' corresponding to one of the above API calls, there were approximately 9 database level table/row locks associated with the "user_", "users_groups", "role_", "group_", "user_", "users_roles", "users_usergroups", "usergroup", "user_" relations (1 lock per relation). After reading some Postgres literature we found there were some conflicting locks on the user_ table (one user_ lock had an 'ExclusiveLock' and another user_ lock had a 'RowExclusiveLock'). According to http://www.postgresql.org/docs/8.2/interactive/explicit-locking.html these two locks conflict with each other. These locks all related to one transaction involving the above mentioned update statement. This seemed to indicate that a possible database deadlock had occurred. Many of these transactions had been waiting for at least two days.

      We were able to determine the process id for each of these deadlocked updates and manually did a UNIX kill on them and found that the affected users could now authenticate and login. The interesting to note here is that if we killed the so called 'primordial' or first one or two deadlocked processes, the remaining ones unblocked and completed.

      Any help would be greatly appreciated.

      Thank you,

      Michael Melo
      Intelliware Development Inc.

        Activity

        Hide
        Hitoshi Ozawa added a comment -

        Thank you for your report. Have you had the opportunity to discuss this issue on the community forums? If so, were there other users who were also experiencing this problem? Next time, please include exact steps so we will be able to reproduce your error. Thanks.

        Show
        Hitoshi Ozawa added a comment - Thank you for your report. Have you had the opportunity to discuss this issue on the community forums? If so, were there other users who were also experiencing this problem? Next time, please include exact steps so we will be able to reproduce your error. Thanks.
        Hide
        Michael Melo added a comment -

        Hi Hotoshi,

        The users steps to reproduce are given in the first few lines of the description of the issue:

        "User steps to reproduce the problem:

        1. Navigate via the browser to the sign in page.
        2. Input login credentials in to sign-in form.
        3. Click sign-in.".

        This is a relatively non-deterministic issue so reproducing it is rather difficult. I provided a substantial amount of detail of background and context in this bug report, including narrowing down exactly where in the Liferay code the I/O block is occurring.

        Please read the ticket in more detail.

        Thank you,
        Michael

        Show
        Michael Melo added a comment - Hi Hotoshi, The users steps to reproduce are given in the first few lines of the description of the issue: "User steps to reproduce the problem: 1. Navigate via the browser to the sign in page. 2. Input login credentials in to sign-in form. 3. Click sign-in.". This is a relatively non-deterministic issue so reproducing it is rather difficult. I provided a substantial amount of detail of background and context in this bug report, including narrowing down exactly where in the Liferay code the I/O block is occurring. Please read the ticket in more detail. Thank you, Michael
        Hide
        Michael Melo added a comment -

        And to answer your question about whether I had an opportunity to discuss this issue on the community forums, I did not because we were given direction by our Liferay contact (we are a partner) to post this directly on this forum.

        Show
        Michael Melo added a comment - And to answer your question about whether I had an opportunity to discuss this issue on the community forums, I did not because we were given direction by our Liferay contact (we are a partner) to post this directly on this forum.
        Hide
        Michael Saechang added a comment -

        Hi Michael,
        I've noticed that you are using Liferay EE. If you are still experiencing this issue please contact your Liferay support account manager. They will be able to take care of you.

        Show
        Michael Saechang added a comment - Hi Michael, I've noticed that you are using Liferay EE. If you are still experiencing this issue please contact your Liferay support account manager. They will be able to take care of you.

          People

          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:
              Days since last comment:
              3 years, 22 weeks, 6 days ago

              Development

                Structure Helper Panel