Uploaded image for project: 'PUBLIC - Liferay Portal Community Edition'
  1. PUBLIC - Liferay Portal Community Edition
  2. LPS-133127

Export performance impacted by ZipWriter strategy

    Details

    • Type: Bug
    • Status: Closed
    • Resolution: Fixed
    • Affects Version/s: 7.3.10 DXP GA1, 7.3.6 CE GA7, 7.4.0 CE GA1, Master
    • Fix Version/s: Master
    • Labels:
      None

      Description

      Liferay export functionality creates a zip (named lar) with select content.
      Zip manipulation is delegated to ZipWriterImpl class which uses built-in zip support.

      Export process calls two methods of ZipWriterImpl:

      • "addEntry(String name, InputStream inputStream)" which adds immediately the inputStream to the zip
      • "addEntry(String name, byte[] bytes") which collect the parameter pair into a list, and adds it in a single step when "getFile()" method is called.

      Because we want to add element to the zip using a specific path, this action is done using "Java File System Provider" capability. Zip is managed as a Virtual File System and standand file utilities are used to create folder and files.

      Every time "addFileEntry(String name, InputStream inputStream)" is called the zip is masked as a FileSystem, the stream in written, and the FileSystem is disposed.

       

      So where is the bad part?

       

      Take a fresh Liferay bundle and upload into DML 100 files of variable size for a total size of 200MB. Preferable images and pdf to have Liferay create preview and thumbnails.

      Now export the site leaving all the checks as proposed to the default value.
      Before start the export open a teminal on the tomcat temp folder and execute "watch -lai" (to have a feedback of what happens in the forlder).

      When you start the export a zip file with an uuid as a name is created.
      While this file increments its size, you'll noted files named "zipfstmp*.tmp" apper and disapper with different sizes.

       

      Every time "addEntry()" adds a file into the zip (using the VirtualFileSystem) all the zip content is copied into the temporary file to produce a new version of the zip with the new content.

      Immagine the export zip is already 2GB big. To add 3 files of 20 bytes each:

      • 2GB + 20 bytes are copied as "zipfstmp" and then the file is renamed as final zip
      • 2GB + 40 bytes are copied as "zipfstmp" and then the file is renamed as final zip
      • 2GB + 60 bytes are copied as "zipfstmp" and then the file is renamed as final zip

      To add 60 bytes we have processed "6GB + 100 bytes".

      This makes the system wait for the disk i/o.

      This behaviour impacts expecially on binary content (like Document and Media). Other types of content are appended to a list and written all together in a single round.

       

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              support-lep@liferay.com SE Support
              Reporter:
              maumar Mauro Mariuzzo
              Participants of an Issue:
              Recent user:
              Tamas Molnar
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:
                Days since last comment:
                6 days ago

                  Packages

                  Version Package
                  Master