Author:Eduardo de Freitas
Email:efreitasrj at gmail.com
Subject:RE: Gzip compress ratio on binaries
Category:Development
Message:

Hi Brendon,

I found some old servlet code and adapted it to work with Excel files (the code is very specific in regard to the response type, although it could easily be adapted to other types of file). It uses the java.util.zip class and was compiled under jdk1.5.0_08. When I access my XLS file using this servlet, the compressed gzip file served is 290kb as opposed to the 1MB original file and the 900kb Puakma web booster compressed XLS file.

I'd love to use Puakma but it would have to be good at binaries too. What I don't understand is why is Puakma is  great on compressiong Javascript and HTML and performs so poorly when it comes down to binary...

Here is the servlet's source code. Do you see any particular difference in it that would make it behave so differently from Puakma's compression ?

********************************************** 

import lotus.domino.*;

import java.io.*;
import java.util.*;
import java.util.zip.*;
import java.lang.*;
import javax.servlet.*;
import javax.servlet.http.*;

/**
 * A servlet example of a method by which
 * attachments are always downloaded by
 * the browser
 *
 * Creation date: (4/6/2002 1:25:45 PM)
 * @author: Jon LeDrew
 */
public class GzipAttach extends javax.servlet.http.HttpServlet {
   
/**
 * Process incoming HTTP GET requests
 *
 * @param request Object that encapsulates the request to the servlet
 * @param response Object that encapsulates the response from the servlet
 */
public void doGet(javax.servlet.http.HttpServletRequest request,javax.servlet.http.HttpServletResponse response) throws javax.servlet.ServletException, java.io.IOException {

    ServletOutputStream op;
    InputStream is;
    BufferedInputStream bis;
    String unid;
    String filename;
    String dbpath;

    try {

    //request parameters Notes UNID, Attachment Filename and the Database Path.
        unid        = request.getParameter("unid");
        filename    = request.getParameter("filename");
        dbpath      = request.getParameter("dbpath");

        /***************************************************************
        public boolean getAttachment obviously returns a boolean value
        and sets the private BufferedInputStream bis if the db, doc and attachment are found
        ***************************************************************/
        bis = getAttachment(unid, filename, dbpath);

        if (bis != null) {

            int bytesA = bis.available();
            //the size of the attachment. no. of bytes

            byte [] attachment = new byte[bytesA];
            //create a new byte array specifying the size the same as the size of the attachment

            response.setContentType("application/vnd.ms-excel");
            response.setHeader("Content-Encoding","gzip");
           
            op = response.getOutputStream();
            GZIPOutputStream gz=new GZIPOutputStream(op);
            //set the response details

            while (true) {
                int bytesRead = bis.read(attachment, 0, attachment.length);
                if (bytesRead < 0)
                    break;
                    gz.write(attachment, 0, bytesRead);
        }
            //read the BufferedInputStream into the byte array
            //write the array to the ServletOutputStream

            //clean up
            bis.close();
            op.flush();
            gz.close();
            op.close();
        } else {
            //Send an error if getAttachment returns false.
            response.sendError( response.SC_INTERNAL_SERVER_ERROR,
                    "Could not find the specified attachment - <a href=\"/"
                        + dbpath + "/0/" + unid + "/$file/" + filename + "\">" + filename + "</a>" );

            /**************************************************************
            If custom error messages are required.
            response.setContentType("text/html");
            pw = response.getWriter();
            pw.println("Error");
            **************************************************************/
        }
   
    } catch (IOException e) {
        response.sendError( response.SC_INTERNAL_SERVER_ERROR,
                    "An IO error ocurred while trying to retrieve file" );
        //e.printStackTrace();
    }
}
/**
  * Creation date: (4/6/2002 1:30:20 PM)
 * @return boolean
 */
public BufferedInputStream getAttachment(String unid, String filename, String dbpath) {

    InputStream is = null;
    BufferedInputStream bis = null;
    NotesThread nThread = new NotesThread();
    //New instance NotesThread

    try {
        nThread.sinitThread();
        Session sess = NotesFactory.createSession();

        /*************************************************************************
        This method creates a Notes Session.  This method uses the local user config if the server is 'local'
        and uses the server permissions if the database is referenced on a server.
        Use Session sess = NotesFactory.createSession((String) null,ID,Password);
        where ID and Password are valid values for a users internet name and password
        *************************************************************************/

        Database dwnldsrvlt = sess.getDatabase(null, dbpath);

        /*************************************************************************
        Set dwnldsrvlt using Session's getDatabase method.  getDatabase(String server, String dbpath)
        null indicates that the session environment is to be used.
        *************************************************************************/

        if (dwnldsrvlt.isOpen()) {
            Document doc = dwnldsrvlt.getDocumentByUNID(unid);
            if (doc != null) {
                EmbeddedObject em = doc.getAttachment(filename);
                if (em != null) {
                    is = em.getInputStream();
                    bis = new BufferedInputStream(is);
                    if (sess != null){
                        sess.recycle();
                    }
                }
            }
        } else {
            dwnldsrvlt.open();
            Document doc = dwnldsrvlt.getDocumentByUNID(unid);
            if (doc != null) {
                EmbeddedObject em = doc.getAttachment(filename);
                if (em != null) {
                    is = em.getInputStream();
                    bis = new BufferedInputStream(is);
                    if (sess != null){
                        sess.recycle();
                    }
                }
            }
        }
    } catch (NotesException ne) {
        /****************************************************************
    Exception caught if the database or document aren't found.
        Returning false executes the finally clause and terminates the NotesThread.
        pw.println(ne.id + " " + ne.text); for custom messages on failure to find the db, doc or
        the attachment.
        ****************************************************************/
    ne.printStackTrace();
           
    } catch (Exception e) {
    e.printStackTrace();
               
    } finally {
        nThread.stermThread();
        //Terminate the thread.
        return bis;
    }
}//end getAttachment
}
 


Threads:
Gzip compress ratio on binaries   Eduardo de Freitas 26.May.07
    RE: Gzip compress ratio on binaries   Brendon Upson 29.May.07
        RE: Gzip compress ratio on binaries   Eduardo de Freitas 30.May.07
            RE: Gzip compress ratio on binaries   Brendon Upson 30.May.07
                RE: Gzip compress ratio on binaries   Eduardo de Freitas 31.May.07
                    RE: Gzip compress ratio on binaries   Eduardo de Freitas 31.May.07