Additional "2000" String ([32 30 30 30] bytes) at the beginning of a file

164 views Asked by At

I have a really strange issue and I cannot find the solution.

I have a simple test servlet that stream a small pdf file in the response:

public class TestPdf extends HttpServlet implements Servlet {

    private static final long serialVersionUID = 1L;

    public void doGet(HttpServletRequest request, HttpServletResponse response) throws IOException, ServletException {

        File file = new File(getServletContext().getRealPath("/lorem.pdf"));

        response.setContentType("application/pdf");

        ServletOutputStream out = response.getOutputStream();

        InputStream in = new FileInputStream(file);

        byte[] bytes = new byte[10000];

        int count = -1;

        while ((count = in.read(bytes)) != -1) {
            out.write(bytes, 0, count);
        }

        in.close();

        out.flush();
        out.close();

    }

}

If I call the servlet url with a browser, curl, wget, everything is fine, but when I call it with a simple TCL script like this:

#!/usr/bin/tclsh8.5

package require http;

set testUrl "http://localhost:8080/test/pdf"
set httpResponse [http::geturl "$testUrl" -channel stdout]

the file has a "2000" string at the beginning that corrupt the pdf.

The issue does not seems related to Tomcat or JDK version, since I am able to reproduce it on my development environment (Ubuntu 16.04) with both JDK 1.5.0_22 Tomcat 5.5.36 and JDK 1.8.0_74 and Tomcat 8.5.15.

2

There are 2 answers

0
zappee On

I have never used TCL but this is the way how you can wtite a general file download servlet:

public class DownloadServlet extends HttpServlet {
    private final int BUFFER_SIZE = 10000;

    @Override
    protected void doGet(HttpServletRequest req, HttpServletResponse resp) 
      throws ServletException, IOException {

        String filename = "test.pdf";
        String pathToFile = "..../" + filename;

        resp.setContentType("application/pdf");
        resp.setHeader("Content-disposition", "attachment; filename=" + filename);

        try(InputStream in = req.getServletContext().getResourceAsStream(pathToFile);
          OutputStream out = resp.getOutputStream()) {

            byte[] buffer = new byte[BUFFER_SIZE];
            int numBytesRead;

            while ((numBytesRead = in.read(buffer)) > 0) {
                out.write(buffer, 0, numBytesRead);
            }
        }
    }
}

Hope that this piece of code helps you.

3
mrcalvin On

What you see is the start of a chunk, the number of octets contained by the chunk, as pointed out by others. To handle this from the Tcl client side (and not by turning off chunked transfer-encoding from the Tomcat POV), you need to omit the -channel option to http::geturl:

package require http;

set testUrl "http://localhost:8080/test/pdf"
set httpResponse [http::geturl "$testUrl"]
fconfigure stdout -translation binary; # turn off auto-encoding on the way out
puts -nonewline stdout [http::data $httpResponse]

This should properly transmogrify the chunked content into one piece. Background is that handling of chunked content did not work with the -channel option, when I last checked.