PDF to ByteArray Conversion

Question

PDF to ByteArray Conversion

1.3k views Asked by Manoj Kh At 09 June 2015 at 13:39

We have written a java code where we are trying to convert PDF to Bytearray.

But the problem is when we try to convert and try to print the converted output we get only 8 to 10 characters only .why is it so ? when i covert the whole pdf it has to be a large no of characters .

Here is my code:

public static void main(String[] args) 
    {

            FileInputStream in = new FileInputStream(new File("C:\\test\\P12.pdf"));
            FileOutputStream out = new FileOutputStream(new File("C:\\test\\pdfoutput.xml"));

                         byte[] buffer = new byte[1024];
            ByteArrayOutputStream bs = new ByteArrayOutputStream();
            int bytesRead;
            while ((bytesRead = in.read(buffer)) != -1)
            {
                bs.write(buffer, 0, bytesRead);
            }
            System.out.println(in);
            byte[] bytes = bs.toByteArray();

                System.out.println(bs.toString());
        out.write(bytes);

}

Original Q&A

There are 2 answers

**user207421** · Answer 1 · 2015-06-09T14:22:17+00:00

We have written a java code where we are trying to convert PDF to Bytearray.

No you haven't. You have written code that reads a file, without conversion, into a byte array. This is a bitwise copy operation, not a conversion.

But the problem is when we try to convert

There is no conversion here, other than the almost certainly invalid conversion of the ByteArrayOutputStream to a String.

and try to print the converted output we get only 8 to 10 characters only

You get junk. Binary junk. You get the original, unconverted, PDF, with all its binary characters, probably including lots of CR and BS characters. It isn't a valid operation. Solution: don't do it.

why is it so?

Because you haven't converted anything.

when i covert the whole pdf it has to be a large no of characters

No doubt, but you haven't converted anything yet. If you want see the text, use a PDF viewer, or write some code that uses a library like iText.

You have not yet begun to fight.

**Joop Eggen** · Answer 2 · 2015-06-09T13:49:25+00:00

A PDF is binary data. So a toString will probably just output the so called PDF signature, PDF + version + some intentionally non-ASCII chars.

As XML is even less likely.

There exists for instance the itext library for reading a PDF.

BTW in.close() would be a good idea too.

TechQA.

PDF to ByteArray Conversion

There are 2 answers

Related Questions in JAVA

Related Questions in PDF

Popular Questions

Popular Tags

Trending Questions