Converting EBCDIC to ASCII in java

Question

Converting EBCDIC to ASCII in java

42.2k views Asked by Robin-Hoodie At 03 December 2013 at 11:09

I am supposed to convert an EBCDIC file to ASCII by using Java. So far I have this code:

public class Migration {
    InputStreamReader reader;
    StringBuilder builder;

    public Migration(){
        try {
            reader = new InputStreamReader(new FileInputStream("C:\\TI3\\Legacy Systemen\\Week 3\\Oefening 3\\inputfile.dat"),
                   java.nio.charset.Charset.forName("ibm500") );
        } catch(FileNotFoundException e){
            e.printStackTrace();
        }
        builder = new StringBuilder();
    }

    public void read() throws IOException {
        int theInt;
        while((theInt = reader.read()) != -1){
            char theChar = (char) theInt;
            builder.append(theChar);

        }

        reader.close();
    }

    @Override
    public String toString(){
        return builder.toString();
    }
}

The file description is the following:

 02 KDGEX.
      05 B1-LENGTH PIC S9(04) USAGE IS COMP.
      05 B1-CODE PIC S9(04) USAGE IS COMP.
      05 B1-NUMBER PIC X(08).
      05 B1-PPR-NAME PIC X(06).
      05 B1-PPR-FED PIC 9(03).
      05 B1-PPR-RNR PIC S9(08) USAGE IS COMP.
      05 B1-DATA.
        10 B1-VBOND PIC 9(02).
        10 B1-KONST.
          20 B1-AFDEL PIC 9(03).
          20 B1-KASSIER PIC 9(03).
          20 B1-DATZIT-DM PIC 9(04).
        10 B1-BETWYZ PIC X(01).
        10 B1-RNR PIC X(13).
        10 B1-BETKOD PIC 9(02).
        10 B1-VOLGNR-INF PIC 9(02).
        10 B1-QUAL-PREST PIC 9(03).
        10 B1-REKNUM PIC 9(12).
        10 B1-REKNR REDEFINES B1-REKNUM.
          20 B1-REKNR-PART1 PIC 9(03).
          20 B1-REKNR-PART2 PIC 9(07).
          20 B1-REKNR-PART3 PIC 9(02).
        10 B1-VOLGNR-M30 PIC 9(03).
        10 B1-OMSCHR.
          15 B1-OMSCHR1 PIC X(14).
          15 B1-OMSCHR2 PIC X(14).
        10 B1-OMSCHR-INF REDEFINES B1-OMSCHR.
          15 B1-AANT-PREST PIC 9(02).
          15 B1-VERSTR PIC 9(01).
          15 B1-LASTDATE PIC 9(06).
          15 B1-HONOR PIC 9(06).
          15 B1-RIJKN PIC X(13).
        10 FILLER--1 PIC 9(02).
        10 B1-INFOREK PIC 9(01).
        10 B1-BEDRAG-EUR PIC 9(08).
        10 B1-BEDRAG-DV PIC X(01).
        10 B1-BEDRAG-RMG-DV REDEFINES B1-BEDRAG-DV PIC X(01).
      05 FILLER PIC X(5).

We can ignore the first 2 bytes on every line. The problem is the bytes where there's a USAGE IS COMP since the reader is not converting them properly, I think I am supposed to read these as bytes or something, though I have no idea how.

Original Q&A

There are 2 answers

Murali On 08 November 2016 at 17:29

I also faced same issue like converting EBCDIC to ASCII string. Please find the code below to convert a single EBCDIC to ASCII string.

public class EbcdicConverter
{
    public static void main(String[] args) 
        throws Exception
    {
        String ebcdicString =<your EBCDIC string>;
        // convert String into InputStream
        InputStream is = new ByteArrayInputStream(ebcdicString.getBytes());
        ByteArrayOutputStream baos=new ByteArrayOutputStream();

        int line;
         while((line = is.read()) != -1) {
             baos.write((char)line);
         }
         String str = baos.toString("Cp500");
         System.out.println(str);
    }
}

**McDowell** · Accepted Answer · 2013-12-03T12:13:01+00:00

If I am interpreting this format correctly you have a binary file format with fixed-length records. Some of these records are not character data (COBOL computational fields?)

So, you will have to read the records using a more low-level approach processing individual fields of each record:

import java.io.*;

public class Record {
  private byte[] kdgex = new byte[2]; // COMP
  private byte[] b1code = new byte[2]; // COMP
  private byte[] b1number = new byte[8]; // DISPLAY
  // other fields

  public void read(DataInput data) throws IOException {
    data.readFully(kdgex);
    data.readFully(b1code);
    data.readFully(b1number);
    // other fields
  }

  public void write(DataOutput out) throws IOException {
    out.write(kdgex);
    out.write(b1code);
    out.write(b1number);
    // other fields
  }
}

Here I've used byte arrays for the first three fields of the record but you could use other more suitable types where appropriate (like a short for the first field with readShort.) Note: my interpretation of the field widths is likely wrong; it is just an example.

The DataInputStream is generally used as a DataInput implementation.

Since all characters in the source and target encodings use a one-octet-per code point you should be able to transcode the character data fields using a method like this:

public static byte[] transcodeField(byte[] source, Charset from, Charset to) {
  byte[] result = new String(source, from).getBytes(to);
  if (result.length != source.length) {
    throw new AssertionError(result.length + "!=" + source.length);
  }
  return result;
}

I suggest tagging your question with COBOL (assuming that is the source of this format) so that someone else can speak with more authority on the format of the data source.

TechQA.

Converting EBCDIC to ASCII in java

There are 2 answers

Related Questions in JAVA

Related Questions in ASCII

Related Questions in INPUTSTREAM

Related Questions in COBOL

Related Questions in EBCDIC

Popular Questions

Trending Questions