How read CSV file filtering by line with Beanio?

609 views Asked by At

I want to read a CSV file with BeanIO and I want only the lines start with "CA" skipping the rest of the lines. I need the values "0" "1" "2" and "3" "4" "5" of lines "CA"

AA123
BA456
CA789
CA012
CA345
DA678
EA901

BeanIO has a XML mapper.

<stream name="InfoCSV" format="csv">
  <record name="info" class="com.example.Info" minOccurs="0" maxOccurs="unbounded">
    <field name="digit1" />
    <field name="digit2" />
    <field name="digit3" />
  </record>
</stream>

How do I filter the lines? I don't know how do the XML parser

1

There are 1 answers

0
nicoschl On BEST ANSWER

First, from the data you have shown, you must use a fixedlength format parser and not a csv:

<stream name="InfoCSV" format="fixedlength" />

Appendix A par 7 Streams have a configuration setting called ignoreUnidentifiedRecords that you need to ignore the records/lines that doesn't start with "CA".

You also need to tell the parser how to identify the record/lines you are interested in. Section 4.2.1 explains how record identification works with rid="true" and the literal attribute. If we assume that the first 2 characters identify the record/line you are interested in we have:

<field name="id" position="0" length="2" rid="true" literal="CA" />

Putting it all together:

<stream name="InfoCSV" format="fixedlength" ignoreUnidentifiedRecords="true">
  <record name="info" class="com.example.Info" minOccurs="0" maxOccurs="unbounded">
    <field name="id" position="0" length="2" rid="true" literal="CA"/>
    <field name="digit1" position="2" length="1" />
    <field name="digit2" position="3" length="1" />
    <field name="digit3" position="4" length="1" />
  </record>
</stream>