Python/Django- Save csv file data with messy rows

Question

Python/Django- Save csv file data with messy rows

442 views Asked by arrt_ At 14 December 2016 at 10:04

I have a csv file with the following structure:

A, 10
B, 11
C, 8
D, 12
A, 21
B, 7
D, 22
D, 15
C, 111
D, 50
A, 41
B, 32
C, 19
D, 11

I want to read the entire file, and save the data on the second column, if the row is like A, B, C, D format. I have a list:

my_list = [A, B, C, D]

And I check each 4 row, if it is in my_list format, then read and save to the database like:

with open('csv_file.csv', 'rb') as csvfile:
        the_file = csv.reader(csvfile.read().splitlines())

        for idx, row in enumerate(islice(zip(*[the_file]*4), 100)):
            my_model = Django_model()
            if row[0][0] == my_list[0]:
                if row[0][0] == my_list[0] and row[1][0] == my_list[1] and \
                    row[2][0] == my_list[2] and row[3][0] == my_list[3]:
                    my_model.a_field = row[0][1]
                    my_model.b_field = row[1][1]
                    my_model.c_field = row[2][1]
                    my_model.d_field = row[3][1]
                    my_model.save()

The fact is that it is working if and ONLY if the structure of the row is the same as my_list. But when it gets to the messy part (A, B, D, D, C, D) it will not read the rows properly, and therefore many rows are skipped.

The question is that, how can I jump to the next interesting row (which follows my_list format) and read it? Meanwhile save the the skipped rows in another list?

I have heard that Pandas could help, but I have been through the documentation, and I couldn't find a way to solve this case.

Original Q&A

There are 1 answers

**Mohammad Yusuf** · Accepted Answer · 2016-12-14T10:39:16+00:00

You can extract the pattern and corresponding values like this:

import pandas as pd
import re

df = pd.read_csv('/home/yusuf/Desktop/c1', header=None, names=['c1','c2'])
l1=[]
for a in re.finditer('ABCD', ''.join(df.c1.tolist())):
    l1.append(range(a.start(),a.end()))
l2 = [b for a in l1 for b in a]
print df[df.index.isin(l2)], '\n'
print df[~df.index.isin(l2)]

Output:

TechQA.

Python/Django- Save csv file data with messy rows

There are 1 answers

Related Questions in PYTHON

Related Questions in DJANGO

Related Questions in PANDAS

Related Questions in DJANGO-MODELS

Related Questions in CSV-IMPORT

Popular Questions

Popular Tags

Trending Questions