l have 7z files that l want to transform them into csv using Pandas to preprocess the data. l have python 2.7.
l tried this one :
import pandas as pd
data = pd.read_csv('train_2011_2012_2013.7z.002', header = None)
print data
l got this error
CParserError Traceback (most recent call last)
<ipython-input-9-74098fd0c476> in <module>()
1
----> 2 data = pd.read_csv('train_2011_2012_2013.7z.001', header = None)
3 print data
/root/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.pyc in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, escapechar, comment, encoding, dialect, tupleize_cols, error_bad_lines, warn_bad_lines, skip_footer, doublequote, delim_whitespace, as_recarray, compact_ints, use_unsigned, low_memory, buffer_lines, memory_map, float_precision)
560 skip_blank_lines=skip_blank_lines)
561
--> 562 return _read(filepath_or_buffer, kwds)
CParserError: Error tokenizing data. C error: Expected 1 fields in line 17, saw 2
What's wrong around here ?
Install pyunpack and patool
pip install pyunpack
pip install patool
after that write run the following code:
in the output path you will find the extracted folder in which your files are stored.