I have a txt file which looks like below including 4 rows as an example and each row strings are separated by a ,
.
"India1,India2,myIndia "
"Where,Here,Here "
"Here,Where,India,uyete"
"AFD,TTT"
https://gist.github.com/anonymous/cee79db7029a7d4e46cc4a7e92c59c50
the file can be downloaded from here
I want to extract all unique cells across all , the output2
India1
India2
myIndia
Where
Here
India
uyete
AFD
TTT
I tried to read line by line and print it ìf i call my data as df`
myfile = open("df.txt")
lines = myfile.readlines()
for line in lines:
print lines
Option 1:
.csv
,.txt
FilesNative Python is unable to read
.xls
files. If you convert your file(s) to.csv
or.txt
, you can use thecsv
module within the Standard Library:Option 2:
.xls
,.xlsx
FilesIf you want to retain the original
.xls
format, you have to install a third-party module to handle Excel files.Install
xlrd
from the command prompt:In Python:
Option 3: DataFrames
You can handle csv and text files with pandas DataFrames. See documentation for other formats.
DataFrame Output
Save as Files
Note: Results from options 1 & 2 can be converted to unordered, pandas columnar objects too with
pd.Series(list(items))
.Finally: As a Script
Save any of the three options above in a function (
stack
) within a file (namedrestack.py
). Save this script to a directory.From its working directory, run the script via commandline. Answer the prompts:
Your results should print in you console and optionally save to a file
output.txt
. Adjust any parameters to suit your interests.