Input files:
file1.txt
danial,23,janitor
adam,42,waiter
katherine,21,teacher
file2.txt
danial,5,broadway street
brooke,4,hughway street
adam,3,new street
Desired output:
danial,23,janitor,5,broadway street
adam,42,waiter,3,new street
katherine,21,teacher
brooke,4,hughway street
My current code:
with open('C:\\Users\\user\\Desktop\\Dap\\job.txt') as f1, open('C:\\Users\\user\\Desktop\\Dap\\address.txt') as f2:
job = {}
for line in f1:
name, age, job = line.split(',')
address[name] = age, job
address = {}
for line in f2:
name2, num, address = line.split(',')
course[name2] = num, address
common = set(job.keys() & set(address.keys()))
with open('C:\\Users\\Izz\\Desktop\\Data\\output.txt', 'w') as f:
for i in common:
f.write("%s\t%s\t%s\n" % (i, job[i], address[i]))
Edit:
With this code here I managed to only print the one with similar keys. I managed to do a dictionary method where I assign the first column as key but I can only print the one with similar keys.
This seems to do what you want:
Description:
We create a
tmp
defaultdict first to store the various attributes (age, occupation, ...) that each person might have. The defaultdict creates an empty list for us whenever we access a key for the first time, this allows us to dotmp[name] += (a, b)
without having to first check ifname
already exists (and if not, create a new list), improving readability.Have a look at the
itertools.chain(l1, l2, ...)
documentation for an explanation of that as the example provided there is pretty concise.Iterating through
f1
andf2
will yield each line of the file, including any newlines, so we have to first usel = l.strip()
to strip those off before continuing further.If your input file has blank lines, then
if not l: continue
is used to check ifl
is the blank string,''
(which evaluates toFalse
), and if it is, skip it. We could have alternatively had:However this is slightly worse form, prefer to write your code assuming everything goes as planned and introducing if statements to handle the exceptional cases instead will improve its readability.
We now split each line into their three components with
l.split(',')
and unpack the result into the variablesname
,a
,b
, assuming that the format of your input file will always be, the persons name, followed by two arbitrary attributes, delimited by commas. (If you're unsure on how tuple unpacking works, this seems to provide a good introduction to tuples in general (including unpacking)).Since we can extend lists like so:
We then append our person's attribute
a
andb
intotmp[name]
by doingtmp[name] += (a, b)
.The last step now that the
tmp
dictionary has been constructed with everyone's names and attributes, is to write it into our out file.Here we use a list comprehension to format our output (if you're also unsure of this, have a look at the documentation linked), and if you're unfamiliar with the
*
operator, it is used here to unpackv
(which is the list of attributes for person with namek
), link to doc.And then
','.join(lst)
will combine the strings inlst
(in this case(k, *v)
) into one string, each value separated by','
.Finally, we add on a newline onto the end since
out.writelines(lines)
doesn't include them for us, and we write our lines to the file withwritelines()
.