I'm very new to Python and I'm having trouble working on an assignment which basically is like this:
#Read line by line a WARC file to identify string1.
#When string1 found, add part of the string as a key to a dictionary.
#Then continue reading file to identify string2, and add part of string2 as a value to the previous key.
#Keep going through file and doing the same to build the dictionary.
I can't import anything so it's causing me a bit of trouble, especially adding the key, then leaving the value empty and continue going through the file to find string2 to be used as value.
I've started thinking something like saving the key to an intermediate variable, then going on to identify the value, add to an intermediate variable and finally build the dictionary.
def main ():
###open the file
file = open("warc_file.warc", "rb")
filetxt = file.read().decode('ascii','ignore')
filedata = filetxt.split("\r\n")
dictionary = dict()
while line in filedata:
for line in filedata:
if "WARC-Type: response" in line:
break
for line in filedata:
if "WARC-Target-URI: " in line:
urlkey = line.strip("WARC-Target-URI: ")
Your idea with storing the key to an intermediate value is good.
I also suggest using the following snippet to iterate over the lines.
To create dictionary entries in Python, the
dict.update()
method can be used. It allows you to create new keys or update values if the key already exists.