Problem with end-of-line character and .rstrip() in python

46 views Asked by At

I am doing a Python exercise where I have to write a program that randomly generates a password according to the following criteria:

  1. It has to be formed from two randomly chosen words, taken from a huge text file with one word on each line.

  2. Each word must have at least 3 letters and the password must be either 8, 9, or 10 letters in total.

  3. The words are concatenated (with no space) and the first letter of each is capitalised.

My code is below:

# Randomly generate a two-word password from a text file of words, subject to various conditions

from sys import exit
from random import choice

print()

try:
    inf=open("words.txt","r")
except FileNotFoundError:
    print("Error: words.txt not found.")
    print("Quitting...")
    exit()

words=inf.readlines()
for word in words:
    word=word.rstrip()
    word=word.lower()

first=choice(words)
words.remove(first)
second=choice(words)

while len(first)<=2 or len(second)<=2 or len(first)+len(second) not in [8,9,10]:
    first=choice(words)
    words.remove(first)
    second=choice(words)

first=first[0].upper()+first[1:]
second=second[0].upper()+second[1:]
password=first+second

print(password)

Whilst attempting this exercise, I observed that the each of the words in the password seem to have one extra space at the end, and that this is the 'end-of-line' character \n. However, I have included the line word=word.rstrip() and have read that this should remove all spaces, tabs and end-of-line characters. Yet the variables first and second both still have '\n' at the end even though they are randomly chosen from the list words, with .rstrip() applied. What is going on here? I'm sure I've missed something.

2

There are 2 answers

0
manelcosio On BEST ANSWER
words=inf.readlines()
for word in words:
    word=word.rstrip()
    word=word.lower()

The issue is that you are modifying the variable word but not the list words within the loop i quoted above A simple solution in my opinion would be to do a list comprehension words = [word.rstrip().lower() for word in words] This way your new list would be stripped without any whitespace

0
SIGHUP On

One of the simplest ways to consume a line oriented file without trailing whitespace is with map()

In order to fulfil the output criteria, you have a while loop that could, theoretically, run forever depending on the file's contents.

A better approach is to build a dictionary keyed on word length and with each associated value a list of words of that length. This obviates the need for the inner loop making it more reliable and, almost certainly, faster.

from random import randint, choice
from collections import defaultdict

WORDSFILE = "words.txt"

db = defaultdict(list)

with open(WORDSFILE) as words:
    # load the dictionary with words of length 3 to 7
    for word in map(str.rstrip, words):
        lw = len(word)
        if 3 <= lw <= 7:
            db[lw].append(word)
    for k, v in db.items():
        db[k] = list(set(v))
        if len(db[k]) < 2:
            raise Exception("There must be at least 2 unique words of each length")
    # choose the total length
    full_length = randint(8, 10)
    # choose a length for the first word
    first_len = randint(3, full_length-3)
    # calculate length of second word
    second_len = full_length - first_len
    # random choice of the first word
    first_word = choice(db[first_len])
    # ensure that the chosen word cannot be duplicated
    if first_len == second_len:
        db[first_len].remove(first_word)
    # random choice of the second word
    second_word = choice(db[second_len])
    password = first_word.capitalize() + second_word.capitalize()
    print(password)