How can i change the output of this w-shingling function to be all lower-case?

304 views Asked by At

I am trying to make a function in python which returns the w-shingling , of a given shingle width, w, but would like the strings in the shingled list to be all lower case letters.

I have tried putting [c.lower() for c in inputFile] and things of this sort.

import io

sample_text = io.StringIO("This is a sample text. It is a ordinary string but simulated to act as the contents of a file")


def wShingleOneFile(inputFile, w): 
    for line in inputFile:
      words = line.split() 
      [c.lower() for c in inputFile]
      return [words[i:i + w] for i in range(len(words) - w + 1)]

print(wShingleOneFile(sample_text, 3))

This is the ouptut when printed:


[['This', 'is', 'a'], ['is', 'a', 'sample'], ['a', 'sample', 'text.'], ['sample', 'text.', 'It'], ['text.', 'It', 'is'], ['It', 'is', 'a'], ['is', 'a', 'ordinary'], ['a', 'ordinary', 'string'], ['ordinary', 'string', 'but'], ['string', 'but', 'simulated'], ['but', 'simulated', 'to'], ['simulated', 'to', 'act'], ['to','act', 'as'], ['act', 'as', 'the'], ['as', 'the', 'contents'], ['the', 'contents', 'of'], ['contents', 'of', 'a'], ['of', 'a', 'file']]

But I would like all of these letters to be lowercase.

1

There are 1 answers

0
mossherder On

Change line.split() to line.lower().split()

Also note that strings in python are immutable so for example in the sample you gave, you need to assign your list comprehension [c.lower() ... inputFile] back to inputFile. It should also in that case be transformed before the loop that you currently show it written in.