As a research case I have a literary novel that consists of three main characters who each have their own chapters in the novel. That is: the first chapter is for character X (Aaron), the second for character Y (Sigerius) and the third for character Z (Joni), the fourth for character X, the fifth for character Y, the sixth for character Z, and so on... I want to count the amount of words of all the chapters that are dedicated to character X, character Y and character Z.
This is the Python code I am currently working on with regards to the chapters of one specific character (Aaron):
from itertools import islice
with open(textfile, 'rt', encoding='utf-8') as f:
# Computes the total word count of the file
text = f.read()
words = text.split()
wordCount = len(words)
print ("The total word count is:", wordCount)
# Aaron's chapters
chapterAaron1 = islice(f, 0, 123)
chapterAaron4 = islice(f, 223 ,326)
chapterAaron6 = islice(f, 639, 772)
chapterAaron10 = islice(f, 1125, 1249)
chapterAaron12 = islice(f, 1370, 1455)
chapterAaron15 = islice(f, 1657, 1717)
chapterAaron19 = islice(f, 2088, 2138)
chaptersAaron = (chapterAaron1, chapterAaron4, chapterAaron6, chapterAaron10, chapterAaron12, chapterAaron12, chapterAaron15, chapterAaron19)
# Computes the total word count of Aaron's chapters (does not work)
wordsAaron = chaptersAaron.split()
wordCountAaron = len(wordsAaron)
print ("The total word count of Aaron's chapters is:", wordCountAaron)
I have manually decided on which lines of the txt-file the different chapters (per character) begin and end. I use islice to split the txt-file into specific chapters (contained between specific line numbers) in order to calculate the amount of words contained between those line numbers (i.e. the chapters). However, I don't seem to find a way to operationalize islice for this purpose in the right way. I get this AttributeError: 'tuple' object has no attribute 'split'. What I want is to store all chapters of a specific character in one variable (e.g. chaptersAaron), so that I can do stuff with with it, e.g. count the total amount of words and search the occurence of specific words in it.
- Does anyone have a suggestion with regards to the correct usage of islice for my purposes? Alternative options to split the text into chapters are also very welcome.
The solution should be:
the problem with you code example is, that you mix iterators, lists and tupels.
islice(f, 1125, 1249)
is an iterator chaptersAaron = (chapterAaron1, ...) is a tupel and you want to use both as a listThe idea in my solution is to start with an empty list
chaptersAaron=[]
. Transform all iterators into lists by[elem for elem in islice(f, 0, 123)]
and connecinate the lists withchaptersAaron+=chapterAaron1