I have a Python script that counts the occurrences of every k-long substring in a very large text. This is how it does it, after having stored and deduplicated the substrings:
counts = {}
for s in all_substrings:
counts[s] = full_text.count(s)
I was surprised to see that this script uses only 4% CPU on average. I have a 4-core, 8-thread CPU, but no core is used at more than single-digit percentages. I would have expected the script to use 100% of one core, since it doesn't do IO.
Why does it use so little computing power, and how can I improve that?