The code works but the output differs by approx. 4.5% from statistically right

60 views Asked by At

I'm trying to make program from tutorial, details below:

The code suppose to simulate 100 coin tosses by generating a random sequence of T and H characters. After that it should count how many sub sequences of the same resulting tosses there are in the sequences. In this specific example I am checking for 6 same resulting tosses in a row.

import random

seq_count = 0
for _ in range(1000000):

    random_set = str()

    for toss in range(100):
        if random.randint(0, 1) == 0:
            random_set += "H"
        else:
            random_set += "T"

    count = 1
    last_toss = random_set[0]
    block = False

    for toss in random_set[1:]:
        if toss == "H":
            if last_toss == "H":
                if not block:
                    count += 1
            else:
                count = 1
                block = False
            last_toss = toss
        else:
            if last_toss == "T":
                if not block:
                    count += 1
            else:
                count = 1
                block = False
            last_toss = toss

        if count < 6 or block:
            continue
        else:
            seq_count += 1
            block = True

print(f'{seq_count}')

I have tested with various test samples - it seams to be fine, however statistically the distribution suppose to be 1/64 = 0.0156 but in my case it constantly produce the result of 0.0149

Any hint would be appreciated

0

There are 0 answers