During the build of lm binay to create scorer doe deepspeech model I was getting the following error again and again

subprocess.CalledProcessError: Command '['/content/kenlm/build/bin/build_binary', '-a', '255', '-q', '8', '-v', 'trie', '/content/lm_filtered.arpa', '/content/lm.binary']' returned non-zero exit status 1.

The command I was using is as below

!python /content/DeepSpeech/data/lm/generate_lm.py \
--input_txt /content/transcripts.txt \
--output_dir /content/scorer/ \
--top_k 50000 \
--kenlm_bins /content/kenlm/build/bin/ \
--arpa_order 5 --max_arpa_memory "95%" --arpa_prune "0|0|1" \
--binary_a_bits 255 --binary_q_bits 8 --binary_type trie
1

There are 1 answers

0
Danish Bansal On BEST ANSWER

Following worked for me Go to

DeepSpeech -> data -> lm -> generate_lm.py

Now find following stack of code inside it

subprocess.check_call(
        [
            os.path.join(args.kenlm_bins, "build_binary"),
            "-a",
            str(args.binary_a_bits),
            "-q",
            str(args.binary_q_bits),
            "-v",
            args.binary_type,
            filtered_path,
            binary_path,
        ]

Tweak the code by adding "-s" flag in it as below

subprocess.check_call(
    [
        os.path.join(args.kenlm_bins, "build_binary"),
        "-a",
        str(args.binary_a_bits),
        "-q",
        str(args.binary_q_bits),
        "-v",
        args.binary_type,
        filtered_path,
        binary_path,
        "-s"
    ]

Now your command will run fine