weird awk outputs in reading/writing file

Question

weird awk outputs in reading/writing file

74 views Asked by chuhui chen At 07 December 2020 at 18:55

I'm working on a Kaldi project about the existing example using the Tedlium dataset. Every step works well until the clean-up stage. I have a length mismatch issue. After examing all the scripts, I found the issue is in the lattice_oracle_align.sh

reference:https://github.com/kaldi-asr/kaldi/blob/master/egs/wsj/s5/steps/cleanup/lattice_oracle_align.sh

I believe the issue is line 142.

  awk '{if ($2 == "#csid") print $1" "($4+$5+$6)}' $dir/analysis/per_utt_details.txt > $dir/edits.txt

The above line should read per_utt_details.tx line by line, every time it reads a #csid it should write a line in edits.txt texts in per_utt_details look like this.

     ref
     hyp
     op
     #csid 0 0 0 0
     ...repeat the above 4 lines.

There are 1073046 lines in per_utt_details.txt. I expect 268262 lines in edits.txt. However, only 48746 lines exist in edits.txt.

Original Q&A

There are 1 answers

**RavinderSingh13** · Accepted Answer · 2020-12-07T19:04:11+00:00

By seeing your samples I believe you are looking to compare 1st field NOT 2nd field(which shows in your shown code), so if this is the case then try running following(where I have changed from $2 to $1 for comparing with 1st field).

awk '($1 == "#csid"){print $1,($4+$5+$6)}' per_utt_details.txt > edits.txt

TechQA.

weird awk outputs in reading/writing file

There are 1 answers

Related Questions in AWK

Related Questions in KALDI

Popular Questions

Popular Tags

Trending Questions