I am trying to parse selective data from a file based on certain key words using tcl,for example I have a file like this
...
...
..
...
data_start
30 abc1 xyz
90 abc2 xyz
214 abc3 xyz
data_end
...
...
...
How do I catch only the 30, 90 and 214 between "data_start" and "data_end"? What I have so far(tcl newbie),
proc get_data_value{ data_file } {
set lindex 0
set fp [open $data_file r]
set filecontent [read $fp]
while {[gets $filecontent line] >= 0} {
if { [string match "data_start" ]} {
#Capture only the first number?
#Use regex? or something else?
if { [string match "data_end" ] } {
break
} else {
##Do Nothing?
}
}
}
close $fp
}
If your file is smaller in size, then you can use
read
command to slurp the whole data into a variable and then applyregexp
to extract the required information.input.txt
extractNumbers.tcl
Output :
Update :
Reading a larger file's content into a single variable will cause the program to crash depends on the memory of your PC. When I tried to read a file of size 890MB with
read
command in a Win7 8GB RAM laptop, I gotunable to realloc 531631112 bytes
error message andtclsh
crashed. After some bench-marking found that it is able to read a file with a size of 500,015,901 bytes. But the program will consume 500MB of memory since it has to hold the data.Also, having a variable to hold this much data is not efficient when it comes to extracting the information via
regexp
. So, in such cases, it is better to go ahead with read the content line by line.Read more about this here.