Convert DOS od (dump) file output to txt

82 views Asked by At

I have a .DAT file which is used by a DOS D-Fend Reloaded application. I know that these characters are somehow converted in dates by the program. Here is the selected part (from different lines) of the output of od -cv INFILE.DAT

006  \0 002  \0   {  \a 024  \0  \b  \0 312  \a  \a  \0  \t  \0 312  \a   
 \a  \0 004  \0 222  \a 037  \0  \b  \0 312  \a 020  \0  \t  \0 312  \a  
 \r  \0 004  \0 250  \a  \a  \0  \t  \0 334  \a 023  \0  \t  \0 334  \a

but I have no idea of how it can be traced back to txt format

1

There are 1 answers

0
Mark Setchell On

Your data is an octal dump from the od utility. You can create a dump in the same format, of all the possible values of a byte, from 0..255, using this:

perl -e "foreach $i (0..255) { print chr($i)}" | od -cv -An > octal.txt

That looks like this:

  \0 001 002 003 004 005 006  \a  \b  \t  \n  \v  \f  \r 016 017
 020 021 022 023 024 025 026 027 030 031 032 033 034 035 036 037
       !   "   #   $   %   &   '   (   )   *   +   ,   -   .   /
   0   1   2   3   4   5   6   7   8   9   :   ;   <   =   >   ?
   @   A   B   C   D   E   F   G   H   I   J   K   L   M   N   O
   P   Q   R   S   T   U   V   W   X   Y   Z   [   \   ]   ^   _
   `   a   b   c   d   e   f   g   h   i   j   k   l   m   n   o
   p   q   r   s   t   u   v   w   x   y   z   {   |   }   ~ 177
 200 201 202 203 204 205 206 207 210 211 212 213 214 215 216 217
 220 221 222 223 224 225 226 227 230 231 232 233 234 235 236 237
 240 241 242 243 244 245 246 247 250 251 252 253 254 255 256 257
 260 261 262 263 264 265 266 267 270 271 272 273 274 275 276 277
 300 301 302 303 304 305 306 307 310 311 312 313 314 315 316 317
 320 321 322 323 324 325 326 327 330 331 332 333 334 335 336 337
 340 341 342 343 344 345 346 347 350 351 352 353 354 355 356 357
 360 361 362 363 364 365 366 367 370 371 372 373 374 375 376 377

If you save the following gawk script as "func.awk" you can re-use it to convert an octal string to hex in any scripts:

BEGIN { 
    # Initialise lookup tables

    # s[] is for values with backslashes, e.g. \0, \a, \t
    s["0"]="00"; s["a"]="07"; s["b"]="08"; s["t"]="09"; s["n"]="0a"; s["v"]="0b"; s["f"]="0c"; s["r"]="0d"

    # o[] is for values with 3 octal digits, e.g. 001, 002, 177, 200, 201
    for(i=1;i<=255;i++){
       key = sprintf("%03o",i); val = sprintf("%02x",i); o[key]=val
    }

    # ord[] allows us to lookup regular single characters
    for(n=0;n<256;n++){
       key = sprintf("%c",n); val = sprintf("%02x",n); ord[key] = val
    }
}

function oct2hex(c) {
  gsub(/[ \t]+/, "", c)                     # trim any spaces
  m = length(c)

  if (m==0) { return "20" }                 # it was a space and has been trimmed away
  if (m==1) { return ord[c] }               # it is a regular, single letter
  if (m==2) { c=substr(c,2); return s[c] }  # it is a single back-slashed letter, e.g. \t, \a
  if (m==3) { return o[c] }                 # it is 3 octal digits
  print "Something went wrong"
  exit
}

You can now use gawk to process your octal file. As the fields in it are separated by spaces, you would lose the space at position 32, so I am using FIELDWIDTHS of 4 to split your fields:

gawk -f func.awk -v FIELDWIDTHS="4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4" -e '
  { for(i=1;i<=NF;i++) print oct2hex($i) } ' YOURFILE

That will then give you your file as hex, which you can pipe into xxd and reconstruct your initial file:

gawk -f func.awk -v FIELDWIDTHS="4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4" -e '
  { for(i=1;i<=NF;i++) print oct2hex($i) } ' YOURFILE | xxd -r -p > RECOVERED.BIN

So, if I take the full 0..255 possible values in octal dump format from the very beginning of the answer, I can reconstruct it to binary like this:

gawk -f func.awk -v FIELDWIDTHS="4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4" -e '
  { for(i=1;i<=NF;i++) print oct2hex($i) } ' octal.txt | xxd -r -p > RECOVERED.BIN

If I then use xxd to dump that, you can see all the values are recovered:

xxd RECOVERED.BIN

00000000: 0001 0203 0405 0607 0809 0a0b 0c0d 0e0f  ................
00000010: 1011 1213 1415 1617 1819 1a1b 1c1d 1e1f  ................
00000020: 2021 2223 2425 2627 2829 2a2b 2c2d 2e2f   !"#$%&'()*+,-./
00000030: 3031 3233 3435 3637 3839 3a3b 3c3d 3e3f  0123456789:;<=>?
00000040: 4041 4243 4445 4647 4849 4a4b 4c4d 4e4f  @ABCDEFGHIJKLMNO
00000050: 5051 5253 5455 5657 5859 5a5b 5c5d 5e5f  PQRSTUVWXYZ[\]^_
00000060: 6061 6263 6465 6667 6869 6a6b 6c6d 6e6f  `abcdefghijklmno
00000070: 7071 7273 7475 7677 7879 7a7b 7c7d 7e7f  pqrstuvwxyz{|}~.
00000080: 8081 8283 8485 8687 8889 8a8b 8c8d 8e8f  ................
00000090: 9091 9293 9495 9697 9899 9a9b 9c9d 9e9f  ................
000000a0: a0a1 a2a3 a4a5 a6a7 a8a9 aaab acad aeaf  ................
000000b0: b0b1 b2b3 b4b5 b6b7 b8b9 babb bcbd bebf  ................
000000c0: c0c1 c2c3 c4c5 c6c7 c8c9 cacb cccd cecf  ................
000000d0: d0d1 d2d3 d4d5 d6d7 d8d9 dadb dcdd dedf  ................
000000e0: e0e1 e2e3 e4e5 e6e7 e8e9 eaeb eced eeef  ................
000000f0: f0f1 f2f3 f4f5 f6f7 f8f9 fafb fcfd feff  ................