Dynamic continuous numbering in bash

76 views Asked by At

I have a text file that acts as a database for my script. The file has a column for an "ID" in example.

The database has a format of UID:Item Name:Quantity:Price:Date Added

cat FirstDB.txt

output:

0001:Fried Tarantula:45:100:2017-08-03
0002:Wasp Crackers:18:25:2017-08-04
0003:Century Egg:19:50:2017-08-05
0004:Haggis Flesh:20:90:2017-08-06
0005:Balut (Egg):85:15:2017-08-07
0006:Bear Claw:31:550:2017-08-08
0007:Durian Fruit:70:120:2017-08-09
0008:Live Cobra heart:20:375:2017-08-10
0009:Monkey Brains:30:200:2017-08-11
0010:Casu Marzu:25:1030:2017-08-12

Now the feature that i'm creating allows a certain user to put in new entries in the text file using the same format (I have already created this). However, the real trick here is that the user is also given the option to delete a certain item. In example the user would like to delete Century Egg from the text file the output would be this:

0001:Fried Tarantula:45:100:2017-08-03
0002:Wasp Crackers:18:25:2017-08-04
0004:Haggis Flesh:20:90:2017-08-06
0005:Balut (Egg):85:15:2017-08-07
0006:Bear Claw:31:550:2017-08-08
0007:Durian Fruit:70:120:2017-08-09
0008:Live Cobra heart:20:375:2017-08-10
0009:Monkey Brains:30:200:2017-08-11
0010:Casu Marzu:25:1030:2017-08-12

Then if the user wishes to add any item in the database I would like the user to take the UID 0003 since it's already free. How do I go about in achieving this? I'm stuck with it as of the moment. I believe awk can be useful here but i'm not keeping my options closed and i'm pretty new to scripting and awk im not really that good with awk yet. So if you would have a solution that would be using awk please guide me through it as well. Thank you very much!

2

There are 2 answers

1
karakfa On BEST ANSWER

awk to the rescue!

assuming after edits the sequence will not be ordered anymore

awk -F: '{a[$1+0]} END{for(i=1;i<=NR;i++) if(!(i in a)) print i}'

will return you the first missing number from the first column (assumes numerical field).

test

create a shuffled list of formatted sequence numbers with "0003" missing.

awk 'BEGIN{for(i=1;i<=10;i++) printf "%04d\n",i}' | shuf | awk '$1!=3' 

0009
0001
0006
0004
0002
0005
0008
0010
0007

pipe to the script

... | awk -F: '{a[$1+0]} END{for(i=1;i<=NR;i++) if(!(i in a)) print i}'

returns as expected

3

however, this won't return anything if your list does not have gaps. To handle that case, you need to return the largest number + 1. With this modification the test case and script becomes

$ awk 'BEGIN{for(i=1;i<=10;i++) printf "%04d\n",i}' | 
  shuf | 
  awk -F: '{a[$1+0]} $1>max{max=$1} 
       END {for(i=1;i<=NR;i++) if(!(i in a)) {print i; exit} 
            print max+1}'

11

Note if you're sorting the file after each record insertion you can avoid much of the complexity.

4
Marc Lambrichs On

If I understand the question correctly, you're looking for the first "free" number starting from the top. Something like:

$ awk -F: '{s=sprintf("%04d",NR)} s!=$1{print s; exit}' FirstDB.txt

could do what you want. I'm assuming here, that no 2 clients can add/delete at the same time.

This can even be shortened to:

$ awk -F: '(s=sprintf("%04d",NR))!=$1{print s; exit}' FirstDB.txt