I am dealing with a CRON job that places a text file with 9000 lines device names.
The job recreates the file every day with an updated list from a network crawler in our domain.
What I was running into is when I have the following worker running my import into my database the db.[name].id
kept growing with this method below
scheduler.py
# -*- coding: utf-8 -*-
from gluon.scheduler import Scheduler
def demo1():
db(db.asdf.id>0).delete()
db.commit()
with open('c:\(project)\devices.list') as f:
content = f.readlines()
for line in content:
db.asdf.insert(asdf = line)
db.commit()
mysched = Scheduler(db, tasks = dict(demo1 = demo1) )
default.py
(initial kickoff)
@auth.requires_membership('!Group-IS_MASTER')
def rgroup():
mysched.queue_task('demo1',start_time=request.now,stop_time = None,prevent_drift=True,repeats=0,period=86400)
return 'you are member of a group!'
So the next time the job kicked off it would start at db.[name].id = 9001
. So every day the ID number would grow by 9000 or so depending on the crawler's return. It just looked sloppy and I didn't want to run into issues years down the road with database limitations that I don't know about.
(I'm a DB newb (I know, I don't know stuff))
SOOOOOOO.....
This is what I came up with and I don't know if this is the best practice or not. And an issue that I ran into when using db.[name].drop()
in the same function that is creating entries is the db tables didn't exist and my job status went to 'FAILED'. So I defined the table in the job. see below:
scheduler.py
from gluon.scheduler import Scheduler
def demo1():
db.asdf.drop() #<=====Kill db.asdf
db.commit() #<=====Commit Kill
db.define_table('asdf',Field('asdf'),auth.signature ) #<==== Phoenix Rebirth!!!
with open('c:\(project)\devices.list') as f:
content = f.readlines()
for line in content:
db.asdf.insert(asdf = line)
db.commit() #<=========== Magic
mysched = Scheduler(db, tasks = dict(demo1 = demo1) )
In the line of Phoenix Rebirth in the comments of code above. Is that the best way to achieve my goal?
It starts my ID back at 1
and that's what I want but is that how I should be going about it?
Thanks!
P.S. Forgive my example with windows dir structure as my current non-prod sandbox is my windows workstation. :(
Why wouldn't you check if the line is present prior to inserting its corresponding record ?
Got a similar process that takes few seconds to complete with 2k-3k entries. Yours should not take longer than half a minute.