ActiveRecord#create does not allow to set the id

111 views Asked by At

I need to batch insert > 100.000 records. The id will not be created by the DB and I have to use a a given UUID: Doing this in a loop using mymodel.new assigning the ID, then save the record will work but is way too slow (appr. 20min)

When I create an array 'records' and use mymodel.create(records) I run into the 'cannot mass assign id' problem. I've tried all solutions I could find:

  • 'attr_acccessible :id, ...' for the model. works for all but id.
  • (re)define 'def self.attributes_protected_by_default [] end' - no effect
  • one advice was to use 'create' with ':without_protection => true', but create does not take more than one argument.

.So neither of these solutions helped. What else can I do?

1

There are 1 answers

0
arcasys On BEST ANSWER

Finally, I found a solution which might not be elegant in a Rails way but it solves my performance problem:

At first I tried what @Albin suggested only to find that create(records) does not work much faster (still > 15min).

My solution now is:

  • Create a temporary CSV file

    db_tmp = File.open("tmp_file", "w")
    records = ""
    @data_records.each do |row|
      records << "#{row['id']},#{row['id']},#{field_1},#{row['field_2']}, ... \n"
    end
    db_tmp.write(records)
    db_tmp.close
    
  • Execute sql with a load data command

    sql = "load data infile 'tmp_file' into table my_table 
      fields optionally enclosed by '\"' terminated by ','
      (id,field_1,field_2, ... )"
    ActiveRecord::Base.connection.execute(sql)
    

The whole process now lasts less than 1 (!) minute, including getting the data over the network and parsing the original json message into a hash.

I'm aware that this does not clarify how create could be tricked into allowing ID assignment but the performance problem is solved.

Another point is that my solution bypasses any validation defined for the model. This is not a problem because in this case I know I can rely on integrity of the data I'm receiving - and if there's a problem load would fail and execute would raise an exception.