Customize rails find_in_batches
Today I ran into an issue while reading a quite significant mysql database table. I needed to fetch the
records in batches so the good old find_in_batches
method in rails ActiveRecord
came into
picture.
Now the actual problem surfaced, as the lookup table was not having any sort of ID column present,
the find_in_batches
kept throwing an error Invalid Statement
. After some debugging it came to
light that find_in_batches
by default only works with Integer only Primary Key fields, as it uses
order
to order the records as per the Integer Primary Key column.
Thanks to http://monkeyandcrow.com/blog/reading_rails_how_do_batched_queries_work/
So I came up with this little hack that seemed to work for me:
module CustomFindInBatches
def custom_find_in_batches(options={})
return unless block_given?
start = options[:start]
batch_size = options.delete(:batch_size) || 1000
# use custom primary key set in the model
relation = self.reorder(self.primary_key).limit(batch_size)
records = start ? relation.offset(start).to_a : relation.to_a
offset = 0
while records.any?
yield(records)
break if records.size < batch_size
offset += batch_size
# fetch batch_size records based on offset
records = relation.offset(offset).to_a
end
end
end
This worked by setting Model.primary_key
on the model and the reordering the relation based on
that primary key, which can now be any data type(which was varchar in my case).