We have a in-memory grid platform, which bootstraps data from database, now we have 8 nodes and we have to equally distribute data among the 8 nodes. so we used % operator as below
select * from LargeTable where id % 8 = 1
select * from LargeTable where id % 8 = 2
select * from LargeTable where id % 8 = 3
select * from LargeTable where id % 8 = 4
select * from LargeTable where id % 8 = 5
select * from LargeTable where id % 8 = 6
select * from LargeTable where id % 8 = 7
select * from LargeTable where id % 8 = 8
But what we have observed is with table scan the query is using full table scan instead of using indexing and we have data that runs into millions of records, is there any way i can force it to use indexing for performance or anyother suggestions are appreciated.
It'll not be using the index because you have an operation on its column.
If it's a clustered index you could use a range test instead (but do you know the points where roughly it splits 8 ways - it seems like you think modulo gives an even split?) You could for non-clustered too but it might not be quite so quick to read, because for clustered the data is all in order at the leaf level.
You could add a materialised column of id%8 and index that too, but that's extra work when data is inserted, because it's an extra index. If that isn't too much of an overhead inserting you'd find that worked and you could read 8 times in parallel, each one going straight to its data.
You could perhaps use a table partition too, but this is not something I know so much about; the aim would be to partition it on ranges, and if you're using bcp you can bcp out from a partition.