sql performance when filtering ipv4 ranges

55 views Asked by At

I have a database(InnoDB) with Ipv4 ranges represented as 32bit int, both are unsigned int and represent a certain IP range (start-ip to end-ip) with data in other columns. I have indexed both columns individually (BTree index type).

id|start-ip|end-ip
 4|16777216|16777471
 5|16777472|16778239
 6|16778240|16779263
 7|16779264|16781311
 8|16781312|16785407
 9|16785408|16793599
10|16793600|16809983
11|16809984|16810018
12|16810019|16810019

When selecting a certain IP I can easily find one as it fits within a single row. But when I need to find a range of ips that need to fit within multiple rows the only way I figured was

select * from `ips` where 
(`start-ip` <= min and `end-ip` >= min) 
or 
(`start-ip` >= min and `end-ip` <= max) 
or 
(`start-ip` <= max and `end-ip` >= max)

for example, trying to select from a range of 16777300(min) - 16779200(max) the output would be

 4|16777216|16777471
 5|16777472|16778239
 6|16778240|16779263

Essentially the start-ip must be smaller or equal to the 'min' while the end-ip must be larger or equal to the 'max'. This can be across multiple rows.

Currently, I have roughly 10 mil rows with ip ranges for the full spectrum of ipv4 and this selects all the rows that fit within the range, however, performance-wise it takes a few seconds to extract the data. How can I improve performance?

2

There are 2 answers

1
Gordon Linoff On BEST ANSWER

You can use this logic:

select *
from `ips`
where min <= `end-ip` and
      max >= `start-ip`

That is, one range start before the other ends . . . and vice versa.

1
Rick James On

MySQL/MariaDB does not know that your ranges might be overlapping or not. Hence performance sucks for any obvious formulation, including Gordon's

Here is an approach that is performant: http://mysql.rjweb.org/doc.php/ipranges