Why does the output of EXPLAIN change after each SHOW index?

383 views Asked by At

I was trying to improve performance on some queries through indexes using EXPLAIN and I noticed each time I used SHOW index FROM TableB; the output of the rows colums in the EXPLAIN of a query changed

Ex:

mysql> EXPLAIN Select A.id
     From TableA A
     Inner join TableB B
         On A.address = B.address And A.code = B.code
     Group by A.id
     Having count(distinct B.id) = 1;
+----+-------------+-------+--------+---------------+---------+---------+---------------------------------------+-------+----------------------------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref                                   | rows  | Extra                                        |
+----+-------------+-------+--------+---------------+---------+---------+---------------------------------------+-------+----------------------------------------------+
|  1 | SIMPLE      | B     | index  | test_index    | PRIMARY | 518     | NULL                                  | 10561 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | A     | eq_ref | PRIMARY       | PRIMARY | 514     | db.B.address,db.B.code                |     1 |                                              |
+----+-------------+-------+--------+---------------+---------+---------+---------------------------------------+-------+----------------------------------------------+
2 rows in set (0.00 sec)

mysql> show index from TableB;
+-----------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table     | Non_unique | Key_name     | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-----------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| TableB    |          0 | PRIMARY      |            1 | id          | A         |           7 |     NULL | NULL   |      | BTREE      |         |
| TableB    |          0 | PRIMARY      |            2 | address     | A         |          21 |     NULL | NULL   |      | BTREE      |         |
| TableB    |          0 | PRIMARY      |            3 | code        | A         |       10402 |     NULL | NULL   |      | BTREE      |         |
| TableB    |          1 | test_index   |            1 | address     | A         |           1 |     NULL | NULL   |      | BTREE      |         |
| TableB    |          1 | test_index   |            2 | code        | A         |       10402 |     NULL | NULL   |      | BTREE      |         |
| TableB    |          1 | test_index   |            3 | id          | A         |       10402 |     NULL | NULL   |      | BTREE      |         |
+-----------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
6 rows in set (0.03 sec)

and...

mysql> EXPLAIN Select A.id
        From TableA A
        Inner join TableB B
           On A.address = B.address And A.code = B.code Group by A.id
        Having count(distinct B.id) = 1;
+----+-------------+-------+--------+---------------+---------+---------+---------------------------------------+-------+----------------------------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref                                   | rows  | Extra                                        |
+----+-------------+-------+--------+---------------+---------+---------+---------------------------------------+-------+----------------------------------------------+
|  1 | SIMPLE      | B     | index  | test_index    | PRIMARY | 518     | NULL                                  | 9800  | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | A     | eq_ref | PRIMARY       | PRIMARY | 514     | db.B.address,db.B.code                |     1 |                                              |
+----+-------------+-------+--------+---------------+---------+---------+---------------------------------------+-------+----------------------------------------------+
2 rows in set (0.00 sec)

Why does this happen?

2

There are 2 answers

5
Bill Karwin On BEST ANSWER

The rows column should be taken as a rough estimate only. It's not a precise number.

It's based on statistical estimates of how many rows will be examined during a query. The actual number of rows cannot be known until you actually execute the query.

The statistics are based on samples read from the table periodically. These samples are re-read occasionally, for example after you run ANALYZE TABLE or certain INFORMATION_SCHEMA queries, or certain SHOW statements.

0
Rick James On

I don't find 20% variation in statistics to be a big deal. In many situations, think of the graph being like an upturned parabola, and you need to know which side of the minimum point you are on. In complex queries, where the Optimizer is likely to goof, it need a lot more than simple stats, such as Histograms of MariaDB 10.0 / 10.1. (I don't have enough experience with such to say whether that makes much headway.)

Your particular query is probably going to be performed in only one way, regardless of the statistics. An example of a complicated query would be a JOIN with WHERE clauses filtering each table. The optimizer has to decide which table to start with. Another case is a single table with a WHERE and ORDER BY and they cannot both be handled by a single index -- should it use an index to filter, but then have to sort? or should it use an index for ORDER BY, but then have to filter on the fly?