当前位置:网站首页>[summary series] technical system of Internet server: high performance database index

[summary series] technical system of Internet server: high performance database index

2020-11-08 08:02:26 Qin Shui Yu

Introduction

Establishing the optimal database index is an important means to improve the query performance of database . This paper summarizes the knowledge and practice of database index .

See you at the main entrance : “ Internet application server common technical ideas and mechanism outline ”

Basic knowledge

InnoDB The data in the table is stored in the order of primary key .InnoDB The first non empty index defined by the table ( In the order of index definition ) A primary key . Indexes ( stay MySQL in ) It's implemented by the storage engine . The main types of index are sequential index and Hashi index . The underlying structure of a sequential index is B+Tree , The underlying structure of a hash index is a hash table .

Index is space for time , Reduces the amount of data to scan 、 Avoid sorting 、 Will be random IO Into order IO. The price of using an index is : More space 、 Insert and update costs more . Sequential index can support : Full match 、 Left most order matching 、 Column prefix matching 、 Range match 、 Match sequence exactly and range match a column 、 Queries that only access the index 、 Index scan sort . Hash index can support : Full match .

Sequential index

InnoDB The ordinal index of is to build the list of primary keys into a tree B+ Trees . Internal nodes store primary key values , The leaf node stores the row data of the whole table . such , The node can store as many primary key values as possible , To lower the height of the tree .B+ A tree is an ordered search balanced tree , The height is usually at 2-4 Between , Because we should reduce the number of disk read and write as much as possible .B+ The insertion of the tree is done when the key number of nodes is full , It will split into two child nodes . understand B+ Trees are critical to understanding sequential indexes .

Sequential index can be divided into clustered index and non clustered index .

  • Cluster index : Saved in the leaf node B-Tree Index and data rows . Put the index column on the inner node , And put the row data on the leaf node . Clustering index can greatly improve IO Intensive performance . A table can only have one clustered index , The primary key column is usually used . The best insertion order of cluster index is according to the order of primary key value . If it's random insertion , It is expensive to update the clustered index : More search operations 、 Frequent “ Page splitting ” The problem of 、 Move a lot of data 、 Produce debris .
  • Nonclustered index : The inner node of a non clustered index stores the values of the non clustered index columns , The leaf node stores the primary key value of the corresponding data row . therefore , According to the nonclustered index, two index lookups are needed . First find the primary key value from the leaf node , Then find the data row in the cluster index according to the primary key value . Nonclustered indexes because they don't store information about rows of data , So it takes less space than a clustered index .

Hash index

Use hash principle to implement , A high performance , It can only be matched equivalently , Match whole columns by index 、 Range lookup... Is not supported 、 Cannot be used for sorting . Hash function can choose crc32 perhaps md5 Part of . Hash index to avoid a large number of conflicts, and do not take up too much space . The selectivity of hash index depends on the conflict degree of hash column value of the column .Memory The engine supports hash indexing , Also support B+Tree Indexes . It can be a long string ( such as URL) Create hash index , The hash value condition and column value condition must be carried in the condition .where url = xxx and hashed_url = yyy .

InnoDB For some very frequent index values in B+ Create another hash index in memory , It's called adaptive hash index .


Development matters

Columns suitable for indexing

Selective plateau is . If all rows are on the column “ Number of distinct values / Number of all values ” The higher the ratio of , The higher the selectivity , The more suitable for indexing . Column selectivity :count(distinct(col)) / count(col) . The only index selectivity is 1. Use show index from tablename ,Cardinality The value of shows the estimated value of the non repeating value of the index column . It can be used to determine whether the index is appropriate . If Cardinality The value of is close to the total number of records in the table , It's highly selective .

Be careful , In a single column index , This value corresponds to... Of the specified index column Cardinality value , And in the union index , This value corresponds to the Cardinality value . As shown below : sid_index The value of is 41659 , tid_index The value of is 101 , sid_index The selectivity of is higher than tid_index ; stc_id_index.t_id The value of is 3443139 , Refer to (s_id, t_id) The value of the union index , higher than sid_index Selectivity of single column index .

How to find highly selective columns ?

  • qualitative analysis : Values tend to be unique , It's highly selective ; And the value range is in a finite set , It's low selectivity . such as ID Values are usually highly selective , and age Values are low selectivity .
  • Measurement analysis : Use count(distinct(col)) / count(col) To calculate , The closer the value is to 1 It's highly selective . Measurement analysis is usually used to verify or negate .
mysql> show index from student_courses;

+-----------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

| Table           | Non_unique | Key_name     | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |

+-----------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

| student_courses |          0 | PRIMARY      |            1 | id          | A         |     7764823 |     NULL | NULL   |      | BTREE      |         |               |

| student_courses |          1 | stc_id_index |            1 | s_id        | A         |       40417 |     NULL | NULL   |      | BTREE      |         |               |

| student_courses |          1 | stc_id_index |            2 | t_id        | A         |     3443139 |     NULL | NULL   |      | BTREE      |         |               |

| student_courses |          1 | stc_id_index |            3 | c_id        | A         |     7764823 |     NULL | NULL   |      | BTREE      |         |               |

| student_courses |          1 | sid_index    |            1 | s_id        | A         |       41659 |     NULL | NULL   |      | BTREE      |         |               |

| student_courses |          1 | tid_index    |            1 | t_id        | A         |         101 |     NULL | NULL   |      | BTREE      |         |               |

+-----------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

6 rows in set (0.00 sec)



Build index

List all possible search statements first , Find the columns that appear , Put the most selective column on the far left , The columns with range lookup are placed as far to the right as possible . From the left, add columns one by one to the federated index , Cover all search statements as much as possible . Multiple federated indexes may need to be built to cover . Last , Consider the columns of selection and sort statements , Use index overlay as much as possible to get column data , Use index scan to sort .

Joint index

The union index is also a B+ Trees , The keyword is a tuple . Multi level search like index , Gradually and substantially reduce the number of lines that need to be scanned and matched . Federated index search follows the left most matching principle . The union index needs to establish the optimal index column order . Be careful , Create a single column index on each column that needs to be searched , It's not a federated index ( When searching, you can only search in a single column , Then index merge to merge the results ).

Union index matching follows the leftmost matching principle . Match termination condition : Reorder the search criteria in the order of the federated index , Encountered equivalent query ( Include IN Inquire about ) continue , Range query encountered 、BETWEEN、LIKE The query is terminated . When the index cannot be used : stay where In the condition , Index columns in expressions or using functions on indexed columns .

In practice , We need to use the same column but different order of union index to meet different query requirements .

Prefix index

Index long strings . Index using the prefix of a string of specified length . about BLOB, TEXT, Very long VARCHAR Column , Must use prefix index . The prefix index should choose a suitable length : The selectivity is close to the selectivity of the whole column , At the same time, it doesn't take up too much space . Prefix index cannot be used GROUP BY and ORDER BY, You can't scan it . If the string suffix or part is more selective , You can also do some preprocessing to convert it into a prefix index . The thoughts are the same .

Steps to find the best length of prefix index :

STEP1 - First find all the values of the column TOPN, have access to count as c, col from table group by col order by c desc limit N sentence ;

STEP2 - From a more appropriate value ( such as 3) Start , Test selectivity , until TOPN Most of them are listed c The quantity and TOPN Of c Close to .


Overlay index

The column that covers the index contains all the columns that need to be queried , Can reduce a lot of disk read , Significantly improved performance . If a list is in select cols There is a high frequency of , You can also consider putting it in a federated index , Use overlay index to optimize performance . Delayed correlation techniques can use the ability to overlay indexes .

Index scan sort

Only if the column order of the index is the same as ORDER BY The order of words and sentences is exactly the same , And all columns are sorted in the same direction , To sort the results using the index . There is one exception , That is, the leading column condition is specified as a constant . such as (date, fans_id) about where date = 'xxx' order by fans_id desc You can also use index scan to sort .

Index tips

have access to FORCE INDEX(a) Force the designation of SQL The index to be used by the statement .

MRR

Multi-Range Read. Optimization for range queries .MRR The secondary key found will be put into the cache , Then sort by primary key ( Will be random IO Convert to order IO, Can reduce page replacement ), Then according to the sorted primary key to access the actual data in order . Apply to range, ref, eq_ref Query for .

MRR Default on . Use optimizer_switch To control whether to use MRR. Set up mrr=on when , Means to enable MRR Optimize .
SET @@optimizer_switch='mrr=on,mrr_cost_based=on';


“ System account number ” problem

A value in an index column appears a lot . Avoid using the system account value in the query statement .

Index experiment

preparation

Preparation form

Suppose there's a student schedule . As shown below :

## executed using root account
## mysql -uroot -p < /path/to/project.sql

DROP USER 'test'@'localhost';
drop database if exists test;

CREATE USER 'test'@'localhost' IDENTIFIED BY 'test';
create database test ;
grant all privileges on test.* to 'test'@'localhost' identified by 'test';

use test
drop table if exists student_courses;
create table student_courses (
    id int(10) UNSIGNED not null primary key AUTO_INCREMENT comment 'AUTO_INCREMENT ID',
    s_id varchar(64) not null comment 'student ID',
    t_id varchar(64) not null comment 'teacher ID',
    room varchar(64) not null comment 'room name',
    c_id varchar(32) not null comment 'course ID',
    c_time int(10) not null comment 'course time',
    extra varchar(256) default '' comment 'extra info',
    gmt_create datetime DEFAULT CURRENT_TIMESTAMP,
      gmt_modified datetime DEFAULT CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Prepare the data

Write a groovy Script generation 800w Course selection data . Batch insertion is more efficient . Single insert , Every time I refresh , Thousands of inserts ; Batch insert , Every time I refresh ,20w Insertion .

package cc.lovesq.study.data

class StudentCoursesDataGenerator {

    private static final STUDENT_PREFIX = "STU";
    private static final TEACHER_PREFIX = "TCH";
    private static final ROOM_PREFIX = "ROOM";
    private static final COURSE_PREFIX = "CRE";

    static Random random = new Random(47);
    static int THREE_MONTH = 3 * 60 * 60 * 24 * 30;

    static void main(args) {

        def filePath = "./sql/stu_courses.sql"
        File file = new File(filePath)
        def batchSize = 50

        file.withWriter { writer ->
            for (int i=0; i< 8000000/batchSize; i++) {
                def insertSql = "insert into student_courses(s_id, t_id, room, c_id, c_time) values "
                for (int j=0; j< batchSize; j++) {
                    def sId = STUDENT_PREFIX + "_" + random.nextInt(40000)
                    def tId = TEACHER_PREFIX + random.nextInt(100)
                    def room = ROOM_PREFIX + random.nextInt(50)
                    def cId = COURSE_PREFIX + random.nextInt(60)
                    def cTime = Math.floor((System.currentTimeMillis() - random.nextInt(THREE_MONTH)) / 1000)
                    insertSql += "('$sId', '$tId', '$room', '$cId', $cTime),"
                }
                insertSql = insertSql.substring(0, insertSql.length()-1) + ";\n"
                //print(insertSql)
                writer.write(insertSql)
            }
        }
    }
}

The sample data generated is as follows :

insert into student_courses(s_id, t_id, room, c_id, c_time) values ('STU_29258', 'TCH55', 'ROOM43', 'CRE41', 1.604717694E9),('STU_429', 'TCH68', 'ROOM0', 'CRE42', 1.604714673E9),('STU_38288', 'TCH28', 'ROOM1', 'CRE49', 1.604719218E9),('STU_7278', 'TCH98', 'ROOM11', 'CRE20', 1.604712414E9),('STU_8916', 'TCH40', 'ROOM11', 'CRE42', 1.604715357E9),('STU_17383', 'TCH6', 'ROOM25', 'CRE10', 1.604718551E9),('STU_27674', 'TCH4', 'ROOM0', 'CRE6', 1.604714485E9),('STU_30896', 'TCH33', 'ROOM34', 'CRE4', 1.604716917E9),('STU_28303', 'TCH41', 'ROOM38', 'CRE52', 1.604716827E9),('STU_8689', 'TCH85', 'ROOM42', 'CRE46', 1.604713881E9),('STU_2447', 'TCH68', 'ROOM4', 'CRE35', 1.604713422E9),('STU_10354', 'TCH16', 'ROOM22', 'CRE36', 1.604713187E9),('STU_29257', 'TCH34', 'ROOM2', 'CRE17', 1.604717763E9),('STU_17242', 'TCH80', 'ROOM48', 'CRE1', 1.60471313E9),('STU_17052', 'TCH65', 'ROOM4', 'CRE9', 1.604711894E9),('STU_12209', 'TCH58', 'ROOM8', 'CRE43', 1.604712827E9),('STU_1246', 'TCH94', 'ROOM20', 'CRE4', 1.604715802E9),('STU_33533', 'TCH61', 'ROOM8', 'CRE8', 1.604718404E9),('STU_14367', 'TCH79', 'ROOM5', 'CRE42', 1.604714165E9),('STU_28037', 'TCH99', 'ROOM21', 'CRE13', 1.604718321E9),('STU_31909', 'TCH28', 'ROOM3', 'CRE36', 1.604718883E9),('STU_16994', 'TCH1', 'ROOM19', 'CRE3', 1.604719329E9),('STU_25382', 'TCH34', 'ROOM12', 'CRE26', 1.604714293E9),('STU_21718', 'TCH55', 'ROOM15', 'CRE40', 1.604715585E9),('STU_36228', 'TCH17', 'ROOM1', 'CRE17', 1.604716797E9),('STU_24146', 'TCH62', 'ROOM2', 'CRE12', 1.604714202E9),('STU_36499', 'TCH11', 'ROOM42', 'CRE14', 1.604718307E9),('STU_30843', 'TCH16', 'ROOM35', 'CRE6', 1.604717656E9),('STU_32930', 'TCH15', 'ROOM23', 'CRE33', 1.604718313E9),('STU_12921', 'TCH3', 'ROOM13', 'CRE35', 1.604711955E9),('STU_16669', 'TCH83', 'ROOM20', 'CRE58', 1.604717105E9),('STU_10225', 'TCH1', 'ROOM26', 'CRE5', 1.60471344E9),('STU_9399', 'TCH98', 'ROOM31', 'CRE45', 1.604714572E9),('STU_17332', 'TCH25', 'ROOM10', 'CRE31', 1.604713764E9),('STU_38771', 'TCH10', 'ROOM10', 'CRE11', 1.604716834E9),('STU_9529', 'TCH16', 'ROOM30', 'CRE10', 1.604718969E9),('STU_32513', 'TCH36', 'ROOM40', 'CRE44', 1.604714399E9),('STU_38907', 'TCH34', 'ROOM31', 'CRE33', 1.604716016E9),('STU_31551', 'TCH13', 'ROOM35', 'CRE28', 1.604716906E9),('STU_39883', 'TCH39', 'ROOM46', 'CRE23', 1.604719006E9),('STU_34965', 'TCH47', 'ROOM45', 'CRE10', 1.604713917E9),('STU_12265', 'TCH85', 'ROOM46', 'CRE11', 1.604714663E9),('STU_9348', 'TCH22', 'ROOM4', 'CRE14', 1.604712076E9),('STU_38391', 'TCH35', 'ROOM29', 'CRE37', 1.60471538E9),('STU_25424', 'TCH78', 'ROOM23', 'CRE3', 1.604717869E9),('STU_39334', 'TCH25', 'ROOM14', 'CRE48', 1.604717478E9),('STU_26085', 'TCH17', 'ROOM16', 'CRE23', 1.604718913E9),('STU_35483', 'TCH16', 'ROOM6', 'CRE5', 1.604712875E9),('STU_28009', 'TCH77', 'ROOM47', 'CRE39', 1.604716687E9),('STU_15094', 'TCH71', 'ROOM23', 'CRE18', 1.604712238E9);


You can view the table space size :

mysql> select CONCAT(ROUND(SUM(DATA_LENGTH) / (1024 * 1024 * 1024),3),' GB') as TABLE_SIZE from information_schema.TABLES where information_schema.TABLES.TABLE_NAME='student_courses'\G

*************************** 1. row ***************************

TABLE_SIZE: 0.538 GB



Start the test

Add index to bare table

Suppose you don't build any index , A naked watch , adopt s_id Search needs 2.94s; add to sid_index After index , The same search can't find 0.01s .

select * from student_courses where s_id = 'STU_17242';

194 rows in set (2.94 sec)




ALTER TABLE `student_courses` ADD INDEX sid_index ( `s_id` );

select * from student_courses where s_id = 'STU_17242';


194 rows in set (0.01 sec)

Use explain Explain :

  • select_type: Query type , SIMPLE It's a simple SELECT Inquire about ;
  • type: The connection type of the table . const Represents a match of up to one line , It's usually based on the primary key ;ref Indicates the use of a non primary key / Unique index matches a small number of rows ; range Indicates range query ,<>, >, <, <=, >=, IN, BETWEEN, LIKE ; index Scan the index tree , But the number is too large , It's equivalent to a full table scan ; full Full table scan .
  • possible_keys and key : The possible index and the actual index used .
  • ref: about key Given column , Which columns or constants are used to compare .
  • rows: Expected number of lines to scan .
  • filtered: The proportion of rows filtered .
  • Extra: Additional information used by the index . Using Where Need to use where To filter records by using sentence conditions ; Using Index The column information to be obtained can be obtained from the index tree ; Using filesort File sorting ; Using MRR Have you used MRR Optimize range queries .
mysql> explain select * from student_courses where id = 5;

+----+-------------+-----------------+------------+-------+---------------+---------+---------+-------+------+----------+-------+

| id | select_type | table           | partitions | type  | possible_keys | key     | key_len | ref   | rows | filtered | Extra |

+----+-------------+-----------------+------------+-------+---------------+---------+---------+-------+------+----------+-------+

|  1 | SIMPLE      | student_courses | NULL       | const | PRIMARY       | PRIMARY | 4       | const |    1 |   100.00 | NULL  |

+----+-------------+-----------------+------------+-------+---------------+---------+---------+-------+------+----------+-------+


mysql> explain select * from student_courses where s_id = 'STU_17242';

+----+-------------+-----------------+------------+------+---------------+-----------+---------+-------+------+----------+-------+

| id | select_type | table           | partitions | type | possible_keys | key       | key_len | ref   | rows | filtered | Extra |

+----+-------------+-----------------+------------+------+---------------+-----------+---------+-------+------+----------+-------+

|  1 | SIMPLE      | student_courses | NULL       | ref  | sid_index     | sid_index | 194     | const |  194 |   100.00 | NULL  |

+----+-------------+-----------------+------------+------+---------------+-----------+---------+-------+------+----------+-------+


mysql> explain select count(id) from student_courses;

+----+-------------+-----------------+------------+-------+---------------+----------+---------+------+---------+----------+-------------+

| id | select_type | table           | partitions | type  | possible_keys | key      | key_len | ref  | rows    | filtered | Extra       |

+----+-------------+-----------------+------------+-------+---------------+----------+---------+------+---------+----------+-------------+

|  1 | SIMPLE      | student_courses | NULL       | index | NULL          | tc_index | 292     | NULL | 7785655 |   100.00 | Using index |

+----+-------------+-----------------+------------+-------+---------------+----------+---------+------+---------+----------+-------------+


A function is used on the index column, and the index will not be used :

select * from student_courses where REPLACE(s_id,"STU_","") = '17242';



mysql> explain select * from student_courses where REPLACE(s_id,"STU_","") = '17242';

+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+-------------+

| id | select_type | table           | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra       |

+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+-------------+

|  1 | SIMPLE      | student_courses | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 7764823 |   100.00 | Using where |

+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+-------------+



Suppose there are the following statements , You can see that the index is used sid_index , from 194 Filter to the end 3 strip . because sid_index Most of the records have been filtered , So add t_id The index doesn't seem necessary .

select * from student_courses where t_id = 'TCH86' and s_id = 'STU_17242';



mysql> explain select * from student_courses where t_id = 'TCH86' and s_id = 'STU_17242';

+----+-------------+-----------------+------------+------+---------------+-----------+---------+-------+------+----------+-------------+

| id | select_type | table           | partitions | type | possible_keys | key       | key_len | ref   | rows | filtered | Extra       |

+----+-------------+-----------------+------------+------+---------------+-----------+---------+-------+------+----------+-------------+

|  1 | SIMPLE      | student_courses | NULL       | ref  | sid_index     | sid_index | 194     | const |  194 |    10.00 | Using where |

+----+-------------+-----------------+------------+------+---------------+-----------+---------+-------+------+----------+-------------+

Now delete sid_index Indexes , add to tid_index Indexes . Look at the situation . because t_id Low selectivity , add to tid_index After filtration, there is still 8w+ Bar record , Two search statements take time 0.4s about .

ALTER TABLE student_courses drop index sid_index;


ALTER TABLE student_courses add index tid_index(t_id);


select * from student_courses where t_id = 'TCH86';

80195 rows in set (0.45 sec)


select * from student_courses where t_id = 'TCH86' and s_id = 'STU_17242';

3 rows in set (0.40 sec)


mysql> explain select * from student_courses where t_id = 'TCH86' and s_id = 'STU_17242';

+----+-------------+-----------------+------------+------+---------------+-----------+---------+-------+--------+----------+-------------+

| id | select_type | table           | partitions | type | possible_keys | key       | key_len | ref   | rows   | filtered | Extra       |

+----+-------------+-----------------+------------+------+---------------+-----------+---------+-------+--------+----------+-------------+

|  1 | SIMPLE      | student_courses | NULL       | ref  | tid_index     | tid_index | 194     | const | 151664 |    10.00 | Using where |

+----+-------------+-----------------+------------+------+---------------+-----------+---------+-------+--------+----------+-------------+


mysql> select count(distinct s_id) / count(*) as s_id_selectivity, count(distinct t_id) / count(*) as t_id_selectivity  from student_courses;

+------------------+------------------+

| s_id_selectivity | t_id_selectivity |

+------------------+------------------+

|           0.0050 |           0.0000 |

+------------------+------------------+

1 row in set (10.11 sec)

This explanation : Add selective indexes , Better performance improvement .

Joint index

Consider the following statement . Still use tid_index , Time consuming 0.4s . If you use a federated index (tid_index, cid_index) , It takes time 0.03s . It's equivalent to doing two index searches , Faster than once, of course . The price is , Indexes take up more space .

select * from student_courses where t_id = 'TCH86' and c_id = 'CRE33';


1423 rows in set (0.41 sec)


ALTER TABLE student_courses add index tid_cid_index(t_id, c_id);

select * from student_courses where t_id = 'TCH86' and c_id = 'CRE33';
1423 rows in set (0.03 sec)

Combined with case one , There are usually multiple businesses ID Build a joint index (s_id, t_id, c_id) , such ,(s_id), (s_id, t_id), (s_id, t_id, c_id) Can be applied to this index . because s_id It's very selective , You can create a separate index ( Save index space ); and (t_id, c_id) You need to build a federated index , because (s_id, t_id, c_id) Can't match t_id and c_id Joint queries . According to the leftmost matching principle ,s_id Must appear .

ALTER TABLE student_courses add index stc_id_index(s_id,t_id,c_id);   perhaps  ALTER TABLE student_courses add index sid_index(s_id)

ALTER TABLE student_courses add index stc_id_index(t_id, c_id);

Federated index is the key to improve the performance of multi conditional queries . The leftmost matching principle is one of the most important principles for applying federated indexes . The query conditions are defined in the order of the union index (a,b,c,d,e) Rearrange , Compare one by one :

  • If the query conditions are equivalent queries , It doesn't matter in which order it appears , Reorder according to the order of the union index definition . such as a=1 and b=2 And b=2 and a=1 Is the same . The order can be different , But there has to be . If b=2 and c=3 You can't apply a federated index (a,b,c,d,e) 了 , because a Didn't show up .
  • If the column does not appear in the union index , Then the match ends here . such as b=2 and a=1 and d = 4 Can only be applied (a,b), because c Didn't show up .
  • If a range matching column appears in the federated index , Match to the end of the column , The latter condition cannot be indexed . such as b=2 and a=1 and d=4 and c in (2,3) Can only be applied (a,b,c) , because c There's scope matching .

stay explain In command , You can see ref , filter To determine which indexes are applied . If there is no index applied to a column , You can also delete the corresponding query conditions , use explain Ordered ref and rows To see if there is any change . If it only applies to an index , be Extra = Using index condition . Let's assume that we've only established (s_id, t_id, c_id) Joint index . It can be used show index from student_courses; See what indexes have been established .

mysql> show index from student_courses;
+-----------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table           | Non_unique | Key_name     | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| student_courses |          0 | PRIMARY      |            1 | id          | A         |     7764823 |     NULL | NULL   |      | BTREE      |         |               |
| student_courses |          1 | stc_id_index |            1 | s_id        | A         |       40417 |     NULL | NULL   |      | BTREE      |         |               |
| student_courses |          1 | stc_id_index |            2 | t_id        | A         |     3443139 |     NULL | NULL   |      | BTREE      |         |               |
| student_courses |          1 | stc_id_index |            3 | c_id        | A         |     7764823 |     NULL | NULL   |      | BTREE      |         |               |
+-----------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

Here are the statements and how the federated index is applied :

 //  Sweep the whole watch , Unable to apply federated index 
mysql> explain select * from student_courses where c_id = 'CRE3' and t_id = 'TCH21'; 
+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| id | select_type | table           | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+-------------+
|  1 | SIMPLE      | student_courses | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 7764823 |     1.00 | Using where |
+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+-------------+

//  Applied  (s_id, t_id, c_id) , All of them are equivalent because of the query , It doesn't matter in the order in which the query statements appear 
mysql> explain select * from student_courses where s_id = 'STU_18528' and c_id = 'CRE3' and t_id = 'TCH21';

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------------+------+----------+-------+

| id | select_type | table           | partitions | type | possible_keys | key          | key_len | ref               | rows | filtered | Extra |

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------------+------+----------+-------+

|  1 | SIMPLE      | student_courses | NULL       | ref  | stc_id_index  | stc_id_index | 486     | const,const,const |    1 |   100.00 | NULL  |

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------------+------+----------+-------+


//  application  (s_id, t_id) , therefore  ref = const, const
mysql> explain select * from student_courses where s_id = 'STU_18528' and t_id = 'TCH21';  

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------+------+----------+-------+

| id | select_type | table           | partitions | type | possible_keys | key          | key_len | ref         | rows | filtered | Extra |

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------+------+----------+-------+

|  1 | SIMPLE      | student_courses | NULL       | ref  | stc_id_index  | stc_id_index | 388     | const,const |    2 |   100.00 | NULL  |

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------+------+----------+-------+



//  Apply only  (s_id) , because  t_id  Didn't show up 
mysql> explain select * from student_courses where s_id = 'STU_18528' and c_id = 'CRE3';

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------+------+----------+-----------------------+

| id | select_type | table           | partitions | type | possible_keys | key          | key_len | ref   | rows | filtered | Extra                 |

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------+------+----------+-----------------------+

|  1 | SIMPLE      | student_courses | NULL       | ref  | stc_id_index  | stc_id_index | 194     | const |  195 |    10.00 | Using index condition |

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------+------+----------+-----------------------+



//  The first application  (s_id, t_id, c_id)  It is estimated that  in ( 'TCH21')  Optimized for equivalent queries ;  The second one applies  (s_id, t_id).
mysql> explain select * from student_courses where s_id = 'STU_18528' and c_id = 'CRE3' and t_id in ( 'TCH21');

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------------+------+----------+-------+

| id | select_type | table           | partitions | type | possible_keys | key          | key_len | ref               | rows | filtered | Extra |

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------------+------+----------+-------+

|  1 | SIMPLE      | student_courses | NULL       | ref  | stc_id_index  | stc_id_index | 486     | const,const,const |    1 |   100.00 | NULL  |

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------------+------+----------+-------+

1 row in set, 1 warning (0.00 sec)



mysql> explain select * from student_courses where s_id = 'STU_18528' and c_id = 'CRE3' and t_id > 'TCH21';

+----+-------------+-----------------+------------+-------+---------------+--------------+---------+------+------+----------+-----------------------+

| id | select_type | table           | partitions | type  | possible_keys | key          | key_len | ref  | rows | filtered | Extra                 |

+----+-------------+-----------------+------------+-------+---------------+--------------+---------+------+------+----------+-----------------------+

|  1 | SIMPLE      | student_courses | NULL       | range | stc_id_index  | stc_id_index | 388     | NULL |  171 |    10.00 | Using index condition |

+----+-------------+-----------------+------------+-------+---------------+--------------+---------+------+------+----------+-----------------------+


Index overlay

Index overlay refers to select The columns in are all in the federated index columns . Here are two sentences , The latter statement applies index coverage ,Extra = Using index , The column data can be obtained directly from the index , You don't have to read the disk .

mysql> explain select * from student_courses where s_id = 'STU_18528' and c_id = 'CRE3' and t_id= 'TCH21';
+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------------+------+----------+-------+
| id | select_type | table           | partitions | type | possible_keys | key          | key_len | ref               | rows | filtered | Extra |
+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------------+------+----------+-------+
|  1 | SIMPLE      | student_courses | NULL       | ref  | stc_id_index  | stc_id_index | 486     | const,const,const |    1 |   100.00 | NULL  |
+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------------+------+----------+-------+

mysql> explain select s_id, t_id from student_courses where s_id = 'STU_18528' and c_id = 'CRE3' and t_id= 'TCH21';
+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------------+------+----------+-------------+
| id | select_type | table           | partitions | type | possible_keys | key          | key_len | ref               | rows | filtered | Extra       |
+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------------+------+----------+-------------+
|  1 | SIMPLE      | student_courses | NULL       | ref  | stc_id_index  | stc_id_index | 486     | const,const,const |    1 |   100.00 | Using index |
+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------------+------+----------+-------------+

Index scan sort

The index can be used to sort , To reduce the randomness IO, Improve sorting performance . Index sort can be applied in three cases :

  • The order of the index columns is the same as ORDER BY When the order of clauses is exactly the same , And all columns are sorted in the same direction ; If you want to associate multiple tables , be ORDER BY When the referenced sorting fields are all fields of the first table ;
  • If the leading column is an equivalent query , Follow up ORDER BY The order of the fields in the clause is the same as the order of the index columns .

If index sorting is used , be type = index ; If the index sort cannot be referenced , that Extra Will prompt Using filesort .

//  Apply index sort :ORDER BY  All the columns of the sentence are in the same order as the index columns , And the sorting direction is the same 
mysql> explain select * from student_courses order by s_id desc, t_id desc, c_id desc limit 10;

+----+-------------+-----------------+------------+-------+---------------+--------------+---------+------+------+----------+-------+

| id | select_type | table           | partitions | type  | possible_keys | key          | key_len | ref  | rows | filtered | Extra |

+----+-------------+-----------------+------------+-------+---------------+--------------+---------+------+------+----------+-------+

|  1 | SIMPLE      | student_courses | NULL       | index | NULL          | stc_id_index | 486     | NULL |   10 |   100.00 | NULL  |

+----+-------------+-----------------+------------+-------+---------------+--------------+---------+------+------+----------+-------+


//  No index sort applied :ORDER BY  All the columns of the sentence are in the same order as the index columns , But the sort direction is not consistent 
mysql> explain select * from student_courses order by s_id asc, t_id desc limit 10;

+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+----------------+

| id | select_type | table           | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra          |

+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+----------------+

|  1 | SIMPLE      | student_courses | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 7764823 |   100.00 | Using filesort |

+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+----------------+



//  No index sort applied :ORDER BY  The order of all the columns of a sentence  (t_id, s_id)  And index column order  (s_id, t_id, c_id)  atypism 
mysql> explain select * from student_courses order by t_id desc, s_id desc limit 10;

+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+----------------+

| id | select_type | table           | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra          |

+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+----------------+

|  1 | SIMPLE      | student_courses | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 7764823 |   100.00 | Using filesort |

+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+----------------+



//  Apply index sort : The lead is listed as  s_id  And  t_id  union , Same as the order of index column definition 
mysql> explain select s_id, t_id from student_courses where s_id = 'STU_18528' order by t_id;

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------+------+----------+--------------------------+

| id | select_type | table           | partitions | type | possible_keys | key          | key_len | ref   | rows | filtered | Extra                    |

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------+------+----------+--------------------------+

|  1 | SIMPLE      | student_courses | NULL       | ref  | stc_id_index  | stc_id_index | 194     | const |  195 |   100.00 | Using where; Using index |

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------+------+----------+--------------------------+



//  Apply index sort : The lead is listed as  s_id, t_id  And  c_id  union , Same as the order of index column definition 
mysql> explain select s_id, t_id from student_courses where s_id = 'STU_18528' and t_id = 'TCH21' order by c_id desc;

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------+------+----------+--------------------------+

| id | select_type | table           | partitions | type | possible_keys | key          | key_len | ref         | rows | filtered | Extra                    |

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------+------+----------+--------------------------+

|  1 | SIMPLE      | student_courses | NULL       | ref  | stc_id_index  | stc_id_index | 388     | const,const |    2 |   100.00 | Using where; Using index |

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------------+------+----------+--------------------------+



//  Failed to apply index sort : The lead is listed as  s_id  And  c_id  union , Inconsistent with the order of index column definition 
mysql> explain select s_id, t_id from student_courses where s_id = 'STU_18528' order by c_id desc;

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------+------+----------+------------------------------------------+

| id | select_type | table           | partitions | type | possible_keys | key          | key_len | ref   | rows | filtered | Extra                                    |

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------+------+----------+------------------------------------------+

|  1 | SIMPLE      | student_courses | NULL       | ref  | stc_id_index  | stc_id_index | 194     | const |  195 |   100.00 | Using where; Using index; Using filesort |

+----+-------------+-----------------+------------+------+---------------+--------------+---------+-------+------+----------+------------------------------------------+



MRR

If you use MRR The cost is too high , It won't turn on MRR. here , You can use mandatory indexing , Or set it on anyway MRR. As shown below ,t_id < 'T24' Will turn on MRR, but t_id < 'T32' It doesn't turn on . here , You can force the use of indexes tc_index, such , Will use MRR.

mysql> explain select * from student_courses where t_id >= 'TCH21' and t_id < 'TCH24';

+----+-------------+-----------------+------------+-------+---------------+----------+---------+------+--------+----------+----------------------------------+

| id | select_type | table           | partitions | type  | possible_keys | key      | key_len | ref  | rows   | filtered | Extra                            |

+----+-------------+-----------------+------------+-------+---------------+----------+---------+------+--------+----------+----------------------------------+

|  1 | SIMPLE      | student_courses | NULL       | range | tc_index      | tc_index | 194     | NULL | 508500 |   100.00 | Using index condition; Using MRR |

+----+-------------+-----------------+------------+-------+---------------+----------+---------+------+--------+----------+----------------------------------+

1 row in set, 1 warning (0.00 sec)





mysql> explain select * from student_courses where t_id >= 'TCH21' and t_id < 'TCH32';

+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+-------------+

| id | select_type | table           | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra       |

+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+-------------+

|  1 | SIMPLE      | student_courses | NULL       | ALL  | tc_index      | NULL | NULL    | NULL | 7785655 |    27.09 | Using where |

+----+-------------+-----------------+------------+------+---------------+------+---------+------+---------+----------+-------------+

1 row in set, 1 warning (0.00 sec)





mysql> explain select * from student_courses FORCE INDEX(tc_index) where t_id >= 'TCH21' and t_id < 'TCH32';

+----+-------------+-----------------+------------+-------+---------------+----------+---------+------+---------+----------+----------------------------------+

| id | select_type | table           | partitions | type  | possible_keys | key      | key_len | ref  | rows    | filtered | Extra                            |

+----+-------------+-----------------+------------+-------+---------------+----------+---------+------+---------+----------+----------------------------------+

|  1 | SIMPLE      | student_courses | NULL       | range | tc_index      | tc_index | 194     | NULL | 2109100 |   100.00 | Using index condition; Using MRR |

+----+-------------+-----------------+------------+-------+---------------+----------+---------+------+---------+----------+----------------------------------+





Summary

Database is the software that developers often deal with , And index is the most important thing to access database efficiently . In depth understanding of the principles of indexing , Design the index of adaptive query reasonably , It's necessary to work hard .

Index Basics :

  • Create efficient indexes based on query criteria ;
  • Understand the leftmost matching principle and define the optimal union index ;
  • Use overlay index and index scan as well as possible .

Reference material

版权声明
本文为[Qin Shui Yu]所创,转载请带上原文链接,感谢