MySQL派生条件下推优化导致自定义变量结果错误问题分析
1. 问题描述
现场 MySQL 8.0.25 版本运行中,发现带有用户变量的SQL执行结果集与预期不符。复现场景如下:- -- 起锅烧油,连接 MySQL 8.0.25 数据库
- mysql -uroot -S /var/lib/mysql/mysql-8.0.25-linux-x86_64/mysql-test/var/tmp/mysqld.1.sock -p
-
- -- 执行以下SQL
- SELECT version();
- CREATE database testdb;
- use testdb
- CREATE TABLE t1(c1 int);
- INSERT INTO t1 VALUES(100),(200),(300),(400),(500);
- SELECT * FROM (
- SELECT c1, (@rownum_r:=@rownum_r+1) AS r
- FROM t1, (SELECT @rownum_r:=0) AS b
- ) AS q WHERE q.c1=300; -- 这里查询到的结果是300 1,实际应该是300 3
复制代码 分析上述的执行结果中的 rownum 返回1,可以推断派生表返回的条数少于预期,导致 rownum 值偏小。由此想到看下优化器改写后的SQL- mysql> explain SELECT * FROM (SELECT c1, (@rownum_r := @rownum_r + 1) AS r FROM t1, (SELECT @rownum_r := 0) AS b) AS q WHERE q.c1 = 300;
- +----+-------------+------------+------------+--------+---------------+------+---------+------+------+----------+----------------+
- | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
- +----+-------------+------------+------------+--------+---------------+------+---------+------+------+----------+----------------+
- | 1 | PRIMARY | <derived2> | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 100.00 | NULL |
- | 2 | DERIVED | <derived3> | NULL | system | NULL | NULL | NULL | NULL | 1 | 100.00 | NULL |
- | 2 | DERIVED | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 5 | 20.00 | Using where |
- | 3 | DERIVED | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | No tables used |
- +----+-------------+------------+------------+--------+---------------+------+---------+------+------+----------+----------------+
- 4 rows in set, 3 warnings (0.01 sec)
- mysql> SHOW warnings;
- +---------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
- | Level | Code | Message |
- +---------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
- .......
- | Note | 1003 | /* select#1 */ select `q`.`c1` AS `c1`,`q`.`r` AS `r` from (/* select#2 */ select `testdb`.`t1`.`c1` AS `c1`,(@rownum_r:=((@`rownum_r`) + 1)) AS `r` from `testdb`.`t1` where (`testdb`.`t1`.`c1` = 300)) `q` |
- +---------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
复制代码 印证了上面的推断,优化器将 WHERE 条件下推到派生表中,减少了派生表返回行数,以此加快 SQL 执行速度。但是引起了 rownum 递增减少问题。
由此可知这么来看,属于优化器的 Bug,而在 Bug home 中也找到了现象匹配的问题
- Derived condition pushdown rewrite ignores user variables https://bugs.mysql.com/bug.php?id=104918
2. 问题分析
2.1 什么是派生条件下推
引自 10.2.2.5 Derived Condition Pushdown Optimization 的相关描述:
对于 SELECT * FROM (SELECT i, j FROM t1) AS dt WHERE i > constant 这种子查询语句,大多情况下可以将外层的where条件下推到派生表内部,改写为 SELECT * FROM (SELECT i, j FROM t1 WHERE i > constant) AS dt,来减少派生表返回条数,达到提升执行效率的作用。
比如以下场景,就很适用 WHERE 条件下推:
- 当派生表不使用聚合或窗口函数时,可以直接向下推送外部WHERE条件。这包括具有多个与AND、OR或两者结合的谓词的WHERE条件。比如 SELECT * FROM (SELECT f1, f2 FROM t1) AS dt WHERE f1 < 3 AND f2 > 11 改写为 SELECT f1, f2 FROM (SELECT f1, f2 FROM t1 WHERE f1 < 3 AND f2 > 11) AS dt
- 当派生表具有 GROUP BY 并且不使用窗口函数时,引用一个或多个不属于GROUP BY的列的外部WHERE条件可以作为HAVING条件下推到派生表中。比如 SELECT * FROM (SELECT i, j, SUM(k) AS sum FROM t1 GROUP BY i, j) AS dt WHERE sum > 100 被改写为SELECT * FROM (SELECT i, j, SUM(k) AS sum FROM t1 GROUP BY i, j HAVING sum > 100) AS dt.
- 当派生表使用GROUP BY并且外部WHERE条件中的列是GROUP BY列时,引用这些列的WHERE条件可以直接下推到派生表中。比如 SELECT * FROM (SELECT i,j, SUM(k) AS sum FROM t1 GROUP BY i,j) AS dt WHERE i > 10 被改写为 SELECT * FROM (SELECT i,j, SUM(k) AS sum FROM t1 WHERE i > 10 GROUP BY i,j) AS dt.
derived_condition_pushdown 开关在 optimizer_switch 参数中设置,也可以使用相关hint控制在语句级生效。方法如下:- -- 开启:
- SET optimizer_switch="derived_condition_pushdown=on";
- SELECT /*+ DERIVED_CONDITION_PUSHDOWN() */ ... FROM ...
- -- 关闭:
- SET optimizer_switch="derived_condition_pushdown=off";
- SELECT /*+ NO_DERIVED_CONDITION_PUSHDOWN() */ ... FROM ...
复制代码 2.2 Bug 影响范围有多大
经过查阅 releasenote,可知条件下推是在 8.0.22 中引入的,修复的版本是 8.0.28- -- 8.0.22的发布日志:
- MySQL now implements derived condition pushdown for eligible queries. What this means is that, for a query such as SELECT * FROM (SELECT i, j FROM t1) AS dt WHERE i > constant, it is now possible in many cases to push the outer WHERE condition down to the derived table, in this case resulting in SELECT * FROM (SELECT i, j FROM t1 WHERE i > constant) AS dt. Previously, if the derived table was materialized and not merged, MySQL materialized the entire table—in this case t1—then qualified the rows with the WHERE condition.
- -- 8.0.28的修复日志:
- Documented fix as follows in the MySQL 8.0.28 changelog:
- When a condition was pushed down, the result of evaluating
- assignments to user variables in the SELECT list of the subquery
- were sometimes affected. For this reason, we now prevent
- condition pushdown for statements with assignments to user
- variables.
复制代码 3. 业务规避措施
- 升级到MySQL 8.0.28及以上版本,或者GreatSQL分支版本 https://gitee.com/GreatSQL/GreatSQL/releases/tag/GreatSQL-8.0.32-27。
- 数据库实例级别关闭该优化,set optimizer_switch="derived_condition_pushdown=off";
- 在涉及的SQL语句上添加 hint /*+ NO_DERIVED_CONDITION_PUSHDOWN() */ ,语句级别禁用derived_condition_pushdown 优化,可以规避 MySQL Bug 104918。
参考文章
- MySQL 8.0.22 releasenote:https://dev.mysql.com/doc/relnotes/mysql/8.0/en/news-8-0-22.html
- Derived condition pushdown rewrite ignores user variables https://bugs.mysql.com/bug.php?id=104918
- Derived Condition Pushdown Optimization https://dev.mysql.com/doc/refman/8.4/en/derived-condition-pushdown-optimization.html
- GreatSQL分支版本 https://gitee.com/GreatSQL/GreatSQL/releases/tag/GreatSQL-8.0.32-27
Enjoy GreatSQL
来源:程序园用户自行投稿发布,如果侵权,请联系站长删除
免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作! |