原因
- MySQL8.0.22版本
- 线上有一张task表的users字段因历史原因
- 存放了以逗号分隔的用户id列表
- 程序中会匹配用户id进行查询
- 用到的SQL如下:
# 查询
select users from task limit 2;
| users |
|300511164303031, 310406164883350, 151134164673502, 330203164377115, 310633164035316, 310408164888300, 170515164003106, 150636164603618, 310510164335822, 151336164653174, 310508164331806, 301115164423156 |
|310406164883350,181138164432020,1000130,330312164322768,170515164003106,300608164825431,331015164472774,150304164442136,331108164613233,1000164,301113164430265,171016164003026,300333164732303,151134164673502,1000143,331034164487883,181033164253337,310633164035316,150304164442101,1000136,330312164636073,310508164331806,330302164334267,181017164275220,301115164423156,330203164377336,310303164733465,330312164322726,330203164377115,310408164888300,311116164231848,1000123,310214164825778,301317164618388,300333164732155,151013164628330,300511164303031,1000138,1000185,150636164603618,300415164783624,310237164871433,310510164335822,151336164653174,330210164387154 |
-- 数据和表名,列名已做掩码转换。非真实数据
# 示例
select * from task where MATCH(users) AGAINST('19323422341234' );
- 表的数据量不多40多万条记录
- 但是频繁的出现慢查询(超过500毫秒)
优化思路
- 定位到全文索引慢的时候
- 第一反应是拆了这个全文索引查询
- 业务方将逗号字段拆表的改动量大暂时不考虑
- 折中办法是将这个字段换成json类型
- 然后用json的索引来替换全文索引
- 我在想这个方案的时候
- 给忠哥的预估是性能会提升3-10倍
- 当时没做测试
- 靠的是经验和信口开河
- 一通怂恿说服了研发同事
- 开始拉群开整
验证和测试
# 加json字段
alter table task add users_list json ;
# 填值
update task set users_list = concat('[' ,TRIM(BOTH ',' FROM users),']') where users is not null and users != '';
# 加索引1
ALTER Table task ADD INDEX idx_task_dba ( ( CAST( users_list -> '$[*]' as unsigned ARRAY) );
- 对应 where 19323422341234 MEMBER OF(users_list->'$[*]')
# 加索引2
#ALTER Table abc ADD INDEX idx_abc(( CAST(`users_list` as bigint(20) unsigned ARRAY) ));
- 对应:where 19323422341234 MEMBER OF (users_list)
# 这里用的加索引是用了方法1,方法2经验证,会经常跳过索引走全表扫描
- 测试和验证结果是:和全文索引比效果差不多
- 没有显著提升
- 每次查询耗时的稳定性提升了一些,但是总的来说效果有限
- 最终经过多次验证和反复讨论决定:
- 不优化了,先这样。
>> Home