彻底搞懂PostgreSQL的模糊查询:LIKE和ILIKE

如果我们使用的是PostgreSQL数据库，那么我们可以使用LIKE和ILIKE做模糊查询，LIKE语法是SQL标准而ILIKE是PostgreSQL的一个扩展。

构造数据

先创建一张表，然后插入一些数据；

create table test(
  id serial, 
  name varchar(16)
  );

my_test_db=#select * from test;
 id |  name   
----+---------
  1 | One
  2 | Two
  3 | One_Two
  4 | One/Two
  5 | oNE
  6 | 99%
(6 rows)

基本使用

在使用 LIKE/ILIKE 时，有两个通配符：百分号 (%) 和下划线 (_)

% - 匹配任何零个或多个字符
_ - 匹配单个任意字符

先看一下这个的查询：

my_test_db=#select * from test where name like 'O%';
 id |  name   
----+---------
  1 | One
  3 | One_Two
  4 | One/Two
(3 rows)

这个语句匹配所有以 O 打头的数据。

下面查询使用通配符 _

my_test_db=#select * from test where name like '_n_';
 id | name 
----+------
  1 | One
(1 row)

上面都是使用LIKE做模糊匹配，现在我们用一下ILIKE（除了不区分大小写外，和LIKE一样）。

my_test_db=#select * from test where name ilike 'O%';
 id |  name   
----+---------
  1 | One
  3 | One_Two
  4 | One/Two
  5 | oNE
(4 rows)

使用 lower() 函数和 LIKE 也能实现上面的效果

1	my_test_db=#select * from test where lower(name) like 'o%';

结果和上面sql的执行结果一样。

我推荐使用lower like，因为它的性能比ILIKE高出15%以上，下面做个测试：

首先创建一个表，然后通过脚本插入1000000行随机数据：

require 'securerandom'
inserts = []
1000000.times do |i|
        inserts << "(1, 'fake', '#{SecureRandom.urlsafe_base64(64)}')"
end
sql = "insert into books (user_id, title, description) values #{inserts.join(', ')}"
ActiveRecord::Base.connection.execute(sql)

验证数据行数：

my_test_db=# select count(id) from books ;
  count  
---------
 1000009

做个查询，看一下返回数据

my_test_db=# SELECT "books".* FROM "books" WHERE "books"."published" = 'f'
my_test_db=# and (LOWER(description) LIKE '%abcde%') ;
   id    | user_id | title |                                      description                                       | published 
---------+---------+-------+----------------------------------------------------------------------------------------+------
 1232322 |       1 | fake  | 5WRGr7oCKABcdehqPKsUqV8ji61rsNGS1TX6pW5LJKrspOI_ttLNbaSyRz1BwTGQxp3OaxW7Xl6fzVpCu9y3fA | f
 1487103 |       1 | fake  | J6q0VkZ8-UlxIMZ_MFU_wsz_8MP3ZBQvkUo8-2INiDIp7yCZYoXqRyp1Lg7JyOwfsIVdpPIKNt1uLeaBCdelPQ | f
 1817819 |       1 | fake  | YubxlSkJOvmQo1hkk5pA1q2mMK6T7cOdcU3ADUKZO8s3otEAbCdEcmm72IOxiBdaXSrw20Nq2Lb383lq230wYg | f

分别查看LOWER LIKE和ILIKE的执行计划

LOWER LIKE结果

my_test_db=# EXPLAIN ANALYZE SELECT "books".* FROM "books" WHERE "books"."published" = 'f' and (LOWER(description) LIKE '%abcde%') ;
                                                   QUERY PLAN                                                   
----------------------------------------------------------------------------------------------------------------
 Seq Scan on books  (cost=0.00..32420.14 rows=1600 width=117) (actual time=938.627..4114.038 rows=3 loops=1)
   Filter: ((NOT published) AND (lower(description) ~~ '%abcde%'::text))
   Rows Removed by Filter: 1000006
 Total runtime: 4114.098 ms

ILIKE 结果

 my_test_db=# EXPLAIN ANALYZE SELECT "books".* FROM "books" WHERE "books"."published" = 'f' and (description iLIKE '%abcde%') ;
                                                   QUERY PLAN                                                   
----------------------------------------------------------------------------------------------------------------
 Seq Scan on books  (cost=0.00..29920.11 rows=100 width=117) (actual time=1147.612..4986.771 rows=3 loops=1)
   Filter: ((NOT published) AND (description ~~* '%abcde%'::text))
   Rows Removed by Filter: 1000006
 Total runtime: 4986.831 ms

从结果可以看到LOWER LIKE比ILIKE快了差不多17%，好了言归正传，继续聊LIKE和ILIKE。

如何匹配 % 或 _ 本身

使用默认的转义字符（反斜杠）来转义

1	select * from test where name like '%\%';

上面SQL语句查询以%结尾的数据。

1	select * from test where name like '%\_%';

这个SQL语句查询含有 _ 的数据。

1	Notice：~~ 和 LIKE、~~* 和 ILIKE、!~~ 和 NOT LIKE、!~~* 和 NOT ILIKE是可以互换的。

索引

索引用于加快搜索速度。 PostgreSQL自动为主键、唯一键等列创建索引。或者我们可以显式创建索引。

如果某列有可用的索引(指定text_pattern_ops或varchar_pattern_ops)，如果不以 % 或 _ 开头，则 LIKE 会使用该索引（name LIKE ‘one%’ 会走索引，而 name like ‘%one’ 不会走索引）。

对于ILIKE，当且仅当以非字母字符（不受大小写转换影响的字符）开头时，才会走索引。

Mirrored Indexes

如果在使用LIKE的时候，查询通配符以 % 或 _ 开头，有没有什么办法走索引？

可以通过一下两步操作，让它走索引：

在该列上创建一个 reverse() 函数索引。
用reversed模式查询

举个例子，下面的查询不走索引

1	select * from test where name like '%wo';

那么，我们在name列上创建个索引

1	create index rev_idx on test(reverse(name));

将上面的查询SQL变成下面的SQL语句

1	select * from test where reverse(name) like reverse('%wo');

那么现在，就会走索引。