Return better results using full-text search in MySQL

ah
2007-06-22
by Contributor Melonfire | Mar 16, 2006

Takeaway: This document will show you how to perform full-text searches in MySQL.

Many Web applications offer full-text search capabilities, wherein the user can locate matching records by using a word or word fragment as the query term. Behind the scenes, these applications are often performing this search by using a LIKE clause in a SELECT query. While this certainly works, it is an extremely inefficient way of performing a full-text search, especially with large volumes of data.

MySQL offers a solution to this problem, in the form of built-in full-text search. Here, the developer need simply mark certain fields as needing full-text search, and then use special MySQL functions to run searches on those fields. This is not only more performance-efficient (because MySQL indexes these fields differently to optimize the search) but also results in higher-quality searches, because MySQL intelligently ranks the results using a natural-language algorithm to remove irrelevant items.

This document will show you how to perform full-text searches in MySQL.
1. Setting up the base table

Begin by creating a sample table, using the SQL commands below:
mysql> CREATE TABLE reviews (id INT(5) PRIMARY KEY NOT NULL AUTO_INCREMENT, data TEXT);

The command above creates a single table for music album reviews (which are essentially paragraphs of text). Populate this table with some sample records:
mysql> INSERT INTO `reviews` (`id`, `data`) VALUES (1, 'Gingerboy has a new single out called Throwing Rocks. It\'s great!'); mysql> INSERT INTO `reviews` (`id`, `data`) VALUES (2, 'Hello all, I really like the new Madonna single. One of the hottest tracks currently playing...I\'ve been listening to it all day'); mysql> INSERT INTO `reviews` (`id`, `data`) VALUES (3, 'Have you heard the new band Hotter Than Hell They have five members and they burn their instruments when they play in concerts. These guys totally rock! Like, awesome, dude!');

Verify that the data has been correctly entered:
mysql> SELECT * FROM reviews;
+----+--------------------------------------------+
| id | data                           |
+----+--------------------------------------------+
| 1 | Gingerboy has a new single out called ... |
| 2 | Hello all, I really like the new Madon ... |
| 3 | Have you heard the new band Hotter Than... |
+----+--------------------------------------------+
3 rows in set (0.00 sec)
2. Define the full-text search fields
Next, define the field you intend to run searches on as a full-text index:
接下来,定义您要作为全文搜索索引的字段
mysql> ALTER TABLE reviews ADD FULLTEXT INDEX (data);
Query OK, 3 rows affected (0.21 sec)
Records: 3 Duplicates: 0 Warnings: 0

Check that the index has been added with a quick SHOW INDEXES command:
mysql> SHOW INDEXES FROM reviews;
+---------+---------------+--------+------+------------+---------+
| Table   | Column_name   | Packed | Null | Index_type | Comment |
----------+---------------+--------+------+------------+---------+
| reviews | id       | NULL   |     | BTREE     |       |
| reviews | data       | NULL   | YES | FULLTEXT   |       |
+---------+---------------+--------+------+------------+---------+
2 rows in set (0.01 sec)
3. Perform a full-text search

Once you've got your data in and indexed, it's time to put MySQL's full-text search through its paces. The simplest form of full-text search is a SELECT query with a MATCH...AGAINST clause. Here's an example which locates records containing the word "single":
mysql> SELECT id FROM reviews WHERE MATCH (data) AGAINST ('single');+----+
| id |
+----+
| 1 |
| 2 |
+----+
2 rows in set (0.00 sec)

Here,MATCH() looks in the field passed to it as argument for text matching the argument passed to AGAINST(). Matches, if any, are returned in the normal way. Note that you can pass MATCH() more than one field to look in―simply separate the field list with commas.

When MySQL receives a full-text search request, it internally ranks each record with a numerical score. Records without a match receive a zero score, while records with a "more relevant" match receive a relatively higher score than those with a "less relevant" match. Relevance is determined by MySQL using a variety of different criteria; look at the MySQL manual for more information on how this works.

To see how each record is ranked, simply return the MATCH() function as part of your result set, as below:
在此,MATCH()将作为参数传递给它的字段中的文字与传递给AGAINST()的参数进行比较,如果有匹配的,那就按照正常的方式返回。注意您可以传递不止一个字段用MATCH()来查看­-只需用逗号来分割字段列表。

当MySQL收到了一个全文搜索的请求,它就在内部对每个记录进行评分,不匹配的记录得分为零,而“更相关”的记录会得到比“不太相关”的记录相对更高的分数。相关性是由MySQL的一系列区分标准来决定的,查看MySQL的用户手册可以得到更多的信息。

想看到每个记录的评分如何,只需要返回MATCH()方法作为结果集的一部分,如下所示:
mysql> SELECT id, MATCH (data) AGAINST ('rock') FROM reviews;
+----+-------------------------------+
| id | MATCH (data) AGAINST ('rock') |
+----+-------------------------------+
| 1 |                   0 |
| 2 |                   0 |
| 3 |           1.3862514533815 |
+----+-------------------------------+
3 rows in set (0.00 sec)
4. Use Boolean search modifiers

You can also use Boolean search modifiers for a more precise search, by adding the special IN BOOLEAN MODE modifier in the AGAINST clause. Here's an example which finds records containing the word "single" but not the word "Madonna":
4、使用逻辑搜索修饰符(Boolean search modifiers)

您还可以使用逻辑搜索修饰符来进行更精确的搜索,这通过在AGAINST语句中添加特殊的IN BOOLEAN MODE修饰符来实现,在以下的例子中,将查找含有单词“single”但是没有“Madonna”的记录:
mysql> SELECT id FROM reviews WHERE MATCH (data) AGAINST ('+single -madonna' IN BOOLEAN MODE);
+----+
| id |
+----+
| 1 |
+----+
1 row in set (0.00 sec)

The most common application of this search feature is to have MySQL search for word fragments (instead of complete words) by using the IN BOOLEAN MODE clause with the * (asterisk) operator. Here's an example, which finds all records containing the word fragment "hot":
这一搜索特性通常用于搜索单词片断(而不是完整的词语),这可以通过在IN BOOLEAN MODE语句中的*(星号)操作符来实现,以下的例子展示了如何查找单词中含有“hot”的记录:
mysql> SELECT id FROM reviews WHERE MATCH (data) AGAINST ('hot*' IN BOOLEAN MODE);+----+
| id |
+----+
| 3 |
| 2 |
+----+
2 rows in set (0.00 sec)

You can also use this to search for records containing at least one of the arguments passed to AGAINST. The following example looks for records containing at least one of the words "hell" and "rocks":
您还可以使用这种方法来查找至少一个传递到AGAINST的参数中,以下的例子查找了至少包含单词“hell”和“rocks”中的一个的记录:
mysql> SELECT id FROM reviews WHERE MATCH (data) AGAINST ('hell rocks' IN BOOLEAN MODE);
+----+
| id |
+----+
| 1 |
| 3 |
+----+
2 rows in set (0.00 sec)

As the examples above illustrate, full-text search provides an efficient alternative to the traditional SELECT...LIKE statement. Try it for yourself the next time you sit down to write a search interface to your MySQL database!
[ 本文最后由 ah 于 2007-11-07 21:19:05 编辑 ]
版权声明: 一起游平台版权声明 侵权联系:邮箱:163828@qq.com
4712000