FULLTEXT in MySQL
When querying your database of cats (as one does) using MySQL, sometimes a simple SELECT * FROM cats WHERE bio LIKE '%boxes%';
is either not strong enough (this query would obviously return many cats), or just inefficient and expensive, as it results in a full table scan. Instead, a FULLTEXT
search is more nuanced, allowing you to run more explicit and multi-worded queries on your database.
There is one thing to first think about in regards to FULLTEXT
. First, depending on the version of MySQL you are using, FULLTEXT
may only work on a MyISAM type table (as of 5.6, InnoDB also supports FULLTEXT
). I could write a whole post about the differences between MyISAM and InnoDB, but for now, this Stack Overflow post sums it up nicely instead.
Once you know your engine type and its FULLTEXT
search capabilities, the first step is to make sure that the columns you want to query are indexed for FULLTEXT
. If your table is already created, the ALTER
command is necessary to change it.
In my table Cats, there are the following columns: cat_id
, name
, age
, gender
, breed
, location
, and bio
. Let’s say I want to be able to run a FULLTEXT
search on cats’ biographies:
ALTER TABLE cats ADD FULLTEXT (bio);
Running a FULLTEXT
search is simple with the use of a WHERE
clause, with a MATCH...AGAINST
added in:
SELECT name, bio FROM cats WHERE MATCH (bio) AGAINST ('boxes Japan');
(In my example, this would return Maru.)
FULLTEXT
has some important rules. Keywords that are less than four characters will be ignored; popular words (stopwords, like ‘and’, ‘the’, etc.) are also ignored; and, if more than half of the record match the keywords, the query won’t return anything at all. This last one is important and should be remembered, as you can become puzzled when your search returns nothing. This is because FULLTEXT
wants to return relevant, specific results.
There are also some useful parameters that can help querying even further. Add IN BOOLEAN MODE
to the AGAINST
clause, and you can further specify if a word must be included (‘+boxes’) or excluded (‘-California’), as well as a word and its variations (the wildcard: ‘box*’ can return ‘box’ and ‘boxes’). It should be noted if you’re using boolean mode, the results will not be ordered or sorted by relevance. An ORDER BY
clause after the AGAINST
clause with the parameters you need will do this. Borrowing from my previous example, let’s say I’m looking for cats who like boxes, but I’m most interested in those who also live in Japan:
1 2 3 4 5 6 |
|
FULLTEXT
is a great feature of MyISAM/newer InnoDB tables, but there are some drawbacks. If you’re strictly InnoDB and using an older version of MySQL, or you have heavy traffic, for example, you may need other options, like a strictly fulltext search server like Sphinx.