MySQL Full-Text Search Tuning: Boolean Mode, Relevance Ranking, and InnoDB

MySQL’s built-in FULLTEXT index on InnoDB tables provides fast, relevance-ranked text search without an external engine. The default natural language mode ranks results by TF-IDF frequency; Boolean mode adds operator-based filtering (+must -exclude "exact phrase" word*). Understanding how to tune the minimum word length, stopwords, and the query expansion modes turns MATCH()...AGAINST() from a blunt tool into a serious search feature.

Problem: A WordPress site with a large post archive enables MySQL FULLTEXT search but gets irrelevant results — common words score too high, phrase searches don't work as expected, and the relevance ranking doesn't match user intent.

Solution: Use MATCH AGAINST in IN BOOLEAN MODE for precise control: prefix terms with + (required) or - (excluded), wrap phrases in double quotes, and use * for wildcards. Tune ft_min_word_len and the stopword list in my.cnf. Use IN NATURAL LANGUAGE MODE WITH QUERY EXPANSION to broaden narrow queries.


The SQL below creates a fulltext index on wp_posts, demonstrates all three search modes, shows how to combine fulltext relevance scoring with a recency boost, and covers tuning variables for InnoDB fulltext.


-- 1. Add a FULLTEXT index to wp_posts (safe on InnoDB with online DDL)
ALTER TABLE wp_posts
    ADD FULLTEXT INDEX ft_post_search (post_title, post_content);

-- 2. Natural language mode — relevance ranked automatically
SELECT ID, post_title,
       MATCH(post_title, post_content) AGAINST ('wordpress performance' IN NATURAL LANGUAGE MODE)
           AS relevance
FROM   wp_posts
WHERE  post_status = 'publish'
  AND  MATCH(post_title, post_content) AGAINST ('wordpress performance' IN NATURAL LANGUAGE MODE)
ORDER  BY relevance DESC
LIMIT  10;

-- 3. Boolean mode — fine-grained operator control
--    +word  : must include
--    -word  : must exclude
--    "phrase": exact phrase
--    word*  : prefix wildcard
SELECT ID, post_title
FROM   wp_posts
WHERE  post_status = 'publish'
  AND  MATCH(post_title, post_content)
       AGAINST ('+wordpress +performance -plugin "cache warmup" optim*' IN BOOLEAN MODE)
LIMIT  20;

-- 4. Query expansion mode — adds related terms from top results (two-pass)
SELECT ID, post_title
FROM   wp_posts
WHERE  post_status = 'publish'
  AND  MATCH(post_title, post_content)
       AGAINST ('block editor' WITH QUERY EXPANSION)
LIMIT  10;

-- 5. Relevance + recency composite score
SELECT ID, post_title,
       (
           MATCH(post_title, post_content) AGAINST ('gutenberg blocks' IN NATURAL LANGUAGE MODE)
           * 1.5                              -- relevance weight
           + ( 1 / ( DATEDIFF(NOW(), post_date) + 1 ) ) * 0.5   -- recency boost
       ) AS score
FROM   wp_posts
WHERE  post_status = 'publish'
ORDER  BY score DESC
LIMIT  10;


-- 6. Tune InnoDB fulltext settings (add to my.cnf then restart MySQL)
-- innodb_ft_min_token_size = 2    -- default 3; allows 2-char words like "js"
-- innodb_ft_stopword_table = 'mydb/my_stopwords'  -- custom stopword table
-- innodb_ft_enable_stopword = ON

-- Rebuild the index after changing min_token_size:
ALTER TABLE wp_posts DROP INDEX ft_post_search;
ALTER TABLE wp_posts ADD FULLTEXT INDEX ft_post_search (post_title, post_content);


NOTE: Words that appear in more than 50% of rows are considered too common and are silently ignored in natural language mode even if they are not in the stopword table; use Boolean mode (+word) if you need to force-include a high-frequency term like "wordpress".