MySQL 9.0 Features: VECTOR Type and JavaScript Stored Programs

MySQL 9.0 (GA 2024) introduces two headline features: the VECTOR data type for storing and querying high-dimensional embeddings (enabling semantic search directly in the database), and JavaScript stored programs via the MLE (Multilingual Engine) component. For WordPress developers, the VECTOR type opens the door to AI-powered search without an external vector database.

Problem: MySQL 8.x provides JSON columns and FULLTEXT indexes, but ML-based semantic search and vector similarity queries — needed for AI-powered WooCommerce product recommendations — require storing and querying high-dimensional embeddings efficiently.

Solution: MySQL 9.0 introduces a native VECTOR column type and the VECTOR_DISTANCE() function for k-nearest-neighbour queries. Store embedding vectors from an ML model (e.g., OpenAI Ada) in a VECTOR(1536) column, then query with ORDER BY VECTOR_DISTANCE(embedding, %s) LIMIT 10. MySQL 9.0 also adds JavaScript stored programs via the MLE component.


The examples below create a VECTOR column for WordPress post embeddings, perform a nearest-neighbour search with DISTANCE(), and show a JavaScript stored function that formats structured data — all in MySQL 9.0.


-- ── 1. Add a VECTOR column to wp_posts for semantic embeddings ───────────
-- VECTOR(1536) stores a 1536-dimensional float32 embedding (OpenAI ada-002 size)
ALTER TABLE wp_posts
    ADD COLUMN post_embedding VECTOR(1536) COMMENT 'OpenAI text-embedding-3-small';

-- Create a VECTOR INDEX for ANN (approximate nearest-neighbour) search
ALTER TABLE wp_posts
    ADD VECTOR INDEX idx_embedding (post_embedding);

-- ── 2. Insert an embedding (from PHP via parameterised query) ─────────────
-- In PHP: $embedding is a float[] from the OpenAI Embeddings API
-- $vector_hex = implode(',', $embedding);
-- Then:
UPDATE wp_posts
   SET post_embedding = STRING_TO_VECTOR('[0.12, -0.34, 0.56, ...]')
 WHERE ID = 42;

-- ── 3. Nearest-neighbour search: find the 5 most similar posts ───────────
SET @query_vector = STRING_TO_VECTOR('[0.10, -0.30, 0.55, ...]');

SELECT
    ID,
    post_title,
    DISTANCE( post_embedding, @query_vector, 'COSINE' ) AS similarity
FROM  wp_posts
WHERE post_status = 'publish'
ORDER BY similarity ASC
LIMIT 5;

-- ── 4. JavaScript stored function (MySQL MLE) ────────────────────────────
-- Requires: INSTALL COMPONENT 'file://component_mle';
CREATE FUNCTION format_price( amount DOUBLE, currency VARCHAR(3) )
RETURNS VARCHAR(20)
LANGUAGE JAVASCRIPT
AS $$
    const formatter = new Intl.NumberFormat('en-US', {
        style:    'currency',
        currency: currency ?? 'USD',
    });
    return formatter.format(amount);
$$;

-- Usage:
SELECT format_price( 1234.5, 'EUR' );   -- €1,234.50

-- ── 5. VECTOR type: check dimensions of stored embedding ─────────────────
SELECT ID, post_title, VECTOR_DIM( post_embedding ) AS dims
FROM   wp_posts
WHERE  post_embedding IS NOT NULL
LIMIT  5;


NOTE: MySQL's VECTOR INDEX uses HNSW (Hierarchical Navigable Small World) indexing for approximate nearest-neighbour search — it trades recall (some nearest neighbours may be missed) for speed; use DISTANCE() with a table scan (no index) when you need 100% recall for small datasets, and the VECTOR INDEX only when you have thousands of rows and can accept ~95% recall.