Why Keyword Search Is Broken for E-Commerce (and How We Fixed It with Vector Search)

Keyword search is built on an assumption: that the customer knows exactly what to type to find what they want. In practice this assumption breaks constantly. A customer looking for a drill for home DIY types exactly that — and keyword search returns nothing useful, because no product description contains all three of those words together. They leave. The sale is lost. The product they needed was in the catalogue the whole time.

The Fundamental Limitation of Keyword Matching

Traditional e-commerce search operates on inverted indexes: it matches query terms against product titles, descriptions, and attribute fields. This works well when customers use the exact terminology a product manager chose when writing the description. It fails when customers describe what they want in natural language, use synonyms, search by use case rather than product category, or phrase the query in a way that differs from how the product is classified internally.

Adding synonyms, typo correction, and faceted filtering helps at the margins, but it is fundamentally additive patching on a model that does not understand meaning. What we built instead was a system that understands the intent behind a query — not just the words in it.

How Vector Search Works

Vector search is built on a different principle. Both the product catalogue and the search query are converted into numerical vectors — high-dimensional representations of meaning generated by an embedding model. Similar meanings produce vectors that are mathematically close to each other. When a user searches, the system converts their query into a vector and finds the products whose vectors are nearest — regardless of whether they share any common keywords.

In practice: “powerful drill for home DIY” and “compact cordless power drill” produce vectors that are close together, because the embedding model has learned that these phrases describe the same kind of thing. The search returns the right product even though no keyword matched.

The System We Built

  • Azure OpenAI embeddings — product descriptions, titles, and attributes are embedded into vectors at index time; user queries are embedded at search time using the same model for consistent semantic representation
  • Qdrant vector database — the product vector index is stored and queried in Qdrant, which handles approximate nearest-neighbour search efficiently at catalogue scale
  • Hybrid search layer — vector similarity search is combined with traditional keyword search in a weighted hybrid. Keyword results handle exact-match cases (product codes, brand names) where semantic search adds noise; vector results handle natural language queries where keyword search fails
  • Re-ranking — results from both layers are re-ranked using a cross-encoder model that evaluates relevance more precisely than either search method produces alone

A Concrete Example

User query: cheap drill for home use

Keyword search result: nothing relevant returned if no product title contains all three words together.

Vector search result: the system interprets the query as a request for an affordable, entry-level power drill suitable for non-professional use, and returns a ranked list of cordless drills in the budget and mid-range price categories — sorted by semantic relevance to the expressed intent.

Business Impact

  • Improved product discovery for natural-language and long-tail queries that keyword search consistently fails on
  • Reduced zero-result searches — one of the clearest indicators of search failure and missed revenue
  • Better product recommendations based on intent-based similarity rather than co-purchase correlation alone
  • Foundation for conversational search — users can describe what they need across multiple turns and get progressively refined results

Vector search has applications well beyond e-commerce. Internal knowledge retrieval, document discovery, compliance research, and financial data exploration are all domains where semantic understanding outperforms keyword matching. This work aligns with TechZiel’s capability in data platforms, vector databases, and AI-powered information systems. If you are evaluating how vector search could improve search or discovery in your environment, speak with us.

Leave a Reply

Your email address will not be published. Required fields are marked *