Skip to main content
Clarity
ProductComparePricing
DemoStart Free Trial
Start Free Trial

Product

  • Features
  • Integrations
  • Demo
  • Pricing
  • Use Cases

Compare

  • vs Monarch
  • vs Kubera
  • vs Mint
  • All Comparisons
  • All Alternatives

Resources

  • Blog
  • Learn
  • Engineering
  • Calculators
  • Glossary

Company

  • About
  • Careers
  • Press
  • Contact
  • Referrals

Legal

  • Terms
  • Privacy
  • Cookies
  • Security
  • Disclosures
Clarity

All your money, one clear view

© 2026 Clarity

Blog

Semantic Transaction Search: Find Transactions by Meaning, Not Just Keywords

Clarity uses sentence-transformers (all-MiniLM-L6-v2) embeddings and cosine similarity to let you search transactions with natural language — find 'that expensive dinner in December' or 'subscription I forgot about' without knowing the exact merchant name.

You remember the transaction but not the exact merchant name. You want “that expensive dinner in December” but you'd have to scroll through weeks of data to find it. Clarity's semantic search understands what you mean — not just what you typed. It uses sentence-transformers embeddings to match your query against transaction descriptions by meaning, so natural language searches return the right results even when the words don't match exactly.

The Problem With Keyword Search

Traditional transaction search is substring matching. You type “coffee” and it finds every transaction with the string “coffee” in the merchant name. That works fine when you remember the exact merchant name — but most of the time you don't. You remember the context, the occasion, the category, or the approximate amount and timeframe. Keyword search can't bridge that gap.

Consider a search like “subscription I forgot about”. A keyword search returns nothing — there's no transaction with those words in it. But a semantic search understands that you're looking for a recurring charge you don't recognize, weights transactions that look like subscriptions and have infrequent views, and surfaces them. The query has meaning even when it has no matching keywords.

The same gap shows up constantly in real financial searching. “Large purchase last month” doesn't match anything literally, but it maps naturally to high-value transactions in the previous 30 days. “Coffee shops near work” isn't about geography in the database — it's about understanding that certain merchant names imply coffee shops, and that your commute-time transactions cluster in a certain way. Semantic search handles all of these.

How the Embeddings Work

Clarity uses all-MiniLM-L6-v2 from the sentence-transformers library to generate embeddings for your transactions. When a new transaction arrives, its description is converted into a 384-dimensional vector that captures its semantic meaning — not just the words, but the relationships between concepts those words represent. These embeddings are stored alongside each transaction.

When you run a search query, the same model converts your query text into a vector in the same 384-dimensional space. The search engine then ranks transactions by cosine similarity — how closely the angle between the two vectors approximates zero. Transactions whose meaning is close to your query rank at the top, regardless of literal word overlap.

All-MiniLM-L6-v2 was chosen for a specific reason: it's a distilled model that produces high-quality semantic representations in a compact, fast package. At 22 million parameters, it runs quickly enough for real-time search without the infrastructure overhead of a larger model. For transaction search where you're expecting instant results as you type, latency matters as much as accuracy — and this model delivers both.

What You Can Search For

Semantic search understands financial intent. Here are examples of queries that return accurate results where keyword search would fail:

  • “Coffee shops near work” — surfaces transactions from coffee chains and independent cafes, weighted toward your typical commute-time spending patterns
  • “That expensive dinner in December” — finds high-value restaurant transactions in December without knowing the restaurant name
  • “Subscription I forgot about” — surfaces recurring charges that don't appear in your regular review flow
  • “Large purchase last month” — maps to high-value transactions in the previous billing cycle
  • “Grocery run before the holiday” — finds supermarket transactions in the days leading up to a major holiday
  • “Gas station on a road trip” — identifies fuel purchases that occurred in unfamiliar locations outside your typical geography

The search also handles approximate matches for merchant names. If you know you bought something from “that streaming service — the one with movies”, semantic search understands that description well enough to surface Netflix, Hulu, and similar services rather than returning nothing.

Works Alongside Filters and Keyword Search

Semantic search isn't a replacement for existing search and filter capabilities — it's an additional layer. Clarity still supports exact keyword matching, date range filters, category filters, amount filters, and account filters. You can combine semantic search with these: search semantically for “dining out” while also filtering to a specific date range and minimum amount. The semantic ranking applies within the filtered set.

The results page shows both semantic matches and their confidence scores, so you can see why a particular transaction ranked where it did. If the top result isn't what you were looking for, the ranking gives you enough signal to refine the query. “Expensive dinner” might surface a business lunch — changing the query to “anniversary dinner restaurant” shifts the semantic weighting toward celebratory occasions.

For power users who want precise control, the traditional filter interface remains unchanged. Semantic search is the starting point for conversational-style lookups; exact filters are there when you need surgical precision. Most real-world searches start fuzzy and get refined — semantic search supports that workflow naturally.

Privacy and Performance

Transaction embeddings are computed server-side and stored in your account's database alongside the transaction data. The embedding model runs on our infrastructure — your transaction descriptions are processed by our servers, not sent to a third-party embedding API. This keeps your financial data within the same trust boundary as the rest of Clarity.

Search results are instant. The cosine similarity calculation across your transaction history is fast enough that results appear as you type, with no perceptible delay even on accounts with thousands of transactions. The 384-dimension vectors used by all-MiniLM-L6-v2 are compact enough that the similarity computation stays well within the time budget for a responsive search experience.

New transactions are embedded automatically when they sync — there's no manual re-indexing step, and the search index stays current with your latest data. Historical transactions imported or connected for the first time are batch-embedded in the background so search is available immediately for your entire transaction history.

Get started

Start your free Clarity trial

Connect accounts in minutes and run your full weekly financial review from one dashboard.

Start Free TrialDemo

Frequently Asked Questions

How is semantic search different from keyword search?

Keyword search matches literal strings — you must know the exact merchant name. Semantic search converts your query and all transaction descriptions into embedding vectors and ranks results by meaning similarity using cosine distance, so natural phrases like 'that streaming service with movies' surface the right transactions even without exact word matches.

Which embedding model does Clarity use for semantic search?

Clarity uses all-MiniLM-L6-v2 from the sentence-transformers library, which produces 384-dimensional embeddings. It was chosen for its balance of semantic quality and inference speed — fast enough for real-time search as you type, without sacrificing meaningful accuracy.

Next best pages

Graph: 0 outgoing / 1 incoming

learn · explains · 86%

5 Investment Tracking Mistakes That Cost You Money

From ignoring cost basis to tracking gains in the wrong currency — common mistakes investors make when monitoring their portfolio.

learn · related-concept · 68%

Best Stock Portfolio Tracker in 2026

Track stocks with real-time quotes, account-level context, and net worth impact across investments, banking, and budgeting.

learn · related-concept · 68%

Tax-Loss Harvesting: A Practical Guide for 2026

Learn how to use investment losses to reduce your tax bill, avoid wash sale violations, and automate harvesting with portfolio tracking tools.