Vector Embedding Targeting
Use this guide to understand how the platform uses vector embeddings to power content-aware link personalization.
When to use
- You want to understand how the platform recommends or groups links by similarity.
- You are building a page or profile and want to know how link ordering and relevance work.
- You want to optimize your link metadata (titles, tags, categories) for better personalization.
What are link embeddings?
The platform generates lightweight vector embeddings for each link in a handle's corpus. These embeddings capture the semantic content of a link based on its metadata, enabling:
- Similarity-based link grouping — Links with similar content cluster together.
- Content-aware ordering — Pages can order links by relevance to a visitor's observed interests.
- Personalization signals — Embeddings feed into the targeting pipeline alongside behavioral signals.
How it works
1. Source text extraction
For each link, the platform builds a source text string from:
| Field | Example |
|---|---|
title | "Shop Now" |
tags | "fashion", "sale" |
ogDescription | "Browse our latest collection" |
categories | "e-commerce" |
These fields are concatenated (space-separated) and truncated to 256 characters.
2. Tokenization
The source text is:
- Lowercased.
- Split on non-alphanumeric characters.
- Filtered to remove single-character tokens.
- Filtered to remove common stopwords (a, the, is, in, for, etc.).
3. TF-IDF weighting
Each token is weighted using Term Frequency - Inverse Document Frequency (TF-IDF):
- TF (Term Frequency) — How often the token appears in this link's text.
- IDF (Inverse Document Frequency) — How rare the token is across all links in the handle. Rare terms are weighted higher.
This means distinctive terms (like a product name) contribute more to the embedding than common terms.
4. Random projection
The TF-IDF vector is compressed to 32 dimensions using sparse random projection (Achlioptas 2003). This technique:
- Preserves relative distances between embeddings.
- Uses a deterministic seed so embeddings are reproducible across renders.
- Is extremely fast (<5 ms for 20 links, zero external dependencies).
5. L2 normalization
The final embedding is L2-normalized (unit length) and rounded to 4 decimal places for storage efficiency.
Minimum corpus size
Embeddings require at least 3 links in a handle's corpus to be meaningful. With fewer than 3 links, the IDF component collapses (there is not enough contrast between documents), and the platform returns empty embeddings.
How embeddings are used
Page rendering
When a profile page or link collection is rendered, embeddings are computed for all links in the handle. The rendering pipeline can then:
- Cluster similar links together in the UI.
- Rank links by relevance to visitor signals or segments.
- Power "related links" sections on individual link pages.
Targeting pipeline integration
Embeddings complement the chain signals system:
- Visitor signals capture behavioral intent (what the visitor clicked, viewed, or searched for).
- Link embeddings capture content semantics (what each link is about).
- The targeting pipeline can match visitor intent signals against link embeddings to surface the most relevant content.
This enables scenarios like: a visitor who has signaled interest in "rings" sees jewelry-related links ranked higher in their personalized view.
Optimizing for embeddings
To get the most out of embedding-based targeting:
| Do | Why |
|---|---|
| Write descriptive link titles | Titles are the primary source text |
| Add relevant tags | Tags add distinctive keywords the title may lack |
Fill in ogDescription | Adds additional context for the embedding |
| Use categories consistently | Categories help links cluster into meaningful groups |
| Keep metadata specific | Generic terms ("best", "great") add noise, not signal |
| Avoid | Why |
|---|---|
| Identical titles across links | Identical source text produces identical embeddings |
| Empty metadata fields | Less text means a weaker, less distinctive embedding |
| Excessive stopwords in titles | Stopwords are filtered out and contribute nothing |
Technical details
| Property | Value |
|---|---|
| Embedding dimension | 32 |
| Algorithm | TF-IDF + sparse random projection |
| Normalization | L2 (unit length) |
| Precision | 4 decimal places |
| Minimum corpus | 3 links |
| Source text max length | 256 characters |
| Projection seed | Deterministic (default 42) |
| Performance | <5 ms for 20 links |
| External dependencies | None |
UI path
Embeddings are computed automatically during page rendering. There is no separate UI to configure them. To influence embedding quality:
- Open
https://app.{PUBLIC_DOMAIN}. - Navigate to your handle and open Links.
- For each link, fill in the title, tags, categories, and social preview description.
- Save changes — embeddings are recomputed on the next page render.
Required auth
Embeddings are computed internally during page rendering. No separate auth is needed. Link metadata updates require CREATOR-level JWT or PAT with links.write scope.
Related: