SKILL #GIES

embedding-strategies

Optimize your embedding models fast

Select and optimize embedding models for semantic search and RAG applications. Use when choosing embedding models, implementing chunking strategies, or optimizing embedding quality for specific domains.

↗ github · ★ 37k·src: wshobson/agents

the manual

Embedding Strategies

Guide to selecting and optimizing embedding models for vector search applications.

When to Use This Skill

Choosing embedding models for RAG
Optimizing chunking strategies
Fine-tuning embeddings for domains
Comparing embedding model performance
Reducing embedding dimensions
Handling multilingual content

Core Concepts

1. Embedding Model Comparison (2026)

Model	Dimensions	Max Tokens	Best For
voyage-3-large	1024	32000	Claude apps (Anthropic recommended)
voyage-3	1024	32000	Claude apps, cost-effective
voyage-code-3	1024	32000	Code search
voyage-finance-2	1024	32000	Financial documents
voyage-law-2	1024	32000	Legal documents
text-embedding-3-large	3072	8191	OpenAI apps, high accuracy
text-embedding-3-small	1536	8191	OpenAI apps, cost-effective
bge-large-en-v1.5	1024	512	Open source, local deployment
all-MiniLM-L6-v2	384	256	Fast, lightweight
multilingual-e5-large	1024	512	Multi-language

2. Embedding Pipeline

Document → Chunking → Preprocessing → Embedding Model → Vector
                ↓
        [Overlap, Size]  [Clean, Normalize]  [API/Local]

Templates and detailed worked examples

Full template library and detailed worked examples live in references/details.md. Read that file when you need the concrete templates.

Best Practices

Do's

Match model to use case: Code vs prose vs multilingual
Chunk thoughtfully: Preserve semantic boundaries
Normalize embeddings: For cosine similarity search
Batch requests: More efficient than one-by-one
Cache embeddings: Avoid recomputing for static content
Use Voyage AI for Claude apps: Recommended by Anthropic

Don'ts

Don't ignore token limits: Truncation loses information
Don't mix embedding models: Incompatible vector spaces
Don't skip preprocessing: Garbage in, garbage out
Don't over-chunk: Lose important context
Don't forget metadata: Essential for filtering and debugging

more research

scroll →

Boost search results with hybrid methods

Research

HOT

Boost search results with hybrid methods

hybrid-search-implementation

♥ 0@ 0★ 37k

Research

HOT

Build smarter AI with RAG systems

rag-implementation

♥ 0@ 0★ 37k

Research

HOT

Optimize vector search performance fast

vector-index-tuning

♥ 0@ 0★ 37k

Research

HOT

Boost search speed with smart indexing

similarity-search-patterns

♥ 0@ 0★ 37k

Research

HOT

protocolsio-integration

transformer-lens-interpretability

♥ 0@ 0★ 28k