pgvector vs Pinecone for AI Agent Memory

The vector database market exploded over the past two years. Pinecone, Weaviate, Qdrant, Chroma, Milvus — there are more options than most teams can evaluate.

For AI agent memory specifically, the comparison that comes up most often is pgvector vs Pinecone. One is a Postgres extension. The other is a purpose-built vector database with a $100M+ funding pedigree.

This post cuts through the marketing and answers the actual question: which one should you use for AI agent memory, at what scale, and what does it cost? It also covers what this means for automation builders using n8n, Make.com, and Zapier.

What We're Actually Comparing

Before the comparison, a clarification: pgvector is a Postgres extension, not a standalone product. You run it on a Postgres instance — self-hosted, on RDS, on Supabase, or on Neon. It adds vector column types and similarity search operators to standard SQL.

Pinecone is a managed vector database service. You call their API to upsert vectors and query them. There's no SQL, no schemas, no joins.

This is a meaningful distinction because the choice isn't just about vector search performance — it's about your entire data architecture.

Performance: The Real Numbers

Let's start with what actually matters for AI agent memory workloads.

Query Latency

Typical workload: a user sends a message, you search their memory for relevant context, you inject it into the prompt. The memory search needs to complete in under 200ms to not meaningfully impact response time.

pgvector (HNSW index, self-hosted on 4-core machine):

10,000 vectors: ~5-10ms
100,000 vectors: ~20-50ms
1,000,000 vectors: ~50-120ms

Pinecone (s1 pod, standard tier):

10,000 vectors: ~30-60ms (network included)
100,000 vectors: ~30-80ms
1,000,000 vectors: ~40-100ms

At the scales relevant to most AI agent deployments (under 1M vectors), pgvector is as fast as Pinecone — sometimes faster, because there's no network hop to an external service.

💡

pgvector's HNSW index is the key to performance. Make sure you create it explicitly: CREATE INDEX ON memories USING hnsw (embedding vector_cosine_ops). Without it, pgvector falls back to a full table scan, which is slow at any scale.

Write Throughput

For agent memory, writes happen after each interaction. This is typically 1-10 writes per second for most deployments.

Both pgvector and Pinecone handle this trivially. Write throughput only becomes a concern above 1,000 writes/second.

Cost Analysis

For Small Deployments (under 100k vectors)

pgvector (on a $10/month VPS):

Compute: $10-30/month
Total: $10-30/month

Pinecone (free tier: 100k vectors, 1 namespace):

Total: $0/month

Pinecone wins for small deployments. Their free tier is genuinely generous.

For Medium Deployments (100k - 10M vectors)

pgvector (on a 4-core/8GB VPS):

Compute + storage: $50-150/month

Pinecone (serverless):

~$0.04/1M read units + $2/GB storage
10M vectors at 1536 dims = ~24GB → ~$48/month storage
Plus read operations: at 100k searches/month = $4
Total: ~$52/month

Cost parity at this range. The difference is operational complexity.

For Large Deployments (10M+ vectors)

pgvector (dedicated 8-core/32GB Postgres):

Compute: $200-500/month
Storage: $50-100/month
Total: $250-600/month

Pinecone (serverless, heavy usage):

Storage: $480/month (240GB for 100M vectors)
Read operations: 1M searches/month = $40
Total: $520+/month

At large scale, costs are roughly comparable, but pgvector gives you more control.

⚠️

These are rough estimates. Pinecone pricing depends heavily on your dimension count, operation frequency, and replication settings.

Complexity Comparison

pgvector

What you manage:

Postgres instance (backups, scaling, updates)
HNSW index tuning (m, ef_construction parameters)
Schema design (vectors live alongside your other data)

What you get:

SQL joins between vectors and your regular data
ACID transactions
Row-level security (filter vectors by user_id with RLS policies)
No additional API to integrate

Pinecone

What you manage:

Pinecone API integration
Metadata sync (your business data lives elsewhere)
Namespace management

What you get:

Zero infrastructure management
Simple upsert/query API
Managed scaling

The Data Fragmentation Problem with Pinecone

Your AI agent memory exists alongside other data: user records, subscription status, conversation metadata, timestamps, tags. With pgvector, all of this lives in one database. A single SQL query can filter memories by user, join to the subscription table, and return only memories from the last 30 days.

SELECT m.content, m.created_at, m.tags
FROM memories m
JOIN users u ON m.user_id = u.id
WHERE m.user_id = 'user_123'
  AND u.plan = 'pro'
  AND m.created_at > NOW() - INTERVAL '30 days'
ORDER BY m.embedding <=> query_embedding
LIMIT 5;

With Pinecone, you'd need to:

Query Pinecone for similar vectors
Filter by user_id in Pinecone metadata
Take the returned IDs
Query your main database for full record details
Join the results yourself

Every query becomes a multi-step operation across two systems. These get out of sync.

💡

If your AI agent needs to filter memories by user attributes (plan, signup date, region), pgvector is significantly simpler. Pinecone's metadata filtering is limited compared to SQL WHERE clauses.

What This Means for n8n, Make.com & Zapier Users

If you're building AI agents on automation platforms, you don't need to choose between pgvector and Pinecone — you need a memory API that:

Works with your platform via simple HTTP calls
Handles the vector infrastructure for you
Provides fast semantic search

For n8n users, the choice between pgvector and Pinecone is irrelevant — what matters is having a memory service that integrates as a community node.

The retainr n8n community node gives you pgvector-backed semantic search without managing any database:

Settings → Community Nodes → Install → n8n-nodes-retainr

Under the hood, retainr uses pgvector with HNSW indexing, user-level isolation via RLS policies, and automatic embedding generation. You get all the pgvector benefits with none of the setup.

For teams who need to build their own: if you're self-hosting n8n and want to manage your own vector database, pgvector is the better choice because it integrates with your existing Postgres instance.

When to Actually Use Pinecone

pgvector is better for most AI agent workloads, but Pinecone is genuinely the right choice in specific situations:

Use Pinecone when:

You need to store 100M+ vectors and don't want to manage Postgres scaling
Your team has zero database operations experience
You need multi-region replication for vectors specifically
You're early-stage and the free tier covers your current scale

Use pgvector when:

Your vectors need to join with relational data (almost always true for user memory)
You already run Postgres
You want ACID transactions spanning vector and non-vector data
You need row-level security
You want to keep infrastructure simple

Why retainr Uses pgvector

retainr is built on pgvector for three reasons:

1. User isolation via RLS. Each workspace's memories are scoped with PostgreSQL row-level security policies. A bug in the API layer cannot leak User A's memories to User B — the database enforces the boundary.

2. Transactional integrity. When you store a memory and update your usage counter in the same request, it happens atomically. Either both succeed or neither does.

3. SQL flexibility. Complex analytics queries on memories — who's storing the most, which tags appear together, what percentage of memories are being recalled — are native SQL.

Benchmark: pgvector HNSW vs. Flat Scan

Without HNSW index:

-- Full table scan — O(n), slow at scale
-- 100k rows: ~800ms
SELECT content FROM memories
ORDER BY embedding <=> query_embedding
LIMIT 5;

With HNSW index:

CREATE INDEX ON memories USING hnsw (embedding vector_cosine_ops)
  WITH (m = 16, ef_construction = 64);
-- Same query
-- 100k rows: ~20ms

The HNSW index reduces query time by 40-50x. It uses approximate nearest-neighbor search — for AI agent memory retrieval, the approximate results are indistinguishable from exact results in practice.

Give your AI agents a real memory

Free plan includes 1,000 memory operations/month. No credit card required.

Use pgvector-backed memory without the infrastructure →

Summary

Factor	pgvector	Pinecone
Performance at 100k vectors	Fast (20-50ms)	Fast (30-80ms)
Performance at 10M vectors	Fast (50-120ms)	Fast (40-100ms)
Cost at small scale	$10-30/mo	Free tier
Cost at 10M vectors	$50-150/mo	~$52/mo
SQL joins	Yes	No
ACID transactions	Yes	No
Managed service	Self-managed	Fully managed
User isolation (RLS)	Yes (native)	Via metadata filter
Works with n8n/Make.com/Zapier	Via retainr API	Via Pinecone API

For AI agent memory with namespace-scoped data that needs to join with other user data — pgvector is the better architecture. Pinecone is simpler to get started with and appropriate when you don't need relational capabilities.

If you want the pgvector benefits without managing the infrastructure, that's exactly what retainr provides — and it integrates directly with n8n, Make.com, and Zapier.

Frequently Asked Questions

Do n8n, Make.com, and Zapier users need to know about vector databases? No. If you use retainr, the vector database is handled for you. This article is for teams building custom memory infrastructure for AI agents.

Can I migrate from Pinecone to pgvector later? Yes, with effort. You'd need to re-embed and re-import all vectors. Plan your storage choice early — migrations at large scale are painful.

Is Supabase pgvector the same as running pgvector yourself? Supabase hosts Postgres with pgvector enabled. It's a managed pgvector option — you get pgvector capabilities with Supabase handling the infrastructure. A good middle ground between self-hosted and Pinecone.

pgvector vs Pinecone for AI Agent Memory: Cost, Performance & n8n/Make.com/Zapier Guide