Getting Started

Postgres Vector (pgvector) is a PostgreSQL extension that adds vector similarity search to your database. It lets you store and query high-dimensional embeddings — for example those produced by OpenAI, Cohere, or Sentence Transformers — directly alongside your relational data. Available as an open web service in Eyevinn Open Source Cloud, pgvector is the standard choice for AI-powered search, semantic recommendations, and retrieval-augmented generation (RAG) applications. This tutorial walks you through the steps to get started.

Prerequisites

Step 1: Create a Postgres Vector instance

Navigate to the Postgres Vector service in the Eyevinn OSC web console. Click Create pgvector and fill in:

Field Description
Name Short alphanumeric name for your instance
PostgresPassword Password for the postgres superuser (required)
PostgresUser Superuser username (default: postgres)
PostgresDb Default database name (default: same as user)

Click the instance card once the status turns green and running. Note the IP and port shown — you will need them to build the connection string.

Step 2: Connect to the database

Based on the IP and port, the connection URL for your database is:

postgres://<user>:<password>@<IP>:<PORT>/<db>

For example, using the defaults:

postgres://postgres:mypassword@<IP>:<PORT>/postgres

Test the connection with psql:

psql "postgres://postgres:mypassword@<IP>:<PORT>/postgres"

Step 3: Enable the pgvector extension

Once connected, enable the extension in each database where you want to use vector search:

CREATE EXTENSION IF NOT EXISTS vector;

Step 4: Create a vector column and insert embeddings

-- Create a table with a vector column (384 dimensions for e.g. all-MiniLM-L6-v2)
CREATE TABLE documents (
  id BIGSERIAL PRIMARY KEY,
  content TEXT,
  embedding vector(384)
);

-- Insert a document with its embedding (example with 3 dimensions)
INSERT INTO documents (content, embedding)
VALUES ('Hello world', '[0.1, 0.2, 0.3]');

Step 5: Query by similarity

-- Find the 5 most similar documents to a query vector (cosine distance)
SELECT id, content, 1 - (embedding <=> '[0.1, 0.15, 0.25]') AS similarity
FROM documents
ORDER BY embedding <=> '[0.1, 0.15, 0.25]'
LIMIT 5;

Supported distance operators:

Operator Distance metric
<-> L2 (Euclidean) distance
<#> Negative inner product
<=> Cosine distance
<+> L1 (Manhattan) distance

For large datasets, create an HNSW or IVFFlat index to speed up queries:

-- HNSW index (recommended, no training required)
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);

-- IVFFlat index (requires at least some rows to train)
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);

Application Usage Example

import openai
import psycopg2

conn = psycopg2.connect("postgres://postgres:mypassword@<IP>:<PORT>/postgres")
cur = conn.cursor()

# Generate an embedding with OpenAI
response = openai.embeddings.create(input="What is OSC?", model="text-embedding-3-small")
embedding = response.data[0].embedding

# Query for similar documents
cur.execute(
    "SELECT content FROM documents ORDER BY embedding <=> %s::vector LIMIT 5",
    (str(embedding),)
)
results = cur.fetchall()

Using the CLI

osc create pgvector-pgvector myvectordb \
  -o PostgresPassword="mypassword" \
  -o PostgresDb="vectors"

Resources