Omnigraph
Search

Vector Search

Find semantically similar nodes using nearest-neighbor search over embeddings.

Vector search finds nodes that are semantically similar to a query vector. It uses an IVF-HNSW index over a Vector field to perform approximate nearest-neighbor (ANN) lookup.

Schema setup

Declare a Vector(N) property with @index to enable vector search. N is the embedding dimension.

node Person {
    name:      String @key
    bio:       String @index
    embedding: Vector(1536) @index
}

node Document {
    title:     String @key
    content:   String
    embedding: Vector(768) @index
}

The @index annotation on a Vector field causes Omnigraph to build an IVF-HNSW index. Without it, nearest() queries on that field are rejected at compile time.

If you want embeddings to be auto-generated from another field, use the @embed annotation:

node Person {
    name:      String @key
    bio:       String @index
    embedding: Vector(1536) @embed(source: bio)
}

nearest() -- vector similarity ranking

nearest(field, vector) appears in the order clause and ranks results by cosine similarity to the query vector. It returns the closest matches first.

query similar_people($vec: Vector) {
    match {
        $p: Person
    }
    order nearest($p.embedding, $vec)
    return { $p.name, $p.bio }
}

Passing vector parameters

Vectors are passed as JSON arrays in the --params argument:

omnigraph read ./my-graph \
    --query queries.gq \
    --name similar_people \
    --params '{"vec": [0.021, -0.003, 0.118, ...]}'

In practice, you generate the query vector from an embedding model (e.g., OpenAI text-embedding-3-small) and pass it as a parameter.

Via the HTTP API:

curl -X POST http://localhost:4000/read \
    -H "Content-Type: application/json" \
    -d '{
        "uri": "./my-graph",
        "query": "queries.gq",
        "name": "similar_people",
        "params": {"vec": [0.021, -0.003, 0.118]}
    }'

Filtering before ranking

Combine a match filter with nearest() in order to narrow the candidate set before ranking. The vector index is only searched over nodes that pass the filter.

query similar_active_people($vec: Vector) {
    match {
        $p: Person { status: "active" }
    }
    order nearest($p.embedding, $vec)
    return { $p.name, $p.bio }
}

This first filters to active people, then ranks the filtered set by vector similarity. Pre-filtering reduces the search space and improves both speed and relevance.

Combining with traversal

Vector search composes with graph traversal. You can find semantically similar nodes and then follow edges to related entities:

query similar_people_at_company($vec: Vector, $company: String) {
    match {
        $p: Person
        $p worksAt $c: Company { name: $company }
    }
    order nearest($p.embedding, $vec)
    return { $p.name, $p.bio, $c.name }
}

This finds people who work at a specific company, ranked by how similar their embedding is to the query vector. The graph traversal constrains the candidate set; the vector ranking orders what remains.

Limiting results

Use the limit clause to cap the number of results returned:

query top_similar($vec: Vector) {
    match {
        $p: Person
    }
    order nearest($p.embedding, $vec)
    limit 10
    return { $p.name, $p.bio }
}

For vector search, limiting results is especially important since the index may contain thousands of candidates.

On this page