Research Knowledge Graph

Build a living knowledge base where agents curate, connect, and evolve research — from papers and notes to structured insight graphs.

Research teams drown in information. Papers pile up. Notes fragment across tools. Connections between ideas exist only in someone's head. When that person leaves, the connections leave too.

A knowledge graph makes connections explicit, discoverable, and persistent. Agents continuously enrich it. Humans curate and validate. The knowledge compounds.

The ontology

interface Searchable {
    title: String @index
    body: String @index
    embedding: Vector(768) @index
}

interface Timestamped {
    created_at: DateTime
    updated_at: DateTime
}

node Paper implements Searchable, Timestamped {
    doi: String @key
    authors: [String]
    venue: String?
    published: Date
    status: enum(unread, reading, read, annotated, archived)
    relevance: F64?
}

node Note implements Searchable, Timestamped {
    slug: String @key
    author: String
    source: enum(reading, experiment, meeting, conversation, synthesis)
    confidence: enum(speculative, plausible, established)
}

node Concept {
    slug: String @key
    name: String @index
    domain: enum(ml, systems, product, strategy, operations)
    maturity: enum(emerging, growing, established, declining)
}

node Question {
    slug: String @key
    text: String @index
    status: enum(open, investigating, answered, parked)
    priority: enum(critical, high, medium, low)
    asked_by: String
    asked_at: DateTime
}

node Claim {
    slug: String @key
    statement: String @index
    confidence: F64
    verified: Bool
    embedding: Vector(768) @index
}

// Knowledge edges
edge Cites: Paper -> Paper
edge Covers: Paper -> Concept
edge Supports: Paper -> Claim
edge Contradicts: Claim -> Claim { @unique(src, dst) }
edge Supersedes: Claim -> Claim { @unique(src, dst) }
edge Answers: Note -> Question
edge DerivedFrom: Note -> Paper
edge Connects: Note -> Concept
edge Raises: Paper -> Question
edge RelatedTo: Concept -> Concept { @unique(src, dst) }

This ontology captures not just what you know, but how you know it. Papers support claims. Claims can contradict each other. Notes derive from papers. Questions get raised and answered. Concepts relate to each other.

What agents can do with this

Literature discovery

A scanning agent reads new papers and places them in the graph:

query add_paper($doi: String, $title: String, $body: String, $authors: String, $venue: String, $published: Date, $embedding: Vector(768)) {
    insert Paper {
        doi: $doi,
        title: $title,
        body: $body,
        authors: [$authors],
        venue: $venue,
        published: $published,
        status: "unread",
        embedding: $embedding,
        created_at: "2026-03-30T00:00:00",
        updated_at: "2026-03-30T00:00:00",
    }
}

Then it detects which concepts the paper covers and creates edges:

query link_paper_to_concept($doi: String, $concept: String) {
    insert Covers {
        src: $doi,
        dst: $concept,
    }
}

Research gap detection

What questions remain unanswered?

query open_questions_by_priority() {
    match {
        $q: Question { status: "open" }
        not { $_ answers $q }
    }
    return {
        $q.text, $q.priority, $q.asked_by, $q.asked_at
    }
    order { $q.priority asc }
}

[
  {
    "text": "Does retrieval-augmented generation degrade on multi-hop reasoning?",
    "priority": "critical",
    "asked_by": "research-lead",
    "asked_at": "2026-03-15T09:00:00"
  },
  {
    "text": "What's the latency ceiling for hybrid search at 10M documents?",
    "priority": "high",
    "asked_by": "systems-eng",
    "asked_at": "2026-03-20T14:00:00"
  }
]

Citation graph traversal

Follow the citation chain to find foundational work:

query citation_chain($doi: String) {
    match {
        $p: Paper { doi: $doi }
        $p cites {1, 3} $ancestor
    }
    return {
        $ancestor.title, $ancestor.doi,
        $ancestor.published, $ancestor.venue
    }
    order { $ancestor.published asc }
}

Three hops into the citation graph surfaces the foundational papers that your paper of interest builds on.

Contradiction detection

A verification agent scans for conflicting claims:

query unresolved_contradictions() {
    match {
        $c1: Claim { verified: true }
        $c1 contradicts $c2
        $c2.verified = true
    }
    return {
        $c1.statement, $c1.confidence,
        $c2.statement, $c2.confidence
    }
}

[
  {
    "c1.statement": "Fine-tuning on domain data consistently outperforms RAG for factual QA",
    "c1.confidence": 0.75,
    "c2.statement": "RAG with structured retrieval matches fine-tuned models on domain QA benchmarks",
    "c2.confidence": 0.82
  }
]

Two verified claims that disagree. A resolution agent can examine the supporting papers, compare methodologies, and either update confidence scores or mark one as superseded.

Semantic research discovery

Find papers related to a question using vector search, scoped to a domain:

query find_relevant_papers($concept: String, $question: Vector(768)) {
    match {
        $c: Concept { slug: $concept }
        $p covers $c
        $p.status != "archived"
    }
    return { $p.doi, $p.title, $p.published, $p.relevance }
    order { nearest($p.embedding, $question) }
    limit 10
}

Structure narrows (only papers covering this concept). Vectors rank (by semantic similarity to the question). Combined, you get precise, relevant results.

Knowledge synthesis

A synthesis agent reads across related concepts and writes connecting notes:

query concept_neighborhood($concept: String) {
    match {
        $c: Concept { slug: $concept }
        $c relatedTo $related
        $p covers $c
        $p covers $related
    }
    return {
        $c.name, $related.name, $related.maturity,
        $p.title, $p.doi
    }
}

"What papers sit at the intersection of two related concepts?" This surfaces cross-domain connections that humans miss because they're reading within one domain at a time.

Multi-agent research pipeline

Scanner agent    →  ingests new papers, extracts concepts, writes Claims
                    (branch: scan/arxiv-20260330)

Linker agent     →  reads new papers, finds citation links and concept connections
                    (branch: link/batch-42)

Verification     →  checks new Claims against existing ones, flags contradictions
agent               (branch: verify/batch-42)

Synthesis agent  →  reads concept neighborhoods, writes connecting Notes
                    (branch: synthesis/weekly-13)

Question agent   →  reads open Questions, searches for relevant papers/claims
                    that might answer them (branch: answers/weekly-13)

Each agent merges independently. The scanner runs daily. The linker runs after each scan merge. Verification runs after linking. Synthesis runs weekly. The graph accumulates structured understanding over time.

What you can build from here

Reading list prioritization — Score unread papers by citation count, concept relevance to open questions, recency, and semantic similarity to the team's current focus.
Expertise mapping — Add Researcher nodes and AuthoredBy / ExpertIn edges. Query "who on the team has the deepest knowledge of concept X?"
Trend detection — Track concept maturity over time. An agent that reads new papers and updates Concept.maturity based on publication velocity can flag emerging topics.
Literature review generation — Given a question, traverse to relevant papers, claims, and notes. Feed the structured context to an LLM that writes a review with proper citations.

Research Knowledge Graph

On this page