Topics / Knowledge graph
What is a knowledge graph, explained simply?
What exactly is a knowledge graph?
A knowledge graph is a store of knowledge built as a graph: a set of nodes and the connections between them. The nodes are entities — a person, a city, a company, an event, a concept. The connections, the edges, say how these entities stand to one another. „Marie Curie“ has an edge „received“ to „Nobel Prize“, and an edge „born in“ to „Warsaw“. The knowledge doesn't sit in the single nodes but in exactly these named connections.
The difference from an ordinary table is large. A table stores values in rows and columns and knows nothing about how those values relate. A graph stores the relationship from the start. You don't ask it „show me row 47“ but „what is connected to Marie Curie — and how?“. That lets you walk the knowledge along, from entity to entity, across the edges.
This is exactly where the model meets the core: a knowledge graph is the technical form of the very picture we use to look at situations anyway — a network of entities and relations. What otherwise stays a way of thinking becomes here a data structure a computer can traverse. The graph isn't the knowledge itself but a map of the connections that carry the knowledge.
How is a knowledge graph built?
Inside, a knowledge graph consists of tiny statements, often called triples: subject, relationship, object. „Berlin — is capital of — Germany.“ Each triple is small and checkable on its own, yet thousands of them together form a dense net. So you don't file away large texts but many small, clear connections that support one another.
So the graph doesn't become arbitrary, there is usually a schema or ontology — an agreement about which kinds of entities and which kinds of edges are allowed. That way the system knows that „born in“ connects a person to a place and not two numbers. This schema is the quiet frame that keeps the many connections comparable.
In the picture of the model the graph can be zoomed freely. A node „Germany“ is, from outside, a single entity; go inside and it falls apart again into entities and relations — federal states, cities, laws. This fractal nesting is no accident but the basic principle: every entity is made of further entities and relations. A good knowledge graph uses exactly this, by grouping knowledge at the fitting level instead of laying everything out flat side by side.
What are knowledge graphs used for?
The best-known knowledge graph runs in your search engine. Search for „Albert Einstein“ and a compact box appears on the right with birth date, works, and related people. This info box comes from a graph that knows Einstein is a person, which relations he has, and which entities hang off them. The search engine thus understands „Einstein“ not as a string of characters but as a node with neighbors.
Outside of search, graphs are widespread too. Companies connect customers, products, and orders to give recommendations. Banks lay out accounts and transfers as a graph to find suspicious patterns. In IT security, devices, users, and accesses can be drawn as a network — an unusual path between nodes then stands out more than a single line in a log.
The common thread is always the same: as soon as the question is „how does this connect?“ instead of „what value is here?“, a graph plays to its strength. It makes the connections themselves the searchable object. That's the point where stored data turns into knowledge you can draw conclusions from.
What does a knowledge graph have to do with AI and LLMs?
Large language models like ChatGPT learn language from huge amounts of text. They are strong at phrasing but have a problem: they can make things up, because they hold no fixed, checked knowledge but only probabilities for the next word. A knowledge graph brings exactly what the model lacks — clear, checkable connections between entities.
That's why the two are often combined. The language model phrases fluently and understands the question, the graph supplies the solid facts to go with it. Before the model answers, it looks up in the graph which relations an entity really has and grounds its answer on that. This lowers the risk that it freely invents connections. The language model provides the language, the graph the scaffolding.
In the picture of the model, two networks complement each other here. The language model has learned countless connections, but many of them are blurry and not traceable. The graph, by contrast, holds selected relations cleanly named. Route the question through the graph first and a checked connection becomes active before the model starts to guess — energy steered onto the solid spot, not onto the most plausible one.
How does it differ from an ordinary database?
A classic database thinks in tables. If you want to know which colleagues work with a person three steps removed, you have to join several tables laboriously, and with each step the query gets slower. A knowledge graph thinks in connections from the start: you simply follow the edges, step by step, from node to node. Such path questions are exactly its strength.
Beside it stands the vector database, important for AI systems. It stores content by similarity and finds what is thematically close — but it knows no named relationships. It knows that two texts sound similar, not that one refutes the other. The graph can do exactly that, because its edges carry a clear meaning. Often you use both side by side.
So it isn't about better or worse but about the fitting view. A table is ideal when you count clean, uniform values. A graph is ideal when the connections themselves are the real question. Which network you look at, you decide by the question you ask — not the other way round.
Where do knowledge graphs reach their limits?
A knowledge graph is only as good as the connections someone put into it. If an edge is missing, the graph simply doesn't know the relationship — and missing knowledge looks to it just like knowledge that doesn't exist. The model calls such a never-activated connection empty: it could exist but has never been made. That's often exactly where the interesting answers sit that the graph can't yet give.
Then there's the effort. Cleanly breaking knowledge down into entities and named relations is work, and the world keeps changing. A graph that isn't maintained ages quietly: a company is renamed, an office changes hands, and the old edge no longer holds. The question of how to capture ambiguity — are there two people with the same name? — also stays demanding.
That's why a knowledge graph is a tool, not an oracle. It makes connections visible and searchable, but it doesn't think for you and guarantees no truth. Its value lies in ordering the net of entities and relations you carry in your head anyway — cleaner, larger, and walkable for a machine. It doesn't want to be more than that, and that is already a lot.
Seen through the model
Imagine you type „Who wrote Harry Potter?“ into a search engine. Without a graph, „Harry Potter“ would just be a string of characters appearing somewhere in texts. With a graph, „Harry Potter“ is a node — an entity of type book series. Attached to it is an edge „written by“ that leads to another node: „J. K. Rowling“. The answer sits not in a text but in this one named connection.
See it as a small network. „Harry Potter“ is connected via „written by“ to „J. K. Rowling“, via „set in“ to „Hogwarts“, via „genre“ to „fantasy“. Ask a follow-up question — „what else did the same author write?“ — and the machine simply walks the edges: from „Harry Potter“ to „J. K. Rowling“ and from there back across all „written by“ edges to further books. No new text is read, the net is just followed.
This is where the model shows in pure form. The entities are the nodes, the named relationships the active relations, and a question activates the path that answers it. If an edge is missing — say to a book nobody has entered yet — that relation stays empty, and the graph stays silent at that spot. So it becomes visible that its knowledge is nothing other than a maintained net of connections.
Frequently asked
What is a knowledge graph in simple words?
A knowledge graph is a body of knowledge that stores not isolated things but their connections. Every entity — a person, a place, a concept — is a node, and every relationship between them is an edge. Instead of filing away "Berlin" and "Germany" separately, the graph records: Berlin is the capital of Germany. From many such small connections a searchable network emerges that search engines and AI systems use to understand how things relate.
What does Google use a knowledge graph for?
Google uses its knowledge graph to understand search queries better and display compact info boxes. Search for a person and a box appears on the right with their date of birth, profession, and related people — that data comes from the graph. This lets the search engine treat "Einstein" not as a mere string of characters but as a node with known relationships. It can then answer questions whose answers appear nowhere verbatim on a page but follow from the connections.
What is the difference between a knowledge graph and a database?
A classic database stores values in tables and knows little about how they relate — relationships have to be reconstructed through laborious joins. A knowledge graph stores the relationships from the start as named edges between nodes. That lets it answer questions like "how do these two things connect?" directly, by following the connections. For uniform bulk data a table is still ideal; for networked questions, the graph.
How do knowledge graphs help against AI hallucinations?
Language models generate text by probability and can invent facts because they hold no verified knowledge, only likelihoods for the next word. A knowledge graph supplies exactly that verified knowledge in the form of clear, named connections. When a model looks up in the graph which relations an entity actually has before it answers, it grounds its response in solid facts rather than mere plausibility. That reduces the risk of it freely confabulating connections, even if it is no complete guarantee.
Is a knowledge graph the same as a vector database?
No, the two work differently. A vector database stores content by similarity and finds what is thematically close — it knows that two texts sound alike, but not how they relate in meaning. A knowledge graph stores named relationships with clear semantics, such as "is capital of" or "contradicts". In AI systems the two are often used together: the vector database for finding similar content, the graph for precise, traceable connections.
How do you build a knowledge graph?
You start with a schema or ontology that defines which kinds of entities and relationships are allowed. Then you break knowledge down into small statements called triples — subject, relationship, object — such as "Berlin — is capital of — Germany". These triples can be added by hand, imported from existing databases, or extracted automatically from text. Maintenance matters: a graph grows stale if nobody adds new connections and corrects ones that are no longer accurate.
Keep thinking
Related terms: Entity, Relation, Network level, Zoom in / zoom out, The three states: empty, active, passive