How and Why Netflix Built a Real-Time Distributed Graph
Ingesting and Processing Data Streams at Internet Scale, Building a Scalable Storage Layer
The Netflix product experience historically consisted of a single core offering: streaming video on demand. But the evolution of its business has created a new class of problems where member interactions with the app have to be analyzed across different business verticals.
In a traditional data warehouse, these events would land in at least two different tables and may be processed at different cadences. But in a graph system, they become connected almost instantly. Ultimately, analyzing member interactions in the app across domains empowers Netflix to create more personalized and engaging experiences.
The data engineering team recognized a solution to process and store swathes of interconnected data while enabling fast querying to discover insights is needed.
Although they could have structured the data in various ways, they ultimately settled on a graph representation.
Graph offers key advantages, specifically:
* Relationship-Centric Queries
* Flexibility as Relationships Grow
* Pattern and Anomaly Detection
This is why they set out to build a Real-Time Distributed Graph, or “RDG” for short.
Three main layers in the system power the RDG:
* Ingestion and Processing — receive events from disparate upstream data sources and use them to generate graph nodes and edges.
* Storage — write nodes and edges to persistent data stores.
* Serving — expose ways for internal clients to query graph nodes and edges.
The team built the ingestion and processing pipeline using Apache Flink to transform streaming events into graph primitives.
But the critical question is - once billions of nodes and edges are created from member interactions, how do you actually store them?
The RDG is a property graph consisting of:
* Nodes: Entities including member accounts, titles (such as shows/movies), devices, and games. Each node has a unique identifier and a set of properties containing additional metadata.
* Edges: Relationships between nodes, such as “started watching,” “logged in from,” or “plays.” Edges also have unique identifiers and properties, such as timestamps.
In evaluating different storage options, Netflix explored traditional graph datastores. While they do provide feature-rich capabilities around things like native-graph query support and data models to represent different types of graphs, they also pose a mix of scalability, workload, and ecosystem challenges.
They ultimately decided that the options evaluated wouldn’t meet requirements at Netflix’s scale. So they turned instead to an internal platform specifically designed for this type of challenge: the Data Gateway Platform. More specifically, its Key-Value Data Abstraction Layer (KVDAL).
For the RDG, Netflix provisions a separate namespace for every node type and edge type in the graph. This also makes it straightforward to extend the RDG with new types of nodes and edges.
By Adrian Taruc, James Dalton, Luis Medina, Ajit Koti
How and Why Netflix Built a Real-Time Distributed Graph: Part 1 — Ingesting and Processing Data Streams at Internet Scale
netflixtechblog.com/how-and-…
How and Why Netflix Built a Real-Time Distributed Graph: Part 2 — Building a Scalable Storage Layer
netflixtechblog.medium.com/h…
#EmergingTech #DataEngineering #ConnectedData #RealTimeProcessing
--
📩 The Year of the Graph Winter 2025-2026 newsletter issue is out!
The Ontology issue: From knowledge to graphs and back again 👇
yearofthegraph.xyz/newslette…
All things
#KnowledgeGraph,
#GraphDB, Graph
#Analytics /
#DataScience /
#AI and
#SemTech.
Subscribe and follow to be in the know. Reach out if you'd like to be featured