Why Union-Find fails for entity resolution at scale, and how weighted graph clustering with safeguards and incremental updates works better in production.
#entityresolution#graphalgorithms...Show more
We just launched Ask Senzing. It’s connected to our MCP Server so you’ll get a conversational interface to deep #EntityResolution knowledge.
Ask it difficult questions or ask it for a free eval license!
hubs.li/Q04kMCdD0 (lower right corner)
🚨 Tuesday! Catch Dr. Gurpinder Dhillon at #BigDataSummit Toronto — Day 1 | 4:30PM | Track 2 🎤 "Confident and Wrong: Why Identity Resolution Is the Missing Foundation for #AgenticAI." Find Tyler & Nastassia at Booth 7! 🍁
#EntityResolution#IdentityIntelligence
How to Keep Your AI Agent's Knowledge Graph Clean
Most tutorials skip the part that actually keeps a graph usable as it grows: separating naming from identity. Paul Iusztin built unified memory layers on top of knowledge graphs and kept running into the same reader question: how do you handle entity resolution and deduplication without corrupting the graph?
The answer is a 5-step pipeline:
LLM extraction reads the text and emits typed entity/relationship triplets anchored to a POLE O ontology
Entity resolution normalizes the name against existing nodes using exact, fuzzy, and semantic matching in a short-circuit chain. No merges yet, just canonical naming
Full-context embedding captures the entity's name, type, and attributes for a richer identity signal
Deduplication compares that embedding against existing nodes and routes to one of three outcomes: auto-merge (>=0.95), human review (0.85-0.95), or a new node (<0.85)
A nightly "dream pass" re-runs deduplication on recently ingested nodes to catch duplicates that were processed in parallel and never compared
The key insight: entity resolution and deduplication are two distinct decisions. Resolution asks "what should we call this?" Deduplication asks "is this the same real-world entity?" Conflating them is what silently corrupts graphs.
Jensen Huang the NVIDIA CEO and a same-named doctor in Taipei have the same name and the same entity type. Resolution cannot tell them apart. Only full-context deduplication can.
False merges are invisible until they are expensive to undo. The pipeline is designed to make irreversible operations earn their way in.
By Paul Iusztin
decodingai.com/p/keep-knowle…#KnowledgeGraphs#GraphAI#AgentMemory#EntityResolution#AIEngineering
--
Connected Data London 2026 | 11–12 November | Leonardo Royal Hotel London Tower Bridge
🎤 Share your work with the world's most passionate data community. The Call for Submissions is open.
connected-data.london/2026-c…
🎟 Tickets on sale now. Early bird discounts up to 30%. 2026.connected-data.london
📺 Sponsorship opportunities available. Contact info@connected-data.london for details.
#KnowledgeGraph#GraphRAG#Ontology#Graph#AI#DataScience#GraphDB#SemTech
AI-driven discovery is evolving from keyword matching to ontology-driven, intent- and semantics-based systems.
Ontology defines meaning structures. Cognition processes intelligence. Entity modeling represents real-world concepts. PITN.ai Relational logic connects entities across contexts. Industrial applications depend on consistent semantic frameworks.
As digital markets become increasingly multilingual, native-script recognition strengthens discoverability, interoperability, and cross-market identity—especially in humanoid robotics, where terminology consistency matters.
Core concepts such as HUMANOID(S), ROBOT(S), and HUMANOID ROBOT are globally recognized yet scarce as exact-match digital assets across languages and scripts. This scarcity elevates the strategic value of internationally aligned naming systems and native-script variants.
PITN.ai holds a portfolio of exact-match digital assets spanning 30 languages in this category, supporting semantic discovery, entity resolution, and branding identity layers across global AI ecosystems.
#Ontology#Cognition#EntityModeling#RelationalLogic#SemanticAI#KnowledgeGraph#IndustrialAI#HumanoidRobotics#AIInfrastructure#MultilingualAI#EntityResolution#DigitalIdentity#BrandingStrategy#AIInnovation
Splink Experience: Seeking User Insights 📊
I'm currently conducting independent research to understand how data analysts and scientists are utilising Splink for record linkage and deduplication.
The goal is to map out common friction points and identify where the community sees the most opportunity for workflow improvements.
I’d love to chat with 4-6 users for a quick 20-minute Zoom session on Feb 17th or 18th.
@RobinLinacre — As the lead on Splink, I’d love to share the synthesised findings with you once the research is complete, if you’re interested!
Are you a Splink user? > Drop a comment or DM me if you’re open to a brief chat. I'm looking for a mix of beginners and power users.
#Splink#DataScience#EntityResolution#OpenSource#DataEngineering