Skip to main content

RDF Triple Stores vs. Property Graphs : How to Attach Properties to Relationships

Time for the opening shot of a series about Semantic Technology, and in particular contrasting-and-comparing the opposing (but perhaps ultimately complementary) camps of:
  1.  RDF Triple Stores, aka Triples-Based Graphs.   For example, Blazegraph or Apache Jena
  2.  (Labeled) Property Graphs.  For example, Neo4j or Blazegraph
(For this article, I'll assume that you have at least a passing acquaintance with both.  Here is background info on Triplestores and Property Graphs)

It’s my opinion that modeling in terms of Subject/Predicate/Object triples (aka RDF) might be appealing to mathematicians or philosophers for its minimalist foundation (though a lot of baroque add-on’s quickly come out of the closet!)

Modeling in terms of (Labeled) Property Graphs might be appealing to computer scientists, because such graphs appear more usable and less clunky once you start actually doing something with them.

Perhaps because I straddle both the Math and CS camps, I’m currently on the fence about which model I like best.  I wear many hats: as someone involved in a research project on Knowledge Management (BrainAnnex.org) , I'm very interested in contrasting and comparing the two models; by contrast, in my role as someone trying to do Bioinformatics and other types of Content Management, I really don't care about the "war of the models" and simply want what is best to actually do serious work with!

N-ary relationships : attaching properties to relationships

There are many aspects, but in this blog entry l I’ll just discuss one: N-ary relationships, aka how to attach properties to relationships.

Here's an example from a helpful tutorial by Neo4j, RDF Triple Stores vs. Labeled Property Graphs: What’s the Difference?

With binary relationships, triples are simple and intuitive. To express that there is “a direct flight” (predicate) from NYC (entity) to SF (entity), you can have a subject/predicate/object triple such as:
NYC   hasDirectFlight  SF

That's just the infix form of a mathematical predicate with 2 variables:  P(x, y).   In graph form, it’s an edge (the Predicate) connecting the 2 nodes of the cities.
 
But what if you want to model the fact that the direct flight has distance and price values attached to them? Using Property Graphs models such as Neo4j, that’s trivially done: relationships can take as many properties/value pairs as you want. End of story!

With Triple-based Graphs, it’s a somewhat clunky process of adding an extra node, representing the city/city pair as an entity that can have properties such as distance and price. Vaguely reminiscent of those annoying “junction tables” needed in relational databases to represent many-many relationships. This operation is called reification.

Reification

The above flight example shows the extra "reification" node as a diagram.  Again, it's basically a crutch to attach properties on a relationship.

How does one actually add that extra node?  Here's a recipe example from Kno.e.sis that shows two ways to actually add a node to the set of triples. (The triples are shown written as “Turtle code”, which is a concise way to avoid repeating the same subject, etc.  Don’t worry about the prefixes such as rdf: , as they’re just name spaces.)


I’m not trying to say that Property Graphs are overall better than Triple-based graphs.  Just pointing the fact that N-ary relationships, while eminently doable in RDF, are more clunky and a little less intuitive.


 My Tentative Conclusions

At present, I lean heavily in favor of the open-source Neo4j, though for some purposes triplestores do just fine (for example, in the Knowledge-Representation and Media Management open source project Brain Annex, I make use of ARC2, a PHP library for working with RDF.)

I've also had promising initial results with Blazegraph.

By contrast, I advice against Virtuoso, a triplestore I found to be clunky, bloated, buggy, and with an inadequate ecosystem.  (At a past job, we squandered a lot of time trying to use it, both versions 6 and 7.)


Beyond RDF

I'll just mention in passing that there is an extension of RDF (and of its query language SPARQL), called RDF*, which somewhat simplifies/streamlines the clunky reification process of RDF.
The open-source Blazegraph supports RDF*, above and beyond supporting RDF (both triples and quads) and also providing a Property Graph modality.

Comments

Popular posts from this blog

Graph Databases (Neo4j) - a revolution in modeling the real world!

(UPDATED 11/2022) - I was "married" to Relational Databases for many years... and it was a good "relationship" full of love and productivity - but SOMETHING WAS MISSING! Let me backtrack.   In college, I got a hint of the "pre-relational database" days...  Mercifully, that was largely before my time, but  - primarily through a class - I got a taste of what the world was like before relational databases.  It's an understatement to say: YUCK! Gratitude for the power and convenience of Relational Databases and SQL - and relief at having narrowly averted life before it! - made me an instant mega-fan of that technology.  And for many years I held various jobs that, directly or indirectly, made use of MySQL and other relational databases - whether as a Database Administrator, Full-Stack Developer, Data Scientist, CTO or various other roles. But there were thorns in the otherwise happy relationship The root cause: THE REAL WORLD DOES NOT REALLY RESEMBLE THE

Life123 : Quantitative Modeling of Biological Systems

(UPDATED 8/2022) - Are we ready to embark on a next-generation detailed quantitative modeling of complex biological systems , including whole-cell simulations?  An anticipated up-jump in computing power may be imminent from Photonics computers (which I discuss here ), and GPU's are rapidly gaining power as well...  Are we in ready state to put existing - and upcoming - power to good use? This is a manifest, and a call to action What's Life123? It's about detailed quantitative modeling of biological systems in 1-D, 2-D and full 3-D, as well as a multi-faceted software platform for doing so. What's (pseudo-)1D?  For now, let's say it's like the inside of a long, thin tube - with no interactions with the tube.  Likewise, (pseudo-)2D can be thought of as a Petri dish, with no interactions with the lid or the bottom. Website : https://life123.science A purposeful decision to also utilize 1D and 2D But why?  Yes, it's in part about "walk before you run&quo

Discussing Neuroscience with ChatGPT

UPDATED Feb. 2023 - I'm excited by ChatGPT 's possibilities in terms of facilitating advanced learning .  For example, I got enlightening answers to questions that I had confronted when I first studied neuroscience.  The examples below are taken from a very recent session I had with ChatGPT (mid Jan. 2023.) Source: https://neurosciencestuff.tumblr.com In case you're not familiar with ChatGPT, it is a very sophisticated "chatbot" - though, if you call it that way, it'll correct you!  'I am not a "chatbot", I am a language model, a sophisticated type of AI algorithm trained on vast amounts of text data to generate human-like text'. UPDATE:  this article focuses on some of the impressive abilities of ChatGPT.  For a good glimpse of its weaknesses, in the context of poor intuition about Physics, as well as Math errors, check out this great short video:  ChatGPT does Physics For a high-level explanation of how ChatGPT actually works -

D3 Visualization with Vue.js : a powerful alliance (when done right!)

[UPDATED MAY 2022]  D3.js is a very powerful visualization tool, especially for specialized/custom needs...  On the flip side, it's rather hard to use - with a steep learning curve. Even worse if one also wants interactivity ! But why is D3 so hard/clunky to use?  And what can be done about it? Spoiler alert: Vue.js (or other modern front-end framework) to the rescue - if done right... All code in the examples is available in this GitHub repository . The Root of the Problem In a nutshell, what makes D3 awkward to use is that, for historical reasons, it tries to do too much : most painfully, it uses an old way to do direct DOM manipulation (i.e. restructuring the page layout) - an operation that nowadays is superbly handled in a far more friendly way by modern front-end frameworks, such as Vue.js Document Object Model ( DOM ) is a programming interface for web documents.  In simple terms, it's the structure of the elements on a web page (text, images, etc.) Let the front-e

To Build or Not to Build One’s Own Desktop Computer?

“ VALENTINA ” [UPDATED JUNE 2021] - Whether you're a hobbyist, or someone who just needs a good desktop computer, or an IT professional who wants a wider breath of knowledge, or a gamer who needs a performant machine, you might have contemplated at some point whether to build your own desktop computer. If you're a hobbyist, I think it's a great project.  If you're an IT professional - especially a "coder" - I urge you to do it: in my opinion, a full-fledged Computer Scientist absolutely needs breath, ranging from the likes of Shannon's Information Theory and the Halting Problem - all the way down to how transistors work. And what about someone who just needs a good desktop computer?  A big maybe on that - but perhaps this blog entry will either help you, or scare you off for your own good! To build, or not to build, that is the question: Whether 'tis nobler in the mind to suffer The slings and arrows of OEM's cutting corners and limit

Brain Microarchitecture : Feedback from Higher-order areas to Lower-order areas

Some questions that arise in Machine Learning involve the prospect of using feedback from Higher-order areas (downstream) to Lower-order areas (upstream), and using Global Knowledge for Local Processing.  A desire to gain insight into those issues from Neuroscience ("how does the brain do it?") led me to some fascinating investigations into the Microcircuits of the Cerebral Cortex.  This blog entry is a broad review of the field, in the context of the original motivating questions from Machine Learning.   Starting out with a quote from the “bible of Neuroscience”: From Principles of Neural Science, 5th edn  (Online book location 1435.3 / 5867).  Emphasis and note added by me: Sensory pathways are not exclusively serial; in each functional pathway higher-order areas project back to the lower-order areas from which they receive input. In this way neurons in higher-order areas, sensitive to the global pattern of sensory input, can modulate the activity of neurons in lowe

A "Seismic Shift" in Longevity Science : Mainstream Acceptance + Large Funding

"You are incredibly prescient!"   I woke up to those words from a former colleague on Jan. 19, 2022: the bombshell announcement that the Chief Science Officer of pharma giant GSK, where I worked until recently, will become the CEO at the new, $3 BILLION longevity science company Altos (presumably also funded by Amazon's Jeff Bezos.) Big Pharma is at long last embracing Longevity Science. The corollary: longevity science is entering Mainstream (with capital "M") But let me backtrack... The Decade of Longevity Science When Harvard professor David Sinclair declared the 2020's to be the " decade of the paradigm shift about age reversal ", one could perhaps be dismissive of it as just an outburst of enthusiasm... But in the past couple of years, we're seeing strong evidence that his forecast is right on the mark! While I worked at GlaxoSmithKline - a giant, top-10, pharma company - I vigorously advocated forming a Longevity Science dept., and sp

PET/CT Combined Scanners - a 2018 Breakthrough of the Year... and a Personal Story

Image source Recently, a co-worker in her 20's was diagnosed with a brain tumor!  At times like these, the importance of medical imaging jumps to the fore! Most people have heard of CT ("CAT") scanners – at least enough to know that they don't actually involve cats – but less well-known are PET scanners (which likewise don't involve pets!), and the synergistic combination of the two. A Marriage Made in Heaven What do those scanners do?  And why are they being combined in single devices? Voted 2018 Breakthrough of the Year by a science magazine , the improved PET/CT combined scanner has been a game changer. The EXPLORER PET/CT scanner – the world’s first medical imaging system that can capture a 3D image of the entire human body simultaneously – has produced its first human images. Developed by UC Davis scientists and a multi-institutional consortium, EXPLORER can scan up to 40 times faster, or use up to 40 times less radiation dose, than

Anti-Aging Research: Science, not Hype

Last updated December 2022 Q: "How is aging a disease?" A: "It's a dynamic system that veers away from its homeostasis (normal equilibrium point): hence a form of slow-progressing illness. Labeling it as 'natural' is a surrender to our traditional state of ignorance and powerlessness, which fortunately is beginning to be changed!" The above is my standard answer to an oft-asked question. The science of aging is by all evidence very misunderstood by the general public.  Hype, misinformation and unquestioned assumptions often prevail, unfortunately. Aging as a systemic breakdown of the body, rather than a series of isolated events and conditions. This 2013 diagram from NIH is a good way to jump-start contemplating the big picture: The diagram originates from the Cell journal: The Hallmarks of Aging   Telomere shortening is perhaps the one most talked about - but just one of several processes.  As stated in the above paper: Each