Skip to main content

Interactomics + Super (or Quantum) Computers + Machine Learning : the Future of Medicine?

[Updated Mar. 2021]

Interactomics today bears a certain resemblance to genomics in the  1990s...  Big gaps in knowledge, but an explosively-growing field of great promise.

If you're unfamiliar with the terms, genomics is about deciphering the gene sequence of an organism, while interactomics is about describing all the relevant bio-molecules and their web of interactions.

A Detective Story

Think of a good police-detective story; typically there is a multitude of characters, and an impossible-to-remember number of relationships: A hates B, who loves C, who had a crush on D, who always steers clear of E, who was best friends with A until D arrived...

Yes, just like those detective stories, things get very complex with our biological story!  Examples of webs of interactions, familiar to many who took intro biology, are the Krebs cycle for metabolism or the Calvin cycle to fix carbon into sugars in plant photosynthesis.

Now, imagine vastly expanding those cycles of reactions - the bane of biology students who need to memorize them - to cover all the cellular functions, in all cell types, at various points in time, in various organism.  Oh, and add quantitative information, such as concentration (a function of location and time), and reaction parameters...

Welcome to Interactomics :)
[We choose to go to the Moon in this decade and do the other things,] not because they are easy, but because they are hard;  [...] because that challenge is one that we are willing to accept, one we are unwilling to postpone, and one we intend to win  -J. F. Kennedy's speech

The Characters   

Back to the detective-story analogy, who are the "characters"?  Well, the genome (DNA) is well-described.  The proteins not as well.  The number of proteins in humans (the "proteome") is some 20 to 30 thousand.  The Human Proteome Map (HPM) project mentions about 30,000 proteins - and that's without counting proteolysis events (protein breakdowns), and other post-translational modifications ("translation" is one of major steps in the generation of new proteins.)  In another blog entry, I provide a primer on the complexities of proteins.

In addition to DNA and proteins, the "cast of actors" of course also includes a variety of other biomolecules, such as RNA, lipids and ATP (the molecule widely used for energy storage), not to mention various small molecules.

The Interactions

Just as our detective story would get dull if it the characters remained in isolation, the story of Life gets interesting when the biomolecules start interacting with one another.  In principle, with 30,000 proteins, one could have about 450 million pairwise interactions!  Fortunately, proteins tend to be specific in their interactions, so many of the conceptual pairings don't actually occur.

Still, the numbers are large.  And that's just the protein-protein interactions!  Needless to say, protein-DNA interactions are equally vital (in particular, to regulate gene expression), and other biomolecules cannot be left out, either...

So, we have large numbers of "actors" (bio-molecules) and dizzying numbers of "relationships" (reactions.)

The Missing Parts

To further complicate matters, not all "actors" have been characterized yet.  And that's even more true for the "relationships."  Projects such as the REACTOME have been hard at work to round up all the known interactions.  I think it's fascinating to take a peek at their interactive Pathway Browser : to me, it feels like being let in past the curtains to peek at the Inner Workings of the life force!

But to do with the unknown parts?  Generally speaking, they can be explored experimentally or with computer simulations ("molecular dynamics" simulations.)

Molecular Dynamics Computing

"Molecular dynamics" simulations are very complex, even with powerful computers.  In brief, that's because of the large number of atomic nuclei and electrons in biomolecules, interacting with all other atomic nuclei and electrons (less so with those farther away, but still a large number interactions.)

Hence the Supercomputers and Quantum Computers mentioned in the title.  Supercomputers have been riding the recent revolution in GPU performance.  "For the first time in history, most of the flops added to the TOP500 [supercomputer] list came from GPUs instead of CPUs" (June 2018 article.)

Very recently, at the end of 2020, a breakthrough Machine-Learning approach, the AlphaFold 2 project by Google's DeepMind company, has been able to find patterns in known protein shapes, to the point of fairly accurately predicting shapes of other proteins (5-min summary, slightly more detailed intromore depth, and a Nov. 2020 article in journal Nature)

And quantum computers are expected to be especially helpful for simulating molecular dynamics.  A topics for future blog entries!  For now, let me just mention a 12-minute PBS video with one of the best intros to qubits and quantum computing (especially its underlying math), and an interesting, though a little dated (2018) online course:  The Building Blocks of a Quantum Computer

Systems Biology : Quantitative Modeling

Let's try to put it all together.  We have a relatively good set of "actors" and a rather incomplete set of relationships.  What's next?  Quantitative modeling!  But how do we do that, given our rather limited knowledge of "initial conditions" (for example, concentrations in each of the grid partitions introduced for modeling the cell), and given our wobbly knowledge of reaction parameters?

Well...  unknown initial conditions...  partially known "weights"...  that's a job for Machine-Learning style optimization techniques!  Perhaps a mix of gradient descent and genetic algorithms (i.e. artificial Directed Evolution, one of my research areas in Theoretical Neuroscience.)  

But what's the counterpart of the "loss function", aka "fitness function" (that is, a gauge of how well the system is performing)?  That seems hard to define, but a simulated cell that can divide appropriately, and interact with simulated environments in ways that mimic real cells - i.e. exhibit appropriate phenotypes - could correspond to better performance scores.  In the words of this 2015 article in Trends in Cell Biology, Why build whole-cell models? : "quantify variation in how individual cells in a population express a set of genes in response to an environmental signal.”

Machine Learning approaches are also discussed in this 2012 article in the Proceedings of the National Academy of Science, Computational design of genomic transcriptional networks with adaptation to varying environments.  Of course, Machine Learning has many more immediate uses in medicine, such as finding cancerous patterns in medical images (here's a very recent, 2/2019 article on AI in Cancer Imaging), but the focus of this blog entry is quantitative modeling of the cell.

Yes, it's a tall order.  A good place to start is probably the simplest of organism.  For example, here's a fascinating 2012 article in Cell, where whole-cell simulations are applied to Mycoplasma genitalium, one of the simplest bacteria known, with just 525 genes in its genome.  In that article, the authors' computer simulations provided insight into that bacterium protein-DNA association, and into its replication.

Envisioning the Future

A possible sequence of events that could profoundly shape Medicine in the 21st century is quantitative modeling of prokaryote (bacterial) cells, followed by quantitative modeling of eukaryote cells (complex cells with a nucleus, including human cells), and finally quantitative modeling of tissues and finally of whole systems/organisms.

How will all that unfold?  Among the key players, I envision institutions or companies that are fluent in bringing together the best available biological datasets (such as the REACTOME) and their frequent updates, and add good analytics, web interface and API for usability...  And then work closely with academia to add quantitative modeling and machine learning.  A mix of open-source/open-data and licensed, might be especially good - to work tightly with academia and public institutions, while at the same time raise money for operations and research.


Popular posts from this blog

Online Courses: (Often) Free and Just Awesome!

“Education is the kindling of a flame, not the filling of a vessel.” -Socrates.  [UPDATED Mar. 2021] Acquiring knowledge has been a hobby of mine since 4th grade, so it's no surprise that I'm the proverbial "kid in the candy store" when it comes to online courses!   As of writing, I have followed over 20 so far, and trying to decide what the next one will be... Utopia or Dystopia? You ever find yourself imagining the future, and wondering whether it'll turn out to be “utopian” or “dystopian”? Well, the state of higher education in the United States is decisively dystopian , with its absurdly ballooned costs and runaway student loans (a “bubble” that may burst sooner or later, mark my words!),  BUT there’s a counterpoint that is decisively utopian , namely the explosive rise of free online courses 😊 Here’s a brief 2012 Ted talk about the rise of free online courses , dated but still of interest. The gist of that TED talk is that online learning has com

Graph Databases (Neo4j) - a revolution in modeling the real world!

(UPDATED 9/2022) - I was "married" to Relational Databases for many years... and it was a good "relationship" full of love and productivity - but SOMETHING WAS MISSING! Let me backtrack.   In college, I got a hint of the "pre-relational database" days...  Mercifully, that was largely before my time, but  - primarily through a class - I got a taste of what the world was like before relational databases.  It's an understatement to say: YUCK! Gratitude for the power and convenience of Relational Databases and SQL - and relief at having narrowly averted life before it! - made me an instant mega-fan of that technology.  And for many years I held various jobs that, directly or indirectly, made use of MySQL and other relational databases - whether as a Database Administrator, Full-Stack Developer, Data Scientist, CTO or various other roles. But there were thorns in the otherwise happy relationship The root cause: THE REAL WORLD DOES NOT REALLY RESEMBLE THE

D3 Visualization with Vue.js : a powerful alliance (when done right!)

[UPDATED MAY 2022]  D3.js is a very powerful visualization tool, especially for specialized/custom needs...  On the flip side, it's rather hard to use - with a steep learning curve. Even worse if one also wants interactivity ! But why is D3 so hard/clunky to use?  And what can be done about it? Spoiler alert: Vue.js (or other modern front-end framework) to the rescue - if done right... All code in the examples is available in this GitHub repository . The Root of the Problem In a nutshell, what makes D3 awkward to use is that, for historical reasons, it tries to do too much : most painfully, it uses an old way to do direct DOM manipulation (i.e. restructuring the page layout) - an operation that nowadays is superbly handled in a far more friendly way by modern front-end frameworks, such as Vue.js Document Object Model ( DOM ) is a programming interface for web documents.  In simple terms, it's the structure of the elements on a web page (text, images, etc.) Let the front-e

A "Seismic Shift" in Longevity Science : Mainstream Acceptance + Large Funding

"You are incredibly prescient!"   I woke up to those words from a former colleague on Jan. 19, 2022: the bombshell announcement that the Chief Science Officer of pharma giant GSK, where I worked until recently, will become the CEO at the new, $3 BILLION longevity science company Altos (presumably also funded by Amazon's Jeff Bezos.) Big Pharma is at long last embracing Longevity Science. The corollary: longevity science is entering Mainstream (with capital "M") But let me backtrack... The Decade of Longevity Science When Harvard professor David Sinclair declared the 2020's to be the " decade of the paradigm shift about age reversal ", one could perhaps be dismissive of it as just an outburst of enthusiasm... But in the past couple of years, we're seeing strong evidence that his forecast is right on the mark! While I worked at GlaxoSmithKline - a giant, top-10, pharma company - I vigorously advocated forming a Longevity Science dept., and sp

Life123 : Quantitative Modeling of Biological Systems

(UPDATED 8/2022) - Are we ready to embark on a next-generation detailed quantitative modeling of complex biological systems , including whole-cell simulations?  An anticipated up-jump in computing power may be imminent from Photonics computers (which I discuss here ), and GPU's are rapidly gaining power as well...  Are we in ready state to put existing - and upcoming - power to good use? This is a manifest, and a call to action What's Life123? It's about detailed quantitative modeling of biological systems in 1-D, 2-D and full 3-D, as well as a multi-faceted software platform for doing so. What's (pseudo-)1D?  For now, let's say it's like the inside of a long, thin tube - with no interactions with the tube.  Likewise, (pseudo-)2D can be thought of as a Petri dish, with no interactions with the lid or the bottom. Website : A purposeful decision to also utilize 1D and 2D But why?  Yes, it's in part about "walk before you run&quo

Multimedia Knowledge Representation and Management : "Brain Annex"

 (Updated Feb. 2022) Wouldn't it be fantastic to have a "butler" to help us as we constantly face drowning in information? That need was crushingly pressing for me , as a polymath with a thirst for knowledge in several fields, not to mention numerous very technical jobs over the years, several complex research projects, old notes from college and grad school, an endless stream of online courses I take , a tech startup I founded and used to run, the many conferences I attend, life in general, and even hobbies that tend to generate abundant information (such as flying airplanes and studying multiple foreign languages!)   I was immensely eager for some sort of powerful assistance, something so helpful that I could poetically describe as an " annex " to my brain.. In this blog entry, I'll describe how deep frustration with existing software tools led to the start of the open-source project, a web-based knowledge representation and manageme

Anti-Aging Research: Science, not Hype

Last updated November 2021 Q: "How is aging a disease?" A: "It's a dynamic system that veers away from its homeostasis (normal equilibrium point): hence a form of slow-progressing illness. Labeling it as 'natural' is a surrender to our traditional state of ignorance and powerlessness, which fortunately is beginning to be changed!" The above is my standard answer to an oft-asked question. The science of aging is by all evidence very misunderstood by the general public.  Hype, misinformation and unquestioned assumptions often prevail, unfortunately. Aging as a systemic breakdown of the body, rather than a series of isolated events and conditions. This 2013 diagram from NIH is a good way to jump-start contemplating the big picture: The diagram originates from the Cell journal: The Hallmarks of Aging   Telomere shortening is perhaps the one most talked about - but just one of several processes.  As stated in the above paper: Each

Brain Microarchitecture : Feedback from Higher-order areas to Lower-order areas

Some questions that arise in Machine Learning involve the prospect of using feedback from Higher-order areas (downstream) to Lower-order areas (upstream), and using Global Knowledge for Local Processing.  A desire to gain insight into those issues from Neuroscience ("how does the brain do it?") led me to some fascinating investigations into the Microcircuits of the Cerebral Cortex.  This blog entry is a broad review of the field, in the context of the original motivating questions from Machine Learning.   Starting out with a quote from the “bible of Neuroscience”: From Principles of Neural Science, 5th edn  (Online book location 1435.3 / 5867).  Emphasis and note added by me: Sensory pathways are not exclusively serial; in each functional pathway higher-order areas project back to the lower-order areas from which they receive input. In this way neurons in higher-order areas, sensitive to the global pattern of sensory input, can modulate the activity of neurons in lowe

Photonic Computer - a "supercharged GPU" with very low energy consumption

Yes, we all wish for Quantum Computers... but in the meantime we need something here and now!  Could Photonic Computers fit that role? Just about everyone has heard of fiber optics – using light for data transmission – but did you know that light can also be used for computing? There's a new commercial product expected for early next year (2022) . I contacted the CEO, Nicholas Harris, of a 4-y.o. startup, Lightmatter , interviewed in April 2021 here . Photonic computers, at least in their first commercial appearance, are essentially accelerator cards for Linear Algebra - and so of special interest for Machine Learning and some types of simulations.    Their claims are remarkable: 10X faster than some of the best GPUs using 90% less energy can be used with existing software stacks, such as TensorFlow commercially available early next year (2022) a lot of future growth, as additional wavelengths of light get used in parallel My own interest is pr