Skip to main content

Life123 : Quantitative Modeling of Biological Systems

(UPDATED 8/2022) - Are we ready to embark on a next-generation detailed quantitative modeling of complex biological systems, including whole-cell simulations? 

An anticipated up-jump in computing power may be imminent from Photonics computers (which I discuss here), and GPU's are rapidly gaining power as well...  Are we in ready state to put existing - and upcoming - power to good use?

This is a manifest, and a call to action

What's Life123?

It's about detailed quantitative modeling of biological systems in 1-D, 2-D and full 3-D, as well as a multi-faceted software platform for doing so.

What's (pseudo-)1D?  For now, let's say it's like the inside of a long, thin tube - with no interactions with the tube.  Likewise, (pseudo-)2D can be thought of as a Petri dish, with no interactions with the lid or the bottom.

Website: https://life123.science

A purposeful decision to also utilize 1D and 2D

But why?  Yes, it's in part about "walk before you run"...  but, more specifically, it's about the freedom to sometimes choose to avoid the distraction of higher dimensions, and focus on the essence of the features... and focus on incorporating good habits very early on, before the complexity, the long runs and the difficult visualizations of 3D come into being.

What good habits?  Just to mention a few:

COMPUTING:  GPU-assisted computing, parallelized computations utilizing multiple CPU cores and/or multiple computers.

MODELING:  variable spacial and temporal resolutions.  A modular approach of coarse-to-fine models as needed.  From very early on, the project will model reaction rates, chemical diffusion, membranes and cellular compartments.  Manage both "normal" concentrations and extremely-low ones (such as macro-molecules that are few in numbers.)

INTERACTIVE VISUALIZATION:  plots, graphs, heatmaps, etc, with interactive adjustable controls.  Extremely personalizable to deal with things like membranes and compartments.

UNIT TESTING: "an ounce of prevention is worth a pound of cure"

MODULARITY: being very disciplined in tackling a large software and data science project.

INFRASTRUCTURE / PLATFORM: a tight alliance between the tools of large-scale software engineering (such as IDE's), the tools of data science (such as JupyterLab) and of data engineering (such as Neo4j graph database.)

MULTIPLE AUDIENCES: address, in the platform and in the documentation, the background and needs of different classes of people, such as programmers, data scientists, chemists, biologists.

 

EXAMPLE.  In the modeling category, take variable spacial and temporal resolutions.  Let's say we have several chemical species that are in near-equilibrium, both in terms of diffusion and reactivity...  and then we have other species that - in some locations and some time periods - have highly variable concentration rates, perhaps because they're produced or consumed in reactions...

Do we really want to waste a lot of high resolution and computing power on the species, locations, time periods that are near-equilibrium?  Conversely, do we want to only coarsely simulate the highly dynamic species/locations/time periods?

Biology informs us that there's quite a range of time scales: for example, impulse firing in neurons is of the order of milliseconds...  while DNA replication is of the order of a hour.  Many orders of magnitude!

How do we best model variable spacial and temporal resolutions for some of the chemical species?  Well, Life123 is a great environment to tackle those design decisions, without being immediately slammed with the intricacies of full 3D!

Fundamental Goals 

  1. Detailed, quantitative biological simulations, including whole prokaryotic cells (bacteria), and later eukaryotic cells

  2. Deeper quantitative insight into human tissue/organ/system physiology, for the advancement of medicine

  3. A very integrative approach that is ultimate conducive to body-wide insights, with an eye to Longevity Science

  4. Explore the minimalist essence of life-like dynamical systems, including their evolution under "genetic algorithms" and other machine-learning approaches.  Also, explore chaotic states

  5. Investigate potential paths for the emergence of life on Earth and on Exoplanets

  6.  A community effort bringing together biologists, system biologists, programmers, machine-learning specialists, biochemists, power-computing engineers, doctors, chemical engineers, data scientists, visualization/UX experts, members of the public & institutions willing to share computing resources, etc.

What Life123 is *NOT*

  • A tool primarily for educational purposes
  • A computer game
  • An art project
  • A Reaction-Diffusion exercise as an end to itself
  • Molecular Dynamics (the reactions rates are assumed already known, or at least with interim estimated values)
  • Detailed modeling of the biophysics of membranes, etc.
  • A mathematical exercise in dynamical systems as an end to itself
  • Wolfram-style "cellular automata"
  • An exploration of hypothetical Physics in one or two dimensions

Note that "as an end to itself" is the operative phrase here; some of those categories do overlap with Life123

Broad Strategies ("Guiding Principles")

  • Aim big, for a simulation scale that may be impractical at the present... but attain a ready state to pounce on the latest advances in computing capabilities - in particular GPU computing, and possibly the upcoming Photonics Computing GPU accelerators

  • Not attempting to create a description and simulation of what happens at the molecular, or near-molecular level.  It's going to be much a much-coarser model that's still accurate enough to capture the essence of the cell's (or system's) behavior: for example, how it respond to inputs and what outputs it produces - what it absorbs, what it secretes, what it does with its internal state, including replication, etc.

    Excessive simulation detail with an ultra-fine spatial/temporal grid, could result in a possibly-excessive information content about the cell (or system): "excessive", in the sense that it may be beyond the information content required to describe the system's "computational capability." (broadly defined; see discussion on dynamical systems and theoretical aspects )

  • A mix of "bottom up" and "top down" approaches:

    "bottom up", as in starting with a minimalist scenarios with fictional molecules and rates, even in 2D or 1D, and gradually advancing to full 3D, real molecules and plausible diffusion/reactions rates, concentrations, etc...

    "top down", as in always setting the target on the final goal of realistic biological systems.  That will be the internal compass always guiding this project.

What does pseudo-1D/-2D mean?

PSEUDO is the operative word!  A true 1D or 2D world would be a speculative exercise in the Foundations of Physics...  In 1D, it'd be molecules on a string...  Would they be able to "pass" each other?  Fermions that can't occupy the same space?  Quantum tunneling to "slide over each other"?  How would force transmit across 1D?  (In 3D, force fields dilute with the square of distance... in 2D, with the distance... but in 1D?)
 
NO!  Life123 isn't about any of that!  If any physicist wants to chat on the side on fascinating concepts of actual 1D or 2D, by all means - but that's completely outside the scope and goals of this project.

Let's think of 1D more like a very thin, long tube of aqueous solution - minus the tube!  In particular, for example, no capillary effects!  Likewise, in 2D, think of the Petri-dish minus the actual dish!

How could 1D or 2D simulations ever be realistic?

They can't - and aren't meant to.  Remember, simulations in 1D and 2D is all about setting up good practices in computing, data modeling, chemical and biological modeling, as a crutch to fully 3D- simulations.

Computing 

  • Python
  • NumPy
  • JupyterLab
  • Custom visualizations with D3.js and Vue.js (a great alliance I discuss here)
  • Network visualizations with cytoscape.js
  • GPU acceleration
  • Multi-core computing, perhaps making use of Dask
  • Distributed computing
  • Neo4j Graph databases (a lot of full-stack infrastructure for that is available thru the open-source sister project Brain Annex)
  • TensorFlow machine learning
  • Unit testing with PyTest

Chemistry 

  • Chemical diffusion
  • Chemical reaction rates
  • Coupled reactions, together with diffusion 
  • Diffusion across membranes (passive and active transport)
  • Temperature effects

Biology

  • Cellular compartments
  • Macro molecules
  • Transcription/Translation
  • Replication

Isn't this overly ambitious?

To paraphrase former president J. F. Kennedy's famous speech about going to the Moon:

We choose to purse quantitative modeling of complex biological systems not because they are easy, but because they are hard;  because that challenge is one that we are willing to accept, one we are unwilling to postpone, and one we intend to win!

Where does the project currently stand?

[8/2022 update in box below!]  

As of today, with the propitious date (in U.S. format) of  3.14, I am officially releasing to open source the early Beta version of the Life123 platform: GitHub repository

Accompanying the software at this relatively early stage is:

  • This project's "manifesto"
  • A "call to arms" to the community, to be possibly followed by a Discord channel to coordinate. For now, ALL DISCUSSION will take place here
  • A new website: https://life123.science
This project originates from years of discussions I've had with colleagues in the pharmaceutical industry, researchers, and academicians...  as well as papers such as A Whole-Cell Computational Model Predicts Phenotype from Genotype...  and from my own professional background in computing, machine learning, molecular biology, bioinformatics and systems biology.  
 
The current platform stems from early research I carried out in 2020, and active development of the alpha version in late 2021 thru the present.

The underlying philosophical framework is detailed in this entry I wrote in 2019.

August 2022 UPDATE: the platform (repo) is shaping up nicely, and entering a late Beta stage.

In particular, the infrastructure to create interactive visualizations, possibly custom ones (with plotly, Vue.js and D3.js) is getting more polished and streamlined. 
The cytoscape.js library has been brought in for network visualizations.

The website is much more substantial, and existing notebooks and their visualizations are just a click away.
 
A live demo of all JupyterLab notebooks is also one click away!  
 
Ways for you to get involved are getting spelled out in more detail

 
Micro-blogging, ready to later turn into an early discussion platform, was added.

With the recently-released Beta 12 version (change-log), Life123 has by now established a substantial amount of the infrastructures for data science, software management and visualization - for diffusion and (single binding site) reactions.

The really interesting stuff is now just starting!  Upcoming releases will be introducing elements that have been on the table from the very inception of the project:

- membranes
- macro-molecules (with multiple binding sites)
- passive/active transport across membranes

How to scale up?

This is a community project meant to bring together a wide variety of skill sets

In the early fall,  as the foundations get more solid, I'll start publicizing the project, and actively seeking collaborators.

Down the line, some non-profits and companies may opt to get involved.  Perhaps philanthropists and/or investors as well.

A Call to the Community

This is an open call to researchers, academicians, computer scientists, students, colleagues in the pharmaceutical industry, doctors, philanthropists, members of the public.

In the Longevity-Science community, which I've been active in - and at times working in - for a number of years, I hear a lot of "how can I help?"  Well, here's a way!

Do you have skills in:

  • Biology
  • Python programming
  • CUDA programming, or other ways to utilize GPU computing
  • TensorFlow and/or other machine learning 
  • Plotly or D3.js visualization
  • UX 
  • Chemical engineering 
  • Biochemistry
  • Biophysics
  • Systems biology
  • Bioinformatics
  • Medicine
  • Web design
  • Technical writing
  • QA / DevOps
Or do you have access to:
  • Computing resources - such as your own gaming PC, or your company's computing (if you're authorized)
  • Funds for research projects, philanthropy or investments
  • Ways to spread the word (social media, etc)
We need you in all of the above scenarios, and more!  Help us design, implement, test, refine and run the simulations and the platform to scale.
 
I expect to later start a Discord channel.  For now, all discussions (as well as micro-blogging) will take place here.  Also, you're welcome to reach me on LinkedIn or on my professional Facebook account.

The website has a page detailing areas of potential involvement for people with particular skillsets.

Comments

Popular posts from this blog

Discussing Neuroscience with ChatGPT

UPDATED Apr. 2023 - I'm excited by ChatGPT 's possibilities in terms of facilitating advanced learning .  For example, I got enlightening answers to questions that I had confronted when I first studied neuroscience.  The examples below are taken from a very recent session I had with ChatGPT (mid Jan. 2023.) Source: https://neurosciencestuff.tumblr.com In case you're not familiar with ChatGPT, it's a very sophisticated "chatbot" - though, if you call it that way, it'll correct you!  'I am not a "chatbot", I am a language model, a sophisticated type of AI algorithm trained on vast amounts of text data to generate human-like text'. For a high-level explanation of how ChatGPT actually works - which also gives immense insight into its weaknesses, there's an excellent late Jan. 2023 talk by Stephen Wolfram, the brilliant author of the Mathematica software and of Wolfram Alpha , a product that could be combined with ChatGPT to imp

Neo4j & Cypher Tutorial : Getting Started with a Graph Database and its Query Language

You have a general idea of what Graph Databases - and Neo4j in particular - are...  But how to get started?  Read on! This is part 3 of a 7-part series on Graph Databases and Neo4j.   part 1 : Graph Databases (Neo4j) - a revolution in modeling the real world! part 2 : Neo4j Sandbox Tutorial : try Neo4j and learn Cypher the free and easy way part 3 : Neo4j & Cypher Tutorial : Getting Started with a Graph Database and its Query Language  part 4 : Using Neo4j with Python : the Open-Source Library NeoAccess part 5 : Using Schema in Graph Databases such as Neo4j part 6  : Putting it All Together - a Technology Stack on Top of a Graph Database part 7  : (SPECIAL TOPIC) Full-Text Search with the Neo4j Graph Database If you're new to graph databases, please check out part 1 for an intro and motivation about them.  There, we discussed an example about an extremely simple database involving actors, movies and directors...  and saw how easy the Cypher query lan

Graph Databases (Neo4j) - a revolution in modeling the real world!

UPDATED July 2023 - I was "married" to Relational Databases for many years... and it was a good "relationship" full of love and productivity - but SOMETHING WAS MISSING! Let me backtrack.   In college, I got a hint of the "pre-relational database" days...  Mercifully, that was largely before my time, but  - primarily through a class - I got a taste of what the world was like before relational databases.  It's an understatement to say: YUCK! Gratitude for the power and convenience of Relational Databases and SQL - and relief at having narrowly averted life before it! - made me an instant mega-fan of that technology.  And for many years I held various jobs that, directly or indirectly, made use of MySQL and other relational databases - whether as a Database Administrator, Full-Stack Developer, Data Scientist, CTO or various other roles. UPDATE: This article is now part of a series... This is part 1 of a 7-part series on Graph Databases and Neo4j.  

Using Neo4j with Python : the Open-Source Library "NeoAccess"

So, you want to build a python app or Jupyter notebook to utilize Neo4j, but aren't too keen on coding a lot of string manipulation to programmatic create ad-hoc Cypher queries?   You're in the right place: the NeoAccess library can do take care of all that, sparing you from lengthy, error-prone development that requires substantial graph-database and software-development expertise! This is part 4 of a 7-part series on Graph Databases and Neo4j.   part 1 : Graph Databases (Neo4j) - a revolution in modeling the real world! part 2 : Neo4j Sandbox Tutorial : try Neo4j and learn Cypher – free and easy part 3 : Neo4j & Cypher Tutorial : Getting Started with a Graph Database and its Query Language   part 4 : Using Neo4j with Python : the Open-Source Library "NeoAccess" part 5 : Using Schema in Graph Databases such as Neo4j part 6  : Putting it All Together - a Technology Stack on Top of a Graph Database part 7  : (SPECIAL TOPIC) Ful

To Build or Not to Build One’s Own Desktop Computer?

“ VALENTINA ” [UPDATED JUNE 2021] - Whether you're a hobbyist, or someone who just needs a good desktop computer, or an IT professional who wants a wider breath of knowledge, or a gamer who needs a performant machine, you might have contemplated at some point whether to build your own desktop computer. If you're a hobbyist, I think it's a great project.  If you're an IT professional - especially a "coder" - I urge you to do it: in my opinion, a full-fledged Computer Scientist absolutely needs breath, ranging from the likes of Shannon's Information Theory and the Halting Problem - all the way down to how transistors work. And what about someone who just needs a good desktop computer?  A big maybe on that - but perhaps this blog entry will either help you, or scare you off for your own good! To build, or not to build, that is the question: Whether 'tis nobler in the mind to suffer The slings and arrows of OEM's cutting corners and limit

Using Schema in Graph Databases such as Neo4j

UPDATED Aug. 2023 - Graph databases have an easygoing laissez-faire attitude: "express yourself (almost) however you want"... By contrast, relational databases come across with an attitude along the lines of a micro-manager:  "my way or the highway"... Is there a way to take the best of both worlds and distance oneself from their respective excesses, as best suited for one's needs? This is part 5 of a 7-part series on Graph Databases and Neo4j.   part 1 : Graph Databases (Neo4j) - a revolution in modeling the real world! part 2 : Neo4j Sandbox Tutorial : try Neo4j and learn Cypher the free and easy way part 3 : Neo4j & Cypher Tutorial : Getting Started with a Graph Database and its Query Language part 4 : Using Neo4j with Python : the Open-Source Library NeoAccess part 5 : Using Schema in Graph Databases such as Neo4j part 6  : Putting it All Together - a Technology Stack on Top of a Graph Database part 7  : (SPECIAL TOPIC) F

Full-Text Search with the Neo4j Graph Database

(UPDATED Sep. 2023)   In part 5 ( Using Schema in Graph Databases ) we discussed the concept of a Schema Layer, and a design and implementation available from the open-source project BrainAnnex.org Now that we have such a layer, what shall be build on top of it?   Well, how about  Full-Text Search ?  This is  part 7  of a ongoing series on Graph Databases and Neo4j.   part 1  : Graph Databases (Neo4j) - a revolution in modeling the real world! part 2  : Neo4j Sandbox Tutorial : try Neo4j and learn Cypher the free and easy way part 3  : Neo4j & Cypher Tutorial : Getting Started with a Graph Database and its Query Language part 4  : Using Neo4j with Python : the Open-Source Library NeoAccess part 5  : Using Schema in Graph Databases such as Neo4j part 6 :   Putting it All Together - a Technology Stack on Top of a (Neo4j) Graph Database part 7 : (SPECIAL TOPIC) Full-Text Search with the Neo4j Graph Database part 8 (upcoming!) : (SPECIAL TOPIC) Document Management Full-Text Searching/

PET/CT Combined Scanners - a 2018 Breakthrough of the Year... and a Personal Story

Image source Recently, a co-worker in her 20's was diagnosed with a brain tumor!  At times like these, the importance of medical imaging jumps to the fore! Most people have heard of CT ("CAT") scanners – at least enough to know that they don't actually involve cats – but less well-known are PET scanners (which likewise don't involve pets!), and the synergistic combination of the two. A Marriage Made in Heaven What do those scanners do?  And why are they being combined in single devices? Voted 2018 Breakthrough of the Year by a science magazine , the improved PET/CT combined scanner has been a game changer. The EXPLORER PET/CT scanner – the world’s first medical imaging system that can capture a 3D image of the entire human body simultaneously – has produced its first human images. Developed by UC Davis scientists and a multi-institutional consortium, EXPLORER can scan up to 40 times faster, or use up to 40 times less radiation dose, than

RDF Triple Stores vs. Property Graphs : How to Attach Properties to Relationships

Time for the opening shot of a series about Semantic Technology , and in particular contrasting-and-comparing the opposing (but perhaps ultimately complementary) camps of:   RDF Triple Stores , aka Triples-Based Graphs.   For example, Blazegraph or Apache Jena   (Labeled) Property Graphs .  For example, Neo4j or Blazegraph (For this article, I'll assume that you have at least a passing acquaintance with both.  Here is background info on Triplestores and Property Graphs ) It’s my opinion that modeling in terms of Subject/Predicate/Object triples (aka RDF ) might be appealing to mathematicians or philosophers for its minimalist foundation (though a lot of baroque add-on’s quickly come out of the closet!) Modeling in terms of (Labeled) Property Graphs might be appealing to computer scientists, because such graphs appear more usable and less clunky once you start actually doing something with them. Perhaps because I straddle both the Math and CS camps, I’m currently on t

Brain Microarchitecture : Feedback from Higher-order areas to Lower-order areas

Some questions that arise in Machine Learning involve the prospect of using feedback from Higher-order areas (downstream) to Lower-order areas (upstream), and using Global Knowledge for Local Processing.  A desire to gain insight into those issues from Neuroscience ("how does the brain do it?") led me to some fascinating investigations into the Microcircuits of the Cerebral Cortex.  This blog entry is a broad review of the field, in the context of the original motivating questions from Machine Learning.   Starting out with a quote from the “bible of Neuroscience”: From Principles of Neural Science, 5th edn  (Online book location 1435.3 / 5867).  Emphasis and note added by me: Sensory pathways are not exclusively serial; in each functional pathway higher-order areas project back to the lower-order areas from which they receive input. In this way neurons in higher-order areas, sensitive to the global pattern of sensory input, can modulate the activity of neurons in lowe