An anticipated up-jump in computing power may be imminent from Photonics computers (which I discuss here), and GPU's are rapidly gaining power as well... Are we in ready state to put existing - and upcoming - power to good use?
This is a manifest, and a call to action
What's Life123?
It's about detailed quantitative modeling of biological systems in 1-D, 2-D and full 3-D, as well as a multi-faceted software platform for doing so.What's (pseudo-)1D? For now, let's say it's like the inside of a long, thin tube - with no interactions with the tube. Likewise, (pseudo-)2D can be thought of as a Petri dish, with no interactions with the lid or the bottom.
Website: https://life123.science
A purposeful decision to also utilize 1D and 2D
But why? Yes, it's in part about "walk before you run"... but, more specifically, it's about the freedom to sometimes choose to avoid the distraction of higher dimensions, and focus on the essence of the features... and focus on incorporating good habits very early on, before the complexity, the long runs and the difficult visualizations of 3D come into being.
What good habits? Just to mention a few:
COMPUTING: GPU-assisted computing, parallelized computations utilizing multiple CPU cores and/or multiple computers.
MODELING:
variable spacial and temporal resolutions. A modular approach of
coarse-to-fine models as needed. From very early on, the project will model reaction rates, chemical
diffusion, membranes and cellular compartments. Manage both "normal" concentrations and extremely-low ones (such as macro-molecules that are few in numbers.)
INTERACTIVE VISUALIZATION: plots, graphs, heatmaps, etc, with interactive adjustable controls. Extremely personalizable to deal with things like membranes and compartments.
UNIT TESTING: "an ounce of prevention is worth a pound of cure"
MODULARITY: being very disciplined in tackling a large software and data science project.
INFRASTRUCTURE / PLATFORM: a tight alliance between the tools of large-scale software engineering (such as IDE's), the tools of data science (such as JupyterLab) and of data engineering (such as Neo4j graph database.)
MULTIPLE AUDIENCES: address, in the platform and in the documentation, the background and needs of different classes of people, such as programmers, data scientists, chemists, biologists.
EXAMPLE. In the modeling category, take variable spacial and temporal resolutions. Let's say we have several chemical species that are in near-equilibrium, both in terms of diffusion and reactivity... and then we have other species that - in some locations and some time periods - have highly variable concentration rates, perhaps because they're produced or consumed in reactions...
Do we really want to waste a lot of high resolution and computing power on the species, locations, time periods that are near-equilibrium? Conversely, do we want to only coarsely simulate the highly dynamic species/locations/time periods?
Biology informs us that there's quite a range of time scales: for
example, impulse firing in neurons is of the order of milliseconds...
while DNA replication is of the order of a hour. Many orders of magnitude!
How do we best model variable spacial and temporal resolutions for some of the chemical species? Well, Life123 is a great environment to tackle those design decisions, without being immediately slammed with the intricacies of full 3D!
Fundamental Goals
- Detailed, quantitative biological simulations, including whole prokaryotic cells (bacteria), and later eukaryotic cells
- Deeper quantitative insight into human tissue/organ/system physiology, for the advancement of medicine
- A very integrative approach that is ultimate conducive to body-wide insights,
with an eye to Longevity Science
- Explore the minimalist essence of life-like dynamical systems, including their evolution under "genetic algorithms" and other machine-learning approaches. Also, explore chaotic states
- Investigate potential paths for the emergence of life on Earth and on Exoplanets
- A community effort bringing together biologists, system biologists, programmers, machine-learning specialists, biochemists, power-computing engineers, doctors, chemical engineers, data scientists, visualization/UX experts, members of the public & institutions willing to share computing resources, etc.
What Life123 is *NOT*
- A tool primarily for educational purposes
- A computer game
- An art project
- A Reaction-Diffusion exercise as an end to itself
- Molecular Dynamics (the reactions rates are assumed already known, or at least with interim estimated values)
- Detailed modeling of the biophysics of membranes, etc.
- A mathematical exercise in dynamical systems as an end to itself
- Wolfram-style "cellular automata"
- An exploration of hypothetical Physics in one or two dimensions
Note that "as an end to itself" is the operative phrase here; some of those categories do overlap with Life123
Broad Strategies ("Guiding Principles")
- Aim big, for a simulation scale that may be impractical at the present... but attain a ready state
to pounce on the latest advances in computing capabilities - in particular GPU computing, and possibly
the upcoming Photonics Computing GPU accelerators
- Not attempting to create a description and simulation of what happens at the molecular, or near-molecular level. It's going to be much a much-coarser model that's still accurate enough to capture the essence of the cell's (or system's) behavior: for example, how it respond to inputs and what outputs it produces - what it absorbs, what it secretes, what it does with its internal state, including replication, etc.
Excessive simulation detail with an ultra-fine spatial/temporal grid, could result in a possibly-excessive information content about the cell (or system): "excessive", in the sense that it may be beyond the information content required to describe the system's "computational capability." (broadly defined; see discussion on dynamical systems and theoretical aspects ) - A mix of "bottom up" and "top down" approaches:
"bottom up", as in starting with a minimalist scenarios with fictional molecules and rates, even in 2D or 1D, and gradually advancing to full 3D, real molecules and plausible diffusion/reactions rates, concentrations, etc...
"top down", as in always setting the target on the final goal of realistic biological systems. That will be the internal compass always guiding this project.
What does pseudo-1D/-2D mean?
Let's think of 1D more like a very thin, long tube of aqueous solution - minus the tube! In particular, for example, no capillary effects! Likewise, in 2D, think of the Petri-dish minus the actual dish!
How could 1D or 2D simulations ever be realistic?
Computing
- Python
- NumPy
- JupyterLab
- Custom visualizations with D3.js and Vue.js (a great alliance I discuss here)
- Network visualizations with cytoscape.js
- GPU acceleration
- Multi-core computing, perhaps making use of Dask
- Distributed computing
- Neo4j Graph databases (a lot of full-stack infrastructure for that is available thru the open-source sister project Brain Annex)
- TensorFlow machine learning
- Unit testing with PyTest
Chemistry
- Chemical diffusion
- Chemical reaction rates
- Coupled reactions, together with diffusion
- Diffusion across membranes (passive and active transport)
- Temperature effects
Biology
- Cellular compartments
- Macro molecules
- Transcription/Translation
- Replication
Isn't this overly ambitious?
To paraphrase former president J. F. Kennedy's famous speech about going to the Moon:
We choose to purse quantitative dynamical modeling of complex biological systems not because they are easy, but because they are hard; because that challenge is one that we are willing to accept, one we are unwilling to postpone, and one we intend to win!
Where does the project currently stand?
[8/2022 update in box below!]
As of today, with the propitious date (in U.S. format) of 3.14, I am officially releasing to open source the early Beta version of the Life123 platform: GitHub repository
Accompanying the software at this relatively early stage is:
- This project's "manifesto"
- A "call to arms" to the community, to be possibly followed by a Discord channel to coordinate. For now, ALL DISCUSSION will take place here
- A new website: https://life123.science
In particular, the infrastructure to create interactive visualizations, possibly custom ones (with plotly, Vue.js and D3.js) is getting more polished and streamlined. The cytoscape.js library has been brought in for network visualizations.
The really interesting stuff is now just starting! Upcoming releases will be introducing elements that have been on the table from the very inception of the project:
- membranes
- macro-molecules (with multiple binding sites)
- passive/active transport across membranes
How to scale up?
This is a community project meant to bring together a wide variety of skill sets.
In the early fall, as the foundations get more solid, I'll start publicizing the project, and actively seeking collaborators.
Down the line, some non-profits and companies may opt to get involved. Perhaps philanthropists and/or investors as well.
A Call to the Community
This is an open call to researchers, academicians, computer scientists, students, colleagues in the pharmaceutical industry, doctors, philanthropists, members of the public.
In the Longevity-Science community, which I've been active in - and at times working in - for a number of years, I hear a lot of "how can I help?" Well, here's a way!
Do you have skills in:
- Biology
- Python programming
- CUDA programming, or other ways to utilize GPU computing
- TensorFlow and/or other machine learning
- Plotly or D3.js visualization
- UX
- Chemical engineering
- Biochemistry
- Biophysics
- Systems biology
- Bioinformatics
- Medicine
- Web design
- Technical writing
- QA / DevOps
- Computing resources - such as your own gaming PC, or your company's computing (if you're authorized)
- Funds for research projects, philanthropy or investments
- Ways to spread the word (social media, etc)
Comments
Post a Comment