[DRAFT IN-PROGRESS]
This write-up was inspired by talks I recently gave at the "Bio-IT World 2025" conference in Boston and at the "Aging & Gerontology 2025" conference in San Francisco.
Overview
Life123 is an open-source platform for quantitative interactomics. Its name is . And it has a slogan – more on this later... The slogan is “AI and Big Data aren't enough! We also need Dynamical modeling.”
Let's start with an overview of the Life123 project. It's an open-source engine for quantitative reactomics. But what does it mean? It means that we perform dynamical modeling of biological systems. And the initial focus is towards eventually creating whole-cell simulations.
The platform consists of python libraries, which allow us to do in-silico experiments. Early this year, this project left the beta stage, after a few years of development.
Motivation
What's the motivation for doing all this? There's a practical side – and a conceptual side.
The practical side is a simple one to state : about 90 percent of clinical-drug development fails. That’s appalling! And all our current AI and Big Data seems it can’t do much about it.
So, maybe we ought to be much more quantitative before we intervene?
Some motivations are more specific to Longevity Science and Healthspan: no one will disagree that fighting cancer, etc, will take great skill and sophistication...And I think it’s fair to say that intervening in Longevity, especially “setting the clock back” will take GRAND MASTERY of fine quantitative control over our biology.
And then, moving onto the conceptual motivation, we encounter questions such as how does life emerge from a network of reactions… diffusion, macromolecules and barriers, membranes, and so on.
The Tragedy of the Qualitative
Let's jump into the elephant in
the room here. Let’s consider a very simple network of
hypothetical reactions. Chemical A reacts to produce X, which is
our product of interest… and in a separate reaction, A produces B.
B, in turn, produces X and C through separate reactions.
And
finally, in red here, C inhibits X through yet another
reaction. This is all very schematic, of course.
Now,
let's say, hypothetically, that reaction 5, the one shown in red, the
inhibitory one – let’s assume it’s pretty insignificant
in mice, but major in humans. So, what's the net result?
Let’s say that our goal is to enhance X, because the direct
reaction from A doesn’t produce enough of it.
Our experiments in mice would show us that enhancing reaction 2, which produces B, which in turn produces X, would be a good thing because the side branch that leads to C, and then inhibition of X, is not significant in mice.
So, our experiment in mice concludes that's one way to increase the production of X - our goal. But then we move onto humans… and, there, the inhibition path is significant! So, lo and behold, there's a side effect: X does not get increased after all; maybe even decreased. Millions of dollars of development and clinical trials, down the toilet!
This is just a contrived example, but even in a minimalist example like this one, it’s pretty easy to come up with some scenario that could go in very different ways - and we don't know which way it goes because we're being qualitative.
So, that's what I'm calling the tragedy of the qualitative. We have no clue about the actual evolution of a dynamical system!
Our Reactome
This screenshot is from reactome.org , an organization that builds a database of all the known interactions among molecules in humans. Which is amazing! To me, it feel like parting the curtains of life, and being afforded a peek at the inner workings of nature! But we still cannot predict what happens, or how to change the physiology in a way that we want it to change it… and now, imagine if we had all this AND we had numbers AND a sophisticated dynamical simulator! Imagine what kind of power that could generate, for us to understand and manipulate biology.
So, here we get to the fundamental point : that, in the path to the advancement of medicine, there's a lot of work being done on Big Data and AI - which is awesome – BUT, at the same time, dynamical modeling is getting short shrift. That's the premise of this project.
What Life123 Is
Now, going back to the nuts and bolts of what Life123 is, in some
detail. We said it's quantitative modeling for biological systems.
In particular, the focus for now is for whole-cells, initially
prokaryotes, the simpler ones.
And it's a software platform on which to perform in-silico experiments. Some of its features involve the ability to simulate networks of reactions, diffusion, membranes and macromolecules - among other things.
Life123 is open source. And it is integrative... Extremely integrative. That's a key feature: many disciplines coming together. And it aims big.
The idea is to be in a ready state for advances in computing power… as well as for advances in knowledge of chemical reaction parameters, such as kinetic parameters.
Another way to state it, is that we want to do quantitative reactomics with spatial awareness, including compartments, transport and diffusion.
And what about the name? 123? The name has to do with simulations
being available in one-dimensional scenarios, as well as 2D and 3D.
And also in what you could call zero dimensions, which would
be a uniform compartments, where everything is always perfectly
stirred up.
The fundamental idea is that work in lower
dimensions paves the way, in terms of being a stepping stone
to get started more simply and quickly, and then proceeding onto
higher dimensions.
What Life123 Is NOT
Equally important is to talk about what Life123 is NOT. And one thing it's definitely not – and it's often an assumption many fall for – it is NOT Molecular Dynamics.It does not aim to compute what the kinetics of a particular reaction are. It ASSUMES that the reaction rates, binding constants, diffusion rates, and so on, are either known or estimated… from measurements, or from AI models, or from Molecular Dynamics.
And something else it's not, it's not just reaction-diffusion as an end to itself!
Life123 is not a tool primarily for educational purposes - though that's a great side goal.It's not the computer game. It’s not an art project, though it can generate beautiful images.
And it's not a mathematical exercise in dynamical systems.
It's not cellular automata, in the style of the Wolfram. I don't have time to elaborate on that.
And it is not an exploration of hypothetical physics in one or two dimensions. Here, when we talk about one or two dimensions in Life123, we’re really talking about idealized scenarios : 1D is like a long pipet, while 2D is like a Petri dish.
Fundamental Goals
Let's talk about fundamental goals now. We aim to do detailed, quantitative simulations of biological systems.
And we would like to get deeper quantitative insight into biological cells and their interactions with the environment, including drugs administered.
And, ultimately, a deeper quantitative insight into human tissue, organ, and system physiology.
It's very integrative approach that could ultimately be conducive to
insight, at a body-wide level, for health, aging, and lifespan.
Those are ultimate foundational goals!
Other goals include the exploration of what is the minimalist essence of a “lifelike dynamical system.” In other words, what's the minimum you need for a dynamical system to have some kind of “life-like quality.”
Along those lines, that could be potential interest to investigate
paths for the emergence of Life on Earth or on exoplanets.
Finally,
a fundamental goal of Life123 is to be a community effort that
brings together people from many disciplines.
Open Source
So why open source, you may wonder? Well, f I had to use just 3 words, I would say, long attention span… Just as a few examples, if you look at some of the familiar names in open-source projects like python, numpy and so on… their start dates go quite a bit far back.
Open-source projects live on – while projects at companies come and go, as companies change their focus, or departments lose their funding, or companies go out of business. An example close to me is from another open-source project I lead, BrainAnnex.org. It started in 2015 - and it has long outlived work projects at all the several companies where I’ve worked at during this time span!
Open-source is also something that's widely available to all researchers to use, or contribute to, without being shackled to any one company.
I often see in the promotional literature of companies, the term proprietary platform, with the implicit implication that it’s wonderful thing they have. But to me, when I see those words, it has the opposite implication. To me, a proprietary platform is something that's at the mercy of that particular company. If that company shifts its focus or fails, the project will likely be neglected or abandoned. And, even if it’s not neglected, it’s still something that won’t get the creative input of the rest of the world outside that company! So, “proprietary platform” is a big negative to me!
Is Life123 too ambitious? Well, it intentionally is very ambitious. That's by design. Because the goal is to be ready in a ready state for future advances in computing power and future interactomics data, such as binding affinities, kinetic reaction parameters and so on. So what happens when all those arrive? Are we ready to immediately use it, as it keeps arriving? That’s our goal!
Psychological and Social Elements
As I see it, there are big psychological barriers - more than any concrete barriers.
For starters being very ambitious: some people may be turned off by the long road ahead.
Also, it's something of a hard-to-find mix of interests and skill
sets. What Life123 tries to do may be too mathematical for
biologists… and too much biochemistry and biology for people with
a computer science or math background.
I continue to see,
first-hand, the power of psychological barriers in another field,
graph databases, which is one of areas of expertise. There's fair
bit of demand for expertise in that area, but a surprisingly small
supply of experts - which is great news for me in terms easily
finding jobs in that area! But why is that? There's no good real
reason, as far as I can tell. Because graph databases are actually
pretty easy, compared to regular, relational, databases. It appears
to be a psychological barrier - I can see no other compelling reason!
Graph databases seem to be the most natural thing in the world for
people with background in both CS and Math – like me and a former
boss at a big pharma – but somewhat intimidating to many computer
scientists or data engineers.
I’m speculating here! But the point here is that the power of psychology should never be under-estimated!
Another barrier is that of being distracted by all the “sexy” AI and all the “hot” Big Data fields. Those are taking everyone's attention!
Above is a great quote, a personal communication from a researcher:Notice the part about System Biology being more appealing to physics majors. (And - guess what - it so happens that I've always been passionate about physics, especially its foundations, and seriously considered it as a career!!)
When you combine the social & psychological elements, a quote comes to mind: the poem about two roads that diverged in a wood… and taking the one less traveled by.”
So, this is where dynamical modeling stands, as I see it!
Whole-Cell Modeling
Talking about whole-cell modeling, which is in the main initial goal of Life123 – well, it's an ambitious goal, but it's not completely new by any means.
I could easily find examples in the literature from back in 2012; for example, a great article in the Cell Journal about whole-cell simulations of Mycoplasma genitalium, one of the simplest bacteria.
And there's a 2015 article in Trends in cell biology about why build whole-cell models… with a goal to “quantify variation in how individual cells express a set of genes in response to an environmental signal.” Well, that’s exactly what we want to to be able to better quantify.
Virtual cells are a goal of many. And that includes
Life123.
With a virtual cell, we would like insight into
questions such as how it will they respond to inputs … Or
combinations of those… or inputs never seen before… or inputs
while in a particular state… Or what kind of inputs to provide to
attain a desired result, etc etc.
Also, how does the cell’s response vary when the cell undergoes epigenomic changes, which is a very relevant to aging and healthspan.
As well, how to emulate a real world phenotype. Which, incidentally, could also be used as a way to estimate the missing or poorly-known parameters of the dynamical model.
That brings us to exactly… what about the parameters and the initial conditions? Every time there is a dynamical system, those are something that's needed.
What do we do when they're unknown or poorly known?
And that could be parameters like reaction rates, binding affinities, and so on… or it could be the initial conditions - like initial concentrations and so on.
Well, one approach is to use machine learning, or genetic algorithms, to estimate those parameters. With this approach, the counterpart of a loss function, in other words something we're trying to optimize, could be the simulated cell’s resemblance to real known phenotypes.
Something to keep in mind is that we’ve been about individual virtual cells so far - but in the grand scheme of things, exciting as individual cells are, our interest, as far as medicine goes, is really is more about tissues.
So, one question that could greatly help is, do we really need to have impeccable simulations of each and every cell?
I suspect the answer is probably no, if what we really care about is their behavior in the aggregate. Unless there's a systematic error in each virtual-cell simulation, I think it’s conceivable that some inaccuracy in individual cells could possibly become less important when we look at the aggregate behavior. That’s a situation often seen in Statistics: there’s much less variation in averages than in each component. To be clear : I don’t have a proof of this; it’s a guiding intuition.
Life as an Emergent Property of Dynamical Systems
Let's talk a bit more abstractly about dynamical systems, and where life starts.
If we have a system with just reactions… It's kind of boring. I mean, it goes to equilibrium after a while – end of story. Likewise, diffusion goes to equilibrium.
But with diffusion together with reactions, some dynamics can be quite interesting : you can find videos on YouTube about fascinating two-dimensional bubbles that change over time, in carefully crafted homemade experiments.
And if we also introduce compartments, we start getting some very
interesting dynamical systems.
But, at the same time, we
could say that the weather is indeed a very interesting
dynamical system. It definitely is - but we probably wouldn't call
it “alive.” Soooo, what's missing? Well, when we take all the
above elements, and also we bring in something that works in a
digital manner, or partially digital manner, analogous to
digital computers – namely macromolecules - then we start
moving towards a system capable of computations for complex
simulations.
And if that system is able to compute a model of some aspects of the real world around it - so that it can react to it, for example as a way to preserve its own existence... Then we start getting something we could start thinking about as being life.
I don't have time to elaborate further but here's an article where I elaborate.
In brief – in the form of a slogan – “Life as an emergent property of specialized dynamical systems”
Life123 Architecture and Technology
Let's get more down to earth! What’s the architecture of Life123? A very coarse description is in the form of spatial modules, separately for 1, 2, and 3D – on top of a module for networks of reactions within a small, uniform compartment.
The spatial modules take care of diffusion, membranes, transport across them, and the coordination with the common module below.
What kind of Technology are we using for Life123? These are some of technologies being currently used, and some in anticipated usage.
As well as technologies that aren’t part of Life123 but have been used in its development: ChatGPT, for example turned out to be very adept at making good suggestions about visualizations and biophysical modeling… And Mathematica has been handy to provide some validation.
For visualization, the project is currently using Plotly, which has turned out to be an excellent tool.
Possibly supplemented by D3.js and Vue.js. And for network visualizations, cytoscape.js
As far as math algorithms, using “Forward Euler” for the systems
of ordinary and partial differential equations, is what is being
currently done. This method is like the “duct tape” of solving
differential equations: not the most sophisticated, but very
versatile. Currently, multiple norms
are being used; norms are values computed in particular
manners – and they’re used to gauge the onset of instability in
the solution of the differential equations.
And various
heuristics are used about how to best choose the variable time steps
in the simulations.
The system is very modular: there is an element of drop-in
replacement for math modules. As well as drop-in replacement for
biochemistry or biophysics models.
For example, one can
start with coarser, more approximate models, just to get going –
and then gradually replace them with finer models, as needed.
History
Quantitative modeling of prokaryote, and then eukaryote, cells has been on my mind since graduate school at UC Berkeley - but back then I was focusing on Molecular Dynamics, though I eventually stopped pursuing that.
Inspired by my later exposure to the Reactome project, as well as newer technologies and some papers I came across on whole-cell modeling, I re-visited the idea with a "manifest / call to action" in 2019. Then Covid arrived, I lost my job - and had some time on hand, as well as a desire to put those dark Covid times to good use: that led to an early prototype in 2020, and eventually a first public beta release in early 2022 (announcement).
After that, a long series of beta releases, the last of which was at the end of 2024.
It’s now a series of release candidates. The terminology reflects the fact that the platform has been becoming a lot more stable - and adding more features.
And then, in April 2025, this project had its public debut with a talk I was invited to give at the "Bio-IT World 2025" conference in Boston , followed by another talk in Sept. at the "Aging & Gerontology 2025" conference in San Francisco.
Capabilities and Challenges
If you were to ask, can I see a “hello world” minimalist example… Well, on the website, life123.science , under experiments, there is a filter you can use - and it shows you different Jupyter notebooks for different available experiments; in particular, the “quick-start” ones.Here's a very minimalist one where we import life123. We create a uniform compartment. We add a reaction. We set initial concentrations.
And then we react over a particular time span.
And afterwards we can look at the history, and plot it.
So, what's currently available, as features of the platform?Well, reaction kinetics of a network of reactions.
Diffusion in 1 and 2D, including reaction-diffusion.
Support of enzymatic reactions.
Compartments in 1D, including passive transport across membranes.
Compartments in 2D are imminent, and also there's early implementation of macromolecules.
The features I talked about are really just “the tip of the iceberg”! So, what’s the bottom of the iceberg? Well, it's a battery of tools.
Because there's a lot that needs to be dealt with, including numerical instability... throttling up and down the computational effort… visualization… Analysis of results… Diagnostics… Using high power computing..
Those are things that we like to have in the platform. So that individual researchers, who are exploring some aspect of systems biology, don't have to to deal with all of this.
What about some challenges?
Well, computing resources, for one.
Modeling particular elements of biochemistry and biophysics, and so on.
Working on a different scales of space and time.
The estimation of the parameters and initial conditions, as discussed before.
Integration and coordination of efforts.
Staying funded and active while remaining open source.
Those are some of the challenges.
Let me wrap up with future directions.They include, at least in the shorter term, things like scaling up, parallelizing, and a full implementation of the features I mentioned before, like a compartments and macromolecules.
Better dealing with enzymatic reactions, given limited knowledge of their parameters
Better Math algorithms and heuristics
Integration and comparison with other open-source software.
And also exploring using the Life123 platform to create a version of existing books & papers on systems biology, such as the one shown on this page…. Whose author, incidentally has background in... surprise surprise: physics!
And also moving on to 3D simulations.
And that element of machine learning to estimate missing parameters, as discussed earlier.
As well as a community involvement, from a wide variety of a disciplines.On the website, there's a “participate” page that includes boxes like the one shown one this slide, about some current needs, and skillsets to help in those areas. And anyone with that particular skillset is welcome to join in the effort.
We would also like to be able to have Community uploads of in-silico experiments, along the lines of Kaggle for data science.
Contributors
I'm the founder and project lead. I tend to alternate between working in Big Pharma and in in nimble smaller companies in Personalized (P4) Medicine and Healthspan/Longevity.My background includes math, computer science, and molecular biology.
I also lead an older, more mature, open-source project, BrainAnnex.org, parts of which are in current use at some companies.
The first scientific advisor to this project is Kevin Perrott, who is a professor of Biochemistry, and the owner of OpenCures, and also works at the Buck Institute, on healthspan.
Project's Website
On our website, life123.science , there's a 3-minute intro video.As well as interactive in-silico experiments, in the form of JupyterLab notebooks.
And there's a one-click hosted version
As well as guides, microblogging, and sign-up for community participation.
Comments
Post a Comment