Postdocs in the theory of cell programming

14 JUN 2020

The London Institute is hiring Postdoctoral Research Fellows in the theory of cell programming from physics, mathematics or machine learning. This follows a recent partnership with cell coding company Bit Bio on a moonshot mission to create every human cell type for use in biomedical research.

Are you interested in this position?


The postdoc positions are for one to three years, starting as soon as possible, with a gross salary of £42,000 per year.


The London Institute is assembling a team of theorists to decode the dynamics of cell identity. Postdocs will interact with a senior theorist being simultaneously recruited, Thomas Fink and Ton Coolen, as well as Bit Bio founder Mark Kotter. Postdocs will play a role in determining the theoretical lines of attack and in influencing experimental innovations at Bit Bio.

Candidates should have a PhD in physics or mathematics with experience in statistical physics, applied mathematics or theoretical machine learning. They will have a promising track record of research. Candidates should value collegiality and intellectual adventure and should write well.

The London Institute for Mathematical Sciences is a private academic institute for curiosity-driven research in physics, mathematics and the theoretical sciences. Funded by research agencies, foundations and firms, it gives scientists the freedom and support to make fundamental discoveries full-time.


Plausible approaches include, but are not limited to, the following:

Iterative data analysis and machine learning

In the collaboration between LIMS and Bit Bio, theory and experiment will be tightly coupled. The types and amount of data generated will be iteratively shaped by emerging theoretical insights. Neural networks provide a coarse tool for early insight into Bit Bio’s transcription factor perturbation experiments. Their success will depend on tailoring the learning algorithm to the details of the experiments. Faced with partial information about which genes communicate with which others, network inference and community detection can suggest candidate genes for more focused experiments. As we gain insight into the structure of cell programming, these heuristic approaches will set the stage to more mathematically tractable lines of attack.

High-dimensional and causal inference

Classical statistics was developed for when the number of variables being estimated is fixed. But in genetic regulatory systems, the number of variables being estimated tends to grow with the amount of data. Inference in this high-dimensional regime can confuse real patterns with noise—a breakdown known as overfitting. The field of statistical physics can be used fruitfully here, by predicting the degree of uncertainty in the dangerous overfitting regime. Crucially, this provides a means to discriminate between the reliable and spurious inference. To reconstruct directionality, techniques from the emerging field of causal inference will also play a role.

Cell regulatory networks and neural networks

Bit Bio has successfully rewired cells by switching on a handful of carefully selected transcription factors. If one writes down the equations governing the entire system of transcription factors, they turn out to be curiously similar to the equations describing interactions between neurons in the brain. In this analogy, stable neuron firing patterns are the equivalent of stable cell types. And we have in recent decades learned a lot about how to “reprogram” brain models by the clever modification of local variables. Adapting this mathematical insight to transcription, and combining it with Bit Bio’s experimental innovations, will offer a strategic head start in the quest to reprogram cells.

Cell subroutines and combinatorial innovation

There is mounting evidence that cells possess interoperable subroutines that can be combined to perform a variety of tasks. Combining modular subroutines in different ways is a powerful shortcut for realizing new functionality fast. For an analogy, think of a library of actual software modules. These are not self-contained pieces of code; rather, each module calls on other modules to perform its task. So new subroutines both call on, and can be called upon by, other subroutines. Dynamics in these expanding spaces are path-dependent and defy traditional notions of equilibrium. Tractable models of combinatorial innovation will help unravel the architecture of genetic regulatory networks in the cell and suggest mechanisms for disrupting pathological behaviours.