Scalable model building

//Scalable model building
Scalable model building2019-06-17T05:45:04+00:00

Despite their importance, we do not yet have comprehensive and accurate models that can guide medicine and bioengineering. Model building is one of the major bottlenecks to predictive models. In particular, model building is still a labor-intensive, ad hoc process. This inhibits researchers from building comprehensive models, and creates models that are hard to understand, reuse, and extend.

We are developing the first tools for systematically, scalably, and reproducibly building models of intracellular pathways.  This includes tools for scalably aggregating data for modeling, organizing this data for model design, and systematically designing models from this data. These tools will make model building reproducible by tracking the data sources and assumptions used to build models.

We are using cell models as a test bed for developing broadly-applicable methods for scalable model construction. This allows us to concretely test our ideas. However, our tool will be modular and extensible to enable future support for additional domains.

To ensure our tools advance biomodeling, we are developing our tools in conjunction with several driving projects which aim to develop whole-cell models of bacteria and human cells.

We anticipate that our tools will help researchers build more predictive models, and we anticipate that these models will help advance science, medicine, and bioengineering.


Kinetic Datanator

Kinetic Datanator is a tool for discovering the data needed to build, calibrate, and validate whole-cell models. Kinetic Datanator is composed of an integrated database of experimental data for whole-cell modeling and tools for identifying [...]

Random WC model generator

The random whole-cell model generator is a tool for generating WC models that represent user-specified numbers of genes, RNA, proteins, and reactions of an archetypal bacterium. The models generated by the model generator represent the [...]


WC-KB is a data model for describing the experimental data needed to build, calibrate, and validate a whole-cell model. WC-KB includes a command line program and a Python API.


WC-KB-Gen is a Python framework for programmatically constructing knowledge bases for whole-cell models. WC-KB-Gen helps modelers retrieve data from external sources, organize this data, and record the provenance of this data. In turn, WC-KB-Gen helps [...]


WC-Lang is a data model and a file format for describing composite, multi-algorithmic whole-cell models. WC-Lang includes a command line interface and a Python API.


WC-Model-Gen is a Python framework for programmatically constructing whole-cell models from large datasets described with WC-KB. WC-Model-Gen helps modelers retrieve data from a WC-KB and use the data to scalably design species and reactions. In [...]


WC-Rules is a formalism for describing composite, mixed-grained, multi-algorithmic WC models. WC-Rules provides modelers a high-level, biologically-intuitive language for describing models in terms of patterns of metabolite, DNA, RNA, and protein species and rules for [...]


  • A whole-cell computational model predicts phenotype from genotype
    Karr JR, Sanghvi JC, Macklin DN, Gutschow MV, Jacobs JM, Bolival B, Assad-Garcia N, Glass JI and Covert MW
    Cell 150, 2: 389-401 (2012)
  • Guidelines for reproducibly building and simulating systems biology models
    Medley JK, Goldberg AP and Karr JR
    IEEE Trans Biomed Eng, (2016)
  • Summary of the DREAM8 Parameter Estimation Challenge: Toward Parameter Identification for Whole-Cell Models
    Karr JR, Williams AH, Zucker JD, Raue A, Steiert B, Timmer J, Kreutz C, DREAM8 Parameter Estimation Challenge Consortium, Wilkinson S, Allgood BA and others
    PLoS Comput Biol 11, 5: e1004096 (2015)
  • The principles of whole-cell modeling
    Karr JR, Takahashi K and Funahashi A
    Curr Opin Microbiol 27: 18-24 (2015)
  • Toward scalable whole-cell modeling of human cells
    Goldberg AP, Chew YH and Karr JR
    Principles of Advanced Discrete Simulation (2016)
  • WholeCellKB: model organism databases for comprehensive whole-cell models
    Karr JR, Sanghvi JC, Macklin DN, Arora A and Covert MW
    Nucleic Acids Res 41, Database issue: D787-792 (2012)

Collaborative projects

Service projects

Alzheimer modeling

Jean-Marie Bouteiller Assistant Professor Department of Biomedical Engineering University of Southern California Los Angeles, CA, USA


Arthur Goldberg
Arthur GoldbergCo-Investigator
Associate Professor, Icahn School of Medicine at Mount Sinai
Jonathan Karr
Jonathan KarrProject Director
Fellow, Icahn School of Medicine at Mount Sinai
Yosef Roth
Yosef Roth
Research Assistant, Icahn School of Medicine at Mount Sinai
Herbert Sauro
Herbert SauroCo-Investigator
Associate Professor, University of Washington
Balazs Szigeti
Balazs Szigeti
Postdoctoral Scholar, Icahn School of Medicine at Mount Sinai