Our goal is to help laboratory scientists effectively use data, statistics, machine learning and computation in their day-to-day research to accelerate scientific discovery. We don’t simply offer off-the-shelf solutions. Instead, we’ll work with you to deliver a customized, data-driven program that fully incorporates your domain knowledge, assumptions and existing data. We believe that all scientific endeavor can be expedited through the smart use data science and machine learning, and we’re excited to collaborate with you to demonstrate how.
Our solutions aren’t one-size-fit-all, black-box algorithms. They’re active collaborations designed to keep the scientist and their domain knowledge in the loop.
If you do laboratory science, chances are we can help. Read on to learn about our core expertise and potential ways we can collaborate, or contact us to get the ball rolling.
What we can do
In the past, we have added value to our collaborators’ research in several ways. Here’s a brief list of what we can do:
- Prior knowledge formation – We work with you to identify and encode prior knowledge about your problem. This knowledge could be obtained through literature review, preliminary data or your general domain knowledge.
- Knowledge management – Encoding your knowledge is important in any data-driven solution. We’ve can provide help in codifying your knowledge and updating it in a rigorous way when you receive more data.
- Identify and quantify uncertainty – Understanding the several types of uncertainty that may arise in your problem helps you account for deviations, and is a necessary step in our process. We work with you to identify what types of uncertainty you’re dealing with, and then quantify and model this uncertainty in order to make robust experimental decisions.
- Robust experimental design – Using your knowledge and uncertainty about your problem, we can help develop an experimental design strategy that incorporates the newest research in decision making under uncertainty. Armed with this strategy, we can help you navigate the combinatorially large space of potential experiments to run, and even help you minimize costs and time.
- Research risk analysis – Using simulations and statistical models, we can estimate risks and costs needed to pursue a particular line of research, subject to budget and equipment constraints.
- Physical modeling and implementation of models – We have helped co-developed physical models, and provided numerical implementations of these models.
- Run customized computations – We have used our computational resources run numerical simulations for with collaborators.
- Statistical analysis of your data – Starting from a data set, we can extract meaning and insight through the use of statistics and data-mining techniques.
Avenues for collaboration
We’re open to help you in any way we can. Below, we’ve listed some potential avenues for collaboration – broader projects incorporating several of the points above – but this list is far from exhaustive. When you reach out to us with your problem, we’ll help craft a customized solution that meets your exact needs.
Experiment space is combinatorially large, which means brute force iteration over all potential experiments is impossible. When working with new materials or experimental procedures, finding the optimal settings for experimental control variables such as temperature, material flux, and concentrations is hampered by the uncertainty about the underlying physics of the system in question.
Optimal learning is a stochastic optimization technique that guides you through experiment space in order to quickly achieve your experimental objectives. By modeling your domain knowledge and assumptions in a rigorous way using Bayesian statistics, we can manage knowledge and balance uncertainty and utility of a potential experiment’s outcome. Through sequential experimentation, the Optimal Learning algorithm learns where the optimal experiment is faster than ad hoc or random experimentation.
Simulation and modeling
Setting up, and running simulations by yourself is a headache and takes precious time away from actual lab work. You need to make sure you have the correct and most up-to-date simulation software, purchase sufficient computational resources to ensure timely results, and be familiar with the multitude of file formats and options needed to run simulations and visualize results.
We can run your simulations for you on our computers. We can help you manage simulation data, maintaining a system of record so that you know the exact conditions, parameters and other meta-data associated to each simulation run, ensuring reproducibility and accuracy. We can also help tie your simulation activities into our optimal learning framework to reduce the number of in-silico experiments you need to perform in order to learn what you need to know.
Machine learning and knowledge discovery
We can use general statistical and machine learning techniques such as regression, clustering and classification as well as more sophisticated techniques such as deep learning to leverage your experimental data in powerful ways, such as structure/property predictions.
We’re developing a software package to help scientist navigate the vast amount of scientific literature using natural language processing. Working with us, we can tune this service to match your specific needs, allowing you to quickly ascertain relevant historical and state-of-the-art experimental procedures and numerical data.
We’re also designing an automated learner for chemical reaction networks and their associated kinetics.
Our core expertise
Our group incorporates researchers from a broad set of backgrounds, including statistics, computer science, applied mathematics, computational materials science, and operations research.
Multiscale modeling and simulation
From fully atomistics molecular dynamics simulation, to numerically solving differential equations of a numerical models, we have experience in theoretically describing nano and micro-scale systems. Specifically, our expertise centers around kinetics and thermodynamics of many-body systems, and in molecular dynamics and kinetic Monte Carlo simulation.
Statistics and machine learning
We have experience in the analysis and modeling of, and inference from large data sets. We’re also adept in general machine learning techniques from simple regression to more sophisticated methods such as deep learning.
Uncertainty is inevitable in laboratory science, and can arise from several sources. We can draw on our extensive statistical knowledge to quantify and model uncertainty, providing statistically robust solutions to your uncertain problems.
Optimization under uncertainty
How do you make decisions under uncertainty and noise, in order to achieve some objective? We study how to properly model and compute optimization problems when random variables (often a large number of them) are involved. Randomness and uncertainty can come from many different sources, and can impact the solution in a myriad of ways. Members have spent the last few decades researching this topic, providing solutions in transportation and logistics, energy , healthcare and laboratory sciences.