July 31, 2023
We’re excited to announce the latest release of our Alzheimer’s Disease Digital Twin Generator, AD DTG 3.1. Coming just a month after our most recent AD DTG release that introduces a novel architecture, this release makes some improvements we’ll be using across disease areas in the future.
Missing data is a challenge that rears its head in multiple ways. Participants in a study will miss observations, and different studies will measure different outcomes, creating a patchwork of data where missingness is standard and not an exception. We leverage as much data as possible to predict outcomes reliably, but a consequence of this completeness means that sometimes, certain things that our model uses won’t be measured in a clinical trial.
When we make digital twins, we want the predictions to be as insensitive as possible to missing data. If a feature is important, participants who are missing that feature will have a lower precision. Our DTG specification sheets, available on our website, characterize this sensitivity. However, how we approach making predictions from missing data will impact this sensitivity. For example, if we just replace missing data with the most common value, we’ll be more sensitive than if we did a more intelligent imputation or handling of that missing data.
Our latest DTG release updates how we handle missing data at baseline by swapping out a component in the architecture. In this new version, we use a neural network imputation model, co-trained as part of the Neural Boltzmann Machine, to impute data used to create digital twins. This model can learn from the other variables to impute, in a context-dependent way, the value for missing baseline variables. This results not just in reduced sensitivity to missing data at baseline but better performance for the DTG overall. You can find the updated DTG spec sheet on our website and can expect more improvements as we continue to innovate in machine learning.
Recently, a potential customer asked, “What should we measure at baseline in our trial to make the best use of your model?”. This is the kind of question we love because it shows a deep understanding of how predictive models work. We build digital twin generators from diverse data sources, both in terms of populations and outcomes, to make strong and robust digital twins. This release takes us one step closer to our mission of eliminating trial and error in medicine.