We’re extremely happy to announce Pathorolo, the first machine learning algorithm developed to assess the likelihood of solving a genomics clinical case with currently available information.
Pathorolo scores each case, and the score is associated with a probability of solving the case. The probability is based on case features and on our automatic analysis results, which also includes information from current literature curated in our knowledge graph. The model was trained with a large dataset of cases, and its performance was assessed on a cohort of 760 Baylor Genetics cases, results will be presented at ASHG 2019 by Dr. Linyan Meng, Assistant Laboratory Director at Baylor Genetics.
With the average diagnostic yield on WES ranging around 30% (Retterer et at, Genetics in Medicine, 2016), there are many possible uses for a model that can predict whether your patient’s case is likely to be solved.
The benefits of reanalysis have been clearly shown over the past few years. Liu et al demonstrated that reanalysis of 250 exomes after 6 years doubled yield. They also developed a semi-automated approach that increased yield on a 2000 case cohort from 25.2% to 36.7% (NEJM, 2019). Basel Salmon et al. performed reanalysis on 84 probands – using the Emedgene platform – and were able to establish a diagnosis in an additional 15.5% cases (Genetics in Medicine, 2018).
Pathorolo is the cornerstone of an automated reanalysis approach. In the reanalysis of large unsolved past cohorts, the Pathorolo model will allow the geneticist to manually analyze only the cases that are likely to be solved. The backlog of past open cases will keep growing beyond any lab’s ability to re-diagnose – if we don’t incorporate automated reanalysis tools like Pathorolo.
Pathorolo makes sure geneticists aren’t wasting manual analysis time, and are focused on cases with highly likely meaningful outcomes.
In the Baylor Genetics test cohort, of the cases with a model score higher than P>0.75, 89% were truly solved by the lab, making it a highly valuable investment of lab time to reinvestigate cases with this score. We’ll enable labs to set their own reanalysis threshold, choosing a sensitivity and specificity that makes sense for their operation.
Should I keep working on this case?
One of the most common challenges we hear from clinical labs is knowing when to stop working on a case. The solved cases are generally easy, and take a shorter time to review. It’s the unsolved cases that can be a big drain on the interpretation team’s time.
Pathorolo can clearly demark cases that are unlikely to be solved. Of the cases with a score of p<0.25, 86% were unsolved by the lab, and for those with p<0.20, 90% were unsolved. Again, labs can choose their own threshold here, but it will be easier to identify the cases that are not likely to yield answers at present. While you’ll still be following lab protocol for these, it will be easier to put a stop to time spent digging for answers.
We at Emedgene firmly believe that AI in medicine can’t be a black box. We want to give our users as broad a toolset as possible to evaluate our algorithm’s output.
For our automated interpretation algorithm, we display evidence from the literature and databases for every variant the algorithm selects.
Pathorolo adds another higher level of explainability to our AI. We not only pinpoint causative variants, we’ll also tell you how likely they are to solve the case! The algorithm is essentially scoring itself, which is extremely useful in evaluating its output.
The ‘In Silico Geneticist’
A common objection to AI in medicine is that it doesn’t mimic work in the clinic. Liu et al. performed a meta-analysis of humans vs machines in medicine (The Lancet Digital Health, 2019), and found AI performance comparable to humans. However, in a commentary (The Lancet, 2019), Tessa Cook suggests that ‘AI cannot yet replicate the essence of the diagnostic process’.
Ultimately, a geneticist is analyzing a patient case, not just a patient variant, with the intent of providing him or her with answers. Pathorolo, with its holistic case view, brings us one step closer to mimicking the geneticist workflow with AI.
Here’s how the model performed on the Baylor Genetics 760 case production cohort, achieving an AUC of 0.83.
Looking at the cases that would be interesting to tag for reanalysis, there were 198 cases that received a model score of P>0.75. Of those, 89% were indeed solved. Checking only cases with P>0.75, covers 50% of all solved cases in the cohort.
Turning to the cases unlikely to be solved, there were 253 cases with a model score of P<0.25, 86% of those were indeed unsolved. We can also look at 167 cases with P<0.20, of which 90% are unsolved.
The Pathorolo model will be incorporated in the Emedgene genomic analysis and interpretation platform over the coming few weeks.
Schedule a Demo
We’ll be demonstrating Pathorolo at ASHG next week, October 15-19, Houston, TX in booth #1108. You can schedule a demo to see it in action, and assess its potential utility for your lab.
And if after all this you’re wondering why we named the model Pathorolo – although I hope that’s not your only question! – it means solved (pathor) or (o) not (lo) in Hebrew!