7 keys to a trustworthy AI
according to the EU guidelines
according to the EU guidelines
“Machines take me by surprise with great frequency.”
— Alan Turing
The term ‘artificial intelligence’ has been around since the 1950s, and nowadays it is simply everywhere, with machine learning algorithms and neural networks taking part in any and all products – from navigation platforms to recommendation systems and all the way through to personalized medicine. While AI can oftentimes be a simple software that finds patterns in past data, it is usually perceived as an elaborate system that mimics human cognitive functions such as learning, deducing, and even exhibiting sentiment.
Recently there have been many discussions on the transparency of AI-based solutions. This makes perfect sense; if a predictive algorithm sees a high probability for a certain disease given a patient’s past data, it is only natural that the patient (and his attending physician) want to know what this prediction is based on… Disturbingly, some of these algorithms are ‘black boxes’, hinting at the difficulty in explaining what’s under the hood. Additionally, since this technology is completely based on past, often human generated, data, it may sometimes present various forms of bias, therefore creating ethical problems. These ideas of fairness, ethics, transparency, and trustworthiness become critical now that AI-based solutions are everywhere in medtech products. When dealing with patients’ genetic data, this issue can become even more sensitive. Before we can achieve truly explainable and trustworthy AI solutions, we first need to define these qualities in view of today’s ML systems.
One evaluation standard is the EU’s recently released guidelines for what constitutes a reliable AI technology. According to these guidelines, trustworthy AI should be, in general, lawful – respecting all applicable laws and regulations; ethical – respecting ethical principles and values; and robust – both technically and in view of the system’s social environment.
In more detail, here are the 7 key requirements for a trustworthy AI according to the EU guidelines; and here’s how our AI system at Emedgene checks all the boxes.
1. Human Agency and Oversight
“AI systems should empower human beings, allowing them to make informed decisions and fostering their fundamental rights. At the same time, proper oversight mechanisms need to be ensured, which can be achieved through human-in-the-loop, human-on-the-loop, and human-in-command approaches.”
Emedgene’s AI-backed automatic genomic interpretation system empowers human beings (geneticists, to be specific) and helps them to make informed decisions. It does so by analyzing masses of genomic data and shortlisting for the geneticists the variants that may solve patient’s cases, along with evidence from the literature and databases., This saves them the time and effort spent filtering variants and performing literature reviews. Additionally, in building and maintaining our AI, we always have the oversight of a human-in-the-loop. Our in-house geneticists, who form half of our R&D team, evaluate our data and our algorithms on a daily basis.
2. Technical Robustness and Safety
“AI systems need to be resilient and secure. They need to be safe, ensuring a fall back plan in case something goes wrong, as well as being accurate, reliable and reproducible. That is the only way to ensure that also unintentional harm can be minimized and prevented.”
Emedgene is a cloud-based, highly redundant platform with high resilience and scalability. That’s a given these days. Moreover, every case solved on our platform has complete reproducibility. We keep a detailed record of the platform and algorithm versions used in analyzing the case for every organization.
3. Privacy and Data Governance
“Besides ensuring full respect for privacy and data protection, adequate data governance mechanisms must also be ensured, taking into account the quality and integrity of the data, and ensuring legitimised access to data.”
We know that sharing your patient’s clinical data (and in Emedgene’s case, genomic data) can be challenging, given multiple regulatory and internal security requirements. That’s why the data confidentiality of our customers – and their patients – is a top priority for us. We make sure that the data we work on is secured, anonymized, reliable and of high-quality, which is why we’ve passed security audits by some of the most stringent health organizations in the world
“The data, system and AI business models should be transparent. Traceability mechanisms can help achieving this. Moreover, AI systems and their decisions should be explained in a manner adapted to the stakeholder concerned. Humans need to be aware that they are interacting with an AI system, and must be informed of the system’s capabilities and limitations.”
Explainability of AI-based systems is a true challenge, as many algorithms – mostly neural nets – are constructed in a way that doesn’t allow the user to understand its decision rule and feature importance. Still, transparency of ML algorithms can be achieved in many ways – from choosing an approachable, explainable model to begin with; through clearly portraying to the user that they are interacting with an AI component, to the complete exposure of the model’s features and decision rule. In our platform, when our AI predicts which variants are the ones most likely to solve the case, it always provides explanations and evidence for that decision, including links to relevant scientific papers and database entries. As a rule, we believe in choosing explainable ML models whenever possible, and in making sure to always provide as much evidence as possible to the user.
5. Diversity, Non-Discrimination and Fairness
“Unfair bias must be avoided, as it could have multiple negative implications, from the marginalization of vulnerable groups, to the exacerbation of prejudice and discrimination. Fostering diversity, AI systems should be accessible to all, regardless of any disability, and involve relevant stakeholders throughout their entire life circle.”
Datasets that are constructed by humans are always at risk of being biased; as a consequence, any AI system trained on such data may present that bias in its predictions. As a whole, our platform encounters the same biases the science of genomics does. Our R&D team is fully aware of this issue, therefore we always aim to train our models on as varied populations as possible. As our system encounters and learns from more and more genetic data, and as more population sequencing data becomes available, it will construct a more holistic view of the world and avoid being biased.
6. Societal and Environmental Well-Being
“AI systems should benefit all human beings, including future generations. It must hence be ensured that they are sustainable and environmentally friendly. Moreover, they should take into account the environment, including other living beings, and their social and societal impact should be carefully considered.”
While the cost of sequencing continues to decline, interpretation remains a time consuming and a costly component of genetic testing, forming a bottleneck to the widespread adoption of genetic-based care. Our mission is to create AI tools that dramatically scale interpretation, and contribute to making genetic-based care available to wider populations.
“Mechanisms should be put in place to ensure responsibility and accountability for AI systems and their outcomes. Auditability, which enables the assessment of algorithms, data and design processes plays a key role therein, especially in critical applications. Moreover, adequate an accessible redress should be ensured.”
Finally, as with any successful product, evaluation is key. We envelop each of our ML components with a collection of evaluation and assessment metrics from a genomics, data science and IT perspective, making sure that they always stand in our (and our customers’) standards.
So there you have it. 7 keys to a trustworthy AI, as defined by the EU. Let’s create machines that don’t take us by surprise, but instead are reliable partners that continuously crawl, organize and make available the endless amounts of new data we’re creating.
Photo by Mika Baumeister on Unsplash