Home Resources The risk of replicating social bias in synthetic patient data – The need for human intelligence partnered with AI

The risk of replicating social bias in synthetic patient data – The need for human intelligence partnered with AI

Authored by: Paul Reed and Anabelle Gall

Published in Pharmafocus, February 2024

As we move forward in 2024, the healthcare industry has embraced the idea that AI-generated synthetic patients are here for the long haul. The remarkable ability to utilize artificial patients that mimic real-world patient data addresses crucial unmet needs, making it nearly impossible to envision a future healthcare industry without widespread usage and integration of synthetic patients.

However, we must strike a delicate balance between integrating AI tools into market research and acknowledging the invaluable role of human intelligence in understanding, verifying, supervising and analysing insights. In this article, we delve into the compelling arguments surrounding the opportunities and risks associated with the utilization of AI-generated synthetic patients in healthcare market research.

The power of data and AI

The healthcare sector is responsible for an estimated 30% of the total data volume worldwide, according to a report by the International Data Corporation. Furthermore, projections indicate that by 2025, the annual growth rate of healthcare data will skyrocket to 36%, surpassing other industries significantly. This data encompasses a wide range of patient and clinical information, including electronic medical records (EMR), disease registries, clinical trial information, diagnostic imaging and data from wearable devices. Thanks to advancements in AI, particularly large language models (LLM) like GPTs, we can now statistically analyze these data points, uncovering patterns within vast data sets to create incredibly lifelike synthetic patients. These patients serve as synthetic replicas of reality.

The benefits for the healthcare industry are profound. Drug developers, scientists and market researchers can now tap into a synthetic patient population without requiring patient compliance approvals. This breakthrough translates to major efficiencies for drug developers, drastically reducing the turnaround time for product advancements. For those concerned with market research insights, there is now a virtual patient population to rigorously test hypotheses and form a comprehensive understanding of the market landscape before embarking on primary research.

Caution is key

Synthetic patients pose a threat of perpetuating a deeply problematic ‘bias loop’. This phenomenon arises when marginalized populations are either underrepresented or inaccurately recorded in existing patient data. Consequently, this flawed data is utilized to create virtual patients, causing them to drift even further apart from the diverse range observed within real-world subpopulations. Psychologists at the University of Deusto, Spain, found that users tend to absorb biases from AI solutions and subsequently act on them. This can lead to negative healthcare outcomes, especially when decisions are made based on synthetic patients that can reflect bias that exists in current data sources.

COVID-19 serves as a glaring reminder of the importance of understanding real patients when aspiring to create accurate synthetic patients. The global health crisis has shed light on the disparities in healthcare access, unveiling the unfortunate truth that certain subgroups face worse outcomes and limited healthcare opportunities. For example, in the US, COVID-19 patient data significantly over represents Asian and White populations, with an underrepresentation of data from Black patients, potentially not identifying greater mortality risks amongst African Americans.

The lack of diversity in available healthcare data is a global issue. A recent analysis conducted by the FDA in 2020 discovered a significant disparity between the demographics of clinical trial participants and the global population. Out of the over 292,000 participants from all over the world, a staggering 76% were identified as White, while only 11% were Asian and a mere 7% were Black. These statistics sharply contrast with the distribution of the global population, which reveals that approximately 60% reside in Asia, 16% in Africa, 10% in Europe, 8% in North America and 8% in Latin America, according to the World Population Review.

If left unaddressed, there is a concern that synthetic patients may perpetuate the current underrepresentation of marginalized patients. It is important to question whether these synthetic patients truly reflect the diversity of real-world patients. We must identify and include patients whose voices have not been heard and take collective action to ensure inclusivity. We believe real-world patient insights will continue to be essential in validating and improving synthetic patient models.

Striking the balance

To unleash the true potential of AI-generated synthetic patients, it is crucial to harness the power of human intelligence with real patient voices. It is important to partner with specialists who have the expertise and knowledge to effectively utilize AI to ensure the patient voice is accurately heard. Support is needed to create genuine patient insights to guarantee the precision and usability of synthetic patients. With the correct guidance, synthetic patients can be integrated into the project process to generate initial hypotheses, research materials and to further develop and fine tune patient treatment and healthcare solution concepts that meets the needs of all patients.

Find out more about AI at Research Partnership. 

Sign up to receive Rapport.

Rapport is our monthly newsletter where we share our latest expertise and experience.