The participating clinical partners will congregate mpMRI and clinical data, retrospectively and prospectively, from more than 17.000 PCa patients (6000 prospective mpMRI cases), including baseline examinations and follow up studies to form the ProstateNET dataset counting more than 1.5 million image representations of prostate cancer. This unique worldwide dataset will be exploited through transfer learning methods to develop vendor-specific models by fine tuning to local imaging data addressing challenges related to performance and generalizability.
About 60% of the data will be retrospective (collected before M16) and will be evenly distributed regarding the MRI vendor. Patient imaging cases exams and related clinical data will originate from 13 clinical sites throughout Europe in order to build up a significant level of data heterogeneity and account for it in the next step of the project. The rest 40% of the data will be prospectively collected starting from M16 and will be used mostly (80%) for the development of vendor specific models (see Section 1.3.4) and for external validation (20%) of all models. For a small percentage of the data (5%), radiologists will perform segmentation of the prostate gland and the index lesion in order to support the necessary AI model training for this specific task.
The ProstateNET dataset whose size, quality and diversity will exceed by at least an order of magnitude currently available PCa datasets and will be an integral part of the ProCAncer-i platform, enabling the development of robust and generalizable AI models.
As stated before, the project will create ProstateNET to be the largest repository worldwide of high-quality mpMRI PCa images since this is a sine qua non condition for advancing AI model development to address the unmet clinical needs. To enhance security an innovative data governance model will be implemented to manage data and service requests and access in a community-driven business model through an honest broker service (HBS), aligning the interests of data providers with incentives for data sharing and valorisation, ensuring data privacy and security. At the same time the cloud image management platform used will be ISO and CE certified.
The specified expected impact is assessed via KPIs 1-4. ProCAncer-I will contribute towards the creation of the largest global repository (>17000 patients) of annotated PCa multimodal medical images of high quality, featuring vendor diversity (3 vendors) and a balanced geographical distribution of the associated clinical sites across 9 European countries. The ProstateNET medical imaging repository will enable the derivation of 50 innovative AI-models related to 8 concrete clinical scenarios that span the PCa care continuum.