Campus Beat Top News

NIMS Hyderabad and IIITH create pathology datasets for cancer & kidney research

Listen to Story
News in short:
The International Institute of Information Technology, Hyderabad (IIITH) and Nizam’s Institute of Medical Sciences (NIMS) have partnered to launch the India Pathology Dataset (IPD) project, which aims to digitize histopathological images and create publicly available datasets for medical research and AI development. The project is supported by the Technological Innovation Hub for Data Banks, Data Services, and Data Analytics (TiH-Data). The first datasets released are the IPD-Brain dataset, containing 547 high-resolution brain tumor slides, and a Lupus Nephritis dataset. These datasets are valuable resources for researchers developing AI models to improve diagnostic accuracy, explore regional variations in diseases, and predict molecular markers from tissue morphology. The project is unique in its focus on Indian demographics, addressing a gap in the availability of region-specific data for histopathology research. The IPD project is expected to significantly contribute to clinical research, education, and the development of AI-based diagnostic tools in India.
Nims Hyderabad & Iiith Launch Cancer, Kidney Pathology Data

Hyderabad: In a significant step towards advancing India-centric clinical research, the International Institute of Information Technology, Hyderabad (IIITH) has partnered with Nizam’s Institute of Medical Sciences (NIMS), Hyderabad, to launch publicly available datasets of digitized histopathological images. These datasets focus on brain cancer and kidney disease (Lupus Nephritis), providing crucial resources for medical research and AI development.

The India Pathology Dataset (IPD) project is a collaborative effort involving academia, hospitals, industry, and the government. Its primary goal is to digitize tissue biopsy slides, offering benefits such as reduced risk of slide damage, faster turnaround times, improved clinical decision-making, and expanded research opportunities through AI.

The project is supported by the Technological Innovation Hub for Data Banks, Data Services, and Data Analytics (TiH-Data). As part of the initiative, IIITH has installed a whole-slide digital scanner at NIMS to facilitate the digitization process. According to Prof. Vinod P.K, who is leading the dataset curation, digitizing these slides allows computers to visualize and share images, enabling collaborative diagnoses across locations.

Brain Tumor Dataset Released

One of the first datasets released is the IPD-Brain dataset, published in Nature Scientific Data. This open-access dataset, which includes 547 high-resolution H&E slides from 367 patients, is one of the largest of its kind in Asia. Dr. Megha Uppin, from the Department of Pathology at NIMS, emphasized the importance of precise tumor typing and grading for effective cancer management. The dataset provides a foundation for training machine learning models to improve diagnostic accuracy and explore regional variations in brain tumors.

AI has the potential to bridge gaps in brain tumor diagnosis, particularly in addressing the shortage of specialized neuropathologists. Digital pathology can also help peripheral hospitals access expertise from specialists remotely.

The project aims to expand the dataset to include other cancers, such as breast, lung, colorectal, oral, and cervical cancers. NIMS is also contributing to the lung cancer dataset.

Lupus Nephritis Dataset

In addition to cancer-related datasets, the IPD project has compiled a dataset on Lupus Nephritis, a kidney disease caused by an autoimmune response that disproportionately affects women in India. The dataset will help nephropathologists at NIMS classify the disease and recommend appropriate treatments. AI tools are expected to address challenges in classifying disease subtypes and overcoming interobserver variations.

AI and Molecular Prediction

AI is also being used to predict molecular markers from H&E slide images. Traditionally, molecular profiles are obtained through genetic testing or immunohistochemistry (IHC). However, the IPD team is exploring how tissue morphology can reflect underlying DNA alterations, potentially predicting critical markers such as IDH mutations in brain tumors.

The Importance of Histopathological Datasets

The IPD project’s open-source datasets are valuable resources for researchers developing new AI models and conducting data analysis. Prof. Vinod noted that the project is one of the first instances of open-source medical data from India. The IIITH campus now houses a second slide scanner available for research and educational use. Dental colleges and corporate hospitals are already utilizing the technology.

In addition to research, the dataset serves as an educational tool, offering MD students and pathologists a resource for studying histopathological images in depth.

Looking ahead, Prof. Vinod mentioned that additional datasets, including one on breast cancer, are developing. The IPD project is unique in its focus on Indian demographics, filling a gap in the availability of region-specific data for histopathology research, which has traditionally relied on datasets such as the U.S.-based TCGA (The Cancer Genome Atlas).

The India Pathology Dataset project is set to significantly contribute to clinical research, education, and the development of AI-based diagnostic tools in India.

(For article corrections, please email hyderabadmailorg@gmail.com or fill out the Grievance Redressal Form.)