HOME Research & Innovation

ECNU research team advances rare diseases documentation with CCRD platform


The Catalogue of Chinese Rare Diseases (CCRD) was jointly established by Shi Tieliu and Lin Xin of ECNU who led the research team. The CCRD was officially launched on Feb 29, to enhance public awareness of rare diseases and promote screening, diagnosis, and prevention of rare diseases.


Rare diseases are a general term for a large group of different diseases that affect a small percentage of the population. Due to their very low prevalence, they are neglected conditions and were once called orphan diseases.

According to the definition of the World Health Organization (WHO), diseases that account for less than 0.65% to 1% of the total population are collectively referred to as rare diseases. Previous research results from Professor Shi’s group show that there are more than 15,000  rare diseases world-wide,  80% of which are hereditary diseases. Less than 10% of the diseases have approved drugs or therapeutic approaches.

It is estimated by the Food and Drug Administration (FDA) of the United States that there are about 300 million rare disease patients around the world, and about 20 million rare disease patients in China, with more than 200,000 new patients every year. An epidemiological survey in Ireland in 2020 showed that 58.6% of children who died under the age of 14 had rare diseases, so rare diseases are also the main cause of death among children.

Currently, the exact types of rare diseases in the Chinese population are unknown. In 2018 and 2023, China’s five ministries and commissions jointly released the "First Batch Rare Disease Catalog" and the "Second Batch Rare Disease Catalog" respectively, which include a total of 207 rare diseases, but the diseases listed are the most common ones among rare diseases. .

Since China is a country with a large population and resources, there should be far more rare diseases listed in the Catalogs. Meanwhile, based on the result of text-mining the face sheet of the medical records, it is shown that since most rare diseases do not have a unified ICD-10 code, because different hospitals usually assign different ICD-10 extended codes to the same rare disease, which makes the rare disease information in the medical record inconsistent between different hospitals. The inconsistence of disease names and codes not only brings great confusion to the standardization of clinical information on rare diseases, but also makes the diagnosis-related groups (DRGs) analysis results for rare diseases inaccurate, which may mislead the country's decision-making on rare diseases.

In addition, due to the rarity of the diseases, most clinicians lack adequate knowledge of them. So, patients often face problems such as a high rate of misdiagnosis , lack of effective treatments, and high disability and mortality rates, which results in a heavy economic burden to patients and society.

Furthermore, the research and development of rare disease treatment drugs is costly and difficult;this is due to the small number of rare disease patients, the lack of epidemiological data, the low awareness of clinicians, the high cost of clinical research, and the slow progress of scientific research. Patients face many difficulties in the whole process from diagnosis and treatment to rehabilitation and support. Therefore, rare disease management is not only a public health issue; it is also a social issue.

To help rare disease patients obtain better diagnosis and treatment, the ECNU research team has carried out a comprehensive text mining and manual curation of rare diseases in the Chinese population to properly standardize the clinical information of rare diseases, which can help the accurate diagnosis of rare diseases, provide patients and their families with comprehensive information about rare diseases and guide the patients towards better treatment and care services.

As Prof. Shi Tieliu from the School of Life Sciences of ECNU said, “The disease names and information included in this CCRD system is mainly extracted from their comprehensive rare disease information annotation platform——eRAM (www.unimd.org/eram/), while integrating the rare disease information from other related databases such as Orphanet, OMIM, DO, HPO, and others.”

In addition, the team conducted large-scale text mining of published clinical case report papers and then manually curated and standardized the extracted information, and subsequently built the CCRD system.

The platform currently includes 4,455 rare diseases, as disclosed by Shi Tieliu; each rare disease  is supported by relevant academic literature and cases. Focusing on the rare diseases in China, this CCRD system is characterized by the distribution and the standardized names of rare diseases in Chinese population.

The database consists of five main modules. The first one is the Basic Information Module which contains disease definitions, inheritance patterns, associated genes, disease classifications, and other disease annotation information. Second, the Disease Phenotype Module contains disease phenotypes and sentences describing the phenotypes extracted from published literature. Third, the Related Genes Module contains disease-associated genes and annotation information for variants in each gene. Fourth, the Treatment Drugs Module contains rare disease treatment drugs approved in China and whether they are included in Category A/ B medical insurance. Fifth, the Case Information Modulecontains a case description for each rare disease and related articles about those cases.

In the view of Shi Tieliu, in the big data era the rare disease catalog platform supported by modern information technology will serve as a system for the management of rare disease patients throughout their entire lifecycle. This not only promotes the progress of the rare disease field but also contributes to the technological advancements in global health.

"In the future, based on large-scale information integration and standardization, combined with disease AI models, we are expected to provide comprehensive and systematic information of rare diseases to national health management departments, research institutions, medical centers and clinicians, as well as the related international communities of rare diseases." Professor Shi Tieliu said.

Shi Tieliu stated that he hopes to establish a comprehensive and efficient rare disease information management system which can greatly help the diagnostic precision and identification of rare diseases.

Source: School of Life Sciences

Editor: Xu Xincheng