One Data for Rare Diseases (ODRD) software
Ministry of Health & Family Welfare, Government of India formulated a ‘National Policy for Treatment of Rare Diseases’ (NPTRD) in 2017. This could not even be implemented and yet another new NPTRD was formulated in 2021. That explains the complexities, enigma, dilemmas, and roadblocks of the problem of ‘rare diseases'.
The definition of ‘rare diseases’ is where the problem starts! World Health Organisation (WHO) defines a rare disease when population prevalence is less than 10/10000. But different countries have adopted different levels ranging from 1/10000 (Taiwan) to 6.4/10000 (USA). There is disagreement on whether rare diseases can be defined based solely on prevalence criteria. Other criteria such as severity, life-threatening nature, hereditary relations, treatability, geographic concentration, population genetics, customs, and practices have been proposed by different studies at different times; but no large-scale structured study has been commissioned yet that can support any of these with sufficient data.
Early diagnosis of rare diseases is also a challenge due to a lack of awareness among primary care physicians and a lack of adequate screening and diagnostic facilities. Traditional genetic testing, including Next-generation Sequencing or Chromosomal microarray studies, are too expensive and rarely available to be applied in a large population context. Many doctors lack the appropriate training to be able to correctly and timely diagnose and treat these conditions. It takes patients in the United States (US) an average of 7.6 years and patients in the United Kingdom (UK) an average of 5.6 years to receive an accurate diagnosis, typically involving as many as eight physicians (four primary care and four specialists). In addition, two to three misdiagnoses are typical before arriving at a final diagnosis.
Given a small patient pool of ‘rare diseases’ there is very little credible data on their pathogenesis, natural history, clinical behaviour, and treatment outcome. The extremely high level of clinical expertise required to treat such diseases and the multi-disciplinary environment are also not available widely in India and most other global south countries. About 95% of rare diseases have no approved treatment, and less than 1 in 10 patients receive disease-specific treatment. Where drugs are available, they are prohibitively expensive (cynically referred to as ‘Orphan drugs’ as hardly any pharma company makes them), and hardly available.
In India, there has been no epidemiological data on the incidence and prevalence of “rare diseases’, so the burden is not even known. There is also the moral dilemma of balancing competing public health and economic interests while spending public money on rare diseases. What will be yielded and then what to do with the patients, given few treatment options; when the country already has much larger public health emergencies such as anemia, undernutrition, etc, where the impact is highly visible, easier to achieve, and reaps demographic dividends.
It is evident that the problem described above is complex, not well-understood, multi-factorial in origin, and diverse across geographies, ethnic and economic groups. The sector is highly data-thin, and without data, there is no way to address the problem basket.
We now introduce our solution - a deep-learning enabled data-driven approach to the conundrum of ‘Rare diseases’ in India and other resource-constrained geographies in the global south. Based on earlier experience of our team in building similar solutions for COVID and Diabetes Mellitus, we will build ''One Data for Rare Diseases'' (ODRD) software that will enable large-scale screening of target population groups, initially consisting of people in the 0-30 years age group and patients known to be suffering from rare diseases within a defined community. Apart from screening, there will be regular monitoring, earlier detection, risk categorisation, and structured early referral of at-risk/diseased individuals to a network of graded treatment centres. This will be ‘designed with user’, operationally compatible in low band-width conditions, and easy-to-use by community health workers in both rural and urban environments.
The data elements in the software will include demographics of surveyed population, anthropometric measurements, climate conditions, dietary practices, prior use of drugs (prescription/otherwise) and substances, nutritional status, hygiene practices and prior infections, family and past history of rare diseases, current symptoms if any, basic physical examination, etc. Certain relevant and practicable biomarkers will also be tested using frugal point-of-care innovative diagnostic devices for which suitable collaborations have been established with two of the leading technology institutes in India – IIT Kharagpur and IIT Guwahati.
From the collected data, using statistical and AI-ML models, we can infer characteristics that are associated (including weak association) with the rare disease. For example, if a rare disease is restricted only to a family, then there is a possibility that certain specific single nucleotide polymorphisms (SNPs) may be responsible for this (requires further investigation). We calculate the prevalence of each of the observed rare diseases. Rare disease-specific risk scores will be developed based on the available data. A multivariate analysis, using Mahalanobis distance, will be carried out to find out whether collected dietary, climate, biomarker, proteomic, and clinical data are associated with rare diseases.
We will also build capacity among community health workers on a pilot scale for software-driven survey work in a target community, undertake basic physical examinations, and conduct diagnostic tests as part of the surveillance program. This work will be monitored by a specialist team of doctors, public health experts, and medical institutions providing clinical care. The data and consequent knowledge will form the backbone of a National policy on Rare Diseases.
AI-ML algorithm driven ODRD software and access to frugal technologies can be a game changer in early detection of rare diseases. For example, ODRD software can better target small geographical areas identified based on prevalence, risk factors and local genetic predispositions in context of inborn errors of metabolism such as congenital hypothyroidism, congenital adrenal hyperplasia and glucose-6-phosphate dehydrogenase deficiency.
On the other side, infectious diseases e.g., Leptospirosis and scrub typhus (for which there is little awareness among primary care doctors) are also expected to be identified at an earlier stage through ODRD software.
Our proposed solution serves multiple stakeholders at multiple levels.
Patients with rare diseases – Access to treatment is a massive challenge for them, over and above the financial hardship (as treatment of most rare diseases, whenever available, is generally highly complex and expensive). Countries like India can only have few centres of such expertise and complexity. Our complete data-system will connect the right patient to the right treatment facility and provide guidance and access. Follow-up of such patients is also a problem as the local doctor has very little knowledge and expertise. The community health workers, using the software that will have an Electronic Health Record (EHR) component, can support such periodic follow-ups with full documentation and monitor progress.
Community at large – Once adopted, the combination of health workers and the software establishes a mechanism for screening of all target populations in the community.
Skilled workforce – We have stated in this proposal about the lack of appropriately trained health workforce at the community level who may be tasked with the implementation of screening, earlier detection, and public health programs; and non-specialist doctors (only such doctors are available in peripheral areas in India) being generally unaware of rare diseases and, therefore, failing to detect them early or at all. Through this proposal, we will develop a batch of skilled community health workers on a pilot scale who would be able to undertake all activities (pre-medical intervention stage). The AI-ML engine of the software will provide evidence-based guidance to inexperienced non-specialists to raise suspicion of diagnostic possibility and make suitable graded referrals. The same software will also provide information on linkages of various rare disease problems in different geographies to the nearest referral centre.
National government – We have studied multiple countries in Asia, Western Europe, and North America, and it is encouraging to see that governments are aware of the problem of rare diseases. Some of them (including India) just have no idea about how to proceed with a strategy because of the paucity of information, lack of studies, and a totally data-thin landscape. Our data-driven approach will offer a comprehensive solution to the governments in terms of data on incidence and prevalence, etiological factors and their interrelationships (nutrition, infection, familial, climate, pollution, etc.), stages of disease among the sufferers, their lack of access to medical care, etc. This knowledge will help the government to both formulate and implement appropriate public health policies and set up tertiary centres of excellence for their treatment (India already has a few such hospitals, known as ‘Nidan Kendra’).
The team members have worked together in an immersive manner on the development of data-science technologies for health systems with a focus on improving access to health for the underserved population. The team lead has extensive experience in developing decision support software engaging the users and rural community, taking bottom up human centred approach which is linguistically familiar, socioculturally congruent. For software development, the ability of the health workers to navigate through the UI, their local language familiarity, health-seeking behaviour of the community, the common symptoms and how people express them all played a pivotal role. The usage of the software is within the capacity of the community health workers and converging with their lives.
In addition to decision-support systems for clinical care, the team has developed two (2) similar software solutions for Covid Severity Score and Diabetes Risk Score. The core team is supported by a group of specialist doctors from different disciplines with vast experience and leadership positions in their own fields.
Multi-speciality team – our team draws from a wide cross-section of specialists. They include doctors with vast community medicine experience of working in rural areas, working at the interface between medical science and health technologies. It has engineers and scientists working on innovating deep-science but frugal health technologies, software engineers and data scientists, experts on statistics and mathematics, biologists, social entrepreneurs, and program managers.
Participation of globally renowned institutions – the team is drawn from some of the ‘Institutes of National Importance’ and ‘Higher Educational Institutions’ in India, such as IIT Guwahati. The mentor group of this project has Prof. Marc Madou of the University of California at Irvine, and Prof. Amitabha Ghosh, Former Director of IIT Kharagpur.
Training model – the team has experience of training a large number of rural youth in different states of India as community health workers. They are certified as per a National Occupation Standard by the Government of India. They are specially trained on digital literacy, technonology usage and financial literacy. The team also has large scale experience in buiding microenterprise, entrepreneurship among the rural youth and driving self-belief among them.
Technology preparedness – the software team is well-placed to undertake the proposed development involving data analytics, machine learning algorithms, and public health analytics. Omics integration is a future task. Multiple frugal diagnostic devices have been implemented; a pipeline of new innovations is set to roll out as soon as they receive national certification.
Government support – the project is likely to receive support from various government agencies in India for its alignment with the government policy of 2021. As mentioned earlier, the Government is also looking for a solution. We are confident the Government will adopt this data-driven approach in its national program for the treatment of rare diseases 2021, if we can successfully complete the first phase of development and conduct a pilot-scale implementation model.
Thousands of health workers – a large number of rural youths, formally trained and certified, may receive employment as community health workers.
- Improve the rare disease patient diagnostic journey – reducing the time, cost, resources, and duplicative travel and testing for patients and caregivers.
- India
- Prototype: A venture or organization building and testing its product, service, or business model, but which is not yet serving anyone
Over the last several years, our team has devoted itself to creating a comprehensive software ecosystem known as 'Uday'. This system is specifically designed to support the operation of eHealth clinics, particularly in rural environments where traditional physician access may be limited. Leveraging community health workers and advanced algorithms, 'Uday' paves the way for remote, efficient, and effective medical care.
At its core, 'Uday' utilizes a sophisticated static medical algorithm that enables Community Health Workers (CHW) to meticulously compile comprehensive patient medical histories. By using a structured set of questionnaires tailored to each 39 commonly reported symptoms, the software guides CHW physical examinations, both general and complaint-based. Upon completing the assessment, 'Uday' synthesizes a Medical History Note (MHN) and Examination History Note (EHN). It further consolidates all gathered data, including demographic details, medical history, family history, and vitals, and forwards it to a backend doctor. After reviewing the data the doctor interacts with patients and the CHWs and writes prescriptions. The software also has separate modules for diagnostic tests, billing etc. LoRaWAN technology is being integrated to serve the areas with no bandwidth.
The proposed addition of the ODRD module represents a significant enhancement to the 'Uday' ecosystem. This comprehensive software will be deployed for the pilot. This add-on will incorporate modules addressing rare diseases, augmenting the capabilities to cater to an even broader spectrum of health conditions. As part of this initiative, we will establish the prevalence of each observed rare disease and generate risk scores specific to each ailment based on existing data. The existing data generated by the Uday software will also help us understand the target diseases to develop a self-learning algorithm.
Utilizing advanced analytical techniques, such as multivariate analysis and Mahalanobis distance, we aim to discern any associations between collected dietary, climate, biomarker, proteomic, and clinical data, and these rare diseases. With this multifaceted approach, 'Uday' continually evolves to provide robust, sophisticated solutions for rural communities and thus making our solution a prototype of the aforesaid solution rather than just a concept.
Our tasks, as explained in this proposal, include i) Full development of One Data for Rare Diseases (ODRD) software, including deep-learning algorithms and data analytics, ii) Training an initial batch of community health workers (40), iii) Undertake survey among a target population of 300,000 using ODRD and test its effectiveness.
Based on the outcome of this phase of the work, we will plan a larger multi-centric study targeting a population of about 1-2 million, for which we will approach the national government and other development organisations, CSOs active in such areas. At this stage, global collaborations will be sought from selected institutions in the USA, for collaborative and iterative further knowledge development, best practice sharing, etc.
We have partially de-risked the software development process through our experience in developing software for Covid Severity Risk Score and Diabetes Risk Score in other projects.
We are really looking at developing a solution for the entire global south, and in stage III, our aim will be to undertake multi-country surveillance study through the ODRD.
We are applying for the Horizon challenge requesting financial support for software development, and completing the pilot scale survey among 300,000 people through 40 trained health workers.
We can see our path to establishing the ODRD as a global solution. We are confident if we can complete the first part of this ambitious project, we will receive significant traction from government and development agencies for further progress in Stages II & III.
Born and raised in a remote village in West Bengal, our Team Lead, Sohom, possesses a profound understanding and personal connection with rural communities. His first-hand experience of the unique challenges these communities face, especially in healthcare, has laid the foundation for the inception and development of our project, 'Uday'.
Over the years, working with NGOs and academia, Sohom has actively worked with these communities, frequently engaging with community members, health workers, doctors, and leaders. This hands-on approach has deepened his grasp of the local needs, preferences, and cultural nuances that significantly influence healthcare practices in these regions. Sohom has made a significant effort to establish and nurture strong networks within these communities. This approach facilitates an ongoing dialogue and feedback loop that allows us to continually refine and adapt 'Uday' to meet the specific needs of the communities we serve. Collaborations with local health officials, non-profit organizations, and community health workers have proven instrumental in implementing our project and making necessary adjustments on the ground.
Beyond this, Sohom has emphasized the importance of ongoing learning and adaptation, integrating his extensive field experiences into our software and hardware development process. This allows us to create intuitive, context-specific solutions that are user-friendly and relevant to the people we aim to serve. As an integral part of both the software and hardware development teams, Sohom's deep understanding of the communities and their healthcare needs has been pivotal in bringing our innovative, point-of-care devices and software solutions to those who need them most.
Sohom’s connection with these communities extends beyond professional boundaries. His commitment to enhancing healthcare access and quality, coupled with his shared history and vision, fosters trust and mutual respect that serves as the cornerstone of our project's success.


Chief Operating Officer


Assistant Professor