Galen: the Medical AI Chatbot
Galen is a solution to the profound challenges faced by the global community of rare disease patients and their caregivers. Given the nature of rare diseases, patients often go through a 'diagnostic odyssey' which involves years of uncertainty, numerous tests, and visits to multiple specialists. Globally, it is estimated that rare diseases affect approximately 400 million individuals. Despite the name, the collective impact of these diseases is significant, yet the research and development for treatments are often limited due to the relatively small patient population for each disease.
A key contributing factor to this problem is the difficulty in accessing and synthesizing the vast and often siloed knowledge about rare diseases. Medical professionals may encounter a particular rare disease only once or twice in their career, if at all, leading to potential misdiagnoses or delayed diagnoses. Similarly, researchers are hampered by the lack of a centralized source of comprehensive, up-to-date information on these diseases.
Galen, a conversational AI chatbot, is designed to improve the diagnostic journey and enhance the efficiency of research related to rare diseases. By using AI to analyze the vast corpus of medical literature and clinical trial data, it can assist in identifying potential diagnoses based on reported symptoms and provide a ranked list of possibilities for further investigation by medical professionals. It can also uncover associations between symptoms, treatments, and outcomes that might not be readily apparent, thus aiding in treatment decision-making.
For researchers, the AI can collate and summarize existing literature, identify knowledge gaps, and even suggest potential hypotheses or experiments. This could significantly speed up the research process, reducing the resources (time, human, financial) needed to arrive at results and thereby reducing the environmental footprint of rare disease research. The AI can also monitor the latest published research and flag relevant articles, allowing researchers to stay updated on the latest developments.
Therefore, Galen targets a major bottleneck in the rare disease domain: the ability to efficiently harness, synthesize, and apply the wealth of existing knowledge for the benefit of patients and researchers alike.
Galen is a conversational AI chatbot, specifically developed for the domain of rare diseases. It works by utilizing advanced language models trained on a vast corpus of medical literature, clinical trial data, and other reliable sources of rare disease knowledge. It's designed to provide comprehensive insights into rare diseases, serving as a powerful and accessible tool for patients, healthcare providers, and researchers alike. You can think of it as a version of ChatGPT that graduated from medical school, then further specialized in rare diseases.
For patients and caregivers, the AI chatbot provides easy access to information on various rare diseases. The user can input symptoms or conditions, and the chatbot will deliver a ranked list of potential diagnoses, along with details on each one, including typical symptoms, standard treatments, and latest research findings. This can help guide users to appropriate medical specialists and resources. (*Important to note: Galen will always defer to diagnostic expertise of human physicians)
For medical professionals, the AI chatbot acts as an instant, accessible, and up-to-date medical reference tool. It can assist in identifying potential diagnoses based on the symptoms presented and can provide a range of information about each disease, including standard treatment approaches and the latest research developments.
For researchers, the AI chatbot offers a powerful research aid. It can provide concise summaries of existing literature on a particular disease, identify knowledge gaps, and even suggest potential hypotheses based on existing data. The tool can also assist in recommending published research, alerting the researcher to relevant articles for their questions.
The technology behind Galen is a large language model, akin to GPT-4, trained specifically for the domain of rare diseases. By processing vast amounts of text data from various sources, this model has learned to generate human-like text that is contextually relevant to the input it's given. It operates by predicting the probability of a word given the previous words used in the text, enabling it to generate complete, coherent, and relevant responses to user queries.
Through its natural language processing capabilities, the chatbot can understand complex questions, context, and semantics, and it can generate responses that provide detailed, accurate, and useful information to the user. The entire system is designed with a user-friendly interface, ensuring that anyone, regardless of their technical ability, can use it effectively.
Galen, therefore, serves as a centralized, accessible, and scientifically-informed source of knowledge for rare diseases, significantly improving the efficiency and effectiveness of diagnosis, treatment, and research efforts in this domain.
Galen aims to serve three primary audiences through its straightforward, conversational interface: rare disease patients and their caregivers, healthcare providers, and medical researchers.
Rare Disease Patients and Caregivers: These individuals often find themselves on a convoluted journey to find a diagnosis, marked by uncertainty and frustration. The scarcity of accessible and comprehensible information on rare diseases can lead to prolonged diagnostic journeys. Galen will provide reliable and digestible information about a wide range of rare diseases, empowering patients and caregivers to pursue suitable medical consultations and treatments. By potentially shortening the diagnostic odyssey, Galen could help save these individuals time, emotional strain, and resources.
Healthcare Providers: Given the nature of rare diseases, it can be a daunting task for clinicians to stay informed about all possible rare diseases they might encounter. Galen will serve as an easily accessible and up-to-date medical reference tool, helping in both the diagnosis and treatment process. This would enable healthcare providers to offer more informed, effective care to their patients.
Medical Researchers: Researchers working in the rare disease space face the significant challenge of sifting through enormous amounts of literature to find pertinent information. Galen can greatly speed up this process by providing concise summaries of existing literature, identifying knowledge gaps, and suggesting potential research directions based on the available data.
As I continue to develop Galen, I've made a point of seeking input from these key user groups to ensure that I fully understand their needs and challenges. I have reached out to rare disease patient advocacy groups, engaged with medical professionals, and spoken with researchers in the field. Their insights, gleaned from interviews, surveys, and usability tests, are critical to my development process, helping me tailor Galen to their needs.
Galen's positive impact extends beyond these user groups. By enhancing the efficiency of rare disease research, it has the potential to expedite the development of new treatments. Moreover, by reducing the time and resources devoted to the diagnostic process and research, Galen could also help lessen the environmental footprint of the rare disease healthcare sector.
Despite being a solo innovator, I believe I am uniquely positioned to deliver this solution due to my close connection to the intersecting communities that Galen serves - the medical community, AI and technology sector, and the communities grappling with rare diseases.
I am a physician. My experience with medical caregiving both here and abroad has given me an understanding of the healthcare landscape and patient needs, especially in the context of rare diseases. I've witnessed these things firsthand and been a part of the community, observing and understanding their pain points firsthand.
Here is where things get interesting. While I am indeed an MD, I am also an AI engineer. While working at Harvard Medical School on deep learning systems, I learned to combine my medical knowledge with advanced technology to address healthcare challenges.
Among the problems I worked on were rare cancers, such as bone and soft tissue sarcomas. I learned to map diagnostic and prediction-based medical problems to computational models, using AI to help solve them. This gave me a new perspective and the skills to build AI systems that could impact healthcare at a global scale. It also is where I became keenly aware of the problem the Horizon Prize is trying to solve: rare diseases face a data scarcity issue. Because they are rare by nature, it is difficult to consolidate and synthesize information on them, leaving everyone in the dark.
Moreover, my personal experience growing up in rural Ohio and my exposure to communities in Sub-Saharan Africa and the Himalayas have made me acutely aware of the access-to-care issues many face. This understanding deeply influences the design and development of Galen, keeping a sharp focus on enhancing accessibility and addressing the unique challenges associated with rare diseases and accessibility to information.
By leveraging my diverse experiences and maintaining an open channel for community feedback, I aim to ensure that Galen remains a user-centered tool, providing real value to those grappling with rare diseases. My proximity to these communities, both as a healthcare provider and an innovator, makes me well-suited to design and deliver this solution.
- Improve the rare disease patient diagnostic journey – reducing the time, cost, resources, and duplicative travel and testing for patients and caregivers.
- United States
- Prototype: A venture or organization building and testing its product, service, or business model, but which is not yet serving anyone
A couple things:
1. The code for Galen is already written. Here is snapshot of the code notebook being used to fine-tune a 40 billion parameter large language model (Falcon-40B) on a corpus of biomedical dialogue:

2. The (growing) dataset on rare diseases. That "corpus of biomedical dialogue" mentioned in item #1 is an expanding list of sample prompts and responses specific to medicine. This is the dataset needed these to fine-tune a LLM into a conversational chatbot. The LLM learns the subject matter content of the dialogue pairs (in this case, rare diseases) and results in a new model that has "learned" about the dataset. The more samples you have the better. I currently have around 70,000 dialogue pairs for medicine and growing
This particular portion of training data deals with 48, XXYY Syndrome - a disease seen in boys that is caused by nondisjunction error, resulting in an extra X and Y chromosome:

Applying for the Horizon Prize holds immense value for the development and potential impact of my AI chatbot, Galen. As an individual innovator currently self-funding this project, several barriers exist that the Prize can help overcome.
Financial: The project, while promising, requires substantial resources for its continued development, maintenance, and scaling. Funds are needed for data acquisition, infrastructure upgrades, user testing, and deployment. Winning the $150,000 prize would significantly alleviate these financial constraints, allowing for robust development and enabling Galen to reach its full potential.
Technical: While I have a medical background as an MD and am enhancing my computer science skills at Carnegie Mellon University, additional technical expertise would accelerate Galen's development and fine-tuning. Access to a network of experts and fellow innovators via the Horizon and MIT Solve community would be invaluable, providing opportunities for collaboration, advice, and potentially even partnerships.
Market Barriers: As a new entrant in the healthcare tech space, getting the solution recognized and accepted by patients, healthcare providers, and researchers poses a significant challenge. I am fortunate to have brand-name value in my training pedigree, having spent time at Harvard Medical School and now getting a masters degree from Carnegie Mellon for computer science and AI. However, that alone is not nearly enough traction to stand out in the LLM space right now. Galen is built for an altruistic mission, and needs ways to be discovered and impact lives. The visibility and credibility provided by the Horizon Prize would help overcome market barriers existing in the LLM market, boosting trust in Galen and increasing its adoption.
Cultural: Changing behaviors, especially in healthcare, is always a challenge. (If you want proof of this, consider the fact that we are the reason fax machines and pagers still exist.) A solution like Galen requires acceptance not only from patients but also from physicians and researchers who are typically used to traditional modes of information retrieval and patient care. It's my hope that the endorsement from a respected entity like MIT, when combined with a noble mission and fascinating tech, will help overcome cultural resistance to AI technology in healthcare.
Applying for the Horizon Prize offers not just the financial support needed but also the opportunity to leverage the knowledge, network, and credibility associated with MIT Solve and Horizon. With these resources, we could significantly advance and distribute Galen, overcoming the barriers that currently limit its growth and impact.
My deep connection to the communities served by Galen stems from three key aspects: my medical background, passion for AI innovation, and commitment to improving access to care.
Medicine: My personal journey has been shaped by a mission to do good and help others, especially those in need of medical care. These desires are why I went to medical school. The practice of medicine has given me a deep understanding of the human body and the challenges faced by patients and healthcare providers, especially for those dealing with rare diseases.
AI Innovation: My recent work at Harvard Medical School involved developing and researching deep learning systems for surgery and oncology. This experience revealed to me my (previously unrealized) aptitude for AI development and its immense potential to scale healthcare solutions for positive impact. Driven by the transformative potential of AI, I pivoted from my original path towards surgery, choosing instead to pursue a Master's degree in computer science and AI at Carnegie Mellon University. I now plan to use AI & other exponential technologies to use my medical knowledge for positive change at scale.
Access to Care: Given the hi-tech nature of this whole endeavor, you might surprised to learn that I actually grew up on a cattle farm. (In fact, that's where I am sitting as I write this!) Growing up in rural Ohio, I witnessed firsthand the barriers to healthcare access due to geographic and financial constraints. This early exposure was later compounded by later experiences I had abroad. During a church missionary trip in South Korea, and later on medical mission trips to Sub-Saharan Africa and the Himalayas, I saw communities struggling with limited healthcare resources. These experiences instilled in me a deep commitment to enhancing healthcare accessibility, a commitment that drives my current work.
These three aspects all influence Galen, which I hope can help address the unique challenges associated with rare diseases. Galen leverages the power of AI to provide comprehensive, accurate, and easily accessible information on rare diseases, thereby serving as a valuable tool for patients, caregivers, healthcare providers, and researchers. By pooling data and knowledge about rare diseases, Galen aims to reduce the informational barriers that can prolong diagnostic journeys and hinder effective treatment.
I hope that Galen can contribute to eliminating the helplessness often associated with rare diseases, leveraging my experiences in medicine, AI, and my commitment to enhanced healthcare access to make a meaningful impact in the rare disease community.

Founder & Physician