Our project is dedicated to the analysis of "transparent diseases," a category of conditions characterized by their lack of visible symptoms and challenging diagnosis process. Examples of such diseases include Fibromyalgia, Endometriosis, and Meniere's Disease. These ailments vary in symptoms, affect different parts of the body and systems, and can significantly impact mental well-being. Diagnosis often relies on a process of elimination, which can extend over years before an accurate determination is made. The limited availability of medical information further complicates the diagnosis and understanding of these conditions.
This project builds upon the foundational work of Valeria and Alon, who made significant contributions by creating a specialized questionnaire, conducting an in-depth analysis of fibromyalgia, and developing an informative website. Our initiative serves the "נשים, בריאות וחיים בכבוד" organization, founded in 2012 by Anat Horowitz, Tzvia Dei, Anna Shefer, and Dina Pozniack at the Haifa HUB. This organization advocates for women suffering from transparent diseases.
The previous team faced challenges with the manual creation of dictionaries and rule-based concept extraction, which required starting from scratch for each disease without an automated method. Furthermore, the original website built with Google Sites suffered from performance and accessibility issues. Efforts to distribute the questionnaire within Arab communities encountered difficulties, leading to low participation and representation in the research.
Our approach introduced an automated process utilizing few-shot learning within a specialized subdirectory of SpaCy, termed SpaCy-LLM. This process utilizes the capabilities of a large language model, blending textual principles from the questionnaire with insights from external sources. The website transitioned to the Flask framework, leveraging Python, HTML, CSS, and JavaScript for enhanced functionality and was deployed on Google Cloud Platform for improved monitoring, security, and performance. The website and questionnaire were professionally translated into Arabic, achieving medical accuracy and significantly boosting outreach and participation within the Arab community.
Prompt Engineering: Utilized with GPT-3.5 Turbo for both the information retrieval and storyline generative systems. Integrated into outputs for reliability, especially within the information retrieval system.
Anonymity in Outreach: A website feature enabling women to confidentially reach project partners and associations.
Model Configuration Updates: Enhancing the large language model's performance through configuration adjustments. The proposal suggests expanding the current dataset and employing advanced methodologies such as transfer learning or meta-learning. These approaches are anticipated to significantly improve the analysis of diseases by enhancing both the accuracy and speed of the process.
Language Processing Adaptation: Extending our methodology to accommodate pure Hebrew and Arabic text processing more directly.