Reducing Hallucinations in Medical AI: A Knowledge Graph-Augmented Retrieval System for Evidence-Based Age-Related Macular Degeneration Information
Large language models (LLMs) have significantly advanced natural language generation but frequently produce unverified outputs, compromising their reliability in critical medical applications. We present a framework that combines structured biomedical knowledge with LLMs through retrieval-augmented generation to address this challenge. Our system automatically extracts causal relationships from 5 000 age-related macular degeneration (AMD) abstracts, building a knowledge graph with over 43 200 validated relations. Using vector-based retrieval, the framework generates contextually relevant and verifiable responses with direct clinical evidence links. We evaluated our approach across eight language models, including open-source models from 1B to 70B parameters (LLama, Mistral, Qwen, SmolLM) and GPT-5-mini, on 3 000 queries with varying question types and reasoning complexity. Smaller models (3B parameters) showed substantial improvements: SmolLM3-3B reached 95.6% accuracy on singlehop true/false questions (from 78.2% baseline). The medium-scale model Mistral-7B demonstrated the largest gains on complex multi-hop reasoning, improving from 45% to 76% accuracy on multiple-choice questions. Larger models (70B parameters) showed minimal improvement due to already high baseline performance (97-98% accuracy). Our results demonstrate that RAG-enhanced knowledge graphs enable resource-efficient smaller models to achieve performance levels approaching or matching larger models, reducing hallucinations while maintaining computational efficiency for clinical deployment [PDF](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11298209)
Attachments
No attachments yet.