🚀 Another new preprint!
Today we share another new preprint publication. In this paper, we compared LLM-based information extraction tools with respect to how suitable they are for extracting data from clinical documents.
We hope this work will empower researchers to make available the valuable data locked away in PDF documents and hardcopies still prevalent in health care.
We evaluated the tools in terms of usability, accuracy, robustness, and privacy. We found local multimodal LLMs like Gemma3 and NuExtract to be the best options.
Special thanks goes to Aaron Yu, who worked on this project as a summer student last year.
*Thumbnail image generated using GPT-Image-1 (OpenAI)