Overview
medRxiv (pronounced 'med-archive') is a community-based, non-profit preprint server dedicated to the health sciences. Launched in 2019 and managed by Cold Spring Harbor Laboratory (CSHL) in a strategic partnership with BMJ and Yale University, it serves as a critical repository for manuscripts before they undergo formal peer review. Architecturally, medRxiv is designed to handle high volumes of medical literature, utilizing a rigorous vetting process that distinguishes it from general-purpose repositories. In the 2026 landscape, medRxiv is a foundational data source for medical AI, providing the 'raw' evidence used to train LLMs on clinical findings, pharmaceutical developments, and public health trends. The platform integrates JATS XML formatting to ensure high-fidelity machine readability, which is essential for bioinformaticians performing large-scale data mining. By assigning permanent DOIs (Digital Object Identifiers) via Crossref, medRxiv ensures that early-stage research remains citable and traceable throughout the scholarly communication lifecycle, bridging the gap between discovery and publication.
