Free data resources for student education and research projects

A foundational aspect of the NYU ACE program and the Health Care by the Numbers Curriculum is the use of authentic clinical data from both our local EMR and open data resources. We have developed two new educational clinical data tools for this project, both of which are freely and publicly available:

New York State SPARCS Data

The NYU ACE version of the NYS SPARCS database includes patient-level data on over 5 million inpatient discharges seen at 227 NY hospitals in 2012/2013. The site lets you explore by DRG code, providers, or hospitals. Data extracts are downloadable for projects and research studies.

NYU Virtual Practice and Virtual Patient Panels

This site contains a fictitious health care group that consists of three practices:
  • Mott Community Practice – Patient population and payer mix is similar to a city hospital in NYC.
  • Women's Medical Group – Patient population is mix of city and private NYC patients.
  • University Practice Associates – Patient population and payer mix is similar to private practice in NYC.
The data for the providers and patients in this practice were created by aggregating: problem lists and visit types from de-identified EMR data; payer and demographics from SPARCS; patient-level lab data, measurements from NHANES. Though they reflect real metrics and are totally authentic, these virtual patients have been significantly manipulated and combined with unrelated sources to create this practice data.

Freely Available Clinical Datasets Resources for education and research projects


Looking for data from other U.S. States? Check out the State-Specific Data Sources from County Health Rankings.
Database/Resource Patient Data Provider Data Raw Data
New York State SPARCS
Millions of patient-level records for every inpatient admission in NY State 2009-13.
X X
CMS Provider Charge Data - data.gov
9.5 million records of the 100 most common inpatient services and 30 common outpatient services.
X X
CDC NHANES (National Health and Nutrition Examination Survey)
Detailed patient-level behavioral, societal, and laboratory data on a nationally representative sample of about 5,000 persons each year.
X X
CMS Timely and Effective Care measures
This data set includes provider-level data for measures of heart attack care, heart failure care, pneumonia care, surgical care, emergency department care, preventive care, and more.
X X

Clinical and Population Data Catalogs