Overview
UCSF's data ecosystem underpins our work in AI. Explore the brief collection of key resources below.
Did we miss something? Let us know.
Requesting Data
- Health System: How to get identified data for quality improvement and clinical operation
https://scuba.ucsf.edu/request-access - Research: How to request clinical data sets for Research
https://data.ucsf.edu/cdrp/research - Research: How to get de-identified clinical data for cohort studies, pattern recognition, and more
https://data.ucsf.edu/research/deid-data - Research: How to get de-identified data that contains real dates & zip codes
https://data.ucsf.edu/research/limited-data - Research: How to get identified data for research or recruitment
https://data.ucsf.edu/research/id-data
Data Warehouses, Catalogs, and Analytic Tools
- SCUBA offers a single, searchable portal for UCSF's data systems and reports
- The UCSF Data Resources for Research site provides a portal to many different data-related assets for AI researchers, including de-identified EHR databases
- The Information Commons Tools Lab - Offers high-powered, user-friendly data exploration and computational analysis tools, including:
- PatientExploreR - Patient search and cohort selection from de-identified UCSF EHR data
- CTAKES As-A-Service - Natural language processing system for extraction of information from medical record clinical free-text
- EMERSE (Electronic Medical Record Search Engine) - search UCSF machine-redacted clinical notes through a user-friendly interface
- UCSFPhilter - User-friendly de-identification of clinical text
- HUE - Web-based application for visually constructing and running SQL queries with any data hosted on the Information Commons
- Spark, SparkML, PySpark, SparkR, SparkSQL - Distributed computing versions of popular language and AI tools
- JupyterHub - A multi-user version of Jupyter Notebook for developing and sharing documents with live code, data output, equations, visualizations and text