Data Resources

Questions? Contact Artificial Intelligence

Overview

UCSF's data ecosystem underpins our work in AI. Explore the brief collection of key resources below. 

Did we miss something? Let us know.

Requesting Data

Data Warehouses, Catalogs, and Analytic Tools

  • SCUBA offers a single, searchable portal for UCSF's data systems and reports 
     
  • The UCSF Data Resources for Research site provides a portal to many different data-related assets for AI researchers, including de-identified EHR databases
     
  • The Information Commons Tools Lab - Offers high-powered, user-friendly data exploration and computational analysis tools, including:
    • PatientExploreR - Patient search and cohort selection from de-identified UCSF EHR data
    • CTAKES As-A-Service - Natural language processing system for extraction of information from medical record clinical free-text
    • EMERSE (Electronic Medical Record Search Engine) - search UCSF machine-redacted clinical notes through a user-friendly interface
    • UCSFPhilter - User-friendly de-identification of clinical text
    •  HUE - Web-based application for visually constructing and running SQL queries with any data hosted on the Information Commons
    • Spark, SparkMLPySparkSparkRSparkSQL - Distributed computing versions of popular language and AI tools 
    • JupyterHub - A multi-user version of Jupyter Notebook for developing and sharing documents with live code, data output, equations, visualizations and text