This repository is under review for potential modification in compliance with Administrative directives.

N3C Education

Mission

The mission of the N3C Education Tenant is to provide educators and learners a space to develop and practice the skills needed to analyze real-world data (RWD, e.g., non-clinical trial data, such as data from medical records, insurance claims, patient surveys, or census or community datasets). The Education Tenant provides simulated (also known as ‘synthetic’ or notional) datasets to learn on, as well as a series of training tutorials, the Researcher’s Guide to the N3C - a virtual textbook of the concepts and skills needed to study RWD, and access to many of the shared resources available to the broader N3C community.

Since the Educational Tenant does not include any real patient data, only simulated data, there are no restrictions on recording or sharing screen views, making it a rich venue for training programs, courses, and workshops.

Overview

Real-world datasets such as electronic health records (EHRs) provide important information needed for advancing and transforming health care. Translational studies - studies that translate lessons and evidence learned from real-world health data into new treatments and improved clinical practice and public health care - require multidisciplinary knowledge and skills. In addition to standard research skills, conducting high-quality, rigorous, translational projects requires understanding medical vocabularies, data models, data engineering, statistical knowledge, and public health. Real-world datasets are also observational, including significant “messiness”, and Good Algorithmic Practices are needed to appropriately plan and conduct studies, and to interpret and communicate results to inform and improve clinical care and public health. Hands-on training using real-world data and tools is essential; however, most databases and analysis tools require special installation or infrastructure support; cannot be shared or recorded - even for classes - in order to protect patient privacy; or are lacking in realism or suitability for teaching.

To remedy these challenges, the National Center for Advancing Translational Science (NCATS), which hosts the N3C, worked with Tufts Medical Center who generously agreed to make over 500,000 simulated patients available for educational use. The Tufts synthetic data has gone through extensive testing to mitigate any concerns about privacy, including getting an expert determination from an independent entity that specializes in privacy risk.

This data contains common elements, including conditions, devices, drugs, measurements, observations, procedures, and visits, that have been preliminarily verified to be highly concordant with the original EHR data across a number of domains and applications. However, this is simulated data; created for educational purposes.

The Educational Tenant also provides access to other notional data in the form of SynPuf and Synthea tables. Each of these data resources is formatted in the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM), an “open community data standard, designed to standardize the structure and content of observational data and to enable efficient analyses that can produce reliable evidence.” OMOP formatting is key to illustrating how data from different health systems can be harmonized, increasing our ability to produce large cross-institutional, broadly-representative data sets that enable increasingly powerful studies.

Like all N3C tenants, resources such as code templates (prewritten sets of commonly-used programming code) and concept sets (prewritten sets of commonly-used medical codes) are sharable, allowing instructors and learners to develop material during training that can be shared and used in research projects. Users of the Education Tenant also have access to the training and support materials developed for the other N3C tenants, including

Available Data

N3C ingests and harmonizes data from multiple sources for use in multiple tenants. These data ‘streams’ primarily include Electronic Health Records (EHRs) that are then harmonized to the OMOP Common Data Model.

The following streams are available in the Educational Tenant:

  • Tufts Synthetic Data: High-quality simulated/synthetic EHR data for learning purposes. (OMOP format.)
  • Notional Patient Data: Other simulated/publicly available EHR data for learning purposes. (OMOP format.)
  • External Datasets: Publicly available data (e.g., U.S. Census and regional data) for use alongside EHR data. Users can request ingestion of additional external datasets–see link for details and currently available datasets. For more information about these and to request ingestion of more, see the External Datasets page. (Various formats.)

Using the N3C Educational Tenant

While the N3C Education Tenant does not contain real patient data, the registration and access procedures are the same as for other N3C tenants. This promotes learning the involved regulatory processes, and enables educators and learners to continue on with real-world data research in the future.

Individuals wishing to use the N3C Educational Tenant must complete the following steps. If you have questions, visit the N3C Office Hours or submit a ticket to the N3C HelpDesk.

  1. Confirm there is a current Institutional Agreement for your institution. First, check the Signatories List at https://covid.clinicalcohort.org/tenant-duas/ to confirm that your institution has a signed institutional agreement with NCATS.

    • Note: N3C registration involves affiliating you as an individual with an institution that has signed an institutional agreement. Be sure to use your institutional email address throughout the process so that the linkage is clear.
    • The Educational Tenant uses the standard Institutional Data Use Agreement (DUA) form; however, this DUA form does not provide access to any synthetic or patient data; it ensures users will abide by appropriate use of the platform and resources provided. Accessing data requires a separate form called the Data Use Request (see below), which confirms that the individual has the appropriate training and approvals to view the specific data requested.
    • If your institution does not currently have a signed institutional agreement on file, you can download the DUA form and provide it to your institution’s signing official (often someone from the business or legal office). The signing official completes the form and then e-mails it to NCATS at NCATSPartnerships@mail.nih.gov. Once submitted, it usually takes 1-2 weeks for NCATS to review, countersign, and add your institution to the Signatories List.
      • While the form is being processed, you can work on the training requirements and review resources described in the steps below.
    • If you do not have an institutional affiliation, you can sign and submit a Citizen Scientist agreement certifying that you agree to the terms of service. Download and complete the form with your information as the signatory and scientific point of contact, entering “Citizen Scientist” for the Accessing Institution, Signatory Title, and Scientific Point of Contact Title. You can download and sign the form, or apply a digital signature (e.g., Adobe Acrobat or Microsoft Word). Email the completed form to NCATSPartnerships@mail.nih.gov. NCATS will review, sign and email you a completed form and add your name to the Signatories List (3-5 business days).
  2. Submit your individual registration to access the tenant and its training resources. Once your DUA institution is on the Signatories List, register for an individual N3C account at https://clinicalcohort.org/registration/. After registration, your account will be reviewed for coverage by an Institutional DUA Agreement; when this is complete, you will receive a “Welcome to N3C” email instructing you to log in at https://unite.nih.gov/. To register, you will need:

    • your institutional email - individual accounts are linked, or “affiliated”, with the Institutional DUA
    • an authenticated login and a smartphone or tablet with a two-factor authentication application (e.g., DUO, Google Authenticator, Microsoft Authenticator)
      • There are four authenticated login options. In addition to NIH and HHS logins, many organizations participate in InCommon for authenticated login. If your organization is not in InCommon, you can create a Login.gov authenticated login. Be sure to use the same institutional email so that your login links to your registration and to the Institutional DUA.
    • an ORCID to ensure appropriate author attributions. If you do not have an existing ORCID, you can register at https://info.orcid.org/what-is-orcid/.

    When you submit your registration, you may be prompted to create your profile, or you can click the Edit My Profile link at the bottom of the screen. N3C uses Google Workspace, Github, and Slack, so be sure to enter your information to be added to those workspaces.

    • Note: You can provide a Gmail address here; it does not need to be your institutional email.

    After you submit your individual registration, your account will be reviewed for coverage by an Institutional DUA; when this is complete (usually 2-5 days), you will receive a “Welcome to N3C” email instructing you to log in at https://unite.nih.gov/. When you log on, you should see a button for the Education Tenant on the N3C Home Page. Enjoy exploring the N3C and watch the orientation videos to get started setting up your workspace and folders.

  3. Complete two required trainings:

    • NIH Information Security Training. Complete the NIH Information Security and Management Refresher course at https://irtsectraining.nih.gov/publicUser. You only need to complete the refresher course with the 6 modules - 2024 Information Security, Insider Threats, Privacy Awareness, Records Management, Emergency Preparedness, Rules of Behavior (60 - 90 minutes). After the 6th module, click the Print Certificate button and save a copy of your certificate. You will need to complete this training to submit your individual registration (below), and annually thereafter to maintain access.
    • Human Subjects Research Protection Training. Even though the Education Tenant contains no human subjects data, completion of this training within the last 3 years is required as part of general N3C requirements. This training is frequently offered through https://citiprogram.org, or a free course is available at HHS Human Subjects Research Protection Foundational Training (3-4 hours); consult your local Human Research Protection Program (HRPP) for your institution's specific requirements.
  4. Join or submit a Data Use Request (DUR) to access the synthetic datasets. Once you are logged in, set up, and ready to begin a course or other learning activity, you can request access to the relevant synthetic data set by clicking on (a) the Join a DUR button to locate and find an existing DUR, or (b) the Create a New DUR button to submit a new request (see instructions, below). The Data Use Request is a standard form used to govern access to data resources.

If you have questions about any of these steps, please visit the N3C Support Desk at https://covid.clinicalcohort.org/support/.

Once registered, spend some time exploring the Educational Tenants' many resources. From the Home page, we recommend clicking on the Training Portal button and reviewing the Orientation videos and the Researcher’s Guide to the N3C, which provides a textbook of the concepts and skills needed to analyze real-world data. The Educational Tenant also provides a wealth of code templates and tutorials to help you learn how to design and develop a translational project.

Data Use Request (DUR) Submission Guidance

Individuals, instructors, and training program leads can create DURs to request using one or more of the synthetic data sets for their classes or educational projects. Requests will need to provide the project, course, or program’s:

  • Title and abstract describing the educational goals and focal areas of the project. This information will be posted publicly on N3C dashboards and websites.
  • Rationale, which will be evaluated by the Education Tenants’ Data Access Committee (DAC) for educational focus. This information will remain private within the DAC.
  • Attestation and Agreement to the N3C Clinical Data Use Agreement, the N3C Download Policy, and the N3C Code of Conduct.

Data Use Requests for the Educational Tenant will generally be accepted for health-related, real-world data educational purposes. Educational Tenant DURs are valid for 1 year and renewable for continued access to approved project workspaces.

Special Considerations

While most N3C governed resources prohibit screenshotting, recording video, or sharing screen views of row-level data with others without the same level of data access, these restrictions do not apply to Education Tenant governed resources. Users are free to take screenshots, record videos, and screen-share data from the Tufts Synthetic Data, Notional Patient Data, and External Datasets data streams for educational purposes, and share them inside and outside of the Education Tenant and N3C platform.

Help & Support

Visit the N3C Support Desk to review frequently asked questions, join the weekly office hours for support, or submit a request for help.