Data Management

There are four D’s related to research and data: Data Collection, Data Management, Data Analysis, and Dissemination of Results. Institutions employ professionals across many fields to help ensure the security, integrity, and provenance of clinical and translational research data.  

This introductory online module will help learners identify the goals of research data management and summarize approaches for answering a research question. The learner will be able to define data collection methodology, describe and compare database design best practices, and discuss tips and tricks for collecting data for research purposes.


Resources

Resources: Data Collection Software for Clinical Research: Literature:
Data Integrity: Making sure that all data related to a research study are complete, reliable, consistent, accurate, and processed correctly. Integrity also means that the data are relevant to the purpose for which they were collected.

Data Provenance: This means that the findings from the data are reproducible, both by the same research team and other research teams. The source data and relevant documentation should be managed in such a way that it can be used to reproduce the same results.

Data Security involves data storage, access, and sharing. Data must be stored in a manner limiting access to only those people who need access to the data. This means data must be protected from destructive forces and unwanted actions of unauthorized users. Sensitive information that is private or can be used to identify a person must be protected.

HIPAA Identifiers: There are 18 HIPAA direct and indirect identifiers. Identifiers are the information or data that can be a) used to identify, contact, or locate a single individual, or b) used, in combination with other sources, to identify a single individual.

Protected Health Information: When one or more HIPAA identifiers are used in conjunction with one’s physical or mental health or condition, health care, or payment for that health care the data become protected health information or PHI.

De-identification: In many cases, before research data is shared, the dataset needs to be fully de-identified so that the people linked to the data cannot be identified using the dataset. De-identification is the removal of all 18 HIPAA identifiers.

Data Lifecycle: The steps in the data lifecycle include - Plan, Collect/Create, Process, Analyze, Share/Disseminate, Preserve, Reuse.

More terms:

Interview

Interview with Ceci Chamorro
Manager of Information Systems
Duke Office of Clinical Research