Considering how much clinical data is being entered, generated by equipment, and shared between organizations, administrators and clinical staff can get lost in all of the data. This is where data management comes in. Data Management captures, maintains, and prepares data for a secondary use in research, clinical and business analytics. Data management is an end-to-end process encompassing every step from data creation through preparation, and data analysis.Data Governance: covered in more detail below, Data Governance is the framework, people, and systems who manage the data from various sources. Governance also includes terminology and ontology management to ensure there is a ‘single source of the truth’.
Data Integration: Data in healthcare comes from various sources including claims and billing to EHRs and laboratory systems. Aggregating data from these sources requires data cleansing to ensure data quality, transforming the data into more standardized formats, and transporting the data from the source systems to data repositories and warehouses.
Data Enrichment is a process that further prepares the data for analysis by blending new data, or converting raw text into meaningful, discrete information.
Data Storage: Data storage pulls together the data into computer systems required for analytics, archives the data, and purges data when needed.
The biggest challenges in health data management are 1) The sheer volume of data and the high rate at which new data is created (velocity). 2) Duplicate data found within the enterprise. and 3) Healthcare data is silo-ed, which means data is kept in one system (and department) and is difficult to collect and merge with another data source.
Imagine that your boss brings you a report on pediatric utilization generated by external consultants last year, and she asks you to create a a new live dashboard of the same report. It should be easy, but your dashboard show different numbers from the consultant’s report. ‘Pediatric’ means patients younger than 18, right? Or it is any patient treated by pediatric services, but not children treated elsewhere? Or any patient who is a dependent on their parents insurance policy (age < 26)? In this oversimplified example, data governance policies and definitions would have specified exactly what a pediatric encounter, how to calculate pediatric utilization, and (sometimes) what this calculation should be used for.
Data Governance is an often misused term in healthcare. Sometimes rules are called ‘data governance’, sometimes a committee is called ‘data governance’, and even data entry popups and validation processes that ensure standardized entry are sometimes called data governance. All of these (and more) make up the system of data governance. From the Data Governance Institute:
“Data Governance is a system of decision rights and accountability for information-related processes, executed according to agreed-upon models which describe who can take what actions with what information, and when, under what circumstances, using what methods.”
Put simply, Data Governance is the system that controls the definitions and usage of data. AHIMA stresses that Data Governance is essential for the transparency, accuracy, and integrity of information shared with patients, as well as legal, regulatory, fiscal, operational, and historical requirements.
Healthcare is an environment full of policies and standards, so the lack of Data Governance maturity across the industry may surprise some. In 2017, only 44 percent of US health systems reported having any enterprise wide data governance capability. Consider that most of the Electronic Health Record adoption happened recently (during the past decade), and the value proposition of data warehouses and analytics is only beginning to pay off.
Even if you can find(and hire) the talent to help your Data Governance Initiative, it is still a difficult task. Organizations may have very little Data Governance, or they can have vast rules, processes, and governing committees. 71 percent of health systems acknowledge data definition discrepancies between department, with 37 percent resulting in interdepartmental conflict.
Do you get IT?
Principles of Data Governance
The following is AHIMA’s Information Governance Principles for Healthcare (IGPHC). IGPHC is a high level framework to understand data governance principles.
- Accountability – An accountable member of senior leadership, or a person of comparable authority, shall oversee the information governance program and delegate program responsibility for information management to appropriate individuals. The organization should adopt policies and procedures to guide its workforce and agents and ensure that its program can be audited.
- Transparency – An organization’s processes and activities relating to information governance should be documented in an open and verifiable manner. Documentation shall be available to the organization’s workforce and other appropriate interested parties within any legal or regulatory limitations and consistent with the organization’s business needs.
- Integrity – An information governance program shall be constructed so the information generated by, managed for, or provided to the organization has a reasonable and suitable guarantee of authenticity and reliability.
- Protection – An information governance program must ensure that the appropriate levels of protection from breach, corruption, and loss are provided for information that is private, confidential, secret, classified, essential to business continuity, or otherwise requires protection.
- Compliance – An information governance program must ensure that the appropriate levels of protection from breach, corruption, and loss are provided for information that is private, confidential, secret, classified, essential to business continuity, or otherwise requires protection.
- Availability – An organization shall maintain information in a manner that ensures timely, accurate, and efficient retrieval.
- Retention – An organization shall maintain its information for an appropriate time, taking into account its legal, regulatory, fiscal, operational, and historical requirements.
- Disposition – An organization shall provide secure and appropriate disposition for information no longer required to be maintained by applicable laws and the organization’s policies.
How to Operationalize Data Governance
These principles can take many forms when operationalized as an enterprise wide data governance system. The Advisory Board and Health Catalyst don’t agree on everything, but thanks to Dale Saunders influence, they operationalize a great data governance system in similar ways. The following is the confluence of Dale’s perspective from each organization:
Balanced, Lean Governance
The Data Governance Committee should be a subcommittee to an existing governance structure that has authority. Governance is an business function, not an IT function. Therefore COOs or CIOs who function across business lines might be effective at leading the data governance (sub) committee. Front line staff who enter/use the data are essential to the committee. When in doubt govern less and keep it small.
Considered the most important function of data governance due to the severe impact of poor quality data. One third of quality is timeliness, so Data Governance Committees must be capable of quickly address issues and enforce changes required in source systems and workflows. The Data Governance Committee must also make the validity and completeness of data a leadership priority to complete the quality equation.
Data access includes all members within the enterprise, as well as external stakeholders, community providers, and patients. Data Governance Committee should create a productive tension in the opposite direction of data security/compliance, which tends to resist data access. It may be productive to combine data governance and data security/compliance committees, or share membership to balance data sharing.
Access to quality data is meaningless without staff/leadership proficiency on available information, tools and techniques. Data literacy can be increased by:
- Training/education on data and dissemination of metadata (provenance) in the context of users’ roles in the organization
- Expand access to data analysis tools
- Proliferated data-driven process improvement tools and programs;
- Training/education on stats and analytic techniques
- Champion the data-driven decision-making and data transparency around quality and cost
It is essential to Grow the types managed information within the organization. This will require a multi-year strategy for data acquisition and data provisioning, and seeking to constantly expand the data ecosystem that is available for analysis.
The Data Governance committee should play an active role in ensuring the data requirements of the organizations strategic plan are available. Inevitably, there will be more demand for analytic services than there are resources available to meet that demand. The Data Governance Committee should balance top-down priorities with bottom-up requests from the clinical, business, and research units.
Master Data Management
Data Governance Committee is the steward for defining and resolving conflicts in master data management. This role covers local data standards (e.g. facility codes, department codes, etc.), industry standards (e.g. SNOMED, LOINC, etc.), as well as calculations ( e.g. LOS, provider attribution, readmission criteria, etc.)
Provenance is the term used to describe the “data about the data”, also sometimes called metadata. The W3C defines data provenance as the ” information about entities, activities, and people involved in producing a piece of data or thing, which can be used to form assessments about its quality, reliability or trustworthiness.”
As healthcare data is shared with more people inside and outside of the organization, documenting the “who, what, where, and how” of the the data is even more important of the data is to be trusted, understood, and used.
For example, before the data can be used, a provider accessing external patient data through the Health Information Exchange (HIE) may want to know this about the data:
- Who created the original health information?
- Where was the original health information created?
- When was the original health information created?
- What information has been changed?
- Why has the information been changed?