Is Healthcare Data Different?

My calculations prove Data is Data

Sometimes it feels like everyone is whining that healthcare data is difficult to work with. Everyone except for engineer-trained computer scientists that say “data is data…. These folks don’t think it is any more difficult to work with, they just say “data is data“.  Which of course is true: if you look up ‘data’ in the dictionary twice, it is going to say the same thing both times. The mathletes out there even invented a rule about it called the transitive property : Let data = a, then  a=a.  Or something like that.

My point is, healthcare data is different because healthcare is different.  More specifically, healthcare data reflects the differences found within the healthcare context. Some engineers remove the context from the data before solving problems, and context is everything with healthcare data. Look, if industry leaders HealthCatalyst complain that healthcare data is challenge, then I would just go with it.

[table id=2 /]When scanning over the table highlighting  ‘healthcare data challenges’, one might marvel whether these challenges are unique or all that tough:    “These challenges don’t look so bad..Every one of these challenges has already been solved somewhere, or at least it could be!”  Our Engineer friends correctly concluded this before they even looked at the table. 

If this is true, -and it might be, what is the deal with Health Data? Is it : 1) Healthcare has never attempted to fix these problems using solutions from other industries, or 2) Non-healthcare solutions applied to healthcare don’t work right or catch on.   What do you think?

 

Data Quality

Data quality is very simply defined as a measure of whether data “is fit for its intended purpose“.  Easy, right?  The key challenge to assessing quality is two-fold: 1) You must know the purpose, and 2) You must understand the specific characteristics of the data that meet (or don’t meet) your needs.

If you depend on data to make good decisions, you should know how to describe the data and articulate the strengths and limitations.   AHIMA’s Data Quality Management (DQM) Model is a good place to start.  The DQM  is a blueprint for institutional data governance that is updated every few years.  As such it contains characteristics and goals of a enterprise data model, including one of the better models for describing healthcare data quality.

Data Quality Characteristics (from DQM) [mfn] Davoudi, Sion & Dooling, Julie & Glondys, Barb & Jones, Theresa & Kadlec, Lesley & Overgaard, Shauna & Ruben, Kerry & Wendicke, Annemarie. (2015). Data Quality Management Model (Updated). Journal of AHIMA. 86. 62-65.[/mfn]

[thim-icon-box line_after_title=”true” desc_content=”The data is free from identifiable errors” custom_font_weight_desc=”” link_to_icon=”” icon_type=”font-awesome” font_awesome_icon=”fas fa-bullseye” font_awesome_icon_size=”28″ icon_color=”#dd9933″ layout_pos=”left” layout_text_align_sc=”text-center” layout_style_box=”image_box” widget_background=”bg_color” title=”Accuracy”][thim-icon-box line_after_title=”true” desc_content=”Data is easily obtainable with strong protections” custom_font_weight_desc=”” link_to_icon=”” icon_type=”font-awesome” font_awesome_icon=”fab fa-accessible-icon” font_awesome_icon_size=”28″ icon_color=”#dd9933″ layout_pos=”left” layout_text_align_sc=”text-center” layout_style_box=”image_box” title=”Accessibility”][thim-icon-box line_after_title=”true” desc_content=”All required data items are included” custom_font_weight_desc=”” link_to_icon=”” icon_type=”font-awesome” font_awesome_icon=”fas fa-shapes” font_awesome_icon_size=”28″ icon_color=”#dd9933″ layout_pos=”left” layout_text_align_sc=”text-center” layout_style_box=”image_box” title=”Comprehensive”][thim-icon-box line_after_title=”true” desc_content=”Data is reliable and the same across applications, systems, locations” custom_font_weight_desc=”” link_to_icon=”” icon_type=”font-awesome” font_awesome_icon=”fas fa-arrows-alt” font_awesome_icon_size=”28″ icon_color=”#dd9933″ layout_pos=”left” layout_text_align_sc=”text-center” layout_style_box=”image_box” title=”Consistency”][thim-icon-box line_after_title=”true” desc_content=”Data are up-to-date when the data value has not changed and is still current” custom_font_weight_desc=”” link_to_icon=”” icon_type=”font-awesome” font_awesome_icon=”fas fa-clock” font_awesome_icon_size=”28″ icon_color=”#dd9933″ layout_pos=”left” layout_text_align_sc=”text-center” layout_style_box=”image_box” title=”Currency”][thim-icon-box line_after_title=”true” desc_content=”The specific meaning of a data element fits your purpose” custom_font_weight_desc=”” link_to_icon=”” icon_type=”font-awesome” font_awesome_icon=”fas fa-book” font_awesome_icon_size=”28″ icon_color=”#dd9933″ layout_pos=”left” layout_text_align_sc=”text-center” layout_style_box=”image_box” title=”Definition”][thim-icon-box line_after_title=”true” desc_content=”The level of detail of the data fit your purpose” custom_font_weight_desc=”” link_to_icon=”” icon_type=”font-awesome” font_awesome_icon=”fas fa-search-plus” font_awesome_icon_size=”28″ icon_color=”#dd9933″ layout_pos=”left” layout_text_align_sc=”text-center” layout_style_box=”image_box” title=”Granularity”][thim-icon-box line_after_title=”true” desc_content=”The range of quantitative data measures and dimensions of categorical data are collected in a way that fits your purpose” custom_font_weight_desc=”” link_to_icon=”” icon_type=”font-awesome” font_awesome_icon=”fas fa-ruler” font_awesome_icon_size=”28″ icon_color=”#dd9933″ layout_pos=”left” layout_text_align_sc=”text-center” layout_style_box=”image_box” title=”Precision”][thim-icon-box line_after_title=”true” desc_content=”Is the data useful for the purposes for which it was collected? Relevancy may seem to be a general characteristic that encompasses all of these, but instead think of it as a dimension of quality for data not currently being used.” custom_font_weight_desc=”” link_to_icon=”” icon_type=”font-awesome” font_awesome_icon=”fas fa-tools” font_awesome_icon_size=”28″ icon_color=”#dd9933″ layout_pos=”left” layout_text_align_sc=”text-center” layout_style_box=”image_box” title=”Relevancy”][thim-icon-box line_after_title=”true” desc_content=”Is the data updated often enough to suit the specific application or need?” custom_font_weight_desc=”” link_to_icon=”” icon_type=”font-awesome” font_awesome_icon=”fas fa-hourglass-half” font_awesome_icon_size=”28″ icon_color=”#dd9933″ layout_pos=”left” layout_text_align_sc=”text-center” layout_style_box=”image_box” title=”Timeliness”]

Do You Get IT?

[h5p id=”19″]

 

Additional Measures for Data Quality

[thim-icon-box line_after_title=”true” desc_content=”Healthcare data is often described as having a large number of “missing“ values. For example, if you pulled data for all of the patients seen in a week, there might be much fewer with diabetes than expected. This expectation vs reality mismatch is due in part by how fragmented healthcare is, but also because documentation does not always reflect reality.” custom_font_weight_desc=”” icon_type=”font-awesome” font_awesome_icon=”far fa-folder-open” font_awesome_icon_size=”20″ layout_pos=”left” layout_text_align_sc=”text-center” layout_style_box=”image_box” title=”Missingness”][thim-icon-box line_after_title=”true” desc_content=”Many cases in healthcare are rare enough that there may be a dozen (or fewer) records in the data set. On the other hand, millions of records are also common. Both situations (big data and small data) present challenges to the data analyst that can be exaggerated with visualizations. Very small numbers preclude most inferential statistics and can result in heavy biases. Very large data can be inaccessible to some users, and results in frequent statistical significance without meaningfulness.” custom_font_weight_desc=”” icon_type=”font-awesome” font_awesome_icon=”fas fa-users” font_awesome_icon_size=”20″ layout_pos=”left” layout_text_align_sc=”text-center” layout_style_box=”image_box” title=”Too Big and Too Small”][thim-icon-box line_after_title=”true” desc_content=”Data can meet the quality measures above, and still be unusable due to a data standard that is local, proprietary, out dated, and/or logically incompatible with your data” custom_font_weight_desc=”” link_to_icon=”” icon_type=”font-awesome” font_awesome_icon=”fas fa-swatchbook” font_awesome_icon_size=”20″ layout_pos=”left” layout_text_align_sc=”text-center” layout_style_box=”image_box” title=”Mis-standardized”][thim-icon-box line_after_title=”true” desc_content=”Providers often downplay uncertainty to put patients at ease, but healthcare information is rarely deterministic. Specifically, diagnosis, genetic, and laboratory data may be uncertain to a degree. Often (though not always) the degree of uncertainty is known and is constant, as is the case for laboratory testing accuracy.” custom_font_weight_desc=”” icon_type=”font-awesome” font_awesome_icon=”fas fa-dice-two” font_awesome_icon_size=”20″ layout_pos=”left” layout_text_align_sc=”text-center” layout_style_box=”image_box” title=”Uncertainty and Probability”]

Do You Get IT?

[h5p id=”20″]

Author

Write A Comment