- Advertisement -
Wednesday, July 6, 2022
Home Healthcare EMR/EHR Best Practices for Data Quality Checks for Third-party Healthcare Data

Best Practices for Data Quality Checks for Third-party Healthcare Data

By Gorkem Sevinc, CEO, Qualytics

In accordance with many industries in the market, the healthcare industry is experiencing the exponential growth of data as its new oil. With Accountable Care Act’s success and digitization of health records, healthcare systems are more unified than ever – EHR systems are integrated with LIMS, PACS, RIS; interoperability standards have grown significantly in both maturity and adoption; and end users are hungry to interact with more data at the right time in a unified manner. More data means more complex operations and more problems – specifically in the quality of the data. In this article, I’ll be focusing on one small but mighty part of the data quality problem in healthcare – third-party data.

At Qualytics, we work with various healthcare organizations who have varying complexity of data operations and data quality issues. One topic that is consistent between these organizations is the overall lack of quality in the data they get from third parties; whether the data is from payors, health systems, wearables or other data generators. Root causes for data quality issues can often be categorized into six high-level topics:

  1. Operational – this is where we see software releases, product evolution and KPI calculations changing over time impacting our data
  2. Human Error – data entry is inherently flawed, especially with systems not always consistently validating data inputs. Fat-fingering is a real problem
  3. Nature of Data – missing data, duplicate data, non-conformant data with inherent flaws
  4. Technical Issues – dataops is complex! We have 100s, if not 1000s of disparate systems – whether they are SaaS or proprietary – that have complex integrations
  5. Nature of third-parties– by nature, third-party data means limited control over the data – and should not be trusted by default
  6. Governance – often inadequate &decentralized; execution lags behind strategy

The issues we experience with incoming data from third parties can fall into all of these categories, but especially Operational, Human Error, Nature of Data and Nature of third-parties categories. Evidently, we need to be able to do something about our third-party data – but what’s the best practice here?

A default line of thinking would be to bring the data in, land it in your systems, and run some QA scripts to validate the data. This line of thinking is inherently flawed for a number of reasons: 1. When data is landed into operational systems, it is nearly impossible to truly delete a bad data record; 2. Relying on manually developed QA scripts puts impetus on the scripts being current, relevant and to have the necessary coverage of all use cases – a very difficult task to accomplish manually. So, this approach doesn’t really work. What else can we do?

An alternative approach would be to first bring the data into a clearinghouse environment – where data is validated with QA scripts before being let into operational systems. This is a better approach, but not scalable. The #2 problem from above persists with manual developed scripts needing to be managed, but exasperated with the fact that now we have added additional complexity into our data ingress with a clearinghouse and additional data pipelines & ops that need to be maintained.

Having faced these issues many times, we have devised a better approach to this problem – and it comes in three distinct steps.

  1. Flexible 1st mile: incoming data can be in many shapes, sizes and locations. Whether it is files on object storage (think AWS S3), APIs, shared data frames, we need to assert the ruleset in-place as part of the processing steps, enabling #2.
  2. Ability to infer data quality rules from historical data: historic data’s shapes and patterns can often be utilized in ML methods to automatically infer what the data should look like going forward. This often gets us to 60-70% coverage of the rule base from the get-go, enabling staff to focus on authoring the more complex business logic checks.
  3. Route anomalous data to an enrichment data store: when the ruleset is asserted on data in-flight, records failing assertions can be routed to a different location than the original destination – an enrichment table – where anomalous data is segregated from the good data. Staff can then focus their efforts on taking corrective actions towards these anomalies with light transformations, quarantining, dropping records, or kicking them back to the originator third-party to address.

Third-party data will continue to be a core part of interoperability between systems and as will the necessity for data quality checks in complex data ingress / egress workflows.

Must Read

Data Sciences in the Medical Industry: How powerful is it?

When novel compounds were researched about, to meet the needs of patients who were in critical conditions, medical science had huge growth....

ReCode Raises over $120 Million in Extended Financing; Plans to Scale up the In-house Genetics Delivery Platform

While there are many things that make human beings special, none do a better job of it than our ability to get better on...

A Renewed Effort Against an Old Foe

While a human skill-set has proven to be special in many different ways, there is hardly anything more impressive there than our ability to...

Tomorrow Health Raises $60 Million in Series B Financing; Plans to Scale Up Proprietary Technology and Enter New Markets

Human beings’ skill-set, as we know, is made up from a lot of valuable elements, and yet if we take a second and assess...

Related News

Building a Brand New Healthcare Reality

Sure, there are a lot of different elements that make human beings special, but if we are being honest, none have proven themselves to...

Reaching Beyond All Limits

As we know, a human skill-set is made up from a lot of different elements, but if we are being honest, none are really...

Breaking the Deadlock

One element that plays a bigger role in making human beings special is how we are always trying to become better under all situations....

Finding a Modern Route to Healthier Life

While a human skill-set tends to boast many different valuable elements, one that sticks out the most is how we are always trying to...

Resetting the Healthcare Limits

While a human skill-set boasts enormous value in each and every area, there is nothing more significant in it than our ability to find...