In the US, Healthcare Data Access Is a Scavenger Hunt

Jared Kaltwasser
SEPTEMBER 10, 2018
big data challenges,healthcare data gathering,disparate data medicine,hca news
Image has been modified. Credit: monsitj -

In the era of big data, obtaining data sets that paint a comprehensive picture of the American healthcare landscape is a big — if not impossible — challenge.

That’s not to say researchers aren’t trying.

“In a study … recently completed, we pulled together over 150 different data sources, and our data still weren’t completely representative of the U.S.,” said Joseph Dieleman, Ph.D., an assistant professor at the University of Washington’s Institute for Health Metrics and Evaluation.

>> READ: The NIH Makes a Big Push for Big Data

His research focuses on healthcare data, the economics of healthcare and healthcare policy, all areas that rely on solid data. Yet in the U.S., it’s virtually impossible to get a robust, all-encompassing picture of Americans’ healthcare.

Trudy Krause, Dr.P.H., MBA, associate professor of management, policy and community health at UTHealth School of Public Health in Houston, said the way America’s healthcare system is constructed makes compiling data something like a scavenger hunt.

“Unlike countries with a national healthcare system, the U.S. healthcare system is fragmented by payer type,” she said.

There are the public programs and entities, like Medicare, Medicaid and the Department of Veterans Affairs (VA), and then there is a vast landscape of private insurers, selling plans on the open market or through employer-sponsored plans.

“Thus, there is no central data bank of claims data from the payers,” Krause said.

Scrambling for Health Data

Consequently, health data researchers are left with a series of calculations — literally and figuratively.

“There (are) data on Medicare beneficiaries, some data on Medicaid, private insurance and (much less data) on uninsured spending, although this is a smaller fraction of (the) health sector,” Dieleman told Healthcare Analytics News™. “Getting access to all these data is very difficult; analyzing them jointly in order to study the entire health sector is really challenging.”

Broadly speaking, data from the government programs are easier — if not easy — to obtain.

“Centers for Medicare & Medicaid Services (CMS) makes some data available to researchers through the Research Data Assistance Center,” Krause said, “but it is project limited, and there are fees.”

CMS also has the Qualified Entity Certification Program, which enables qualifying organizations to access data. However, the certification process takes time, and even with certification, researchers must pay for the data they use.

>> READ: Leo Celi and the ‘Holy Grail of Personalized Medicine’

Because Medicaid varies on a state-by-state basis, access to Medicaid data is likewise hit-and-miss, Krause said.

However, even with a complete set of public-sector healthcare data, researchers would miss information on most Americans. According to the Kaiser Family Foundation, 56 percent of U.S. citizens had private insurance in 2016, either through an employer-sponsored plan or a nongroup plan. Add in the 9 percent of patients who were uninsured, and government data accounts for just over a third of all patient information.

Furthermore, because programs like Medicare, Medicaid and VA initiatives are designed for specific populations, comprehensive government data would not give a representative sample of the public, experts said.

Given the limits of public healthcare data, researchers must either use complex weighting and equations to adjust the numbers or turn to the private sector for data. The problem is, data from commercial insurers or pharmacies are even harder to come by.

“Most commercial carriers try very hard to protect their proprietary information that reveals contracting terms for providers, and thus (insurers) limit charge and payment information when providing data,” Krause said.

Worries about privacy add another layer of concern, causing insurers to remove identifying information from the data.

The end result, Krause said, is that most commercial insurers don’t share such data, and the few who do charge a considerable amount.

Another option, Dieleman said, is to simply conduct surveys of patients. Those data, though, “(have) many of (their) own challenges related to smaller sample size and respondents’ own reporting biases.”

Become a contributor