Apixio Blog

This blog is dedicated to Apixio in the Health industry, and how we see the future of it.

  • Home
    Home This is where you can find all the blog posts throughout the site.
  • Categories
    Categories Displays a list of categories from this blog.
  • Tags
    Tags Displays a list of tags that has been used in the blog.
  • Bloggers
    Bloggers Search for your favorite blogger from this site.
  • Team Blogs
    Team Blogs Find your favorite team blogs here.
  • Login

Intel Blog: Why is Big Data a Big Deal for Healthcare?

Posted by on in Big Data
  • Font size: Larger Smaller
  • Hits: 143961
  • Subscribe to this entry
  • Print

Leading up to their Healthcare Innovation Summit webcasts, Intel asked industry leaders to share some of their thoughts on the future of healthcare technology. As part of this blog series, they asked me to contribute on the topic of Big Data. You can read my guest entry on their blog, or right here:


As a healthcare technology leader, you ask, “Isn’t my data warehouse already Big Data?”


I hear this question frequently as Chief Scientist at a startup specializing in Big Data Analytics for Healthcare.

My response: “Probably not. ‘Lots of data’ is not the same thing as Big Data.”

“What is Big Data Analytics?”

Don’t get me wrong: Healthcare is generating Big Data. A typical healthcare system with 200,000 patients has 500,000 encounters each year, submits 5 million claims and creates 3 million documents. In five years this adds up to over 1.5 billion distinct references to medical concepts equaling 10 Terabytes of data. That is bigger than the entire print collection of the U.S. Library of Congress.

However, Big Data Analytics is really about new methods to infer knowledge directly from data, which requires three components:

1.    Scalable data storage with parallel computing capability

2.    Analytical tools, such as machine learning and statistical natural language processing, that can make sense of both structured data and clinical narrative 

3.    Instantaneous access to enough data to infer something useful in real time

“So what will I learn from Big Data Analytics?”

Here’s an example. To best care for diabetics, you enroll them into a disease management program, which means you must identify patients with diabetes accurately. You then need to compute quality measures, such as whether their latest Hemoglobin A1c lab value was above 9.

Sadly, your information system only contains structured data and you quickly find that this data is a mess. First, many patients with “diabetes” entered in their problem lists are actually not diabetic.  This is called “chart lore,” and is a phenomenon endemic to electronic healthcare data. Strike one.

Next, you discover that a significant fraction of your real diabetics do not have any coded term for diabetes in their problem lists so they are not being tracked for disease management. Strike two.

Finally, you discover that 25 percent of your patients are part of a provider group that your organization recently acquired. Although you loaded all of their coded data into your data warehouse, their lab codes are not recognized by your reporting system. Strike three.

There is no way you can actually compute the measures you need to run your business!

Here’s where Big Data Analytics comes in. With the ability to analyze unstructured data, you infer from encounter notes and consult letters which patients are truly diabetic.

What about the unrecognizable lab codes? You utilize machine learning to leverage a Big Data-sized set of patient histories to infer which of these mystery codes correspond to Hemoglobin A1c measurements. You are back in business.

Healthcare is at the intersection of two revolutionary events: The first is fueled by new Big Data technologies that extract valuable knowledge from huge amounts of data. The second was sparked by Meaningful Use and the electronic liberation of previously unavailable clinical data. The resulting explosion in Big Data for Healthcare will light the way for years to come.

What do you think?

»Original post

Rate this blog entry:
Trackback URL for this blog entry.

Bob Rogers brings to Apixio two decades of algorithm development and marketing experience. As Chief Scientist, Bob is responsible for strategic direction and development of the algorithms at the core of Apixio's applications. He is also Apixio's liaison to the medical and academic communities. Bob is the founder of Counterpart Consulting, a consulting firm specializing in advanced analytics for hedge fund and medical applications. Prior to Counterpart Consulting, Bob was Senior Product Manager at Carl Zeiss Meditec where he was responsible for the complete product lifecycle of the most commonly deployed diagnostic device in ophthalmology. Previously, Bob was founder and President of Tri-Valley Capital, a hedge fund, and co-founder and VP of R&D at Arjewel Inc, a private fund, that generated returns based on proprietary algorithms he developed. Bob has also spent several years as an academic researcher, has published a number of academic papers in astrophysics and artificial neural networks and is co-author of the book, "Artificial Neural Networks: Forecasting Time Series." Bob received his undergraduate degree in physics from UC Berkeley and his PhD in physics from Harvard University.


  • Guest
    J.M. Friday, 26 October 2012

    Great article!

  • Guest
    SS Saturday, 03 November 2012

    Bob - Well put and excellent points.

Leave your comment

Guest Monday, 21 April 2014
Contact specialist