It’s a paradox today we have more healthcare data than ever, yet we can’t seem to do meaningful work with it. Why? Because much of the new data is unstructured.
Nearly 80% of the data in the 1.2 billion clinical care documents that the U.S. produces annually is unstructured. Unstructured data includes written doctor’s notes, scanned documents, images and other free-form files. It is notoriously difficult to use because unlike structured data that is easy to view and use, this data is unorganized, text-heavy and hostile to easy processing. Putting it in simple terms: It’s way harder for a computer to understand a paragraph than it is to understand a checkbox.