Health IT, Diagnostics

Mining Internet searches yields clues to lung cancer diagnosis

Are people ever more honest about their health concerns as they are when they type health questions into Internet search engines?

lungs_wikimedia_commons

Are people ever more honest about their health concerns as they are when they type health questions into Internet search engines?

Two top Microsoft health researchers think it’s a fair question, and set out to mine search logs to identify risk factors for the nearly 20 percent of cases of lung cancer found in non-smokers. Their work also could help diagnose lung cancer earlier, since 75 percent of patients are diagnosed at Stage III or IV of the disease, according to an article published online Thursday in JAMA Oncology.

This is a follow-up to a study that appeared in the August edition of the Journal of Oncology Practice. That study found that search engines — in this case, Microsoft’s Bing — could yield clues to help diagnose pancreatic adenocarcinoma.

“People tend to whisper their health concerns into search engines on a regular basis,” Dr. Eric Horvitz, managing director of Microsoft’s research laboratory, said said on a Microsoft blog.”This kind of data can serve as a complement to more formal clinical information.”

This work, according to Horvitz, “shows promise for identifying new clinically relevant findings in multiple areas of healthcare.”

Horvitz and co-author Ryen W. White, CTO for health intelligence at Microsoft Research in Redmond, Washington, scanned more than 4.8 million Bing searches in the U.S. for lung carcinoma and related symptoms. They found about 5,400 likely positives and identified family history of lung cancer, age, presence of radon in the home, geographic location and occupation as the top five risk factors.

“Evidence of smoking … was important but not top-ranked, highlighting the difficulty of identifying smoking history from search terms,” they wrote in the JAMA Oncology article.

“Here, we are not just looking at the text of the queries; we also consider the locations that people are in when they issue these queries and we tie that back to contextual risk factors linked to those locations,” White explained on the Microsoft blog.

The data “allow us to discover new risk factors, things that might not have been thought of in the past that might actually be important,” White added. “We looked at air travel, for example, as one of the factors that might be tied to a higher likelihood.”

Photo: Wikimedia Commons