Most people recognize computer terms like memory and data. But few have heard of the term “Big Data”.

Big Data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. These data sets can uncover hidden patterns, unknown correlations and other useful information, and can provide competitive advantages over rival organizations and result in business benefits, such as more effective marketing and increased revenue.

Two researchers at West Virginia University are also applying Big Data methods to human nutrition.

Tim Menzies, of the Lane Department of Computer Science and Electrical Engineering, has partnered with Susan Partington of the Davis College of Agriculture, Natural Resources, and Design, to see if it’s possible to design intelligent data collection strategies that significantly reduce the cost of understanding and monitoring a population.

Partington is researching the effects of food availability on obesity.

“High obesity prevalence in the United States has become a national health priority with much of the focus on obesity prevention,” said Partington. “Because environmental factors can be modified, they may be key in obesity prevention.”

One way to do this, Partington said, is to give researchers access to a full inventory of foods sold in all retail outlets in a given area. That type of data, however, is not readily available and it would be a painstaking process to gather such information, visiting site by site.

That’s where Menzies comes in.

“A repeated effect in most data sets is that a few variables and instances can serve as good exemplars for the rest,” he said. “Once we have learned these key variables, it becomes practical to sample a wider area much faster. Once we know the key variables, visiting each store is quicker, since we would be after less data, and we could visit fewer stores once we have the most exemplary stores.”

In addition to working with Menzies, Partington is working with Vasil Papakroni, a master’s student in the Lane Department, on the research. His research has found that collecting data from 10 percent of the stores can approximate the results found in the rest of the population. Papakroni is now exploring how to find the most representative stores. Together, these two findings mean that the cost of monitoring patterns in a large population can be significantly decreased.

“Health patterns are constantly changing and we must always check old results,” said Partington. “Using Vasil’s methods, fewer stores will need to be revisited. And once we arrive at a particular store, by using Dr. Menzies’ results, we will only need to record a few variables.”

“The world is a big place,” Menzies said, “and with Big Data methods, we can explore more of it in less time.”



Follow @WVUToday on Twitter.

CONTACTS: Mary C. Dillon, Benjamin M. Statler College of Engineering and Mineral Resources

David Welsh, Davis College of Agriculture, Natural Resources, and Design