The University of Massachusetts Amherst
University of Massachusetts Amherst

Search Google Appliance


Duarte Participates in Collaborative Team to Further Develop the Foundations of Data Science as Part of $1.5-million NSF Grant

Marco Duarte

Marco Duarte

Associate Professor Marco Duarte of the Electrical and Computer Engineering (ECE) Department is part of an interdisciplinary, inter-institutional team of researchers who recently received a three-year, $1.5-million grant from the National Science Foundation (NSF) to further develop the foundations of data science in a project that will create one of NSF’s TRIPODS Institutes for Theoretical Foundations of Data Science. TRIPODS stands for Transdisciplinary Research in Principles of Data Science. See UMass News Office article: Computer Science-Math-Engineering Team Forms New NSF Institute .

The collaborative team is led by Andrew McGregor, a computer science associate professor in the College of Information and Computer Sciences.

As Duarte explains, “My role in the proposal was as a liaison between the [TRIPODS] Institute and the Department of Electrical and Computer Engineering. There are several areas of ECE that are closely linked with the data science core of the institute: signal processing, machine learning, information theory, data analytics, and more. Many faculty in ECE and the College of Engineering have research interests related to data science.”

Duarte is the head of the High-Dimensional Signal Processing Group, which researches signal and image processing, compressive sensing, dimensionality reduction, machine learning, computational imaging, distributed sensing, and sensor networks.

Unlike the “practical outcomes” focus of some big data initiatives, the focus of the TRIPODS Institute is more on theoretical, mathematical, and foundational aspects of data science, McGregor says. For example, practical data scientists “know that their methods typically work in practice,” he says, “but we don’t necessarily know why, or whether they’re consistent and reliable. In order to know the why and how, you need to mathematically analyze them. You need to show that the algorithm’s estimates will always in fact give an answer that is within a certain percentage of the true answer.”

Another aspect of the TRIPODS Institute will be to organize summer schools, speaker series, talks by experts in related technical areas, and workshops for faculty researchers in other disciplines who want to learn how big data can help them.

“In the next three years we might notice more of our colleagues attending data science workshops on campus,” McGregor notes.

Data sets in the sciences, such as genetics and physics, are growing larger every year, McGregor points out. As he observes, “In statistics the more data you get, the more accurate you can be, but in computer science the more data you get, the longer it will take you to process. That’s one reason you need computer scientists and mathematicians working together. When you double the size of the input, we need to know if the algorithm takes twice as long, four times as long, or 100 times as long? That’s important.”

The $1.5-million grant is part of the NSF’s $17.7-million support for 12 TRIPODS projects, which will bring together the statistics, mathematics, and theoretical computer science communities at 14 institutions in 11 states to promote long-term research and training activities in data science that transcend traditional disciplinary boundaries. (November 2019)