Aluru Receives Two NSF Awards for Advancing Next Generation Sequencing Research

decorative image

Srinivas Aluru, a professor in the School of Computational Science and Engineering and co-executive director of the Institute for Data Engineering and Science, received two awards from the National Science Foundation (NSF) totaling over $1.3M, both to advance research in next generation sequencing (NGS) bioinformatics.

“NGS refers to a suite of technologies that enable high-throughput and inexpensive DNA sequencing. Their origin dates back to just a decade ago, but these technologies now underpin all of modern genomics research,” said Aluru.

Aluru’s group is already a pioneer in developing bioinformatics methods for NGS, including a large NSF and National Institutes of Health big data project that came to fruition during the initial round of federal investments in big data.

The two new projects are collectively dedicated to improving the quality and speed of NGS analyses. The first project explores the development of new algorithms to directly tackle errors made by NGS machines through the development of approximate matching algorithms. Previous methods are either time consuming and prohibitive to use on large data, or depend on heuristic techniques with no quality or performance guarantees.

“The new research takes advantage of limited errors made by NGS machines, and develops algorithms that can tackle a bounded number of errors efficiently. It continues a line of investigation initiated through collaboration with late [College of Computing] professor Apostolico,” he said.

The second project will conduct comprehensive experiments on reproducibility, and will evaluate key methods and software products scientists currently use to analyze NGS data. It will benefit the research community by:

  • Assessing over fifty software products
  • Defining the state-of-the-art products in the field
  • Making research findings publicly accessible

As precursor, a paper authored by Aluru and three CSE graduate students was recognized at Supercomputing 2016 as the first selected paper by ACM SIGHPC under the scientific repeatability, replicability, and reproducibility initiative. The new NSF project will also involve nearly 20 undergraduate students, who will learn about research integrity and reproducibility issues. 

Both projects impact a wide range of applications: from medical research to evolution, as well as influence work outside of biology in text matching and information retrieval.