New NVIDIA Partnership Bridges Education Gap for Data Science and Machine Learning

As data science and machine learning needs surge across the globe, many educators and students are left behind due to a lack of availability and access to comprehensive learning materials. This is where NVIDIA and Georgia Tech’s latest partnership aims to help.

Developed by School of Computational Science and Engineering (CSE) Associate Professor Polo Chau and Prairie View A&M University Assistant Professor Xishuang Dong, a teaching kit focused on data science education as part of the NVIDIA Deep Learning Institute (DLI) is now available for free to qualified educators.

Released as a multipart series, four kits are offered through NVIDIA’s DLI program in collaboration with leading researchers and professors in four research areas:

  • Deep Learning
  • Accelerated Computing
  • Robotics 
  • Data Science

Specifically, each of these four DLI Teaching Kits lowers the barrier to entry for educators seeking to incorporate artificial intelligence (AI) and graphic processing unit (GPU) computing in coursework by providing downloadable teaching materials and online courses.

The kit teaches students fundamental and advanced topics on accelerated data science with the NVIDIA RAPIDS framework, GPU-accelerated machine learning, data visualization, graph analytics, and more.

“Traditional data science software libraries are mainly written for CPUs and don’t take advantage of GPUs. The RAPIDS library is NVIDIA’s effort to simplify and more easily use their GPU Python focused library,” said Chau.

The Data Science Teaching Kit contains tens of modules and labs, with content adapted from the popular course, CSE 6242: Data and Visual Analytics, taught by Chau. Georgia Tech contributors on the project include Polo Club for Data Science researchers Scott Freitas, Haekyu Park, Jay Want, Jon Saad-FalconKevin Li, Aiswarya Bhagavatula, and Frank Zhou.

“The new development on our side is creating the modules and figuring out how to provide interactive labs for the students to work on and new coding questions,” said Freitas, a lead researcher on the project. “We also released three papers and each one of those papers will inform a lab in the teaching kit.”

Part of these papers include two new large-scale datasets for cybersecurity which are incorporated into the toolkit to teach participants how to detect malware using new graph techniques.

According to Freitas, the two datasets being integrated are also two of the largest cybersecurity and graph datasets ever released in the world

“The end goal is to help people learn how to use new state-of-the-art GPU accelerated techniques. NVIDIA has many advanced technologies that they are developing but it may not necessarily be accessible to people just getting into the field. So, this teaching kit aims to take all of these components and simplify them in a way that is successful and easy to use for educators,” he said.



Kristen Perez

Communications Officer