Data Science Industry Day 

Our first Industry Day was held on March 28, 2018, from 7:30 am to 5:30 pm in the Technology Square Research Building (TSRB) Banquet Hall.   

Georgia Tech's Institute for Data Engineering and Science (IDEaS) brought together members of the data science community from local industry and academia in Technology Square for its first Data Science Industry Day.

Nearly 120 people attended the event – including more than forty corporate participants – along with Georgia Tech faculty, researchers, and students. The event was held to familiarize local industry with IDEaS, increase collaboration to advance research and the application of data science, and contribute to economic development.

Georgia Tech's Srinivas Aluru

IDEaS Co-Executive Director Srinivas Aluru

Co-executive Director Srinivas Aluru kicked off the event, and Georgia Tech’s Executive Vice President of Research Stephen Cross welcomed the day’s participants with a view of interdisciplinary research across Georgia Tech. Several presentations by industry and researchers served to share expertise, challenges, and perspectives, and to begin a conversation between participants. 

The day opened with introductory presentations. Aluru addressed the structure and goals of the data science institute at Georgia Tech. “Georgia Tech historically prides itself on applied innovation and being an asset to industry,” he said. “In the area of data science, in particular, most big data problems originate in industry, making academia-industry collaboration not only preferable but essential.”

Director of Industry Partnerships Renata Rawlings-Goss provided an overview of opportunities for industry to become involved via the IDEaS Industry Alliance Program.

Georgia Tech'sRanta Rawlings-Goss

Director of Industry Partnerships Renata Rawlings-Goss

“Industry participation in the data science ecosystem here at Georgia Tech is truly a win-win for both sides. We have top-ranked programs, faculty, and research here at Tech, but we also strive to be good partners in the dynamic industry analytics community here in Atlanta and nationally. Industry Day is one way we listen and connect with Industry partners and local companies,” said Rawlings-Goss. 

Several themed panels took place in the late morning and afternoon.

Dana Randall, co-executive director of IDEaS, moderated a session on Centers and Educational Programs. Faculty and staff from Georgia Tech highlighted several opportunities:

  • Zsolt Kira discussed the Machine Learning Center and the Machine Learning Ph.D. Program
  • Polo Chau provided an overview of the M.S. in Analytics, of which he is also program director
  • Ellen Zegura described the summer fellowship program Data Science for Social Good
  • Rawlings-Goss summarized the many opportunities available through the NSF South Big Data Regional Innovation Hub, of which she is a co-executive director
  • Xiaoming Huo and principal investigator of the National Science Foundation Transdisciplinary Research Institute for Advancing Data Science (TRIAD) gave an overview of this program
  • Anthony Zivalich highlighted opportunities for industry collaboration in the upcoming CODA Building

David Sherrill, an IDEaS associate director, moderated the session on AI and Deep Learning. Panel participants included

  • Vijay Reddy of Google discussing “Practical Machine Learning Using the Google Cloud”
  • Ashok Goel of Georgia Tech presenting on “Vera, A Virtual Research Assistant: Combining AI and Big Data to Support Scientific Modeling”
  • Christopher Yasko of Equifax spoke about “White Box Neural Networks for Predictive Risk Models”

Deirdre Shoemaker, an IDEaS associate director, moderated the session on Graph Analytics and Anomaly Detection

  • Jason Riedy of Georgia Tech presented “Streaming Graph Analysis: New Models, New Architectures”
  • Umit Catalyurek of Georgia Tech discussed “High Performance Graph Analytics”
  • Sophia Velastegui of Microsoft spoke about “Graph Analytics at Microsoft”

Marilyn Wolf, an IDEaS associate director, moderated the session on Unstructured Data Analytics (Video, Text, Image) 

  • Lan Guan of Accenture discussed “AI-Powered Advanced Customer Engagement Solution”
  • Khalifeh AlJadda of CareerBuilder presented “Search for Things not Strings: Towards Smart Information Retrieval Systems”

Between panel presentations, multiple “rapid fire” industry sessions encouraged industry guests to introduce themselves, describe their company and work, and how they envision interacting with Georgia Tech. The evening concluded with poster presentations from nearly thirty students.

 

Speaker Biographies

 

Ashok K. Goel, Professor, School of Interactive Computing,

Georgia Institute of Technology

Ashok Goel is a Professor of Computer Science and the Director of the Ph.D. Program in Human-Centered Computing in School of Interactive Computing at Georgia Institute of Technology. Ashok conducts research into artificial intelligence, cognitive science and human-centered computing, with a focus on computational design, modeling and creativity. He is Editor-in-Chief of AAAI’s AI Magazine. Ashok serves on Georgia Tech’s Commission on Next in Education. As part of Georgia Tech’s Online Masters of Science in Computer Science program, he developed a graduate-level course on AI, and as part of this class, he developed Jill Watson, an AI teaching assistant for answering questions in the online class discussion forum:https://www.youtube.com/watch?v=WbCguICyfTA

Presentation Abstract: Modeling is an essential part of scientific inquiry and discovery. Scientific modeling typically entails cycles of model construction, use, evaluation, and revision. We present Vera, a virtual ecological research assistant, for supporting citizen scientists in this cycle through conceptual and simulation modeling in ecology. Vera develops and uses AI in three ways. First, Vera uses an AI compiler to automatically spawn simulation models directly from conceptual models, thereby reducing the cognitive load on the scientist. Second, to build conceptual and simulation models, Vera affords scientists with access to Smithsonian Institution’s Encyclopedia of Life (EOL), the world’s largest database of biodiversity. Third, Vera uses IBM’s Bluemix tool to acquire ecological knowledge to enable scientists to ask questions of EOL and other data sources.

This research is supported by an NSF BigData SouthHub Spoke grant, as well as supplementary PPSR and REU grants from NSF.

 

 

Khalifeh AlJadda

CareerBuilder

AlJadda holds a Ph.D. in computer science from the University of Georgia (UGA), with a specialization in machine learning. He has experience implementing large-scale, distributed machine learning algorithms to solve challenging problems in domains ranging from Bioinformatics to search and recommendation engines. He is the lead data scientist on the search data science team at CareerBuilder, which is one of the largest job boards in the world. He was in charge of the data science initiative to design and implement the backend of CareerBuilder’s language-agnostic semantic search engine leveraging Apache Spark and the Hadoop ecosystem. He is now in charge of building a new AI-based recommendation engine using cutting-edge technologies in data science and Big Data. Khalifeh is the founder and organizer of the Southern Data Science Conference (https://www.southerndatascience.com) which is a major data science conference in Atlanta that aims to promote data science to the Southern companies and schools. Also, Khalifeh is a frequent public speaker on topics related to data science, machine learning, semantic search, and big data analytics. For more information, please visit his website (www.aljadda.com).

 

 

E. Jason Riedy

Georgia Institute of Technology

Dr. E. Jason Riedy is a Senior Research Scientist in the School of Computational Science and Engineering at Georgia Tech. His current research focuses primarily on massive scale and streaming graph analysis extending Georgia Tech's STINGER software framework. He developed a scalable community detection code that won the 10th DIMACS Implementation Challenge's mix competition, and his work in STINGER has produced the first algorithms for multiple streaming graph analysis metrics. STINGER received a best paper award at IEEE HPEC 2012. He continues research in sparse and dense linear algebra including development of the widely used reference BLAS (Basic Linear Algebra Subprograms) and LAPACK (Linear Algebra PACKage) libraries. He currently serves on the IEEE 754 (IEEE Standard on Floating Point Arithmetic) revision committee and has extended the proposed standard to further support extra precision and reproducible operations. Dr. Riedy has been a Georgia Tech research teaching fellow and continues teaching an undergraduate research class monitoring honeybee hives in the metro Atlanta area deploying IoT (Internet of Things) platforms. He also was a research fellow in the Institute for Data and High Performance Computing. His dual B.S. in Computer Science and Mathematics is from the University of Florida in 1998. His Ph.D. in Computer Science is from the University of California, Berkeley in 2010 in combinatorial optimization and targeted high-precision arithmetic.

Presentation Abstract:  Applications in computer network security, social media analysis, health care, and other areas rely on analyzing a changing environment.  The data is rich in relationships and lends itself to graph analysis.  Traditional static graph analysis cannot keep pace with network security applications analyzing nearly one million events per second. Streaming frameworks like STINGER support ingesting up three million of edge changes per
second but there are few streaming analysis kernels that keep up with these rates.  We present a new computational model along with promising novel architectures to tackle massive data rates.

 

Christopher Yasko

Equifax

Chris is the Vice President, Data Science Innovation, leading the global analytics R&D team at Equifax that is chartered to pioneer disruptive solutions utilizing machine learning and artificial intelligence.  Chris and his team are engaging key customers, industry leaders, and academic partners to deliver new insights using big data and advanced mathematical modeling on high performance computing platforms. Chris has over twenty years of global R&D experience and has successful built and led high performance R&D teams in USA, France, Scotland, Israel, and India.  Chris has earned a Master of Science, Computer and Information Science, University of Florida, and a Bachelor of Science, Electrical Engineering, with Distinction, Worcester Polytechnic Institute. Equifax is a global information solutions company that uses unique data, innovative analytics, technology and industry expertise to power organizations and individuals around the world by transforming knowledge into insights that help make more informed business and personal decisions. Headquartered in Atlanta, Georgia, Equifax operates or has investments in 24 countries in North America, Central and South America, Europe and the Asia Pacific region.  It is a member of Standard & Poor's (S&P) 500® Index, and its common stock is traded on the New York Stock Exchange (NYSE) under the symbol EFX. Equifax employs 10,300 employees worldwide.

Presentation Abstract: Artificial neural networks (ANN) are efficient machine learning models that have been used in many industries for decades.  Until recently, ANNs have been absent in regulatory compliant financial lending applications because their “black box” characteristic does not allow for explanation of their model behavior to consumers and to industry regulators. NeuroDecision® is an Equifax exclusive patent pending machine learning technology producing regulatory compliant neural networks for risk decision applications. NeuroDecision® models are constrained in such a way that the influence of any input variable is completely understood in terms of its impact on the output variable, enabling the generation of “key factors” impacting the model, as required by industry regulation.

 

Lan Guan

Accenture

Lan Guan is a Managing Director in the Applied Intelligence practice within Accenture Digital.  Lan has been with Accenture for 15 years and is the lead Management Scientist and Analytics Architect in the practice.  She leads all analytics work for Accenture in the Communication, Media, High Tech industry, but also has a broad set of cross-industry experience.  As an experienced analytics professional, Lan brings in-depth knowledge and hands-on experience with transforming customer data into actionable strategies and plans to maximize market opportunities. She specializes in data integration, analytics architecture, high performance computing, advanced data mining, and statistical modeling. Lan has led an array of analytics projects featuring customer segmentation and profiling, target marketing, predictive modeling, and business forecasting to enhance client’s marketing and sales efforts and provide business/strategic decision support to improve business bottom line.

Lan is often asked to speak to analytics events on subjects such as such as managing advanced analytic applications, growing analytic resources, delivering more informed information and making analytics a part of the corporate operation. She is a repeat invited session speaker at the industry’s annual Analytics and Data Mining global conferences.

Presentation Abstract: We believe Artificial Intelligence is essentially about smart machines extending human capabilities by sensing, comprehending, acting and learning – allowing people to achieve much more.  Our Enhanced Customer Engagement solution aims to leverage Artificial Intelligence to deliver superior experience to customers and users based on hyper-personalization and curation of real-time information.  We will deliver a future vision that connects across channels, starting with customer service, and leverages client’s existing foundational systems powered by artificial intelligence services to provide a consistent and omni-channel customer experience.  

 

Ümit V. Çatalyürek

Georgia Institute of Technology

Ümit V. Çatalyürek is currently professor and associate chair of the School of Computational Science and Engineering in the College of Computing at the Georgia Institute of Technology. He received his PhD, MS and BS in Computer Engineering and Information Science from Bilkent University, in 2000, 1994 and 1992, respectively. Professor Çatalyürek is a Fellow of the IEEE, a member of the Association for Computing Machinery (ACM) and the Society for Industrial and Applied Mathematics, and the elected chair for the IEEE’s Technical Committee on Parallel Processing for 2016-2019. He is also vice-chair for the ACM’s Special Interest Group on Bioinformatics, Computational Biology and Biomedical Informatics for the 2015-2018 term. He currently serves as the editor-in-chief for Parallel Computing, as an editorial board member for IEEE Transactions on Parallel and Distributed Computing, and on the program and organizing committees of numerous international conferences. His main research areas are in parallel computing, combinatorial scientific computing and biomedical informatics. He has co-authored more than 200 peer-reviewed articles, invited book chapters and papers. More information about Dr. Çatalyürek and his research group can be found at http://cc.gatech.edu/~umit.

Presentation Abstract: Graphs became de facto standard for modeling complex relations and networks in computers. With an increase in the size of the graphs and the complexity of the analyses to perform on them, many software systems have been designed to leverage modern high performance computing platforms. Some of them provide very productive programming environment for graph analysis, however, they cannot get even close to single threaded performance. In this talk, we will, briefly, present some important graph analytics problems and techniques, such as centrality, pattern search and alignment, and talk about how we achieve high performance on modern computer architectures.

 

Sophia Velastegui

Microsoft

Named one of Business Insider’s “Most Powerful Female Engineers in 2017,” Sophia recently joined Microsoft as the General Manager, Product in the Knowledge & Conversation org.  She has been the Chief Product Officer at Doppler Labs. She was the Head of Silicon / Architecture Roadmap and worked on Special Projects at Google’s Nest as it grew from a small company and scaled under Alphabet. Prior to joining Nest in 2014, Sophia led the “Think Tank” Program Management at Apple. She previously led the company’s Laptop & Special Projects Product Management group. Her experiences in Silicon Valley, as well as her prior experience at Applied Materials, Harvard Business School, Berkeley, and Georgia Tech, provide her with global perspectives on productizing technology, execution, and strategic planning. She's been awarded a number of patents and is also active in the greater tech community. Beyond her work at Microsoft, Sophia is on the advisory board of Georgia Tech’s College of Engineering, Woodruff Mechanical Engineering School, and Create X incubator. Additionally, Sophia is a board director of Elwyn.org, a nonprofit servicing children and adults with disabilities. She has been the innovation advisor to the South Korean President Jae In Moon’s Labor dept from candidacy to president. She enjoys golfing and rock climbing with her family.