SLIM Group CNF

A new machine learning method could help engineers detect leaks in underground reservoirs earlier, mitigating risks associated with geological carbon storage (GCS). Further study could advance machine learning capabilities while improving safety and efficiency of GCS.

The feasibility study by Georgia Tech researchers explores using conditional normalizing flows (CNFs) to convert seismic data points into usable information and observable images. This potential ability could make monitoring underground storage sites more practical and studying the behavior of carbon dioxide plumes easier.

The 2023 Conference on Neural Information Processing Systems (NeurIPS 2023) accepted the group’s paper for presentation. They presented their study on Dec. 16 at the conference’s workshop on Tackling Climate Change with Machine Learning.

“One area where our group excels is that we care about realism in our simulations,” said Professor Felix Herrmann. “We worked on a real-sized setting with the complexities one would experience when working in real-life scenarios to understand the dynamics of carbon dioxide plumes.”

CNFs are generative models that use data to produce images. They can also fill in the blanks by making predictions to complete an image despite missing or noisy data. This functionality is ideal for this application because data streaming from GCS reservoirs are often noisy, meaning it’s incomplete, outdated, or unstructured data.

The group found in 36 test samples that CNFs could infer scenarios with and without leakage using seismic data. In simulations with leakage, the models generated images that were 96% similar to ground truths. CNFs further supported this by producing images 97% comparable to ground truths in cases with no leakage.

This CNF-based method also improves current techniques that struggle to provide accurate information on the spatial extent of leakage. Conditioning CNFs to samples that change over time allows it to describe and predict the behavior of carbon dioxide plumes.

This study is part of the group’s broader effort to produce digital twins for seismic monitoring of underground storage. A digital twin is a virtual model of a physical object. Digital twins are commonplace in manufacturing, healthcare, environmental monitoring, and other industries.   

“There are very few digital twins in earth sciences, especially based on machine learning,” Herrmann explained. “This paper is just a prelude to building an uncertainty aware digital twin for geological carbon storage.”

Herrmann holds joint appointments in the Schools of Earth and Atmospheric Sciences (EAS), Electrical and Computer Engineering, and Computational Science and Engineering (CSE).

School of EAS Ph.D. student Abhinov Prakash Gahlot is the paper’s first author. Ting-Ying (Rosen) Yu (B.S. ECE 2023) started the research as an undergraduate group member. School of CSE Ph.D. students Huseyin Tuna ErdincRafael Orozco, and Ziyi (Francis) Yin co-authored with Gahlot and Herrmann.

NeurIPS 2023 took place Dec. 10-16 in New Orleans. Occurring annually, it is one of the largest conferences in the world dedicated to machine learning.

Over 130 Georgia Tech researchers presented more than 60 papers and posters at NeurIPS 2023. One-third of CSE’s faculty represented the School at the conference. Along with Herrmann, these faculty included Ümit Çatalyürek, Polo ChauBo DaiSrijan KumarYunan LuoAnqi Wu, and Chao Zhang.

“In the field of geophysics, inverse problems and statistical solutions of these problems are known, but no one has been able to characterize these statistics in a realistic way,” Herrmann said.

“That’s where these machine learning techniques come into play, and we can do things now that you could never do before.”

News Contact

Bryant Wine, Communications Officer
bryant.wine@cc.gatech.edu

Yunan Luo $1.8 Million NIH Grant

The National Institute of Health (NIH) has awarded Yunan Luo a grant for more than $1.8 million to use artificial intelligence (AI) to advance protein research.

New AI models produced through the grant will lead to new methods for the design and discovery of functional proteins. This could yield novel drugs and vaccines, personalized treatments against diseases, and other advances in biomedicine.

“This project provides a new paradigm to analyze proteins’ sequence-structure-function relationships using machine learning approaches,” said Luo, an assistant professor in Georgia Tech’s School of Computational Science and Engineering (CSE).

“We will develop new, ready-to-use computational models for domain scientists, like biologists and chemists. They can use our machine learning tools to guide scientific discovery in their research.” 

Luo’s proposal improves on datasets spearheaded by AlphaFold and other recent breakthroughs. His AI algorithms would integrate these datasets and craft new models for practical application.

One of Luo’s goals is to develop machine learning methods that learn statistical representations from the data. This reveals relationships between proteins’ sequence, structure, and function. Scientists then could characterize how sequence and structure determine the function of a protein.

Next, Luo wants to make accurate and interpretable predictions about protein functions. His plan is to create biology-informed deep learning frameworks. These frameworks could make predictions about a protein’s function from knowledge of its sequence and structure. It can also account for variables like mutations.

In the end, Luo would have the data and tools to assist in the discovery of functional proteins. He will use these to build a computational platform of AI models, algorithms, and frameworks that ‘invent’ proteins. The platform figures the sequence and structure necessary to achieve a designed proteins desired functions and characteristics.

“My students play a very important part in this research because they are the driving force behind various aspects of this project at the intersection of computational science and protein biology,” Luo said.

“I think this project provides a unique opportunity to train our students in CSE to learn the real-world challenges facing scientific and engineering problems, and how to integrate computational methods to solve those problems.”

The $1.8 million grant is funded through the Maximizing Investigators’ Research Award (MIRA). The National Institute of General Medical Sciences (NIGMS) manages the MIRA program. NIGMS is one of 27 institutes and centers under NIH.

MIRA is oriented toward launching the research endeavors of young career faculty. The grant provides researchers with more stability and flexibility through five years of funding. This enhances scientific productivity and improves the chances for important breakthroughs.

Luo becomes the second School of CSE faculty to receive the MIRA grant. NIH awarded the grant to Xiuwei Zhang in 2021. Zhang is the J.Z. Liang Early-Career Assistant Professor in the School of CSE.

[Related: Award-winning Computer Models Propel Research in Cellular Differentiation]

“After NIH, of course, I first thanked my students because they laid the groundwork for what we seek to achieve in our grant proposal,” said Luo.

“I would like to thank my colleague, Xiuwei Zhang, for her mentorship in preparing the proposal. I also thank our school chair, Haesun Park, for her help and support while starting my career.”

News Contact

Bryant Wine, Communications Officer
bryant.wine@cc.gatech.edu

Ph.D. student Alec Helbling ManimML

Georgia Tech researchers have created a machine learning (ML) visualization tool that must be seen to believe.

Ph.D. student Alec Helbling is the creator of ManimML, a tool that renders common ML concepts into animation. This development will enable new ML technologies by allowing designers to see and share their work in action. 

Helbling presented ManimML at IEEE VIS, the world’s highest-rated conference for visualization research and second-highest rated for computer graphics. It received so much praise at the conference that it won the venue’s prize for best poster. 

“I was quite surprised and honored to have received this award,” said Helbling, who is advised by School of Computational Science and Engineering Associate Professor Polo Chau.

“I didn't start ManimML with the intention of it becoming a research project, but because I felt like a tool for communicating ML architectures through animation needed to exist.”

[RELATED: Polo Chau is One of Three College of Computing Faculty to Receive 2023 Google Award for Inclusion Research]

ManimML uses animation to show ML developers how their algorithms work. Not only does the tool allow designers to watch their projects come to life, but they can also explain existing and new ML techniques to broad audiences, including non-experts.

ManimML is an extension of the Manim Community library, a Python tool for animating mathematical concepts. ManimML connects to the library to offer a new capability that animates ML algorithms and architectures.

Helbling chose familiar platforms like Python and Manim to make the tool accessible to large swaths of users varying in skill and experience. Enthusiasts and experts alike can find practical use in ManimML considering today’s widespread interest and application of ML.

“We know that animation is an effective means of instruction and learning,” Helbling said. “ManimML offers that ability for ML practitioners to easily communicate how their systems work, improving public trust and awareness of machine learning.”

ManimML overcomes what has been an elusive approach to visualizing ML algorithms. Current techniques require developers to create custom animations for every specific algorithm, often needing specialized software and experience.

ManimML streamlines this by producing animations of common ML architectures coded in Python, like neural networks.

A user only needs to specify a sequence of neural network layers and their respective hyperparameters. ManimML then constructs an animation of the entire network.

“To use ManimML, you simply need to specify an ML architecture in code, using a syntax familiar to most ML professionals,” Helbling said. “Then it will automatically generate an animation that communicates how the system works.”

ManimML ranked as the best poster from a field of 49 total presentations. IEEE VIS 2023 occurred Oct. 22-27 in Melbourne, Australia. This event marks the first time IEEE held the conference in the Southern Hemisphere.

ManimML has more than 23,000 downloads and a demonstration on social media has hundreds of thousands of views.

ManimML is open source and available at: https://github.com/helblazer811/ManimML

News Contact

Bryant Wine, Communications Officer
bryant.wine@cc.gatech.edu

Meet CSE Profile Rafael Orozco

The start of the fall semester can be busy for most Georgia Tech students, but this is especially true for Rafael Orozco. The Ph.D. student in Computational Science and Engineering (CSE) is part of a research group that presented at a major conference in August and is now preparing to host a research meeting in November.

We used the lull between events, research, and classes to meet with Orozco and learn more about his background and interests in this Meet CSE profile.

Student: Rafael Orozco  

Research Interests: Medical Imaging; Seismic Imaging; Generative Models; Inverse Problems; Bayesian Inference; Uncertainty Quantification 

Hometown: Sonora, Mexico 

Tell us briefly about your educational background and how you came to Georgia Tech. 
I studied in Mexico through high school. Then, I did my first two years of undergrad at the University of Arizona and transferred to Bucknell University. I was attracted to Georgia Tech’s CSE program because it is a unique combination of domain science and computer science. It feels like I am both a programmer and a scientist.  

How did you first become interested in computer science and machine learning? 

In high school, I saw a video demonstration of a genetic algorithm on the internet and became interested in the technology. My high school in Mexico did not have a computer science class, but a teacher mentored me and helped me compete at the Mexican Informatics Olympiad. When I started at Arizona, I researched the behavior of clouds from a Bayesian perspective. Since then, my research interests have always involved using Bayesian techniques to infer unknowns.  

You mentioned your background a few times. Since it is National Hispanic Heritage Month, what does this observance mean to you? 

I am quite proud to be a part of this group. In Mexico and the U.S., fellow Hispanics have supported me and my pursuits, so I know firsthand of their kindness and resourcefulness. I think that Hispanic people welcome others, celebrating the joy our culture brings, and they appreciate that our country uses the opportunity to reflect on Hispanic history. 

You study in Professor Felix Herrmann’s Seismic Laboratory for Imaging and Modeling (SLIM) group. In your own words, what does this research group do? 

We develop techniques and software for imaging Earth’s subsurface structures. These range from highly performant partial differential equation solvers to randomized numerical algebra to generative artificial intelligence (AI) models.  

One of the driving goals of each software package we develop is that it needs to be scalable to real world applications. This entails imaging seismic areas that can be kilometers cubed in volume, represented typically by more than 100,000,000 simulation grid cells. In my medical applications, high-resolution images of human brains that can be resolved to less than half a millimeter.  

The International Meeting for Applied Geoscience and Energy (IMAGE) is a recent conference where SLIM gave nine presentations. What research did you present here? 
The challenge of applying machine learning to seismic imaging is that there are no examples of what the earth looks like. While making high quality reference images of human tissues for supervised machine learning is possible, no one can “cut open” the earth to understand exactly what it looks like.  

To address this challenge, I presented an algorithm that combines generative AI with an unsupervised training objective. We essentially trick the generative model into outputting full earth models by making it blind to which part of the Earth we are asking for. This is like when you take an exam where only a few questions will be graded, but you don’t know which ones, so you answer all the questions just in case.  

While seismic imaging is the basis of SLIM research, there are other applications for the group’s work. Can you discuss more about this? 

The imaging techniques that the energy industry has been using for decades toward imaging Earth’s subsurface can be applied almost seamlessly to create medical images of human sub tissue.  

Lately, we have been tackling the particularly difficult modality of using high frequency ultrasound to image through the human skull. In our recent paper, we are exploring a powerful combination between machine learning and physics-based methods that allows us to speed up imaging while adding uncertainty quantification.  
 
We presented the work at this year’s MIDL conference (Medical Imaging with Deep Learning) in July. The medical community was excited with our preliminary results and gave me valuable feedback on how we can help bring this technique closer to clinical viability. 

News Contact

Bryant Wine, Communications Officer
bryant.wine@cc.gatech.edu

Default Image: Research at Georgia Tech

A scientific machine learning (ML) expert at Georgia Tech is lending a hand in developing an app to identify and help Florida communities most at risk of flooding.

School of Computational Science and Engineering (CSE) Assistant Professor Peng Chen is co-principal investigator of a $1.5 million National Science Foundation grant to develop the CRIS-HAZARD system.

CRIS-HAZARD‘s strength derives from integrating geographic information and data mined from community input, like traffic camera videos and social media posts.  

This ability helps policymakers identify areas most vulnerable to flooding and address community needs. The app also predicts and assesses flooding in real time to connect victims with first responders and emergency managers.

“Successfully deploying CRIS-HAZARD will harness community knowledge through direct and indirect engagement efforts to inform decision-making,” Chen said. “It will connect individuals to policymakers and serve as a roadmap at helping the most vulnerable communities.”

Chen’s role in CRIS-HAZARD will be to develop new ML models for the app’s prediction capability. These assimilation models integrate the mined data with predictions from current hydrodynamic models.

Along with making an immediate impact in flood-prone coastal communities, Chen said these models could have broader applications in the future. These include models for improved hurricane prediction and management of water resources.

The models Chen will build for CRIS-HAZARD derive from past applications aimed at helping communities.

Chen has crafted similar models for monitoring and mitigating disease spread, including Covid-19. He has also worked on materials science projects to accelerate the design of metamaterials and self-assembly materials.

“Scientific machine learning is very broad concept and can be applied to many different fields,” Chen said. “Our group looks at how to accelerate optimization, account for risk, and quantify uncertainty in these applications.”

Uncertainty in CRIS-HAZARD is what brings Chen to the project, headed by University of South Florida researchers. While the app’s novelty lies in its use of heterogenous data, inferring predictions can be challenging since the data comes from different sources in varying formats. 

To overcome this, Chen intends to build new data assimilation models from scratch powered by deep neural networks (DNNs).

Along with their ability to find connections between heterogeneous data, DNNs are scalable and inexpensive. This beats the alternative of using supercomputers to make the same calculations.

DNNs are also fast and can significantly reduce computational time. According to Chen, the efficiency of DNNs can achieve acceleration hundreds of thousands of times greater than classical models.

Low cost and time make it possible to run DNN-based simulations multiple times. This improves reliability in prediction results in real-time once the DNNs are properly trained.

“The data may not be consistent or compatible since there are different models we’re trying to integrate, making prediction uncertain,” Chen said. “We can run these ML models many times to quantify the uncertainty and give a probability distribution or a range of predictions.”

CRIS-HAZARD also exemplifies the power of collaboration across disciplines and universities. In this case, machine learning techniques reach across state boundaries to help people that are vulnerable to flooding or other natural disasters.

USF Professor Barnali Dixon leads the project with Associate Professor Yi Qiang— both geocomputation researchers in the School of Geosciences, incorporating data science and artificial intelligence.

Subhro Guhathakurta collaborates with Chen from Georgia Tech. Along with being a professor in the School of City & Regional Planning, Guhathkurta is director of Tech’s Master of Science in Urban Analytics program and the Center for Spatial Planning and Analytics and Visualization.

News Contact

Bryant Wine, Communications Officer
bryant.wine@cc.gatech.edu

Man in salmon colored shirt working at computer

Advancement in technology brings about plenty of benefits for everyday life, but it also provides cyber criminals and other potential adversaries with new opportunities to cause chaos for their own benefit.

As researchers begin to shape the future of artificial intelligence in manufacturing, Georgia Tech recognizes the potential risks to this technology once it is implemented on an industrial scale. That’s why Associate Professor Saman Zonouz will begin researching ways to protect the nation’s newest investment in manufacturing.

The project is part of the $65 million grant from the U.S. Department of Commerce’s Economic Development Administration to develop the Georgia AI Manufacturing (GA-AIM) Technology Corridor. While main purpose of the grant is to develop ways of integrating artificial intelligence into manufacturing, it will also help advance cybersecurity research, educational outreach, and workforce development in the subject as well.   

“When introducing new capabilities, we don’t know about its cybersecurity weaknesses and landscape,” said Zonouz. “In the IT world, the potential cybersecurity vulnerabilities and corresponding mitigation are clear, but when it comes to artificial intelligence in manufacturing, the best practices are uncertain. We don’t know what all could go wrong.”

Zonouz will work alongside other Georgia Tech researchers in the new Advanced Manufacturing Pilot Facility (AMPF) to pinpoint where those inevitable attacks will come from and how they can be repelled. Along with a team of Ph.D. students, Zonouz will create a roadmap for future researchers, educators, and industry professionals to use when detecting and responding to cyberattacks.

“As we increasingly rely on computing and artificial intelligence systems to drive innovation and competitiveness, there is a growing recognition that the security of these systems is of paramount importance if we are to realize the anticipated gains,” said Michael Bailey, Inaugural Chair of the School of Cybersecurity and Privacy (SCP). “Professor Zonouz is an expert in the security of industrial control systems and will be a vital member of the new coalition as it seeks to provide leadership in manufacturing automation.”

Before coming to Georgia Tech, Zonouz worked with the School of Electrical and Computer Engineering (ECE) and the College of Engineering on protecting and studying the cyber-physical systems of manufacturing. He worked with Raheem Beyah, Dean of the College of Engineering and ECE professor, on several research papers including two that were published at the 26th USENIX Security Symposium, and the Network and Distributed System Security Symposium.

“As Georgia Tech continues to position itself as a leader in artificial intelligence manufacturing, interdisciplinarity collaboration is not only an added benefit, it is fundamental,” said Arijit Raychowdhury, Steve W. Chaddick School Chair and Professor of ECE. “Saman’s cybersecurity expertise will play a crucial role in the overall protection and success of GA-AIM and AMPF. ECE is proud to have him representing the school on this important project.”

The research is expected to take five years, which is typical for a project of this scale. Apart from research, there will be a workforce development and educational outreach portion of the GA-AIM program. The cyber testbed developed by Zonouz, and his team will live in the 24,000 square-foot AMPF facility.

News Contact

JP Popham 

Communications Officer | School of Cybersecurity and Privacy

Georgia Institute of Technology

jpopham3@gatech.edu | scp.cc.gatech.edu