CSE NeurIPS 2024
CSE NeurIPS 2024

A new machine learning (ML) model from Georgia Tech could protect communities from diseases, better manage electricity consumption in cities, and promote business growth, all at the same time.

Researchers from the School of Computational Science and Engineering (CSE) created the Large Pre-Trained Time-Series Model (LPTM) framework. LPTM is a single foundational model that completes forecasting tasks across a broad range of domains. 

Along with performing as well or better than models purpose-built for their applications, LPTM requires 40% less data and 50% less training time than current baselines. In some cases, LPTM can be deployed without any training data.

The key to LPTM is that it is pre-trained on datasets from different industries like healthcare, transportation, and energy. The Georgia Tech group created an adaptive segmentation module to make effective use of these vastly different datasets.

The Georgia Tech researchers will present LPTM in Vancouver, British Columbia, Canada, at the 2024 Conference on Neural Information Processing Systems (NeurIPS 2024). NeurIPS is one of the world’s most prestigious conferences on artificial intelligence (AI) and ML research.

“The foundational model paradigm started with text and image, but people haven’t explored time-series tasks yet because those were considered too diverse across domains,” said B. Aditya Prakash, one of LPTM’s developers. 

“Our work is a pioneer in this new area of exploration where only few attempts have been made so far.”

[MICROSITE: Georgia Tech at NeurIPS 2024]

Foundational models are trained with data from different fields, making them powerful tools when assigned tasks. Foundational models drive GPT, DALL-E, and other popular generative AI platforms used today. LPTM is different though because it is geared toward time-series, not text and image generation.  

The Georgia Tech researchers trained LPTM on data ranging from epidemics, macroeconomics, power consumption, traffic and transportation, stock markets, and human motion and behavioral datasets.

After training, the group pitted LPTM against 17 other models to make forecasts as close to nine real-case benchmarks. LPTM performed the best on five datasets and placed second on the other four.

The nine benchmarks contained data from real-world collections. These included the spread of influenza in the U.S. and Japan, electricity, traffic, and taxi demand in New York, and financial markets.   

The competitor models were purpose-built for their fields. While each model performed well on one or two benchmarks closest to its designed purpose, the models ranked in the middle or bottom on others.

In another experiment, the Georgia Tech group tested LPTM against seven baseline models on the same nine benchmarks in zero-shot forecasting tasks. Zero-shot means the model is used out of the box and not given any specific guidance during training. LPTM outperformed every model across all benchmarks in this trial.

LPTM performed consistently as a top-runner on all nine benchmarks, demonstrating the model’s potential to achieve superior forecasting results across multiple applications with less and resources.

“Our model also goes beyond forecasting and helps accomplish other tasks,” said Prakash, an associate professor in the School of CSE. 

“Classification is a useful time-series task that allows us to understand the nature of the time-series and label whether that time-series is something we understand or is new.”

One reason traditional models are custom-built to their purpose is that fields differ in reporting frequency and trends. 

For example, epidemic data is often reported weekly and goes through seasonal peaks with occasional outbreaks. Economic data is captured quarterly and typically remains consistent and monotone over time. 

LPTM’s adaptive segmentation module allows it to overcome these timing differences across datasets. When LPTM receives a dataset, the module breaks data into segments of different sizes. Then, it scores all possible ways to segment data and chooses the easiest segment from which to learn useful patterns.

LPTM’s performance, enhanced through the innovation of adaptive segmentation, earned the model acceptance to NeurIPS 2024 for presentation. NeurIPS is one of three primary international conferences on high-impact research in AI and ML. NeurIPS 2024 occurs Dec. 10-15.

Ph.D. student Harshavardhan Kamarthi partnered with Prakash, his advisor, on LPTM. The duo are among the 162 Georgia Tech researchers presenting over 80 papers at the conference. 

Prakash is one of 46 Georgia Tech faculty with research accepted at NeurIPS 2024. Nine School of CSE faculty members, nearly one-third of the body, are authors or co-authors of 17 papers accepted at the conference. 

Along with sharing their research at NeurIPS 2024, Prakash and Kamarthi released an open-source library of foundational time-series modules that data scientists can use in their applications.

“Given the interest in AI from all walks of life, including business, social, and research and development sectors, a lot of work has been done and thousands of strong papers are submitted to the main AI conferences,” Prakash said. 

“Acceptance of our paper speaks to the quality of the work and its potential to advance foundational methodology, and we hope to share that with a larger audience.”

News Contact

Bryant Wine, Communications Officer
bryant.wine@cc.gatech.edu

Deven Desai and Mark Riedl

Deven Desai and Mark Riedl have seen the signs for a while. 

Two years since OpenAI introduced ChatGPT, dozens of lawsuits have been filed alleging technology companies have infringed copyright by using published works to train artificial intelligence (AI) models.

Academic AI research efforts could be significantly hindered if courts rule in the plaintiffs' favor. 

Desai and Riedl are Georgia Tech researchers raising awareness about how these court rulings could force academic researchers to construct new AI models with limited training data. The two collaborated on a benchmark academic paper that examines the landscape of the ethical issues surrounding AI and copyright in industry and academic spaces.

“There are scenarios where courts may overreact to having a book corpus on your computer, and you didn’t pay for it,” Riedl said. “If you trained a model for an academic paper, as my students often do, that’s not a problem right now. The courts could deem training is not fair use. That would have huge implications for academia.

“We want academics to be free to do their research without fear of repercussions in the marketplace because they’re not competing in the marketplace,” Riedl said. 

Desai is the Sue and John Stanton Professor of Business Law and Ethics at the Scheller College of Business. He researches how business interests and new technology shape privacy, intellectual property, and competition law. Riedl is a professor at the College of Computing’s School of Interactive Computing, researching human-centered AI, generative AI, explainable AI, and gaming AI. 

Their paper, Between Copyright and Computer Science: The Law and Ethics of Generative AI, was published in the Northwestern Journal of Technology and Intellectual Property on Monday.

Desai and Riedl say they want to offer solutions that balance the interests of various stakeholders. But that requires compromise from all sides.

Researchers should accept they may have to pay for the data they use to train AI models. Content creators, on the other hand, should receive compensation, but they may need to accept less money to ensure data remains affordable for academic researchers to acquire.

Who Benefits?

The doctrine of fair use is at the center of every copyright debate. According to the U.S. Copyright Office, fair use permits the unlicensed use of copyright-protected works in certain circumstances, such as distributing information for the public good, including teaching and research.

Fair use is often challenged when one or more parties profit from published works without compensating the authors.

Any original published content, including a personal website on the internet, is protected by copyright. However, copyrighted material is republished on websites or posted on social media innumerable times every day without the consent of the original authors. 

In most cases, it’s unlikely copyright violators gained financially from their infringement.

But Desai said business-to-business cases are different. The New York Times is one of many daily newspapers and media companies that have sued OpenAI for using its content as training data. Microsoft is also a defendant in The New York Times’ suit because it invested billions of dollars into OpenAI’s development of AI tools like ChatGPT.

“You can take a copyrighted photo and put it in your Twitter post or whatever you want,” Desai said. “That’s probably annoying to the owner. Economically, they probably wanted to be paid. But that’s not business to business. What’s happening with Open AI and The New York Times is business to business. That’s big money.”

OpenAI started as a nonprofit dedicated to the safe development of artificial general intelligence (AGI) — AI that, in theory, can rival human thinking and possess autonomy.

These AI models would require massive amounts of data and expensive supercomputers to process that data. OpenAI could not raise enough money to afford such resources, so it created a for-profit arm controlled by its parent nonprofit.

Desai, Riedl, and many others argue that OpenAI ceased its research mission for the public good and began developing consumer products. 

“If you’re doing basic research that you’re not releasing to the world, it doesn’t matter if every so often it plagiarizes The New York Times,” Riedl said. “No one is economically benefitting from that. When they became a for-profit and produced a product, now they were making money from plagiarized text.”

OpenAI’s for-profit arm is valued at $80 billion, but content creators have not received a dime since the company has scraped massive amounts of copyrighted material as training data.

The New York Times has posted warnings on its sites that its content cannot be used to train AI models. Many other websites offer a robot.txt file that contains instructions for bots about which pages can and cannot be accessed. 

Neither of these measures are legally binding and are often ignored.

Solutions

Desai and Riedl offer a few options for companies to show good faith in rectifying the situation.

  • Spend the money. Desai says Open AI and Microsoft could have afforded its training data and avoided the hassle of legal consequences.

    “If you do the math on the costs to buy the books and copy them, they could have paid for them,” he said. “It would’ve been a multi-million dollar investment, but they’re a multi-billion dollar company.”
     
  • Be selective. Models can be trained on randomly selected texts from published works, allowing the model to understand the writing style without plagiarizing. 

    “I don’t need the entire text of War and Peace,” Desai said. “To capture the way authors express themselves, I might only need a hundred pages. I’ve also reduced the chance that my model will cough up entire texts.”
     
  • Leverage libraries. The authors agree libraries could serve as an ideal middle ground as a place to store published works and compensate authors for access to those works, though the amount may be less than desired.

    “Most of the objections you could raise are taken care of,” Desai said. “They are legitimate access copies that are secure. You get access to only as much as you need. Libraries at universities have already become schools of information.”

Desai and Riedl hope the legal action taken by publications like The New York Times will send a message to companies that develop AI tools to pump the breaks. If they don’t, researchers uninterested in profit could pay the steepest price.

The authors say it’s not a new problem but is reaching a boiling point.

“In the history of copyright, there are ways that society has dealt with the problem of compensating creators and technology that copies or reduces your ability to extract money from your creation,” Desai said. “We wanted to point out there’s a way to get there.”

News Contact

Nathan Deen

 

Communications Officer

 

School of Interactive Computing

Camille Harris

The Automatic Speech Recognition (ASR) models that power voice assistants like Amazon Alexa may have difficulty transcribing English speakers with minority dialects.

A study by Georgia Tech and Stanford researchers compared the transcribing performance of leading ASR models for people using Standard American English (SAE) and three minority dialects — African American Vernacular English (AAVE), Spanglish, and Chicano English.

Interactive Computing Ph.D. student Camille Harris is the lead author of a paper accepted into the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP) this week in Miami.

Harris recruited people who spoke each dialect and had them read from a Spotify podcast dataset, which includes podcast audio and metadata. Harris then used three ASR models — wav2vec 2.0, HUBERT, and Whisper — to transcribe the audio and compare their performances.

For each model, Harris found SAE transcription significantly outperformed each minority dialect. The models more accurately transcribed men who spoke SAE than women who spoke SAE. Members who spoke Spanglish and Chicano English had the least accurate transcriptions out of the test groups. 

While the models transcribed SAE-speaking women less accurately than their male counterparts, that did not hold true across minority dialects. Minority men had the most inaccurate transcriptions of all demographics in the study.

“I think people would expect if women generally perform worse and minority dialects perform worse, then the combination of the two must also perform worse,” Harris said. “That’s not what we observed. 

“Sometimes minority dialect women performed better than Standard American English. We found a consistent pattern that men of color, particularly Black and Latino men, could be at the highest risk for these performance errors.”

Addressing underrepresentation

Harris said the cause of that outcome starts with the training data used to build these models. Model performance reflected the underrepresentation of minority dialects in the data sets.

AAVE performed best under the Whisper model, which Harris said had the most inclusive training data of minority dialects.

Harris also looked at whether her findings mirrored existing systems of oppression. Black men have high incarceration rates and are one of the people groups most targeted by police. Harris said there could be a correlation between that and the low rate of Black men enrolled in universities, which leads to less representation in technology spaces.

“Minority men performing worse than minority women doesn’t necessarily mean minority men are more oppressed,” she said. “They may be less represented than minority women in computing and the professional sector that develops these AI systems.”

Harris also had to be cautious of a few variables among AAVE, including code-switching and various regional subdialects.

Harris noted in her study there were cases of code-switching to SAE. Speakers who code-switched performed better than speakers who did not. 

Harris also tried to include different regional speakers.

“It’s interesting from a linguistic and history perspective if you look at migration patterns of Black folks — perhaps people moving from a southern state to a northern state over time creates different linguistic variations,” she said. “There are also generational variations in that older Black Americans may speak differently from younger folks. I think the variation was well represented in our data. We wanted to be sure to include that for robustness.”

TikTok barriers

Harris said she built her study on a paper she authored that examined user-design barriers and biases faced by Black content creators on TikTok. She presented that paper at the Association of Computing Machinery’s (ACM) 2023 Conference on Computer Supported Cooperative Works. 

Those content creators depended on TikTok for a significant portion of their income. When providing captions for videos grew in popularity, those creators noticed the ASR tool built into the app inaccurately transcribed them. That forced the creators to manually input their captions, while SAE speakers could use the ASR feature to their benefit.

“Minority users of these technologies will have to be more aware and keep in mind that they’ll probably have to do a lot more customization because things won’t be tailored to them,” Harris said.

Harris said there are ways that designers of ASR tools could work toward being more inclusive of minority dialects, but cultural challenges could arise.

“It could be difficult to collect more minority speech data, and you have to consider consent with that,” she said. “Developers need to be more community-engaged to think about the implications of their models and whether it’s something the community would find helpful.”

News Contact

Nathan Deen

 

Communications Officer

 

School of Interactive Computing

a group of students and alumni

Members of the recently victorious cybersecurity group known as Team Atlanta received recognition from one of the top technology companies in the world for their discovery of a zero-day vulnerability in the DARPA AI Cyber Challenge (AIxCC) earlier this year. 

On November 1, a team of Google’s security researchers from Project Zero announced they were inspired by the Georgia Tech students and alumni on the team that discovered a flaw in SQLite. This widely used open-source database ran the competition’s scoring algorithm. 

According to a post from the project’s blog, when Google researchers saw the success of Atlantis, the large language model (LLM) used in AIxCC, they deployed their LLM to check vulnerabilities in SQLite. 

Google’s Big Sleep tool discovered a security flaw in SQLite, an exploitable stack buffer underflow. Project Zero reported the vulnerability and it was patched almost immediately. 

“We’re thrilled to see our work on LLM-based bug discovery and remediation inspiring further advancements in security research at Google,” said Hanqing Zhao, a Georgia Tech Ph.D. student. “It’s incredibly rewarding to witness the broader community recognizing and citing our contributions to AI and LLM-driven security efforts.”

Zhao led a group within Team Atlanta focused on tracking their project’s success during the competition, leading to the bug's discovery. He also wrote a technical breakdown of their findings in a blog post cited by Google’s Project Zero. 

“This achievement was entirely autonomous, without any human intervention, and we hadn’t even anticipated targeting SQLite3,” he said. “The outcome highlighted the transformative potential of generative AI in security research. Our approach is rooted in a simple yet effective philosophy: mimic the expertise of seasoned security researchers using LLMs.”

The DARPA AI Cyber Challenge (AIxCC) semi-final competition was held at DEF CON 32 in Las Vegas. Team Atlanta, which included Georgia Tech experts, was among the contest’s winners.  

Team Atlanta will now compete against six other teams in the final round, which will take place at DEF CON 33 in August 2025. The finalists will use the $2 million semi-final prize to improve their AI system over the next 12 months. Team Atlanta consists of past and present Georgia Tech students and was put together with the help of SCP Professor Taesoo Kim.

The AI systems in the finals must be open-sourced and ready for immediate, real-world launch. The AIxCC final competition will award the champion a $4 million grand prize.

The team tested their cyber reasoning system (CRS), dubbed Atlantis, on software used for data management, website support, healthcare systems, supply chains, electrical grids, transportation, and other critical infrastructures.

Atlantis is a next-generation, bug-finding and fixing system that can hunt bugs in multiple coding languages. The system immediately issues accurate software patches without any human intervention. 

AIxCC is a Pentagon-backed initiative announced in August 2023 and will award up to $20 million in prize money throughout the competition. Team Atlanta was among the 42 teams that qualified for the semi-final competition earlier this year.

News Contact

John Popham

Communications Officer II | School of Cybersecurity and Privacy

Photo of Molei Tao holding his College of Sciences Faculty Development Award during the 2022 Spring Sciences Celebration.

School of Mathematics Associate Professor Molei Tao has been honored with a Sony Faculty Innovation Award for his work on the foundations of machine learning, particularly diffusion generative models. The award, which includes a $100,000 grant, is part of an international program sponsored by SONY that provides funding for cutting-edge academic research across a wide range of disciplines.

Tao is an applied and computational mathematician who designs and synergizes mathematical tools to solve practical problems. Recently, he has focused on the applications of these tools to machine learning. Tao works on multiple subareas of machine learning, including deep learning theory, probabilistic methods, generative modeling, and artificial intelligence for science (“AI4Science”). 

"Molei is doing breakthrough work on machine learning and artificial intelligence,” says Mike Wolf, chair of the School of Mathematics. “It is wonderful to see him recognized by Sony, both for his accomplishments so far and also his promise for the future. His unique perspectives, informed by an astonishing deep breadth of understanding of mathematics, have already made him one of the more prominent researchers in this extremely competitive and important field. I know that this award will fuel even more impactful works. We are just thrilled to have Molei on our faculty in the School of Mathematics."

Revolutionizing Generative AI

The award recognizes Tao’s research on the mathematical and algorithmic aspects of diffusion generative modeling, which is considered one of the foundations of modern Generative AI. Using advanced machine learning algorithms, these models have revolutionized the generation of image, video, and 3D content. 

“Exciting products such as ChatGPT, Stable Diffusion, and Sora are generative AI tools, and a good number of them are powered by diffusion models,” explains Tao. “The way the magic works is you basically give a machine learning model a collection of training data, and then the algorithm can generate more content that is similar to the training data. The ability of generating new content is called generative modeling. Diffusion model is one of the latest technologies for generative modeling.”

Tao’s work aims to make diffusion models more versatile and scalable. He hopes to broaden their application and possibly create the next generation of generative modeling tools. 

“The large-scale impact of this research is to make generative AI more accessible, more creative, safer, and more trustworthy,” he adds. 

To learn more about Tao's research, visit his blog or follow him on Twitter at @MoleiTaoMath.

News Contact

Amanda Cook
Communications Officer II
College of Sciences

Editor and Contact: Lindsay C. Vidal
Assistant Director of Communications

Tim Brown's Tech AI Announcement Image

Tech AI at Georgia Tech has appointed Tim Brown as interim director of professional education. He is also the new academic program director for AI, a joint appointment by Tech AI and Georgia Tech Professional Education (GTPE). Previously, Brown served as managing director of Georgia Tech’s Supply Chain and Logistics Institute for nearly 10 years, where he focused on program expansion and partnership development.

In his new role, Brown will work closely with the College of Lifetime Learning and Tech AI to develop innovative AI programs. He will identify industry needs and create interdisciplinary academic offerings serving a diverse range of learners, from K-12 students to executives. His initial emphasis will be on mid-career professionals seeking to upskill or reskill, equipping them with the technical skills essential for success in the AI field. He will also enhance existing programs and provide educational opportunities to companies and organizations, addressing current market demands.

Brown has more than 35 years of experience in professional education and supply chain optimization, including roles at IBM, Accenture, Chainalytics, Frito-Lay, and Tropicana. He has worked with executives in various industries, advising on supply chain management and securing $81 million in funding for AI in manufacturing through the Georgia AIM coalition.

In a statement, Brown said, “I look forward to contributing to innovative AI programs at Georgia Tech. Our goal is to create educational opportunities that meet the diverse needs of learners and equip them with the skills necessary to thrive in this evolving field.”

Brown’s leadership underscores Georgia Tech’s commitment to innovation and education in the rapidly changing landscape of AI. He is dedicated to establishing Georgia as a leader in AI and highlighting the resources and capabilities that Georgia Tech offers.

Saman Zonouz is a Georgia Tech associate professor and lead researcher for the DerGuard project.

The U.S. Department of Energy (DOE) has awarded Georgia Tech researchers a $4.6 million grant to develop improved cybersecurity protection for renewable energy technologies. 

Associate Professor Saman Zonouz will lead the project and leverage the latest artificial technology (AI) to create Phorensics. The new tool will anticipate cyberattacks on critical infrastructure and provide analysts with an accurate reading of what vulnerabilities were exploited. 

“This grant enables us to tackle one of the crucial challenges facing national security today: our critical infrastructure resilience and post-incident diagnostics to restore normal operations in a timely manner,” said Zonouz.

“Together with our amazing team, we will focus on cyber-physical data recovery and post-mortem forensics analysis after cybersecurity incidents in emerging renewable energy systems.”

As the integration of renewable energy technology into national power grids increases, so does their vulnerability to cyberattacks. These threats put energy infrastructure at risk and pose a significant danger to public safety and economic stability. The AI behind Phorensics will allow analysts and technicians to scale security efforts to keep up with a growing power grid that is becoming more complex.

This effort is part of the Security of Engineering Systems (SES) initiative at Georgia Tech’s School of Cybersecurity and Privacy (SCP). SES has three pillars: research, education, and testbeds, with multiple ongoing large, sponsored efforts. 

“We had a successful hiring season for SES last year and will continue filling several open tenure-track faculty positions this upcoming cycle,” said Zonouz.

“With top-notch cybersecurity and engineering schools at Georgia Tech, we have begun the SES journey with a dedicated passion to pursue building real-world solutions to protect our critical infrastructures, national security, and public safety.”

Zonouz is the director of the Cyber-Physical Systems Security Laboratory (CPSec) and is jointly appointed by Georgia Tech’s School of Cybersecurity and Privacy (SCP) and the School of Electrical and Computer Engineering (ECE).

The three Georgia Tech researchers joining him on this project are Brendan Saltaformaggio, associate professor in SCP and ECE; Taesoo Kim, jointly appointed professor in SCP and the School of Computer Science; and Animesh Chhotaray, research scientist in SCP.

Katherine Davis, associate professor at the Texas A&M University Department of Electrical and Computer Engineering, has partnered with the team to develop Phorensics. The team will also collaborate with the NREL National Lab, and industry partners for technology transfer and commercialization initiatives. 

The Energy Department defines renewable energy as energy from unlimited, naturally replenished resources, such as the sun, tides, and wind. Renewable energy can be used for electricity generation, space and water heating and cooling, and transportation.

News Contact

John Popham

Communications Officer II

College of Computing | School of Cybersecurity and Privacy

New CSE Faculty Lu Mi

Two new assistant professors joined the School of Computational Science and Engineering (CSE) faculty this fall. Lu Mi comes to Georgia Tech from the Allen Institute for Brain Science in Seattle, where she was a Shanahan Foundation Fellow. 

We sat down with Mi to learn more about her background and to introduce her to the Georgia Tech and College of Computing communities. 

Faculty: Lu Mi, assistant professor, School of CSE

Research Interests: Computational Neuroscience, Machine Learning

Education: Ph.D. in Computer Science from the Massachusetts Institute of Technology; B.S. in Measurement, Control, and Instruments from Tsinghua University

Hometown: Sichuan, China (home of the giant pandas) 

How have your first few months at Georgia Tech gone so far?

I’ve really enjoyed my time at Georgia Tech. Developing a new course has been both challenging and rewarding. I’ve learned a lot from the process and conversations with students. My colleagues have been incredibly welcoming, and I’ve had the opportunity to work with some very smart and motivated students here at Georgia Tech.

You hit the ground running this year by teaching your CSE 8803 course on brain-inspired machine intelligence. What important concepts do you teach in this class?

This course focuses on comparing biological neural networks with artificial neural networks. We explore questions like: How does the brain encode information, perform computations, and learn? What can neuroscience and artificial intelligence (AI) learn from each other? Key topics include spiking neural networks, neural coding, and biologically plausible learning rules. By the end of the course, I expect students to have a solid understanding of neural algorithms and the emerging NeuroAI field.

When and how did you become interested in computational neuroscience in the first place?

I’ve been fascinated by how the brain works since I was young. My formal engagement with the field began during my Ph.D. research, where we developed algorithms to help neuroscientists map large-scale synaptic wiring diagrams in the brain. Since then, I’ve had the opportunity to collaborate with researchers at institutions like Harvard, the Janelia Research Campus, the Allen Institute for Brain Science, and the University of Washington on various exciting projects in this field.

What about your experience and research are you currently most proud of?

I’m particularly proud of the framework we developed to integrate black-box machine learning models with biologically realistic mechanistic models. We use advanced deep-learning techniques to infer unobserved information and combine this with prior knowledge from mechanistic models. This allows us to test hypotheses by applying different model variants. I believe this framework holds great potential to address a wide range of scientific questions, leveraging the power of AI.

What about Georgia Tech convinced you to accept a faculty position?

Georgia Tech CSE felt like a perfect fit for my background and research interests, particularly within the AI4Science initiative and the development of computational tools for biology and neuroscience. My work overlaps with several colleagues here, and I’m excited to collaborate with them. Georgia Tech also has a vibrant and impactful Neuro Next Initiative community, which is another great attraction.

What are your hobbies and interests when not researching and teaching?

I enjoy photography and love spending time with my two corgi dogs, especially taking them for walks.

What have you enjoyed most so far about living in Atlanta? 

I’ve really appreciated the peaceful, green environment with so many trees. I’m also looking forward to exploring more outdoor activities, like fishing and golfing.

News Contact

Bryant Wine, Communications Officer
bryant.wine@cc.gatech.edu

ARCollab Usability Evaluation
Pratham Mehta at CHI 2024
Georgia Tech @ VIS 2024

A new surgery planning tool powered by augmented reality (AR) is in development for doctors who need closer collaboration when planning heart operations. Promising results from a recent usability test have moved the platform one step closer to everyday use in hospitals worldwide.

Georgia Tech researchers partnered with medical experts from Children’s Healthcare of Atlanta (CHOA) to develop and test ARCollab. The iOS-based app leverages advanced AR technologies to let doctors collaborate together and interact with a patient’s 3D heart model when planning surgeries.

The usability evaluation demonstrates the app’s effectiveness, finding that ARCollab is easy to use and understand, fosters collaboration, and improves surgical planning.

“This tool is a step toward easier collaborative surgical planning. ARCollab could reduce the reliance on physical heart models, saving hours and even days of time while maintaining the collaborative nature of surgical planning,” said M.S. student Pratham Mehta, the app’s lead researcher.

“Not only can it benefit doctors when planning for surgery, it may also serve as a teaching tool to explain heart deformities and problems to patients.”

Two cardiologists and three cardiothoracic surgeons from CHOA tested ARCollab. The two-day study ended with the doctors taking a 14-question survey assessing the app’s usability. The survey also solicited general feedback and top features.

The Georgia Tech group determined from the open-ended feedback that:

  • ARCollab enables new collaboration capabilities that are easy to use and facilitate surgical planning.
  • Anchoring the model to a physical space is important for better interaction.
  • Portability and real-time interaction are crucial for collaborative surgical planning.

Users rated each of the 14 questions on a 7-point Likert scale, with one being “strongly disagree” and seven being “strongly agree.” The 14 questions were organized into five categories: overall, multi-user, model viewing, model slicing, and saving and loading models.

The multi-user category attained the highest rating with an average of 6.65. This included a unanimous 7.0 rating that it was easy to identify who was controlling the heart model in ARCollab. The scores also showed it was easy for users to connect with devices, switch between viewing and slicing, and view other users’ interactions.

The model slicing category received the lowest, but formidable, average of 5.5. These questions assessed ease of use and understanding of finger gestures and usefulness to toggle slice direction.

Based on feedback, the researchers will explore adding support for remote collaboration. This would assist doctors in collaborating when not in a shared physical space. Another improvement is extending the save feature to support multiple states.

“The surgeons and cardiologists found it extremely beneficial for multiple people to be able to view the model and collaboratively interact with it in real-time,” Mehta said.

The user study took place in a CHOA classroom. CHOA also provided a 3D heart model for the test using anonymous medical imaging data. Georgia Tech’s Institutional Review Board (IRB) approved the study and the group collected data in accordance with Institute policies.

The five test participants regularly perform cardiovascular surgical procedures and are employed by CHOA. 

The Georgia Tech group provided each participant with an iPad Pro with the latest iOS version and the ARCollab app installed. Using commercial devices and software meets the group’s intentions to make the tool universally available and deployable.

“We plan to continue iterating ARCollab based on the feedback from the users,” Mehta said. 

“The participants suggested the addition of a ‘distance collaboration’ mode, enabling doctors to collaborate even if they are not in the same physical environment. This allows them to facilitate surgical planning sessions from home or otherwise.”

The Georgia Tech researchers are presenting ARCollab and the user study results at IEEE VIS 2024, the Institute of Electrical and Electronics Engineers (IEEE) visualization conference. 

IEEE VIS is the world’s most prestigious conference for visualization research and the second-highest rated conference for computer graphics. It takes place virtually Oct. 13-18, moved from its venue in St. Pete Beach, Florida, due to Hurricane Milton.

The ARCollab research group's presentation at IEEE VIS comes months after they shared their work at the Conference on Human Factors in Computing Systems (CHI 2024).

Undergraduate student Rahul Narayanan and alumni Harsha Karanth (M.S. CS 2024) and Haoyang (Alex) Yang (CS 2022, M.S. CS 2023) co-authored the paper with Mehta. They study under Polo Chau, a professor in the School of Computational Science and Engineering.

The Georgia Tech group partnered with Dr. Timothy Slesnick and Dr. Fawwaz Shaw from CHOA on ARCollab’s development and user testing.

"I'm grateful for these opportunities since I get to showcase the team's hard work," Mehta said.

“I can meet other like-minded researchers and students who share these interests in visualization and human-computer interaction. There is no better form of learning.”

News Contact

Bryant Wine, Communications Officer
bryant.wine@cc.gatech.edu

woman wearing glasses standing outside

r. Teodora Baluta is looking for Ph.D. students to join her in researching deep fake detection, malicious AI use, and building secure AI models with privacy in mind. Photos by Terence Rushin, College of Computing

New cybersecurity research initiatives into generative artificial intelligence (AI) tools will soon be underway at Georgia Tech, thanks to the efforts of a new assistant professor in the School of Cybersecurity and Privacy (SCP).

While some researchers seek ways to integrate AI into security practices, Teodora Baluta studies the algorithms and datasets used to train new AI tools to assess their security in theory and practice.

Specifically, she investigates whether the outputs from generative AI tools are abusing data or producing text based on stolen data. As one of Georgia Tech’s newest faculty, Baluta is determined to build on the research she completed during her Ph.D. at the National University of Singapore. 

She plans to expand her past works by continuing to analyze existing AI technologies and researching ways to build better machine learning systems with security measures already in place. 

“One thing that excites me about joining SCP is its network of experts that can weigh in on aspects that are outside of my field,” said Baluta. “I am really looking forward to building on my past works by studying the bigger security picture of AI and machine learning.” 

As a new faculty member, Baluta is looking for Ph.D. students interested in joining her in these new research initiatives

“We’re going to be looking at topics such as the mathematical possibility of detecting deep fakes, uncovering the malicious intent behind AI use, and how to build better AI models with security and privacy safeguards,” she said. 

Baluta’s research has been recognized by Google’s Ph.D. fellowship program and Georgia Tech’s EECS Rising Stars Workshop in 2023. As a Ph.D. student, she earned the Dean’s Graduate Research Excellence Award and the President’s Graduate Fellowship at the National University of Singapore. She was also selected as a finalist for the Microsoft Research Ph.D. Fellowship, Asia-Pacific.

News Contact

John Popham

Communications Officer II

School of Cybersecurity and Privacy