Dec. 06, 2024
The surface is covered with fine ash. The lava fields stretch for miles, punctuated only by basalt mountains. But life could be found here if you look hard enough.
This barren land isn't Mars or Pluto, but volcanic deserts in Iceland. The environment is so comparable to Mars' arid landscape that researchers can use it as an analog. From Earth, they can extrapolate how planets in our galaxy and beyond could sustain life and what tools humans might need to make homes on these planets.
Georgia Tech researchers explore everywhere from Oregon's mountaintops to Arizona's deserts to better understand space — and life on this planet.
Dec. 03, 2024
Georgia Tech researchers have created a dataset that trains computer models to understand nuances in human speech during financial earnings calls. The dataset provides a new resource to study how public correspondence affects businesses and markets.
SubjECTive-QA is the first human-curated dataset on question-answer pairs from earnings call transcripts (ECTs). The dataset teaches models to identify subjective features in ECTs, like clarity and cautiousness.
The dataset lays the foundation for a new approach to identifying disinformation and misinformation caused by nuances in speech. While ECT responses can be technically true, unclear or irrelevant information can misinform stakeholders and affect their decision-making.
Tests on White House press briefings showed that the dataset applies to other sectors with frequent question-and-answer encounters, notably politics, journalism, and sports. This increases the odds of effectively informing audiences and improving transparency across public spheres.
The intersecting work between natural language processing and finance earned the paper acceptance to NeurIPS 2024, the 38th Annual Conference on Neural Information Processing Systems. NeurIPS is one of the world’s most prestigious conferences on artificial intelligence (AI) and machine learning (ML) research.
"SubjECTive-QA has the potential to revolutionize nowcasting predictions with enhanced clarity and relevance,” said Agam Shah, the project’s lead researcher.
“Its nuanced analysis of qualities in executive responses, like optimism and cautiousness, deepens our understanding of economic forecasts and financial transparency."
[MICROSITE: Georgia Tech at NeurIPS 2024]
SubjECTive-QA offers a new means to evaluate financial discourse by characterizing language's subjective and multifaceted nature. This improves on traditional datasets that quantify sentiment or verify claims from financial statements.
The dataset consists of 2,747 Q&A pairs taken from 120 ECTs from companies listed on the New York Stock Exchange from 2007 to 2021. The Georgia Tech researchers annotated each response by hand based on six features for a total of 49,446 annotations.
The group evaluated answers on:
- Relevance: the speaker answered the question with appropriate details.
- Clarity: the speaker was transparent in the answer and the message conveyed.
- Optimism: the speaker answered with a positive outlook regarding future outcomes.
- Specificity: the speaker included sufficient and technical details in their answer.
- Cautiousness: the speaker answered using a conservative, risk-averse approach.
- Assertiveness: the speaker answered with certainty about the company’s events and outcomes.
The Georgia Tech group validated their dataset by training eight computer models to detect and score these six features. Test models comprised of three BERT-based pre-trained language models (PLMs), and five popular large language models (LLMs) including Llama and ChatGPT.
All eight models scored the highest on the relevance and clarity features. This is attributed to domain-specific pretraining that enables the models to identify pertinent and understandable material.
The PLMs achieved higher scores on the clear, optimistic, specific, and cautious categories. The LLMs scored higher in assertiveness and relevance.
In another experiment to test transferability, a PLM trained with SubjECTive-QA evaluated 65 Q&A pairs from White House press briefings and gaggles. Scores across all six features indicated models trained on the dataset could succeed in other fields outside of finance.
"Building on these promising results, the next step for SubjECTive-QA is to enhance customer service technologies, like chatbots,” said Shah, a Ph.D. candidate studying machine learning.
“We want to make these platforms more responsive and accurate by integrating our analysis techniques from SubjECTive-QA."
SubjECTive-QA culminated from two semesters of work through Georgia Tech’s Vertically Integrated Projects (VIP) Program. The VIP Program is an approach to higher education where undergraduate and graduate students work together on long-term project teams led by faculty.
Undergraduate students earn academic credit and receive hands-on experience through VIP projects. The extra help advances ongoing research and gives graduate students mentorship experience.
Computer science major Huzaifa Pardawala and mathematics major Siddhant Sukhani co-led the SubjECTive-QA project with Shah.
Fellow collaborators included Veer Kejriwal, Abhishek Pillai, Rohan Bhasin, Andrew DiBiasio, Tarun Mandapati, and Dhruv Adha. All six researchers are undergraduate students studying computer science.
Sudheer Chava co-advises Shah and is the faculty lead of SubjECTive-QA. Chava is a professor in the Scheller College of Business and director of the M.S. in Quantitative and Computational Finance (QCF) program.
Chava is also an adjunct faculty member in the College of Computing’s School of Computational Science and Engineering (CSE).
"Leading undergraduate students through the VIP Program taught me the powerful impact of balancing freedom with guidance,” Shah said.
“Allowing students to take the helm not only fosters their leadership skills but also enhances my own approach to mentoring, thus creating a mutually enriching educational experience.”
Presenting SubjECTive-QA at NeurIPS 2024 exposes the dataset for further use and refinement. NeurIPS is one of three primary international conferences on high-impact research in AI and ML. The conference occurs Dec. 10-15.
The SubjECTive-QA team is among the 162 Georgia Tech researchers presenting over 80 papers at NeurIPS 2024. The Georgia Tech contingent includes 46 faculty members, like Chava. These faculty represent Georgia Tech’s Colleges of Business, Computing, Engineering, and Sciences, underscoring the pertinence of AI research across domains.
"Presenting SubjECTive-QA at prestigious venues like NeurIPS propels our research into the spotlight, drawing the attention of key players in finance and tech,” Shah said.
“The feedback we receive from this community of experts validates our approach and opens new avenues for future innovation, setting the stage for transformative applications in industry and academia.”
News Contact
Bryant Wine, Communications Officer
bryant.wine@cc.gatech.edu
Dec. 03, 2024
A new machine learning (ML) model from Georgia Tech could protect communities from diseases, better manage electricity consumption in cities, and promote business growth, all at the same time.
Researchers from the School of Computational Science and Engineering (CSE) created the Large Pre-Trained Time-Series Model (LPTM) framework. LPTM is a single foundational model that completes forecasting tasks across a broad range of domains.
Along with performing as well or better than models purpose-built for their applications, LPTM requires 40% less data and 50% less training time than current baselines. In some cases, LPTM can be deployed without any training data.
The key to LPTM is that it is pre-trained on datasets from different industries like healthcare, transportation, and energy. The Georgia Tech group created an adaptive segmentation module to make effective use of these vastly different datasets.
The Georgia Tech researchers will present LPTM in Vancouver, British Columbia, Canada, at the 2024 Conference on Neural Information Processing Systems (NeurIPS 2024). NeurIPS is one of the world’s most prestigious conferences on artificial intelligence (AI) and ML research.
“The foundational model paradigm started with text and image, but people haven’t explored time-series tasks yet because those were considered too diverse across domains,” said B. Aditya Prakash, one of LPTM’s developers.
“Our work is a pioneer in this new area of exploration where only few attempts have been made so far.”
[MICROSITE: Georgia Tech at NeurIPS 2024]
Foundational models are trained with data from different fields, making them powerful tools when assigned tasks. Foundational models drive GPT, DALL-E, and other popular generative AI platforms used today. LPTM is different though because it is geared toward time-series, not text and image generation.
The Georgia Tech researchers trained LPTM on data ranging from epidemics, macroeconomics, power consumption, traffic and transportation, stock markets, and human motion and behavioral datasets.
After training, the group pitted LPTM against 17 other models to make forecasts as close to nine real-case benchmarks. LPTM performed the best on five datasets and placed second on the other four.
The nine benchmarks contained data from real-world collections. These included the spread of influenza in the U.S. and Japan, electricity, traffic, and taxi demand in New York, and financial markets.
The competitor models were purpose-built for their fields. While each model performed well on one or two benchmarks closest to its designed purpose, the models ranked in the middle or bottom on others.
In another experiment, the Georgia Tech group tested LPTM against seven baseline models on the same nine benchmarks in zero-shot forecasting tasks. Zero-shot means the model is used out of the box and not given any specific guidance during training. LPTM outperformed every model across all benchmarks in this trial.
LPTM performed consistently as a top-runner on all nine benchmarks, demonstrating the model’s potential to achieve superior forecasting results across multiple applications with less and resources.
“Our model also goes beyond forecasting and helps accomplish other tasks,” said Prakash, an associate professor in the School of CSE.
“Classification is a useful time-series task that allows us to understand the nature of the time-series and label whether that time-series is something we understand or is new.”
One reason traditional models are custom-built to their purpose is that fields differ in reporting frequency and trends.
For example, epidemic data is often reported weekly and goes through seasonal peaks with occasional outbreaks. Economic data is captured quarterly and typically remains consistent and monotone over time.
LPTM’s adaptive segmentation module allows it to overcome these timing differences across datasets. When LPTM receives a dataset, the module breaks data into segments of different sizes. Then, it scores all possible ways to segment data and chooses the easiest segment from which to learn useful patterns.
LPTM’s performance, enhanced through the innovation of adaptive segmentation, earned the model acceptance to NeurIPS 2024 for presentation. NeurIPS is one of three primary international conferences on high-impact research in AI and ML. NeurIPS 2024 occurs Dec. 10-15.
Ph.D. student Harshavardhan Kamarthi partnered with Prakash, his advisor, on LPTM. The duo are among the 162 Georgia Tech researchers presenting over 80 papers at the conference.
Prakash is one of 46 Georgia Tech faculty with research accepted at NeurIPS 2024. Nine School of CSE faculty members, nearly one-third of the body, are authors or co-authors of 17 papers accepted at the conference.
Along with sharing their research at NeurIPS 2024, Prakash and Kamarthi released an open-source library of foundational time-series modules that data scientists can use in their applications.
“Given the interest in AI from all walks of life, including business, social, and research and development sectors, a lot of work has been done and thousands of strong papers are submitted to the main AI conferences,” Prakash said.
“Acceptance of our paper speaks to the quality of the work and its potential to advance foundational methodology, and we hope to share that with a larger audience.”
News Contact
Bryant Wine, Communications Officer
bryant.wine@cc.gatech.edu
Nov. 25, 2024
Cybersecurity researchers have discovered new vulnerabilities that could provide criminals with wireless access to the computer systems in automobiles, aircraft, factories, and other cyber-physical systems.
The computers used in vehicles and other cyber-physical systems rely on a specialized internal network to communicate commands between electronics. Because it took place internally, it was traditionally assumed that attackers could only influence this network through physical access.
In collaboration with Hyundai, researchers from Georgia Tech’s Cyber-Physical Systems Security Research Lab (CPSec) observed that threat models used to evaluate the security of these technologies were outdated.
The team, led by Ph.D. student Zhaozhou Tang, found that vehicle technology advancements allowed attackers to launch new attacks, improve existing attacks, and circumvent current defense systems.
For example, Tang’s findings included the possibility for attackers to remotely compromise the computers used in cars and aircraft through Wi-Fi, cellular, Bluetooth, and other wireless channels.
“Our job was to thoroughly review existing information and find ways to protect against these attacks,” he said. “We found new threats and proposed a defense system that can protect against the new and old attacks.”
In response to their findings, the team developed ERACAN, the first comprehensive defense system against this new generation of attackers. Designed to detect new and old attacks, ERACAN can deploy defenses when necessary.
The system also classifies the attacks it reacts to, providing security experts with the tools for detailed analysis. It has a detection rate of 100% for all attacks launched by conventional methods and detects enhanced threat models 99.7% of the time.
The project received a distinguished paper award at the 2024 ACM Conference on Computer and Communications Security (CCS 24) held in Salt Lake City. Tang presented the paper at the October conference.
“This was Zhaozhou’s first paper in his Ph.D. program, and he deserves recognition for his groundbreaking work on automotive cybersecurity,” said Saman Zonouz, associate professor in the School of Cybersecurity and Privacy and the School of Electrical and Computer Engineering.
The U.S. Department of Homeland Security has designated the transportation sector as one of the nation’s 16 critical infrastructure sectors. Ensuring its security is vital to national security and public safety.
“Modern vehicles, which rely heavily on controller area networks for essential operations, are integral components of this infrastructure,” said Zonouz. “With the increasing sophistication of cyberthreats, safeguarding these systems has become critical to ensuring the resilience and security of transportation networks.”
This paper introduced to the scientific community the first comprehensive defense system to address advanced threats targeting vehicular controller area networks.
The CPSec team is putting the technology it has developed into practice in collaboration with Hyundai America Technical Center, Inc., which sponsors the work. Tang hopes ERACAN’s success will raise awareness of these new threats in the research community and industry.
“It will help them build future defenses,” he said. “We have demonstrated the best practice to defend against these attacks.”
Tang received his bachelor’s degree at Georgia Tech, where he first performed security-related work for the automobile industry. While working with Zonouz on his master’s degree, he decided to change course and pursue research initiatives like vehicle security in a Ph.D. program.
“It is interesting how it came full circle,” he said. “I will continue on this path of automobile security throughout my Ph.D.”
ERACAN: Defending Against an Emerging CAN Threat Model, was written by Zhaozhou Tang, Khaled Serag from the Qatar Computing Research Institute, Saman Zonouz, Berkay Celik and Dongyan Xu from Purdue University, and Raheem Beyah, professor and dean of the College of Engineering. The CPSec Lab is a collaboration between the School of Cybersecurity and Privacy and the School of Electrical and Computer Engineering.
News Contact
John Popham
Communications Officer II
School of Cybersecurity and Privacy
Dec. 01, 2024
On Nov. 12, CREATE-X hosted a panel discussion featuring Y Combinator (YC) partner Brad Flora and Georgia Tech and Startup Launch alumni. In addition to sharing experiences, panelists offered practical advice and feedback for aspiring entrepreneurs, and attendees enjoyed the opportunity to network.
Y Combinator, which has produced companies like Twitch, Reddit, AirBnB, and Coinbase, has funded over 143 Georgia Tech alumni, surpassing institutions like the University of Michigan, Duke, and Princeton. YC recruits startups four times a year and provides a $500,000 investment.
Spotlight on Founders
Flora, the event's keynote speaker, shared his journey from a YC founder to a partner, emphasizing the accelerator's commitment to supporting college-age founders. He also spoke about finding ideas, meeting co-founders, knowing when to persist and when to pivot, and more.
“A lot of people think you have to have a great startup idea before you start working on a startup,” Flora said. “The theme you find again and again for the best YC founders is that they were doing something that was interesting to them.”
Flora encouraged students to explore their interests and identify problems they are passionate about solving. He also spoke about "tar pit ideas,” or ideas that seem interesting and novel but don’t translate to a wider audience and wouldn’t be widely used. He advised them to focus on ideas with clear, demonstrable demand.
“The best way to avoid tar pit ideas is to get feedback from your users and find out if they’re actually using them,” Flora said.
Georgia Tech alumni and Greptile founders SooHoon Choi and Vaishant Kameswaran talked about the origins of their company. Choi and Daksh Gupta, their other co-founder, participated in CREATE-X Capstone and then in CREATE-X Startup Launch to develop Tabnam, which initially was an AI shopping assistant that scraped the internet to tell users what people think about their product.
The founders discussed starting Tabnam in a course and moving across the country to work on it in their apartment to getting rejected by YC, pivoting the startup at a hackathon, and developing Greptile. This AI product enables large software teams to review core changes before merging, find issues in their code, understand the source of bugs, and perform other related tasks. That iteration proved successful, gaining millions in funding and hundreds of customers.
Gupta spoke about a framework that kept the co-founders open to pivots. “Startups aren’t small companies. They’re a hypothesis that asks if a company should exist in this space. That means your job is to prove or disprove that hypothesis,” he said.
For more insights, watch the video of the event.
Opportunities for Entrepreneurs
Students, faculty, researchers, and alumni interested in developing their own startups are encouraged to apply to CREATE-X's Startup Launch. The program provides $5,000 in optional seed funding, $150,000 in in-kind services, mentorship, entrepreneurial workshops, networking events, and resources to help build and scale startups. The program culminates in Demo Day, where teams present their startups to potential investors. The deadline to apply for Startup Launch is March 19, 2025. Spots are limited. Apply now for a higher chance of acceptance and early feedback.
News Contact
Breanna Durham
Marketing Strategist
Nov. 22, 2024
The Department of Commerce has granted the Semiconductor Research Corporation (SRC), its partners, and Georgia Institute of Technology $285 million to establish and operate the 18th Manufacturing USA Institute. The Semiconductor Manufacturing and Advanced Reseach with Twins (SMART USA) will focus on using digital twins to accelerate the development and deployment of microelectronics. SMART USA, with more than 150 expected partner entities representing industry, academia, and the full spectrum of supply chain design and manufacturing, will span more than 30 states and have combined funding totaling $1 billion.
This is the first-of-its-kind CHIPS Manufacturing USA Institute.
“Georgia Tech’s role in the SMART USA Institute amplifies our trailblazing chip and advanced packaging research and leverages the strengths of our interdisciplinary research institutes,” said Tim Lieuwen, interim executive vice president for Research. “We believe innovation thrives where disciplines and sectors intersect. And the SMART USA Institute will help us ensure that the benefits of our semiconductor and advanced packaging discoveries extend beyond our labs, positively impacting the economy and quality of life in Georgia and across the United States.”
The 3D Systems Packaging Research Center (PRC), directed by School of Electrical and Computer Engineering Dan Fielder Professor Muhannad Bakir, played an integral role in developing the winning proposal. Georgia Tech will be designated as the Digital Innovation Semiconductor Center (DISC) for the Southeastern U.S.
“We are honored to collaborate with SRC and their team on this new Manufacturing USA Institute. Our partnership with SRC spans more than two decades, and we are thrilled to continue this collaboration by leveraging the Institute’s wide range of semiconductor and advanced packaging expertise,” said Bakir.
Through the Institute of Matter and Systems’ core facilities, housed in the Marcus Nanotechnology Building, DISC will accelerate semiconductor and advanced packaging development.
“The awarding of the Digital Twin Manufacturing USA Institute is a culmination of more than three years of work with the Semiconductor Research Corporation and other valued team members who share a similar vision of advancing U.S. leadership in semiconductors and advanced packaging,” said George White, senior director for strategic partnerships at Georgia Tech.
“As a founding member of the SMART USA Institute, Georgia Tech values this long-standing partnership. Its industry and academic partners, including the HBCU CHIPS Network, stand ready to make significant contributions to realize the goals and objectives of the SMART USA Institute,” White added.
Georgia Tech also plans to capitalize on the supply chain and optimization strengths of the No. 1-ranked H. Milton Stewart School of Industrial and Systems Engineering (ISyE). ISyE experts will help develop supply-chain digital twins to optimize and streamline manufacturing and operational efficiencies.
David Henshall, SRC vice president of Business Development, said, “The SMART USA Institute will advance American digital twin technology and apply it to the full semiconductor supply chain, enabling rapid process optimization, predictive maintenance, and agile responses to chips supply chain disruptions. These efforts will strengthen U.S. global competitiveness, ensuring our country reaps the rewards of American innovation at scale.”
News Contact
Amelia Neumeister | Research Communications Program Manager
Nov. 22, 2024
Georgia Tech is days away from the Fall 2024 Idea to Prototype (I2P) Showcase, set to take place on Dec. 3 at 5 p.m. in the Exhibition Hall. This event offers students a platform to present solutions built over the semester to tackle real-world problems and compete for rewards, including a golden ticket into the CREATE-X summer startup accelerator, Startup Launch. The program offers optional seed funding, workspace, entrepreneurial education, and continued mentorship to help students turn their prototypes into viable startups. Over 50 teams will present their prototypes at the showcase.
The event is open to all Georgia Tech students, faculty, staff, and the local community. Tickets are available now but are limited, so register for the I2P Showcase today.
Each semester, students in the Idea-to-Prototype course take time out of their schedules, similar to undergraduate research, to build prototypes. Teams accepted into I2P receive a reimbursement of up to $500 for physical expenses, course credit (undergraduate students only), and mentorship from a Georgia Tech faculty member.
During the showcase, participants and judges interact with the projects and give feedback. The criteria for judging are centered on innovation and overall market and impact potential. Judges can include industry professionals, faculty members, and alumni.
Throughout I2P Showcase history, many winning projects have gone on to achieve significant success. One is CaseDocker, which provides an end-to-end workflow management system. The startup now has a user base of over 400 global clients, including Fortune 500 companies. Other winners of the showcase include a blockchain-based music application, Radiochain, a personal financial management platform, Dolfin Solutions, and an EEG monitoring device for pediatric seizure detection, NeuroChamp.
This semester, the I2P cohort includes a digital twin using individual data and AI for health screenings and early detection, an active shooter detection and tracking tool, an AR tool that turns walls into interactive canvases, a device that detects overdosages, 3D-printed circuit boards, an AI detector for digital media, and more.
Whether you're a student with a passion for entrepreneurship, a faculty member interested in the latest student innovations, or a community member looking to support local talent, the I2P Showcase is a perfect opportunity to explore student innovations, mingle, and enjoy refreshments. Register for the I2P Showcase today and join us at the Exhibition Hall for an evening of creativity and community.
Students interested in participating in I2P can do so in the spring, summer, or fall semesters. The registration process involves providing a brief description of the project, the team members involved, and the current stage of development. The deadline for applications is Jan. 6 for Spring 2025 and May 12 for Summer 2025.
News Contact
Breanna Durham
Marketing Strategist
Nov. 21, 2024
A multi-institutional team of researchers, led by Georgia Tech’s Francesca Storici, has discovered a previously unknown role for RNA. Their insights could lead to improved treatments for diseases like cancer and neurodegenerative disorders while changing our understanding of genetic health and evolution.
RNA molecules are best known as protein production messengers. They carry genetic instructions from DNA to ribosomes — the factories inside cells that turn amino acids into the proteins necessary for many cell functions. But Storici’s team found that RNA can also help cells repair a severe form of DNA damage called a double-strand break, or DSB.
A DSB means both strands of the DNA helix have been severed. Cells have the tools to make some repairs, but a DSB is significant damage — and if not properly fixed can lead to mutations, cell death, or cancer. (Interestingly, cancer treatments, like chemotherapy and radiation, can cause DSBs.)
Storici, a professor in the School of Biological Sciences, has dedicated her research to studying the molecules and mechanisms underlying damaged DNA repairs. Ten years ago, she and collaborators discovered that RNA could serve as a template for DSB repair.
“Now we’ve learned that RNA can directly promote DSB repair mechanisms,” said Storici, whose lab teamed with mathematics experts in the lab of Nataša Jonoska from the University of South Florida. They’re all part of the Southeast Center for Mathematics and Biology based at Georgia Tech. They explain their discovery in the journal Nature Communications.
“These findings open up a new understanding of RNA's potential role in maintaining genome integrity and driving evolutionary changes,” added Storici.
The researchers used variation-distance graphs to visualize millions of DSB repair events, offering a comprehensive snapshot of sequence variations. The graphs highlighted major differences in repair patterns, depending on the DSB position.
This mathematical approach also uncovered significant differences in repair efficiency, pointing to RNA's potential in modulating DSB repair outcomes.
“These findings underscore the critical role of mathematical visualization in understanding complex biological mechanisms and could pave the way for targeted interventions in genome stability and therapeutic research,” said Jonoska.
Molecular Grunt Work
When a DSB happens in DNA, it’s like a load-bearing beam in a building breaking. A careful, precise repair is needed to ensure the building’s — or the DNA’s — stability. The pieces must be rejoined accurately to prevent further damage or mutation. Repairing a damaged building requires having a reliable foreman on the job site. A DSB requires something very similar.
“A key mechanism we identified is that RNA can help position and hold the broken DNA ends in place, facilitating the repair process,” explained Storici, whose team conducted the research in both human and yeast cells.
Specifically, they found that RNA molecules and the broken section of DNA can match up like puzzle pieces. When RNA has this kind of complementarity with the DNA break site, it acts as a scaffold, or a guide, beyond its traditional coding function, showing the cellular machinery where to make repairs. Over millennia, cells have evolved complex mechanisms to fix DSB, each of them functioning like different tools from the same toolbox.
Storici’s team showed that RNA can influence which tools are used, depending on its complementarity to the broken DNA strands. This means that in addition to being the important protein production messenger, RNA acts as both a foreman and laborer when it comes to DNA repair.
A deeper understanding of RNA’s role in DNA repair could lead to new strategies for strengthening repair mechanisms in healthy cells, potentially reducing the harmful effects of treatments like chemotherapy and radiation.
“RNA has a much broader function than we knew,” Storici said. “We still have a lot of research to do into these mechanisms, but this work opens up new ways for exploring how RNA could be harnessed in healthcare, potentially leading to new treatments for cancer and other genetic diseases.”
As Storici and other researchers continue probing RNA’s effects in DNA repair, their revelations could have a lasting impact on human health and evolution. That means better gene therapies, new cancer treatments and anti-aging strategies — and also the ability to influence how organisms adapt and evolve.
CITATION: Youngkyu Jeon, Yilin Lu, Margherita Maria Ferrari, Tejasvi Channagiri, Penghao Xu, Chance Meers, Yiqi Zhang, Sathya Balachander, Vivian S. Park, Stefania Marsili, Zachary F. Pursell, Nataša Jonoska, Francesca Storici. “RNA-mediated double-strand break repair by end-joining mechanisms.” Nature Communications https://doi.org/10.1038/s41467-024-51457-9
FUNDING: NIH grants GM115927, ES028271; NSF grant MCB-1615335; Howard Hughes Medical Institute Faculty Scholar grant 55108574; Southeast Center for Mathematics and Biology NSF DMS-1764406; Simons Foundation grant 59459; NSF grants CCF-2107267 and DMS-2054321.
News Contact
Nov. 21, 2024
Deven Desai and Mark Riedl have seen the signs for a while.
Two years since OpenAI introduced ChatGPT, dozens of lawsuits have been filed alleging technology companies have infringed copyright by using published works to train artificial intelligence (AI) models.
Academic AI research efforts could be significantly hindered if courts rule in the plaintiffs' favor.
Desai and Riedl are Georgia Tech researchers raising awareness about how these court rulings could force academic researchers to construct new AI models with limited training data. The two collaborated on a benchmark academic paper that examines the landscape of the ethical issues surrounding AI and copyright in industry and academic spaces.
“There are scenarios where courts may overreact to having a book corpus on your computer, and you didn’t pay for it,” Riedl said. “If you trained a model for an academic paper, as my students often do, that’s not a problem right now. The courts could deem training is not fair use. That would have huge implications for academia.
“We want academics to be free to do their research without fear of repercussions in the marketplace because they’re not competing in the marketplace,” Riedl said.
Desai is the Sue and John Stanton Professor of Business Law and Ethics at the Scheller College of Business. He researches how business interests and new technology shape privacy, intellectual property, and competition law. Riedl is a professor at the College of Computing’s School of Interactive Computing, researching human-centered AI, generative AI, explainable AI, and gaming AI.
Their paper, Between Copyright and Computer Science: The Law and Ethics of Generative AI, was published in the Northwestern Journal of Technology and Intellectual Property on Monday.
Desai and Riedl say they want to offer solutions that balance the interests of various stakeholders. But that requires compromise from all sides.
Researchers should accept they may have to pay for the data they use to train AI models. Content creators, on the other hand, should receive compensation, but they may need to accept less money to ensure data remains affordable for academic researchers to acquire.
Who Benefits?
The doctrine of fair use is at the center of every copyright debate. According to the U.S. Copyright Office, fair use permits the unlicensed use of copyright-protected works in certain circumstances, such as distributing information for the public good, including teaching and research.
Fair use is often challenged when one or more parties profit from published works without compensating the authors.
Any original published content, including a personal website on the internet, is protected by copyright. However, copyrighted material is republished on websites or posted on social media innumerable times every day without the consent of the original authors.
In most cases, it’s unlikely copyright violators gained financially from their infringement.
But Desai said business-to-business cases are different. The New York Times is one of many daily newspapers and media companies that have sued OpenAI for using its content as training data. Microsoft is also a defendant in The New York Times’ suit because it invested billions of dollars into OpenAI’s development of AI tools like ChatGPT.
“You can take a copyrighted photo and put it in your Twitter post or whatever you want,” Desai said. “That’s probably annoying to the owner. Economically, they probably wanted to be paid. But that’s not business to business. What’s happening with Open AI and The New York Times is business to business. That’s big money.”
OpenAI started as a nonprofit dedicated to the safe development of artificial general intelligence (AGI) — AI that, in theory, can rival human thinking and possess autonomy.
These AI models would require massive amounts of data and expensive supercomputers to process that data. OpenAI could not raise enough money to afford such resources, so it created a for-profit arm controlled by its parent nonprofit.
Desai, Riedl, and many others argue that OpenAI ceased its research mission for the public good and began developing consumer products.
“If you’re doing basic research that you’re not releasing to the world, it doesn’t matter if every so often it plagiarizes The New York Times,” Riedl said. “No one is economically benefitting from that. When they became a for-profit and produced a product, now they were making money from plagiarized text.”
OpenAI’s for-profit arm is valued at $80 billion, but content creators have not received a dime since the company has scraped massive amounts of copyrighted material as training data.
The New York Times has posted warnings on its sites that its content cannot be used to train AI models. Many other websites offer a robot.txt file that contains instructions for bots about which pages can and cannot be accessed.
Neither of these measures are legally binding and are often ignored.
Solutions
Desai and Riedl offer a few options for companies to show good faith in rectifying the situation.
- Spend the money. Desai says Open AI and Microsoft could have afforded its training data and avoided the hassle of legal consequences.
“If you do the math on the costs to buy the books and copy them, they could have paid for them,” he said. “It would’ve been a multi-million dollar investment, but they’re a multi-billion dollar company.”
- Be selective. Models can be trained on randomly selected texts from published works, allowing the model to understand the writing style without plagiarizing.
“I don’t need the entire text of War and Peace,” Desai said. “To capture the way authors express themselves, I might only need a hundred pages. I’ve also reduced the chance that my model will cough up entire texts.”
- Leverage libraries. The authors agree libraries could serve as an ideal middle ground as a place to store published works and compensate authors for access to those works, though the amount may be less than desired.
“Most of the objections you could raise are taken care of,” Desai said. “They are legitimate access copies that are secure. You get access to only as much as you need. Libraries at universities have already become schools of information.”
Desai and Riedl hope the legal action taken by publications like The New York Times will send a message to companies that develop AI tools to pump the breaks. If they don’t, researchers uninterested in profit could pay the steepest price.
The authors say it’s not a new problem but is reaching a boiling point.
“In the history of copyright, there are ways that society has dealt with the problem of compensating creators and technology that copies or reduces your ability to extract money from your creation,” Desai said. “We wanted to point out there’s a way to get there.”
News Contact
Nathan Deen
Communications Officer
School of Interactive Computing
Nov. 21, 2024
A multi-institutional research initiative aims to address lymphoma survival disparities in African American and EBV-infected patients.
A new interdisciplinary initiative with researchers at Georgia Tech, Emory University, MD Anderson Cancer Center, and Weill Cornell Medical aims to address the knowledge gap in lymphomas — particularly diffuse large B-cell lymphoma (DLBCL), the most common form of blood cancer. Survival rates for DLBCL are lower among African American patients and those with Epstein-Barr virus (EBV), which is prevalent in Latin America. The team uses immunoengineering tools to facilitate this discovery.
Tackling Health Disparities in Lymphoma Treatment
To address these health disparities, the team combines expertise in cancer biology and immunoengineering. At Georgia Tech, Ankur Singh works with oncologists and cancer biologists from partner institutions to create innovative cancer technologies, such as lab-grown, lymph node-mimicking models of DLBDL tumors. Singh is Carl Ring Family Professor in the George W. Woodruff School of Mechanical Engineering and the Wallace H. Coulter Department of Biomedical Engineering (BME) and directs the Center for Immunoengineering. These models will mimic the tumor environments in lymphoma from African American patients and model specific mutations prevalent in these patients. Researchers will observe how various genetic changes work in concert with the immune system to impact a tumor's response to treatments.
“We want to understand the full makeup of these tumors; not just the cancer cells but the surrounding supportive cells and proteins,” said Singh, who serves as co-investigator for LLS SCOR. “This study will help us pinpoint which parts of the tumor are critical for its survival and how we can disrupt those mechanisms, including the immune cells.”
Challenges for Understanding Tumor Biology in High-Risk Groups
Diffuse large B-cell lymphoma is the most common form of blood cancer. While many patients respond well to standard therapies, a significant portion — including a disproportionate number of African Americans and individuals with EBV-related conditions, experience poorer outcomes. The reasons behind these disparities are still largely unknown. Current barriers include a lack of diverse representation in research studies and a paucity of engineered technologies dedicated to understanding cancers in patients from underrepresented backgrounds.
"Most lymphoma studies don't include nearly enough African American or Hispanic patients," said Jean Koff, lead investigator and associate professor of Hematology and Medical Oncology at Emory University’s Winship Cancer Institute. “This means we are likely missing key insights into the unique biology and treatment needs of these populations.”
A Collaboration Focused on Advancing Lymphoma Research and Care
This new initiative, funded by The Leukemia & Lymphoma Society's Specialized Center of Research (SCOR) Program, will analyze a comprehensive collection of DLBCL tumor samples that includes many cases from Black and Hispanic patients. By examining genetic differences and tumor structures, the researchers hope to identify the factors most important for improving therapy for these groups.
“This program is groundbreaking because it addresses both biological and structural barriers in treatment, leveraging the latest bioengineered technologies,” Singh noted. “We’re looking at factors that have been overlooked for too long in cancer research, especially in high-risk communities.”
To explore the composition and diversity of cells within tumors of African American patients and better understand how they grow and respond to treatments, the team leverages the expertise of Ahmet Coskun. Coskun is a Georgia Tech immunoengineer known for his innovative approaches to understanding the immune response to cancer. An assistant professor in BME, Coskun holds the Bernie Marcus Early Career Professorship. He and his team use advanced imaging techniques and engineering principles to analyze tumor microenvironments in unprecedented detail. By examining how different immune cells interact with cancer cells, they hope to uncover the complexities of tumor biology and identify factors that contribute to treatment resistance.
This five-year, multi-million-dollar LLS SCOR award is the culmination of years of collaboration among leading researchers in the field of lymphoma. Singh, with colleagues Koff, Coskun, Christopher Flowers at MD Anderson Cancer Center, and Cornell Medicine’s Ari Melnick, Ethel Cesarman, and Leandro Cerchietti, are fostering a partnership in lymphomas and EBV-related cancers, which is instrumental in advancing research on lymphoma treatment health disparities. Their longstanding partnership reflects a commitment to addressing the complex challenges different populations face when battling deadly cancers.
"With this unique partnership, leveraging new cancer technologies, biology, and clinical expertise, we hope to make breakthroughs in lymphoma research and begin to address health disparities in lymphoma at multiscale levels,” said Melnick, a co-lead for LLS SCOR and Gebroe Family Professor of Hematology and Oncology at New York’s Weill Cornell Medicine.
The group also played a significant role in organizing, moderating, and presenting at the inaugural conference “Health Disparities in Hematologic Malignancies: From Genes to Outreach,” held in May 2023 in New York. The conference served as a vital platform for discussing the latest research, sharing best practices, and highlighting the importance of outreach initiatives aimed at improving care for underserved populations.
"The research will provide a unique window into the intricate structure of lymphomas and how these complexities influence treatment,” said Flowers, a physician-scientist and division head of Cancer Medicine at MD Anderson Cancer Center in Houston, Texas. “By studying lymphoma microenvironments in patient tissues and organoids, we can begin addressing health disparities in lymphoma, identifying why certain populations may respond differently to therapies. No other technology currently provides this level of insight or potential for tailored patient care."
This unique research collaboration is crucial, as understanding tumor heterogeneity can inform the development of more personalized treatment strategies, particularly for underserved communities that often face disparities in cancer care. By integrating engineering with oncology, the team hopes to create more effective therapies tailored to individual patient profiles, ultimately aiming to improve outcomes for all lymphoma patients. This multi-site collaboration aims to fast-track the development of therapies against lymphomas in African Americans and individuals with EBV-related conditions and eventually bring them to clinical trials.
Project Title: Translating molecular profiles into treatment approaches to target disparities in lymphoma
(Funding and award period: $5 million, October 1, 2024 - September 30, 2029)
News Contact
By: Savannah Williamson
Pagination
- Previous page
- Page 41
- Next page