Georgia Tech Researchers Develop New Tool to Preserve Crash Report Privacy
When a user opts to send a crash report following a program failure, the report could share personal information including usernames, passwords, and other confidential details.
This is why Georgia Tech researchers created a new tool called Desensitization that generates crash reports that preserve the original error — whether a bug or an attack — without exposing privacy. The method is also smaller than half of all crash reports, taking less than 15 seconds to process and significantly saving resources for both users and developers.
“Crash dumps can have a severe outcome, including sensitive information leaking from end users, bad publicity, and financial liability for developers if a data breach happens,” said School of Computer Science Ph.D. student Ren Ding. “We came up with the idea to desensitize crashes while keeping necessary attack payloads before sharing it to the developers.”
Desensitization is effective at removing more than 80 percent of potential sensitive data from Linux crash reports and nearly 50 percent of Windows crash reports.
The problem with crash reports
Crash reports usually include a coredump file that contains the central processing unit context and memory of the program, or the program inputs that made it crash. Both can include sensitive data, from session tokens to personal contact information.
Previous techniques to remove this type of data weren’t effective. Relying on developers to hand-annotate the data is time consuming and error prone. Another method that uses a pattern-based search to identify private data, such as email addresses, doesn’t work on program-specific data. The biggest issue with both methods is how much computation they require.
How Desensitization works
In contrast, Desensitization first runs on the user side to decouple general information and crash evidence from personal data. The order is strategic, according to Ding. End users may not have as much computation power as developers, so debugging and analysis isn’t efficient at the source.
It’s imperative to minimize necessary resources such as time and memory, making the program “lightweight.” The goal, said Ding, is to extract enough general information so developers can conduct the crash diagnostics on their end. Once the tool extracts the attack information from the user, it’s sent to the server for more thorough analysis.
The researchers’ method is bug and attack oriented to ensure the tool only focuses on relevant data. With this in mind, they designed one general and four lightweight techniques.
The general technique scans the memory to identify all pointers, a programming language object that stores memory and where most attacks reside. The four other techniques either identify specific information or customize existing techniques in prominent bugs and attacks.
“Our security model relies on the security guarantee of previous detection methods,” Ding said. “That’s why we try to make the current framework customizable by adopting the module design so that it can support more in the future.”
The module design of Desensitization makes the tool easy to apply to future technique and works on existing crash report analysis formats.
The solution is available to developers via Github.
Ding will present the research at the Network and Distributed System Security Symposium (NDSS) in San Diego running from Feb. 23 through 26. He co-wrote the paper, Desensitization: Privacy-Aware and Attack-Preserving Crash Reports, with fellow SCS Ph.D. students Hong Hu and Wen Xu and Associate Professor Taesoo Kim.
As computing revolutionizes research in science and engineering disciplines and drives industry innovation, Georgia Tech leads the way, ranking as a top-tier destination for undergraduate computer science (CS) education. Read more about the college's commitment:… https://t.co/9e5udNwuuD pic.twitter.com/MZ6KU9gpF3
— Georgia Tech Computing (@gtcomputing) September 24, 2024