Biz & IT —

Watershed SHA1 collision just broke the WebKit repository, others may follow

"Please exercise care" with colliding PDFs, researchers advise software developers.

Watershed SHA1 collision just broke the WebKit repository, others may follow

Thursday's watershed attack on the widely used SHA1 hashing function has claimed its first casualty: the version control system used by the WebKit browser engine, which became completely corrupted after someone uploaded two proof-of-concept PDF files that have identical message digests.

The bug resides in Apache SVN, an open source version control system that WebKit and other large software development organizations use to keep track of code submitted by individual members. Often abbreviated as SVN, Subversion uses SHA1 to track and merge duplicate files. Somehow, SVN systems can experience a severe glitch when they encounter the two PDF files published Thursday, proving that real-world collisions on SHA1 are now practical.

On Friday morning, the researchers updated their informational website to add the frequently asked question "Is SVN affected?" The answer:

Yes - please exercise care, as SHA-1 colliding files are currently breaking SVN repositories. Subversion servers use SHA-1 for deduplication and repositories become corrupted when two colliding files are committed to the repository. This has been discovered in WebKit's Subversion repository and independently confirmed by us. Due to the corruption the Subversion server will not accept further commits.

Researchers connected to the world's first known collision attack on SHA1 spent much of Friday monitoring the effects of the PDFs on SVN. This group included lead researcher Marc Stevens and Google Research Scientist Luca Invernizzi, who worked on Google's cloud implementation of an online tool that detects if files are part of a collision attack. In an e-mail, Stevens wrote, "Luca from Google has independently verified this corruption and its effect. WebKit is also dealing with this. As far as I know, it is yet unknown how to fix such a corrupted SVN repo. So we should warn people about this issue."

Update 2/25/2017 8:15, California time: SVN maintainers have released a tool administrators can use to prevent the glitch. The script will reject any generated PDFs based on Thursday's SHA1 collision. It will catch all such files, not just the two PDFs released by the researchers.

"This script will reject any generated PDFs based on Google's SHA1 collision," Apache SVN developer Stefan Sperling wrote in an e-mail. "This script is the first mitigation the Subversion project has made available on short notice."

According to the above-linked bug report, the WebKit repository became corrupted late Thursday night when someone wanted to test how the system would handle the PDFs. Almost immediately, the system experienced failures. The errors persisted into Friday and eventually prompted one user to ask, "Is it fixable, or are we just totally hosed? Are we going to need to delete all the SVN history since this commit from the server in order to avoid the hash collision?" Responses indicated that the repository remained at least partially corrupted even after the PDFs were deleted. This message on a WebKit e-mail list showed mirroring systems remained unable to be updated.

The precise status of WebKit's repository isn't clear, and attempts to reach WebKit officials weren't immediately successful. It's not clear if production SVN systems used by other organizations are experiencing similar outages. This post will be updated as warranted.

Channel Ars Technica