Chasing Cyber - #35 - The Pace of Change
AI watermarking, crypto standards, and government policy
Meet Google’s Random New Watermarking
Google found a new way to use random numbers: efficiently watermarking AI-generated text to prevent plagiarism.
The new approach is called “SynthID-Text”, suggesting the marketing team had the week off. Nonetheless, the technical details are interesting.
Large language models generate text by repeatedly choosing a plausible next word based on the preceding text. Google’s watermarking approach influences which words are chosen without degrading the resulting text.
Before each word is chosen, a random number is generated based on a secret watermarking key and the context of recently generated words. This random number is used to score potential word choices, which compete against each other in a tournament bracket system until only one word remains.
The reverse process is used to detect a watermark. The scoring of each word is compared against the candidates that could have been chosen. If it scores too highly against the random-number-driven preference, the text is considered watermarked.
You can read more about the details, and Google’s large-scale testing with Gemini, in their paper: https://www.nature.com/articles/s41586-024-08025-4.
Finally, It’s Banned…
If you need proof that cryptographic standards move slowly, look no further.
Thirty-one years ago, Bruce Schneier lambasted ECB mode in his seminal book, Applied Cryptography. Finally, in 2024, it’s about to be banned in the standards.
Electronic codebook (ECB) is a block-cipher mode that describes how to encrypt long messages. ECB is the simplest possible mode: the message is divided into equal-sized blocks, encrypted, and then concatenated. However, this simplicity caused a lot of problems.
One major issue is that patterns in the message show through as patterns in the encrypted data. This is best demonstrated in the classic penguin image, which I’ve included in this post. There’s so much structure and repeated data in the original image that the encrypted version still looks like a penguin.
Despite this problem (and others), ECB was widely adopted. Unfortunately, this made it very difficult to erase. Cryptographic standards have complex inter-dependencies, which makes it hard to remove a foundational ingredient like a cipher mode. It’s taken many years to get to the point where ECB can finally be killed.
Remember, just because something is standardised doesn’t mean it’s the best choice. Always do your research to find the best option.
Thanks, GSMA!
Here’s something post-quantum fans should bookmark.
The GSMA has compiled a list of governmental guidance on the transition to post-quantum. For each country, you get links to the official documents alongside a summary of their timeline requirements and preferred algorithms.
It was last updated in early October, so it’s reasonably fresh. It has data for Australia, Canada, China, the Czech Republic, the European Union, France, Germany, Italy, Japan, Netherlands, New Zealand, Singapore, South Korea, Spain, the United Kingdom, and the United States.
Bookmark this link: https://www.gsma.com/newsroom/post-quantum-government-initiatives-by-country-and-region/.