cross-posted from: https://lemmy.dbzer0.com/post/19547690
After reading this thread I had the question on whether it is possible to verify you have certain information without revealing who you are to others.
Yes, but the information would need to be computationally verifiable for it to be meaningful - which basically means there is a chain of signatures and/or hashes leading back to a publicly known public key.
One of the seminal early papers on zero-knowledge cryptography, from 2001, by Rivest, Shamir and Tauman (two of the three letters in RSA!), actually used leaking secrets as the main example of an application of Ring Signatures: https://link.springer.com/chapter/10.1007/3-540-45682-1_32. Ring Signatures work as follows: there are n RSA public keys of members of a group known to the public (or the journalist). You want to prove that you have the private key corresponding to one of the public keys, without revealing which one. So you sign a message using a ring signature over the ‘ring’ made up of the n public keys, which only requires one of n private keys. The journalist (or anyone else receiving the secret) can verify the signature, but obtain zero knowledge over which private key out of the n was used.
However, the conditions for this might not exist. With more modern schemes, like zk-STARKs, more advanced things are possible. For example, emails these days are signed by mail servers with DKIM. Perhaps the leaker wants to prove to the journalist that they are authorised to send emails through the Boeing’s staff-only mail server, without allowing the journalist, even collaborating with Boeing, to identify which Boeing staff member did the leak. The journalist could provide the leaker with a large random number r1, and the leaker could come up with a secret large random number r2. The leaker computes a hash
H(r1, r2)
, and encodes that hash in a pattern of space counts between full stops (e.g. “This is a sentence. I wrote this sentence.” encodes 3, 4 - the encoding would need to limit sentence sizes to allow encoding the hash while looking relatively natural), and sends a message that happens to contain that encoded hash - including to somewhere where it comes back to them. Boeing’s mail servers sign the message with DKIM - but leaking that message would obviously identify the leaker. So the leaker uses zk-STARKs to prove that there exists a message m that includes a valid DKIM signature that verifies to Boeing’s DKIM private key, and a random number r2, such that m contains the encoded form of the hash with r1 and r2. r1 or m are not revealed (that’s the zero-knowledge part). The proof might also need to prove the encoded hash occurred before “wrote:” in the body of the message to prevent an imposter tricking a real Boeing staff member including the encoded hash in a reply. Boeing and the journalist wouldn’t know r2, so would struggle to find a message with the hash (which they don’t know) in it - they might try to use statistical analysis to find messages with unusual distributions of number of spaces per sentence if the distribution forced by the encoding is too unusual.Great summary. Thanks for this.
Yes, only after you’ve met them. First meet, exchange public keys. Use zero knowledge proofs.
But how do you verify if that information is actually accurate?
Like for example if a whistleblower says that their organization has something that can do xyz is it possible to verify that through zero knowledge proofs?
You rely on the whistleblower. That is the only way.
It might be possible. It would depend on the specific details of what the whistle blower is claiming.
That’s impossible in a generalized way. That would be the same as having an algorithm for truth
In a journalistic context, a ZKP can’t prove veracity of the information.
Let’s say you have a hoax that you want to pull on a journo. You cook up something that looks legit, like the blueprints for a super secret stealth fighter or something. You find a way to apply a ZKP to that file (let’s say an elaborate cryptographic hash). You leak the file to the journo. They ask for you to iterate on the ZKP a few hundred thousand times (which is on the low side for a ZKP) - easy to do, because you came up with it.
But that doesn’t mean the file’s legit. That’s a separate problem, and not one that is technological in nature.
At that point why not just use digital signatures?