Color the map of the British Isles so that no two physically bordering regions have the same color (Click to color)
Color the map of the British Isles so that no two physically bordering regions have the same color
(Click to color)
I showed this to my two kids, and we all three enjoyed it. The zero knowledge proof portion didnt really click for me, but we liked the four color map theorem stuff. This led me to download some maps for my kids to attempt coloring on paper, and also got me wondering about how this holds or doesn't on non-euclidian spaces. Turns out the maximum is four colors on a sphere, but 7 colors on a torus! More details here: https://mathworld.wolfram.com/TorusColoring.html
Thanks for leading us down this mathematical rabbit hole today.
The best analogy of zk-proofs I've heard is to suppose you have found Waldo in "Where's Waldo," and want to prove that you have done this without revealing the location.
You could take a piece of paper (much larger than the picture/book), and cut out a waldo-shaped hole it and position the paper such that he is shown in the hole. Then, when you show it to the challenger, they know that you have found him without you revealing where he is.
It took me a minute to fully get this, so I'm adding this so it's a bit more obvious for anyone else: the piece of paper is much larger than the picture/book so that it can hide the book's relative position underneath it.
Unfortunately that completely defeats the idea of the proof.
Here's an alternative procedure:
1. Get a very large sheet of paper, and cut a Waldo-shaped hole in it.
2. Get some more paper, and paint a picture of Waldo on it.
3. Paste your image of Waldo on the back of the paper you prepared in step 1.
4. Place this composite over a Where's Waldo book.
5. You've found Waldo!
If you can't tell where the book is, there is no evidence that the image of Waldo is part of the book.
if you can tell where the book is, it's not zero-knowledge anymore...
see my other comments -- the idea is that in each round you should be able to verify the construction (there's a Waldo-sized hole and the correct book and page when the pieces are separated) or the proposed solution (Waldo through the hole) but never the offset of the book and hole (because then you can deduce where Waldo is).
a cheating prover (utilizing your strategy, or just Waldo from a different book or whatever) would try to guess which one you will want but only has probability 1/2 of succeeding. through iteration the verifier can be exponentially increasingly confident that the prover knows where Waldo is.
But it's a simplification: One iteration is enough to detect lying.
In a real ZK proof the probability of the prover lying reduces after each iteration but never reached 0.
there is a probabilistic interactive component. this protocol allows the prover to fake it by just using a book where they know where Waldo is. in each iteration you choose whether to confirm it's the original book and page under the paper or see Waldo (but not both). a cheating prover has p = .5 of fooling a verifier.
but your concern is invalid to begin with. nothing in the definition of a zkp requires them to be multi-round interactive. there exist non-interactive zkp.
How do I know if it is the original "Where's Waldo" under the paper?
in each iteration you choose whether to confirm it's the original book and page under the paper or see waldo
I think that conveys what a zero knowledge proof achieves but it doesn't really correspond to any real zero knowledge proof algorithms. You can't do that over the phone, which is kind of the whole point.
no, that's not even close to the whole point. the analogy is to introduce the concept of proving something to a verifier without giving the verifier the solution
the paint-mixing analogy of diffie-hellman also can't be done over the phone, but it helps people understand how a shared secret can be established even if all communication is intercepted
It is the whole point. ZKPs are useless if you can't do them remotely.
> the analogy is to introduce the concept of proving something to a verifier without giving the verifier the solution
Yes that's what I said.
Four colours also isn't enough for some real-world country maps. Countries with enclaves (like Alaska, Kaliningrad or Sint-Maarten, though only the last actually makes 4- colouring impossible) change the topology: you can think of what shape the earth would have if there was also a tunnel connecting Alaska and New York.
Torus, of course, but how does that relate to enclaves, which are planar ?
You might want to colour the exclave the same colour as the rest of its country, even though no connection can exist between them on the plane.
It seems like you're using "enclave" and "exclave" interchangeably, which is causing confusion. What you're referring to is an exclave; an enclave is when one region is completely surrounded by another, like Lesotho or Vatican City.
You're right, my first post should say exclaves and the three examples I gave are all exclaves.
This video helped me a lot to understand the basics of zero knowledge proof: https://www.youtube.com/watch?v=fOGdb1CTu5c
Great video, thanks. My takeaways:
- The physical demonstrations at 1:16 and 3:48 were helpful
- thinking of proofs as between a prover and a verifier, not as classical deterministic proofs
- insight into the name "zero-knowledge", as in "you can already predict the answer, so you're not gaining any knowledge from that interaction"
And it turns out that on the Klein bottle, the maximum is 6 colors. More generally, this number depends directly on the number of holes of the surface [1].
Regarding the zero knowledge proof, there'd be a chance that the two random ones you reveal are the same color, if three weren't enough.
So by doing this over and over again, if the two you choose are always different colors, you approach a 100% certainty that it's legit.
You never really get a 100% proof but the more times you repeat the closer you are to being sure. At 99.999999% after repeating this enough times, you'd most likely be satisfied.
When I read the zero knowledge proof part the first time, my thought was:
The person can just always give two different colors (and never two same colors) when you reveal two post it notes, so you could never proof them false.
Only after re-reading I realized _all_ the colors are to be hidden under the post it notes _beforehand_, not at the time you choose two post it notes.
Does the torus one apply to a wrap-around 2D map then? So if the edges don't wrap, the max amount of colors needed is 4, but if the edges do wrap, you could make a map requiring 7 colors?
EDIT: yep looks like it, the leftmost picture under 'snapshots' here is a 2D map of what's on the torus and looking at e.g. the blue band, it touches all 6 others (just barely the cyan and yellow ones with a few pixels of its tips when wrapping between top/bottom): https://demonstrations.wolfram.com/SevenColoringOfATorus/
Here's my explanation of the ZKP:
First, both you and the oracle know what the uncolored map looks like, and what the 3 colors are.
Step 1: The oracle sends the entire 3-colored map, but asymmetrically encrypted. So you can't see the map, but the oracle can't "un-send" it. If no 3-coloration of the map exists, the oracle has to send you something, and because you know what the map and colors look like, the only thing they can send is a map where some two adjacent regions have the same color.
Steps 2&3: You randomly pick two adjacent regions and ask the oracle for their decryption keys, so you can see their colors. When you initially received the encrypted map, you could not know whether it had two same-colored adjacent regions. But, if you happen to randomly choose the same-colored regions, the oracle has no way of not telling you. If the oracle refuses to send the decryption keys for those regions, or sends ones that don't successfully decrypt, you'll know something is up, and can assume the regions are the same color. And the only keys that successfully decrypt the regions' colors, decrypt into their original colors; so the regions will only decrypt into different colors, if they were different colors in the encrypted map the oracle originally sent.
To further illuminate: the "random pick after the map is received" ensures that the oracle must send you a map where all adjacent regions have different colors, even though you can't see all the colors yourself. Otherwise, the oracle can't guarantee that you won't ask for the two regions with different colors (because they sent the map before you randomly pick), and if you do ask for those regions, they can't respond in a way that re-assures you the colors are different (because the only way to do so is send keys that decrypt the regions into separate colors, and since they decrypt into the same color, such keys don't exist).
Step 4: Repeat steps 1-3 an unbounded number of times. This is necessary because in a single iteration, there's a chance that the oracle sends a map with two adjacent same-colored regions, but you pick two different regions; so the map is un-3-colorable, but you don't find out. In fact, this is a very high chance if it's a large map and only a single pair of adjacent regions have the same color. But more iterations increase the probability that you do find out indefinitely; with enough iterations, the probability that one of them you get lucky and select the same-colored region is 99.999...%.
Also, each time you repeat steps 1-3, the oracle sends the map with a different coloration. Otherwise, you'd slowly reveal the colors to get the fully-colored map, so it wouldn't be a zero-knowledge proof (two colors in each of two maps with different colors doesn't give you any more information than two colors in one map, so even with unbounded iterations the original coloration isn't revealed). If the map isn't 3-colorable, the randomization doesn't affect the probability: when the oracle randomizes the map, they could choose an entirely different coloration and give two different adjacent regions the same color, but the probability of you randomly choosing those two regions stays the same, so the probability of "getting lucky" in one of enough iterations also stays the same, at 99.999...%.
same that was extremely fun but im not sure i'm going to grasp the zk proof thing
[spoiler alert]
I knew that 4 colours sufficed for any arbitrary map from back in the day when I learned this, but still I found it VERY rewarding by attempting to draw a map that needed 5 colors, and how intuitive this demo was for getting a "feel" for a thing that I knew only theoretically! Like I needed an impossible geometry to fit, either an area that stretched to a zero-width path (which would becomes a point, and thus 2 areas, so doesn't fit) or some other "impossible" geometry. Loved it, congrats on a really well executed idea!
It was really enjoyable to try. I got an optimal 4-color map on the first try, so I was over confident. My approach was something like: put a bunch of the 4-color maps next to a long Chile-like thing. When that didn't work I added borders until my device couldn't render it any longer. Very very fun.
there has to be a layman's explanation. I knew the easiest way to get four colors was to put a split square inside another split square "donut", but the reason you can't force 5 colors is that word "inside". There has to be a nice, tidy "verbal proof" that no matter what, one or more colors will be "trapped" inside at most 3 other colors.
You would think. The easier, five-color theorem proof fits in a few paragraphs but the four-color theorem really resists a simple explanation. Even the simplified 1990s version of the proof (which came ~20 years after the original proof and 100 years after the 5CT proof) required enumeration of hundreds of individual cases.
Loved the interactions and flow overall but I'm a bit lost on the zero knowledge proof example. I'm familiar with the concept but I don't follow how the example is one. E.g. "By repeating the process enough times, the probability that you never catch me becomes smaller than, say, getting struck by lightning" doesn't seem to show it's a proof? If I pick a hundred numbers it'll look like I just proved some black box function which happens to be Sin[n] + 0.999999999999 is always positive even though I'd be able to clearly show it negative with the knowledge of the function.
It feels like something that got detached from the things that make it work during simplification. Or it could be that I just have a misunderstanding/oversight in the zero knowledge proof :).
In an unrelated note: I colored the larger graph and it didn't even play along!
Very glad you enjoyed it!
For the ZK example, the math behind it is this: if there are m bordering regions and I am lying, you have a 1/m chance of catching me each time. Thus after k repetitions the chance you haven't caught me is (1-1/m)^k \approx e^{-k/m} which is extremely small for k sufficiently larger than m.
Now, you may rightfully say: hey that's still not a "proof," you could still be lying! There are two responses to this:
1. The probability can be made incredibly small, like smaller than the the chance, say, your computer got hit by a gamma ray burst that would flip bits from 0 to 1 (I really have no idea if this actually happens but people have said it to me).
2. It turns out it is mathematically impossible to get the zero knowledge property if you want true proofs (i.e., no probability of being wrong). So, there's a trade off: if you want zero knowledge, you have to accept some (small) failure probability
P.S. Adding an easter egg for coloring the larger graph is on the todo list :)
Yeah, I got tripped up by that formulation as well and it's actually something that annoys me with a lot of algorithms that have some properties proven in a limit: It's "easy" (or at least possible) to mathematically prove that in the limit of some variable, the property will hold: If you repeat the challenge increasingly often, the probability of being lied to will get arbitrarily close to zero; for sufficiently large input sizes, some algorithm runs in linear time; with sufficiently large amounts of training data and iterations, some prediction error will become arbitrarily small, etc etc.
But none of that is telling you how much is "sufficient", or even which order of magnitude we're talking about. If the quantity has a real life cost, this would result in enormous practical differences.
(With the formula you have given for the ZK proof, we're at least one step further: You can start with the desired probability, e.g. the gamma ray burst und calculate the required minimum k from that - also, it's easy to see that the color problem lends itself well to such proofs because the probability of failure drops exponentially quickly with growing k, so the actual k you choose can be relatively small. But if all you have is a proof in the limit, that's not possible)
The problem with doing this on a computer is getting us to believe you didn't just make up the colors as we tell you to reveal them (after being “dishonest” before).
That's the idea at the end about presenting the "sticky notes" as products of primes. Assuming you can't factor the primes yourself, you can be given the whole grid of those products and then interactively ask for the factors or a pair of them. The requestor can't give an alternative factorization (ie. make up a color on the spot) since each number can only be factored one possible way and its easy to verify.
I really liked this part that shows all the numbers up front, and none of them change during the reveal step.
I think it would present better if introduced as "To show that there's no cheating going on behind the scenes, we will..."
Yeah I agree it could be presented clearer. Maybe make the analogy of multiplying the factors together and "covering with a post it" more explicit
You're right. That does cover it. I was playing with my kid and I didn't get it at first.
I might use smaller factors then.
Thanks for the explanation, it seems the definition was slightly different than I assumed it to be previously and that was my missing link to it all making sense. Thanks also for the demonstration to share this info!
Looking forward to the easter egg :)
OP: Although I'm not really in the target audience for this demo (I already knew all the punchlines), it does occur to me that it might be helpful to readers like the parent-commenter — and even perhaps thought-provoking to us know-it-alls — if you provided a mode at the end of the demo where the graph was in fact not three-colorable, and the computer actually would lie about its being three-colorable. So it would generate a "three-coloring" with a flaw somewhere, and display its representation as products of primes, and you'd get to choose two adjacent products and receive their factors... and so you could see for yourself exactly how long it took for you to luck into catching the computer in one of its lies.
And the demo could also tell you the expected number of iterations to catch the lie with (50/90/99%) probability. It'd be a pretty large number even for such a small graph, I'd bet.
(Of course the computer could also lie about the factorizations, since it's unlikely a human would bother to catch it in that kind of lie; but let's assume it doesn't ever do that.)
Readers might also be interested in the https://mathworld.wolfram.com/McGregorMap.html (reported, on 1 April 1975, to require five colors!)
For me it was less about the idea of how likely you'd quickly it converges to you almost certainly outing them vs misunderstanding the idea that a zero knowledge proof is about, more or less, the "limit" of the validation behavior to an arbitrary point choosable by the tester not necessarily an actual guarantee you can finitely reach the conclusion.
Prior to this I'd only seen "proof" in math where it has meant you can absolutely guarantee there to bo no counterexample not just that it seems impossibly unlikely there could be a counterexample. E.g. the Tarry-Escott problem where we have proof there is no sets exists with n=4 and m=5 even though we haven't ever found numerical values of sets matching that description or Merten's conjecture where the smallest counterexample is estimated to be so large (~10 billion digits) we've not even been able to find the first counterexample value despite knowing it exists due to a proof. On the other side of things we have things like the Goldbach conjecture or Riemann Hypothesis where we've poured our hearts, brains, and souls into trying to find a counterexample or proof and don't claim to have either yet.
Adjusting to that definition of "proof" for the context it all makes a lot more sense now.
I've got another problem about this zero knowledge proof. The digital version doesn't make a lot of sense to me. It depends on the fact we don't have a fast integer factorization algorithm. But integer factorization is not proven to be NP-complete, and 3-coloring is NP-complete.
So isn't it possible that there is a polynomial time algorithm for integer factorization, but no polynominal time algorithm for 3-coloring, and therefore the "zero knowledge proof" actually reveals the answer?
I think you're right, and integer-factorization is often used in these examples as a process that is hard to do but easy to verify. There are plenty of other processes that could be substituted in, e.g. reversing SHA256 hashes, that would likely be even less tractable to the target audience.
However, if P = NP, there is no process that works here - there's nothing that is hard to do but easy to demonstrate, and therefore no zero knowledge proofs exist.
Actually, that's not true either. It requires the definition that all polynomial-time algorithms run quickly and all superpolynomial ones run slowly. This is not an accurate definition for all practical problem sizes and this is where the analogies all break down. Polynomial vs nonpolynomial is more interesting to complexity theorists than "how many years would this actually take with a fast computer".
> However, if P = NP, there is no process that works here - there's nothing that is hard to do but easy to demonstrate, and therefore no zero knowledge proofs exist.
IIRC technically, there are zero-knowledge proofs for all statements in P: the proof is "prove it yourself", which the verifier can do because it's in P.
> E.g. "By repeating the process enough times, the probability that you never catch me becomes smaller than, say, getting struck by lightning" doesn't seem to show it's a proof?
That's fine though, because the point isn't really to publish math papers without disclosing proofs. For example, presenting a valid digital signature is sometimes colloquially called a proof that you had the private key, even though there is 1 in gazillion chance that you didn't. For such practical tasks, very high chance tends to be good enough.
I think the answer is that each time you reveal the colours, you observe that they are within the set of three colours illustrated at the beginning of the proof. Whichever you reveal, you never find a fourth colour.
This confused me at first.
For anyone confused by this response I had edited my comment after reading https://news.ycombinator.com/item?id=40740557 but before equivalence had hit reply and now their reply is left hanging. Sorry esquivalience! To summarize the linked answer on trusting the second dot isn't just randomly assigned: keep the context as physical post-its. Barring something like a matter bending psychic you'd be able to tell the dot under the second post-it was swapped as you made your pick.
That still leaves how to rely on chance of picks for a proof though.
It's the same thing as limits in spirit.
It's not that the chances of lying are small, it's that they can be made arbitrarily small.
Let's say my standards of "proof" are that there's only 0.1% chance that you're cheating. We play that game several times, and I'm satisfied.
Next comes someone else whose standard is 0.001% chance of cheating. They simply play the game a few more times, and they're satisfied too.
If they change their mind and decide that only 0.0000001% will make them happy, they simply tack on a few more rounds.
The key here is that the probability that you can cheat for arbitrarily long is exactly zero — for the same reason that Zeno's paradox is resolvable (and limit of 1, 1/2, 1/4, 1/8, 1/16, ... is exactly zero, and not just a very small number).
Great description in that "proof" in this context is more referring to the limiting behavior and being able to get to your desired level of arbitrary happiness than necessarily providing a traditional "proof" about it being a certainty within a finite amount of estimation. Thanks.
Thanks for the feedback, glad my comment was helpful!
Reminds me of different colored swans
This project is an enhanced reader for Ycombinator Hacker News: https://news.ycombinator.com/.
The interface also allow to comment, post and interact with the original HN platform. Credentials are stored locally and are never sent to any server, you can check the source code here: https://github.com/GabrielePicco/hacker-news-rich.
For suggestions and features requests you can write me here: gabrielepicco.github.io