To FHE, or not to FHE, that is the question.
Everything you need to know in a few pages to help navigate the segment: the technology, the history, the challenges, and the road ahead to commercialization.
Fully Homomorphic Encryption (FHE) is creating a buzz with new funding and companies entering the space.
Here’s everything you need to know in a few pages to help navigate the segment: the technology, the history, the challenges, and the road ahead to commercialization.
The technology.
What is FHE? It’s a method to encrypt data and still perform computations over encrypted data without seeing the data. A simple analogy is to imagine you could assemble a delicate watch while blindfolded – you still get the task done (assemble the watch; compute), without seeing the precise diamonds you put in there (keeps data private).
It allows us to perform machine learning and other tasks over sensitive data.
Since generic computations are hard to work with in theoretical designs, scientists focus on performing these operations over just two functions: addition and multiplication. That is, given two encrypted numbers C1 and C2, encrypting plaintext values M1 and M2, you want to be able to obtain an encryption C of (M1+M2) (and similarly for multiplication).
It turns out that if you can have efficient operations over those two encrypted values, you can do FHE over any program! You first convert a program into a circuit consisting of only additions and multiplications, and then perform the above operations one by one. This step is pretty critical, and we’ll come back to it later.
The history.
Ronald Rivest, Leonard Adleman, and Mike Dertouzos proposed the idea of computing over encrypted data in 1978. However, the first construction that supported general circuits (by instantiating FHE over both addition and multiplication) was shown only in 2009 by Craig Gentry in his PhD thesis. [https://crypto.stanford.edu/craig/craig-thesis.pdf]
[A side note: Craig’s work and the advances that followed inspired me to pursue a PhD in cryptography. After getting into grad school, I printed his entire thesis, put it in a binder, and set out on a mission to read it. I couldn’t understand anything in it, and still don’t. Thankfully, my advisor came along and told me to read follow-up papers instead].
The use-cases.
There are many sources for this, so I’ll just drop a bullet list here.
Secure Data Processing in the Cloud.
Privacy-preserving ML/AI.
Encrypted Search and Database Operations.
Financial Services.
Intelligence.
And last but not least, blockchains.
What is an encrypted VM?
It’s a generic virtual environment that allows you to upload smart contracts (S) and transactions (T), compute S(T) [asset transfers, swaps, etc.] without revealing S and/or T. This is where differentiation between the VMs can take place. Some could provide contract privacy, others data privacy, or both. Without being extra careful, you could end up in a situation where an environment with data privacy but no contract privacy could reveal information about the data, of course.
The challenges.
Overheads.
Breaking composability.
Breaking dev tools.
Regulatory.
Overheads: You need to convert a program into an “FHE-friendly” representation (e.g., a circuit consisting only of addition and multiplication gates). This introduces significant overheads to computing the program natively. Then, you apply cryptographic operations over each circuit, replacing a single operand with a sequence of complex operations over large numbers. This also adds overheads. Overall, depending on the computation, you’re adding dozens of orders of magnitude in overheads. Does this make it impractical for all computations? No, you can optimize the functions for specific workloads to be relatively efficient. But generic compute is far from practical, as of now.
The recent announcement from Intel about a hardware circuit for more efficient FHE operations is interesting https://www.eenewseurope.com/en/intel-plans-custom-accelerator-chip-model-for-encrypted-computing/. I don’t know what this is about and how this would function. The challenge is that algorithms still need to evolve, and there are a lot of parameter selection choices to make [I participated in an effort to standardize some of these choices back in ‘19 https://eprint.iacr.org/2019/939, but I don’t think we can finalize these selections even today]. This makes the design of circuits challenging as you need very specific “number sizes” (fields) and operations to make the best use of fixing the circuit topology. Would love to learn more here.
Has there been any “breakthrough” that makes FHE a reality? Not really. At this point, we’re on the path of solid but incremental improvements in algorithms and optimizations. Nothing can change from having to convert a program into some form of THE-friendly representation (which will add overhead) and having to compute over larger numbers.
Breaking composability: Privacy breaks composability. You cannot compose private data the same way you can compose public data [and programs]. That is, an encrypted transaction cannot be used in smart contracts the same way a plaintext value can — a program receiving an encrypted trade cannot call another program and send a deposit to another program that operates only over plaintext values. This makes it more challenging to build ecosystems where one builds on top of another.
Breaking dev tools: Very few seem to discuss this, but you cannot have the same monitoring/debugging tools in the encrypted land. For instance, suppose, your customer calls and says “my transaction doesn't go through”. To debug this statement, you need to know if the issue is with the data encoding, encryption, program evaluation, or anywhere else in the pipeline. Without visibility into the data itself, which the customer is trying to hide to begin with, this is quite challenging.
Regulatory: We have seen issues with privacy-preserving tokens and networks. You might get in trouble just for writing the software. How fully encrypted environments will be received and classified by regulators is still to be seen.
The road ahead.
It collectively took about a billion dollars to push ZK to where it’s at today — a lot of excitement and potential but more production use-cases to be seen. I’d expect it will take about the same amount to get FHE where it’s at [we need about ~900m from VCs]. This is because similar parts of the software stack need to be rewritten and arguably more.
I think engineers should focus on more vertical use-cases and think of end-to-end developer experiences. It’s lucrative to think about generic VM environments, of course, but they could be optimized for specific workloads. It remains to be seen who would pay $ (in engineering, maintenance, and execution costs) for the inherent overheads to preserve privacy.
I’m very excited to see what’s ahead. The subject is close to my heart and a result of many sleepless nights. We need more funding, more engineers, practical use-cases, and simple dev environments.