What is Encrypted Search?
For years, encrypted search has been the annoying pebble stuck in the shoe of pro-privacy tech companies. Not wanting to compromise privacy or convenience, engineers have toiled in an attempt to find a solution to the encrypted search problem. What’s made the problem so tricky to solve? First, some background.
What is End-to-End Encryption?
In order to understand why encrypted search has posed such a difficult challenge in cryptography, you’ll need to become familiar with end-to-end encryption.
Simply put, end-to-end encryption means only the owner of the data (and any intended recipients) has the key to encrypt and decrypt data. That means that no one, not even the tech company providing the service (e.g., e-mail, SMS or cloud storage) is able to decrypt your data. For a more in-depth breakdown of end-to-end encryption, check out our previous blog post.
The Encrypted Search Problem: Why It's So Difficult to Search Encrypted Data
As previously discussed, when data is encrypted it is made unintelligible until it is decrypted again. In an end-to-end encrypted environment, one in which you and only you have the key to “see” the data, a search engine doesn’t know what to look for because it can’t make sense of the data. To a computer, the gibberish of one file looks like the gibberish of another. Here’s an illustration:
Imagine a bank vault lined to the ceiling with safe deposit boxes. There are thousands of valuables within these boxes, from stock certificates to fancy brooches passed down generations. Similar to an end-to-end encrypted environment, the bank doesn’t know what’s inside these boxes. That means if a customer was to call the bank and ask if their diamond necklace is in their safe, the teller wouldn’t have a clue. Unless they have X-ray vision, what’s inside each safe deposit box is unbeknownst to them. The only way to find out is to unlock the safe, which defeats its purpose.
The same goes for trying to search through encrypted data. Since a computer, like that bank teller, can’t “see” inside your encrypted data, it can’t search for specific items. The only way to do so would be to decrypt your files first, but that renders them vulnerable to prying eyes – which is dangerous in untrusted digital environments.
Herein lies the crux of the problem for pro-privacy tech companies; keeping stored data safe through end-to-end encryption while also allowing users to quickly and easily find their stored files.
So where does that leave data privacy advocates who want to help people protect their data without sacrificing convenience?
The Dangers of Poorly Designed Encrypted Search Engines
A simple way to fix this is to index the data before it is encrypted. In the bank vault analogy, each customer could write a list (or an index) of what’s in their deposit boxes and use that list to help quickly find what they're looking for.
On a small scale, this could work fine, but it poses a major problem: the index is a security risk because it contains information about sensitive, hidden assets inside the bank safe. That means if someone were to come across your index they’d know just where to find your valuables. So then where does one keep their index? The safest place would probably be inside the safe-deposit box, but putting it there would defeat the whole purpose of having an index anyway.
This problem is analogous in cryptography. When considering encrypted search, you have to balance three factors:
- Information Leakage - How much information are you willing to leak about the encrypted data? Using simple indexing means putting some data at risk for leakage, but it is an easier search method to implement. However, if you care about your users’ privacy, then this one is non-negotiable.
- Scalability - How much additional complexity are you introducing into the system? Maybe you can make your encrypted search safe through client-side decryption, but that would mean users would have to download and decrypt files before being able to search them. Such an arduous process would be impossible when storing thousands of files.
- Query Expressiveness - How accurate and flexible are your searching capabilities? Simple True/False search queries might be easier to develop and acceptable at a small scale, but useless at a larger scale.
If you can’t find a way to balance these three factors, then you’re putting your encrypted system at risk and hampering your users’ experience.
So then, how do you create an encrypted search that is not only highly secure, but also scalable and flexible enough for complex search queries? It’s a complex problem, but the Cyborg team has developed a way to balance all three of these factors.
The Holy Grail of Cryptography: Secure Encrypted Search
Cyborg has developed Stealth, a solution that not only keeps users’ data safe through end-to-end encryption, but also offers an easy way to search their data without having to decrypt it. Secure encrypted search, when designed and implemented properly, promises to be the holy grail of cryptography and cybersecurity for decades to come.
Curious about how we did it? Look out for our next blog post to learn how we solved the problem.