## Backpack Algorithms And Public-Key Cryptography Made Easy

E-commerce runs on secrets. Those secrets let you update your blog, shop at Amazon and share code on GitHub. Computer security is all about keeping your secrets known only to you and the people you choose to share them with.

We’ve been sharing secrets for centuries, but the Internet runs on a special kind of secret sharing called public-key cryptography. Most secret messages depend on a shared secret—a key or password that everyone agrees on ahead of time. Public-key cryptography shares secret messages without a shared secret key and makes technologies like SSL possible.

Cryptography is a scary word: it conjures thoughts of complex equations and floating-point arithmetic. Cryptography does have a lot of math, but it’s more about keeping and sharing secrets.

### A Simple Secret

Telling my best friends a secret is easy: I find a private place and whisper it in their ears. As long as no one is listening in, I’m totally secure. But the Internet is full of eavesdroppers, so we need codes.

We’ve all been inventing codes since we were children. I created this simple number code (actually a cipher) when I was 5:

```
a=1, b=2, c=3, d=4, e=5…
```

It fooled my friends, but not my parents. Simple substitution ciphers are based on a lack of knowledge. If you know how they work, then you can decode every message. The experts call this “security through obscurity.” Letter and number substitutions don’t work on the Internet, because anyone can look them up on Wikipedia. For computer security, we need codes that are still secure even if the bad guys, or your parents, know how they work.

The most secure code is still easy to use: a “one-time pad.” One-time pads have been used for centuries, so they don’t even need computers. They played a big part in World War II, when each pad of paper with the key numbers was used only once.

Let’s say I wanted to send you this secret message:

```
I love secrets
```

First, I’d turn the message into numbers using my simple cipher from when I was 5. (I’ve heard rumors that other people had this idea first, but I don’t believe it.)

Then I’d mash my keyboard to generate a random key string for my one-time pad.

Now I can add the two strings together. If my number is greater than 26, I would just wrap it around to the beginning. So, `i(9) + e(5) = n(14)`

, and `o(15) + t(20) = i(35 - 26 = 9)`

. The result is an encrypted string:

Decrypting the string to get the secret back is easy. We just subtract the one-time pad: `n(14) - e(5) = i(9)`

. Follow that pattern through the entire message, and you can securely share a secret. You don’t even need a computer: just work it out with a pen and paper.

This function is called a symmetric-key algorithm, or a shared-key algorithm, since it uses the same key to encrypt and decrypt the message. Modern systems can safely use the pad more than once, but the basic idea is the same.

The one-time pad is totally secure because the bad guys don’t know how we got the encoded letter. The `n`

could be `i + e`

, `j + d`

or any other combination. We can use our shared secret (the one-time pad) once to share another secret.

But there’s a fatal flaw. We need to share the one-time pad ahead of time before we can start sharing secrets. That’s a chicken-and-egg problem because we can’t share the pad without worrying that someone will snoop. If the bad guys get the one-time pad, then they would be able to read everything.

One-time pads help me share secrets with my best friends, but I can’t use them with strangers such as Amazon or Facebook. I need a way to share something publicly that doesn’t compromise my one-time pad. I need a public key.

### The Public-Key Backpack

Public-key encryption focuses on a single problem: how do I prove that I know something without saying what it is? An easy concept to help us understand this is a backpack full of weights.

I want to prove that I know which weights are in my pack, but I don’t want to tell you what they are. Instead of showing you all of the weights separately, I’ll just tell you the total. Now you can weigh the pack and see if I’m right without ever opening it.

If the pack weighs 20 kilos, then you wouldn’t know if it has one 20-kilo weight, twenty 1-kilo weights or something in between. With a large number, you can be pretty confident that I know what’s in the pack if I know the total; you don’t have to see inside. The weight of the backpack is the public part, and the individual weights are the private part.

This basic backpack enables us to share a secret without really sharing it. If we each have a backpack, then we can both share secrets.

The backpack works well enough for smaller numbers, but it isn’t useful in the real world. Backpack algorithms were a mere curiosity for decades. Then RSA changed everything.

### RSA

RSA was the first public-key encryption system that worked in the real world. Invented more than 30 years ago, it coincided with the introduction of the more powerful computers that were needed to run the big numbers. RSA is still the most popular public-key encryption system in the world.

The basic premise of RSA is that factoring large numbers is difficult. Let’s choose two prime numbers: 61 and 53. I’m using the numbers from Wikipedia’s article on “RSA” in case you want more details.

Multiply these two numbers and you get 3233:

```
61 × 53 = 3233
```

The security of RSA comes from the difficulty of getting back to 61 and 53 if you only know 3233. There’s no good way to get the factors of 3233 (i.e. the numbers that multiply to make the result) without just looking for all of them. To think of this another way, the weight of our backpack is 3233 kilos, and inside are 61 weights weighing 53 kilos each. If you make the resulting number large enough, then finding the numbers that produced it would be very difficult.

### Public And Private Keys

Unlike the one-time pad, RSA uses the public key to encrypt information and the private key to decrypt it. This works because of the special relationship between the public and private keys when they were generated, which allows you to encrypt with one and decrypt with the other.

You can share the public key with anyone and never reveal the private key. If you want to send me a secret message, just ask for my public key and use it to encrypt the message. You can then send it to anyone you want, and you’ll know that I’m the only one who can decrypt the message and read it.

I could send you a message in the same way. I would ask for your public key, encrypt the message using it and then send it to you to decrypt. The popular program Pretty Good Privacy (PGP) worked like that. We’re secure as long as we both keep our private keys private.

Exchanging keys is made even easier by special key servers that allow you to search for people and find their public keys.

Public-key encryption also works in reverse to provide digital signatures. Let’s say I want to write a message and prove that I wrote it. I just encrypt it with my private key and post it. Then anyone who wants to check can decrypt it with my public key. If the decryption works, then it means I have the private key and I wrote the message.

RSA is relatively simple: take two numbers (the private key), apply some math, and get a third number (the public key). You can write out all of the math in a few lines, and yet RSA changed the world. Business doesn’t work on the Internet without public-key encryption.

### RSA And HTTPS

We use public-key encryption every day with HTTPS. When you access Facebook, Twitter or Amazon with HTTPS, you’re using a simple encryption mechanism like the one-time pad, but you’re creating the pad with public-key encryption. Without HTTPS, anyone else at Starbucks could read your credit-card number, Facebook password or private email while sipping a latte.

Amazon has a certificate from a company named VeriSign. The certificate certifies that Amazon is Amazon, and it contains its public key. Your browser creates a special key just for that session and encrypts it using Amazon’s public key. Then it sends it over the Internet, knowing that only Amazon can decrypt the session key. Once you’ve exchanged that secret key, you can use it as the one-time pad to protect your password and credit-card number.

You could keep using public-key encryption for the whole session, but because of all the math, it’s much slower than the one-time pad.

### RSA And GitHub

Another place many of us use RSA is GitHub. Every time you push a change to GitHub or pull from a master branch, GitHub has to make sure you have permission to make the change. It gets its security through a secure command shell using RSA.

Remember when you set up your GitHub account and followed some commands to generate keys?

You used the SSH-Keygen tool to generate a new RSA private/public key pair. Then you went to your GitHub account page and entered your public key.

Now, when GitHub needs to authenticate you, it asks your computer to sign something with your private key and return the signed data. With your public key, GitHub can confirm that the signature is authentic and could only have been produced by someone who has the corresponding private key—even though GitHub itself doesn’t have that private key.

That’s better than a simple password because nobody can snoop it. And if GitHub ever gets hacked, your private key won’t be in danger because only you have it.

#### Sharing Passwords

When WordPress.org was “hacked”, it wasn’t really hacked. WordPress plugin developers, like everyone else, have accounts on other websites. They also reuse their passwords. When hackers cracked those other websites, they used the stolen passwords to log into WordPress.org and make malicious changes to plugins.

Most people use the same user name and password on multiple websites. That makes your website only as secure as everyone else’s. Public-key encryption changes that. Because you never have to share your private key, it doesn’t matter if other websites get hacked. If an attacker breaks into GitHub and gets your public key, they can’t use it to impersonate you or log in as you on other websites. Only someone with your private key can do that, which is why your private key remains safe on your computer. Using public-key cryptography makes GitHub much more secure.

#### GitHub Gets Hacked

GitHub was hacked recently, but not because the encryption failed. Real-world security breaches are caused by problems in implementation, not in math.

In this case, the hacker was able to exploit a hole and add his public key to the Ruby on Rails repository. Once the key was added, GitHub used it to verify the hacker’s identity and granted him access. We’re lucky this hacker was friendly and told GitHub about the issue.

Once the problem was fixed, you could keep using your private key because GitHub never had it to lose; it stayed on your machine. Public keys saved GitHub from serious problems.

The weakest link in GitHub’s security was in the mechanism that allowed clever users to add public keys to other projects without being authorized. The math was perfect, but the implementation wasn’t.

### Public Keys In The Wild

Knowing the fundamentals is essential (you might say the *key*) to writing secure applications. The math is complex, but the basics are simple:

- There are two main types of encryption: shared-key encryption, such as a one-time pad, and public-key encryption, which uses public and private keys.
- Shared-key encryption is faster, but sharing the keys is difficult.
- RSA is the most popular public-key encryption algorithm, but a few others are in general use, as well as some cool experimental systems.
- Public-key cryptography works best in combination with other technologies.
- Don’t ever share your private key with anyone.

When it comes time to implement public-key cryptography in your application, don’t. RSA and other algorithms are already implemented in all major languages. These libraries include extra security features such as padding and salts, and they have a lot of testing behind them.

Most security flaws come from poor implementations and misunderstanding about the libraries. You don’t have to write your own cryptography libraries, but you do have to know the fundamentals so that you can use the ones that are out there.

*Illustrations in this article were provided by Robb Perry.*

*(al) (km)*

## Oozman

May 17, 2012 3:43 ammakes more sense… thanks for the article. :)

## tonkas

May 17, 2012 4:45 amGreat explanation of public key, although, I thought the the first public-key encryption system that worked in the real world was the one invented by GCHQ in the UK a few years before RSA, but it was so secret they didn’t tell anyone.

## Zack Grossbart

May 18, 2012 2:44 amThe GCHQ algorithm from Clifford Cocks was first, but it was classified for decades. That’s why RSA got famous and the GCHQ one is just a footnote.

Thanks for sharing.

## Chris

May 17, 2012 5:24 amGreat explanation of a complex subject! Fascinating.

## Bryan Lewis

May 17, 2012 5:31 amFinally, a straightforward, non-tech speak explanation of public/private keys. This will come in handy when attempting to explain it to others. I always confuse them at some point when I try.

## Taai

May 17, 2012 6:35 amGreat explanation! Until now I knew only a half of it.

## Aaron

May 17, 2012 6:44 amI wish every article was this clear. I now (finally) understand how this works! Thank you!

## Dmitri

May 17, 2012 10:18 amAlways wanted to know this stuff. Thanks!!

## Velja

May 17, 2012 5:48 pmGreat article!

## Michael

May 18, 2012 3:11 amNice way to make a tough explanation easy for everyone to follow…. Even Management and Marketing types :-)

ps. Zack began loading DOS from a floppy disk when he was five years old. which begs the question and are you done yet? ;-/

## Zack Grossbart

May 18, 2012 5:32 amThose old versions of DOS take a while to load.

## Pedro

May 19, 2012 3:36 amThere’s a typo here:

i(9) + e(5) = n(14), and o(15) + t(20) = i(35 – 16 = 9).

It should be:

i(9) + e(5) = n(14), and o(15) + t(20) = i(35 – 26 = 9).

Great article.

## Zack Grossbart

May 19, 2012 10:19 amSharp eye Pedro! Thanks for the correction. It should be all fixed now.

## Jake Kennedy

May 28, 2012 8:34 amLearned about this in A level computing, but never really grasped it, thanks for enlightening me!

## Chris

March 12, 2013 9:16 pmHi Zack,

You use the term “one time pad” a lot in your article. Based on my understanding, a one-time-pad is a cipher that never repeats itself and is totally unique. But you are using the term in a sense where, based on my understanding, I guess it has to loop. For example:

“When you access Facebook, Twitter or Amazon with HTTPS, you’re using a simple encryption mechanism like the one-time pad, but you’re creating the pad with public-key encryption.”

Which of these is more accurate?

- It’s actually not really like a pad, just a cipher. It DOES loop, and the only reason you mentioned “pad” was by way of analogy based on the fact you mentioned the concept earlier in the article.

- Or are you saying that based on the private key, a never-repeating cipher is generated?

Thanks, hopefully you can clear up my misunderstanding on this point.