Menu Search
Jump to the content X X
Smashing Conf New York

We use ad-blockers as well, you know. We gotta keep those servers running though. Did you know that we publish useful books and run friendly conferences — crafted for pros like yourself? E.g. upcoming SmashingConf Barcelona, dedicated to smart front-end techniques and design patterns.

P Vs. NP: The Assumption That Runs The Internet

Let’s get a few things out of the way first. This isn’t your regular Smashing Magazine article. It’s not a “how to“; it won’t show you how to build a better menu or improve your project tomorrow. This article shows you how a core problem in computer science works and why we’re all pretending we know something for certain when we really have no idea.

You’re looking at Smashing Magazine right now because you’re standing on the shoulders of a giant assumption called “P versus NP”. It’s a math problem that protects governments, runs the Internet and makes online shopping possible.

You may have never heard of P versus NP, but this article will walk you through it, show you how it works and explain why it matters. There’s a little math, but don’t worry; it’s all pretty easy.

P versus NP is a mathematical question masquerading as a philosophical one. It describes the difference between solving a problem and knowing whether you’ve solved it. Let’s start with a simple example known as the “traveling salesman” problem.

Our salesman is walking through a town with houses spread across a number of streets. He needs to visit every house once and only once. The salesman wants to find the fastest route that takes him to every house but requires as little walking as possible.

Travelling salesman map
Our travelling salesman and the map of houses he has to visit.

The salesman doesn’t know whether it’s better to start from the upper right, or to walk to the middle first, or to walk around the town and start from the other side. The only way he can know for sure is to try each route and measure how long it takes. There’s no formula he can follow to figure out the fastest route; he has to find the answer with brute force. Finding the answer is the “NP” part, which we’ll define soon.

Knowing whether he’s found the answer is easy. If he’s visited each house, then he’s done the job; if he’s skipped one, then he has to start over. That’s the “P” part. All P versus NP problems are tough to solve but easy to verify.

Yeah, There Are Some Big Words Link

The “P” in P versus NP stands for polynomial time. That just means we can predict the maximum amount of time it will take to solve the problem. The classic example of polynomial time is a quick sort. Here’s a set of blocks:

Starting point of quick-sort algorithm
Starting point for a quick-sort algorithm. (Image: RolandH21)

We want to sort them in order from shortest to tallest. The easiest way to do that is by dividing the group in half, pushing the tall blocks to the right and the short blocks to the left. Then, we’d cut each half in half, and repeat.

Animation of the quick-sort algorithm
The quick-sort algorithm in action. (Image: RolandH21)

There’s an algorithm that will tell us how long this will take. It doesn’t matter how many blocks we have or how disorganized they are, we can always predict the worst-case scenario for how long it will take to sort any number of blocks. The predictable part is what makes it take only polynomial time. All of the common math we use — addition, algebra, balancing your checkbook — can be done in polynomial time.

NP stands for nondeterministic polynomial time. It’s basically the opposite of P: There’s no equation or formula to predict how long it will take to solve a problem. Normally, the only way to solve this kind of problem is to keep trying answers until we find one that works. Figuring out the best route for our salesman is an NP-complete problem.

Let’s look at a few routes our salesman could take. He could start from the top and go down the left:

The first potential travelling salesman solution
The first route our travelling salesman tries to visit all of the houses.

That’s about 75 steps. He could also start by walking to the middle and looping around counterclockwise:

The second potential travelling salesman solution
The second route our travelling salesman tries to visit all of the houses.

That one’s about 95 steps. The salesman could also turn the map around and start from the top right. Now he can avoid looping all the way around:

The third potential travelling salesman solution
The third route our travelling salesman tries to visit all of the houses.

That route’s about 80 steps.

While we can measure each route to see which is shortest, there’s no way to tell the lengths without trying them out. We can make guesses, but pretty soon we just have to start walking. The traveling salesman problem is famous because visiting every house along the shortest route is NP (very difficult), but making sure to visit every house is P (very easy).

We could figure out this map in a few minutes, but adding a few more houses would make the solution take hours. Add enough houses and the solution would take years.

Why This All Matters Link

The traveling salesman is a cute problem, but the implications are giant.

Change the salesman to a page request and the houses to servers and you’ve got Internet packet routing. When my computer in Boston wants to send requests to a computer in London, it has to figure out a path to get there by bouncing from one server to another along the way. The data could take thousands of different routes through thousands of different computers in between Boston and London. Finding the fastest route is a bigger version of our salesman problem.

Google, Facebook and Apple try to make the problem easier by building data centers near major cities. They’re trying to make the map smaller when people make requests.

If you could figure out a better way to route data on the Internet, you could make stock trades faster than everyone else and make billions of dollars3.

Another version of this problem is all about secrets.

Most of the Internet runs on secrets. I want to give my credit-card number to Amazon, but not to the guy sitting next to me at Starbucks. I don’t want to share my bank password with my neighbours, and I don’t want to let my frenemies read my email. You can shop, share and work on the Internet because of secrets, and all of those secrets are based on math.

Most of the secrets on the Internet are protected by public-key cryptography4. That cryptography is based on finding two large numbers that, multiplied together, equal a very large number. The two large numbers are factors of the very large number. (Think of a backpack full of weights. You may know that the pack weighs 10 kilograms, but you don’t know whether it has one 10-kilogram weight, 10 one-kilogram weights or anything in between.)

For example, take the number 26. It has four factors: 1, 2, 13 and 26. Every number contains 1 and the number itself, so we’ll ignore those factors; the important ones are 2 and 13.

2 × 13 = 26

You can find the factors of 26 by trying every number between 1 and 26. It’s a lot tougher with a mindbendingly large number like:

33478071698956898786044169848212690817704794983713768568912431388982883793878002287614711652531743087737814467999489

Can you find factors of that number? I’ll give you one answer.

36746043666799590428244633799627952632279158164343087642676032283815739666

and

511279233373417143396810270092798736308917

Those numbers are dizzyingly large. You could never work them out on a piece of paper, and neither could a computer. There’s no good way to write a computer program to find the factors of large numbers quickly. The best you can do is to discard the numbers that clearly don’t work, and then search for factors one by one among the trillions and trillions of numbers remaining. Finding factors is an NP problem.

I had to work hard to find those factors (well, I looked them up on Wikipedia, but someone worked hard to find them). Making sure I’m right is easy. You could write a small computer program to multiply the second and third numbers. If they produce the first number, then I’m right and they are factors. Your job is simple; this is a P-level problem.

If finding those factors were easy, then the Internet would fall over.

P Vs. NP Link

Having a secret is not very useful if I can’t prove I have it, but I can’t just tell you the secret because then it wouldn’t be a secret. I need to prove that I know the value without telling you what it is. That’s where the factors come in. The larger first number proves that I know the factors without ever telling you what those numbers are. Take a look at “Backpack Algorithms and Public-Key Cryptography Made Easy5” if you’re interested in how the factoring turns into cryptography.

If you could come up with a way to quickly find large factors, then you could steal my secrets. Robert Redford starred in a movie about this6.

Many smart people have been working on a way to find those factors, and they haven’t found one yet. We’ve based the entire security of the Internet on the “fact” that there’s no easy way to find the factors of large numbers, but it isn’t really a fact. We don’t know whether a fast way to find factors or get directions for our salesman exists. Maybe it’s out there and we just haven’t found it yet. Maybe someone will find it tomorrow. That’s what P versus NP is all about.

P ≠ NP Link

Right now we assume that P does not equal NP. This means that some problems are easy and others are hard. We think our secrets are safe, but we can’t prove it.

Mathematics is based on a lot of assumptions. Some of those assumptions last decades before being proved true or false. As long as the assumption that P doesn’t equal NP remains true, then we can keep sharing secrets, email and credit-card numbers on the Internet without any problems. If you proved that P does equal NP, then you could cause some big trouble.

P = NP Link

Some people make the philosophical argument that P just can’t equal NP. If it did, then it would mean that finding the solution to a problem has always been as easy as verifying that the solution is correct and that factoring our large numbers is easy. That would break all public-key cryptography on the Internet. SSL would stop protecting anything, and you could never give your credit card safely to anyone.

It has even larger implications. If P equals NP, then you could solve anything as easily as you could verify it. Anyone who could drive a car could build one, and anyone who could listen to a symphony could write one. This makes my head hurt, but it’s not the real problem.

If you could create a practical example of P equaling NP, then you could solve the traveling salesman problem and change the entire way that Internet routing works. And that’s just the beginning.

A practical solution proving that P equals NP would give you enormous control over information everywhere.

What Comes Next? Link

Basing the whole Internet on an assumption is scary. We want to know whether or not P equals NP. The answer has practical applications, but it also asks a larger question about how we figure things out. Do we know them first and figure them out second, or do we need to work at a solution before we can find it?

Some problems are hard and some just look hard. P versus NP gives us a framework to figure out which is which.

If you can prove P versus NP one way or the other, you’d win a million dollars7.

(da, ml, al)

Footnotes Link

  1. 1 http://commons.wikimedia.org/wiki/File:Sorting_quicksort_anim.gif
  2. 2 http://commons.wikimedia.org/wiki/File:Sorting_quicksort_anim.gif
  3. 3 http://en.wikipedia.org/wiki/Flash_Boys
  4. 4 https://www.smashingmagazine.com/2012/05/17/backpack-algorithms-and-public-key-cryptography-made-easy/
  5. 5 https://www.smashingmagazine.com/2012/05/17/backpack-algorithms-and-public-key-cryptography-made-easy/
  6. 6 http://www.imdb.com/title/tt0105435/
  7. 7 http://en.wikipedia.org/wiki/Millennium_Prize_Problems#P_versus_NP
SmashingConf New York

Hold on, Tiger! Thank you for reading the article. Did you know that we also publish printed books and run friendly conferences – crafted for pros like you? Like SmashingConf Barcelona, on October 25–26, with smart design patterns and front-end techniques.

↑ Back to top Tweet itShare on Facebook

Zack Grossbart is an engineer, designer, and author. He's a founding member of the Spiffy UI project, the architect of the WordPress Editorial Calendar, and a Consulting Engineer with NetIQ. Zack began loading DOS from a floppy disk when he was five years old. He first worked professionally with computers when he was 15 and started his first software company when he was 16. Zack lives in Cambridge, Massachusetts with his wife and daughter.

  1. 1

    Before anyone gets excited, it’s worth noting that P=NP is The Biggest™ Baddest™ unsolved problem of computer science. Thousands upon thousands of people sat and still sit on this problem daily, and can’t find a solution to it.

    So don’t get *too* excited about billions of dollars in the stock market :P

    7
    • 2

      I’ll bet someone around here is confident he’ll solve it with WordPress.

      52
    • 3

      i a 13 year old girl have solved p vs np. it is quite simple. a calulater with enough room for up to 1,000 digets coyld solve anything. Just make a calulater with room for more digets if you need to. Duh.

      2
  2. 4

    Dmitri Tcherbadji

    November 28, 2015 7:02 am

    Zack,
    Wonderful article man. Very refreshing to see insightful and important info like this here. I already know how to build decent menus anyways :p

    The fact that we as a society have been operating under unconfirmed assumptions is as old as… Anything. We’ve been sold some things as absolute truths when they aren’t – for just as long. It is natural that religions or scientific assumptions can be proven wrong or that our race might not survive another 100 years. We will never know. Yet we still are able to function (in part with help from comforting ignorance). We are able to place bets and move along because predicting and understanding every step of the way ahead of time isn’t feasible. So far so good though, still here.

    2
  3. 5

    Thanks for breaking this down! I work in front end development and often don’t get to read about these huge computer science problems that we are trying to solve and you presented it clearly and concisely. I am sure with the rate technologies are improving that one of these Gen Y kids raised on programming will figure this out!

    2
  4. 6
  5. 7

    Great article!

    I wanted to point out the following though:

    “If he’s visited each house, then he’s done the job; if he’s skipped one, then he has to start over. That’s the “P” part.”

    So this is actually somewhat misleading.

    Confirming whether you have the right solution to the traveling salesman problem is itself NP.

    Whether an NP problem’s confirmation is itself P or NP, is the distinction between NP complete and NP hard problems, respectively. The traveling salesman is an NP hard problem.

    5
  6. 10

    Internet routing is not (in general) a good example of a P != NP theorum. For the most part, there are defined and structured rules for navigating between and within autonomous administrative systems. These follow shortest path rules and are computable in polynomial time.

    BGP, in general will follow the number of networks ‘away’. Most Service Providers will be using IS-IS or OSPF, both of which us Dijkstra’s Shortest Path algorithm. In big O notation, where E is edges and V is vertexes (E = Routers, V = Paths) :: O(|E| + |V|^2) = O(|V|^2).

    5
    • 11

      Zack Grossbart

      November 29, 2015 9:57 pm

      Network routing is not the perfect example, but it’s one that everyone understands. There are many ways that we simplify the network routing problem to make it more tractable, but in a pure form network routing is NP complete.

      I recommend Computer Networks by Andrew S. Tanenbaum as a thorough introduction to computer networks and routing problems.

      0
  7. 12

    Sigurt Bladt Dinesen

    November 29, 2015 11:46 pm

    This items in this article range from wildly inaccurate to plain wrong.
    – P is _not_ the set of problems with predictable maximum running time. It is the set of problems that can be solved in polynomial time, on a deterministic Turing machine. These two definitions are _not_ equivalent.

    – NP is _not_ the set of problems that do not have predictable maximum running times.
    It is the set of problems whose solutions can be verified in polynomial time, on a non deterministic Turing machine. Again, these two definitions are not equivalent.

    – The traveling salesman problem (which, yes, is NP-hard) is _not_ equivalent to packet routing. The TSP is finding the lowest-cost path that visits all nodes once. packet routing seeks a path between two nodes, that visits the fewest nodes! As such, packet routing is an instance of the shortest path problem — which has a polynomial time solution!

    Seriously, what the hell?

    5
    • 13

      Zack Grossbart

      November 30, 2015 1:22 am

      Thank you for reading the article and a big thank you for your comment. This article is a high level view of a complex topic. There’s a lot more to dig into here. Do you have some favorite books or links you would recommend?

      Thanks,
      Zack

      7
      • 14

        Sigurt Bladt Dinesen

        November 30, 2015 11:05 am

        Thank you for responding so well to critique.
        I can see the idea in a holistic introduction to the P=NP problem, but I think nailing the definitions of the classes (P, NP, and probably NP-complete) is imperative. This is probably the hardest part of writing such an introduction, if you don’t want to get to computer-sciencey. Getting it wrong will make for an introduction that does more harm than good to those who read it.

        Not an introduction, but a nice exploration of the consequences of P=NP or PNP (or of the limitations of asymptotic performance analysis) is https://rjlipton.wordpress.com/2009/07/03/is-pnp-an-ill-posed-problem/

        I wouldn’t particularly recommend any of the introductions I’ve read on P=NP. The key to understanding the problem lies in subtleties in the definition of NP (and hence the difference between NP-complete or NP-hard). Therefore, an intuitive, holistic explanation is elusive.

        5
  8. 15

    FedEx “solved” the “unsolvable” by flying all packages to a central point and then outwards to their final destination.

    1
  9. 16

    Interesting article, but the author’s definitions of NP are a bit misleading.

    NP is simply the class of problems which can be verified in polynomial time. Importantly, this _includes_ all problems in class P, since a problem that can be solved in polynomial time can of course be verified in polynomial time too. So the class P, far from being “the opposite of NP”, as the author claims, is in fact a subset of the class NP: the open question is is it a strict subset or are the sets equal.

    Also, his point that “There’s no equation or formula to predict how long it will take to solve a [NP] problem” is simply untrue. Take, for example, the subset sum problem: we can prove that it will take on the order of 2^N time to solve it. What makes it NP is not that we don’t know how long it will take to solve – it’s that the best algorithm we have to solve it is not polynomial.

    1
  10. 17

    Interesting article.

    In the case of the travelling salesman problem, is Dijkstra’s algorithm the best way to find the shortest path between two nodes or is there a better solution available?

    0
  11. 18

    Thank you for this great read.

    Whatsoever – the internet isn’t based on this fundamental assumption. It is the very nature of problem vs distance to solution. One, that you can’t even philosophicly solve.

    But i would like to see some try :)

    0
  12. 19

    kevin carvill

    December 3, 2015 7:50 pm

    I would highly recommend listening to this podcast from BBC Radio 4 regarding this subject. Very interesting.
    http://www.bbc.co.uk/programmes/b06mtms8

    0
  13. 20

    The conclusions in this post are kind of wrong:

    if we proof that P=NP, but we also proof that the O is n^googol, then we can safely distribute our credit card number over the internet, ’cause such a polynomial is kind of long to computate anyway.

    …at opposite side, if we proof that P!=NP, but we also proof it’s O is under n^logN, then we’re screwed, ’cause our communications can be decrypted in a short time.

    BTW, any encryption is already broken ’cause of “us”: https://xkcd.com/538/

    0

↑ Back to top