Cloudflare TV

Algorithms May Be Everywhere, But the Real Trick Is In the Data Structures

Presented by Marwan Fayed
Originally aired on 

Algorithms are everywhere. Whether we're aware of them or not, algorithms dominate our lives. Never mind machines - we unknowingly design algorithms that guide everything from the way we drive, to sandwich-making. Me? I claim the real magic is in a related domain called 'data structures.' A well designed data structure can transform otherwise complex operations into the trivially simple -- and I'll prove it, too, using magic!

English

Transcript (Beta)

Hello, everybody. Welcome to this session. A bit of a strange one, I'm told. Algorithms are everywhere, but the real magic is in the data structures.

So let's start with introductions.

First, Peter largely has no idea why he's here. He's been given only the bare minimum of instructions.

I am Marwan Fayed, formerly an academic professor, now with Cloudflare Research.

My area of expertise is networking in general, protocols, architectural design.

And I want to convey to people some of the elegance in the domain in which we work at a very high level, particularly for people who are learning.

But I think the people who are experienced might actually get a kick out of some of the personal observations that I might convey.

And Peter here has graciously agreed to assist me. Peter, who are you?

Yeah. Usually I sit next to you in the office, but well, apparently not for the moment.

I'm also on your team, as you might have guessed. Usually working on protocol-related stuff.

By protocols, I mean TLS. Sometimes I look at the low levels with Wireshark, which is very useful for debugging and something I like to work on as well.

Yeah. And here I have no idea what kind of card tricks I will look at, but I guess you can tell me a bit more about that.

Okay. So Peter, you are here because I want to convince the world that I can force you to think in a particular way.

Okay. Unbeknownst to you, I want to convince the world that I've placed thoughts in your head.

Or at least can read your mind, if nothing else. And I'm going to start with this deck of cards.

And I don't know if you can see this.

It's a completely sealed deck of cards, so it's not been manipulated in any way, shape, or form.

I'll tear the plastic off of it. It says bicycle. It says bicycle, yes.

I always wondered about the history about bicycle cards, where they came from, why they were named bicycle, because they seem to be everywhere.

I'm going to set those aside.

I'm taking out the instruction cards, of course, and the jokers should be on the bottom.

Certainly they are. So I'm taking these out, and I'm left with a deck of 52 cards.

I'm going to lower the screen. I'm going to take Peter out of gallery view, and bring him back for a moment, so that he can share with you the card that he's selected.

Or rather that I've implicitly sort of beamed into his head to select.

So let's try and do this. Let's see how far we can go. Roughly about here.

Okay. And I'm going to loosely shuffle. The beauty of this trick, one of the reasons I love it, there's a lot of card tricks out there that rely on mathematics in some form.

And or remembering some quality of the deck. So no trick here.

There's no card. It's just a straight deck. Not one side is thicker than the other.

It's not a magic deck. This one, though, I think is particularly elegant, because despite the complex mathematics that are behind it, one needs to know none of it.

Okay. There's just this lovely, elegant ordering that makes it very simple.

Simple enough that I've taught this trick to six-year-olds in the past who have just gone nuts with it.

And it works like this. Okay. So I'm going to deal out 21 cards in three columns.

Can you see these, Peter? Let me just lower these a little bit.

Yeah, I can't see it. But I think you're still in the gallery side.

Yeah. Okay. So if I do this now, I should be in gallery view. Am I big? Peter?

I guess.

Go ahead. Yeah. So you can see me large on the screen, right? You see the cards.

So I'm going to keep shuffling. Sorry, keep dealing. Wow. I'm starting to think I did a poor deal.

They're all red. Oh, this isn't good. I have to add some contrast, right?

Let me shuffle a little bit more. Honestly, it doesn't matter how well you shuffle here.

But I do want to add some color to the display.

It's a really, really lovely trick.

Okay.

Let's go with whatever we've got. So three columns. Let's see. So you can see that one.

And every column, seven cards.

So four, five, six, seven. Can you see all of these cards, Peter?

Yeah, I can see all of them. The very first card doesn't look particularly random to me, but...

I promise it won't make a difference in the end.

So I'm going to turn around the card now. While you write down on a piece of paper the card that I've convinced you to choose, show it to the camera, and then let me know, and I will turn back around and put it in the gallery.

All right. Any random card, please.

Any random card. Just one of them? Just one. Any one of the 21. Okay.

Let me write it down and draw it a bit. And that's the bottom side. But it's a beautiful drawing I'm going to make.

Anyone is a better artist than I am, Peter.

Okay. Is this one readable? Okay. I showed it to the screen. You showed it to the screen.

I mean, surely you remember that's the most important part. This is just for validation what you're showing to the screen.

Can I turn around? Yeah. I removed it.

Okay. Great. So now let me ask you. Which... Well, I suppose I would ask you which column is it in, but it's going to be difficult for you to point.

So I understand this is your right.

So is it in your right, your left, or the center, the card that you selected?

It's in the center. It's in the center. Okay. That's fine.

I will gather up the cards. And I will deal again.

I want you to look for the card and notice where it ends up.

Good thing I wrote it down.

Otherwise, I would already have forgotten it. It's good to know that someone else's memory is as bad as mine.

Peter, where is your card? To your right, your left, or the center?

It's on the left. This is just like an eye test where they're asking like, oh, can you see it good?

Is it too small, too big? But yeah, it's on the left.

All right. Very good. That's great. Same thing again.

Deal this out. Watch for your card. Secretly think that you have a very, very good memory.

If only that were true, Peter, I would be, I'd have studied history instead of computer science.

Your right, your left, or the center? It's on my right this time.

It's on your right this time. Okay. Interesting.

We have had three of them now in three different columns. We assume that I've done this correctly, so we will now find out.

This time, I want you to say nothing.

My lips are sealed.

Okay, off to this. I believe I chose the seven of spades.

That's impressive. Yep. Correct. Exactly. That's correct.

All right. Here's the proof. Fabulous. Oh, I'm so excited. I'm so excited this worked.

Okay. So, while I've got this in speaker view rather than gallery view, let me tell you the magic of why this works, okay?

Or at least the underlying mechanics, and then why it's important in the context of data structures.

It turns out, if you take any 21 cards and you deal them out in this fashion, okay, there is a trick I'm sure some of our viewers will have noticed, and the trick is in the way we gather the cards.

Okay. The idea being, once it's dealt out in this manner and somebody selects a card, then there's a total of four deals.

So, the initial one, and then two more where you ask the person to tell you where their card is, so a total of three times, and then on the fourth deal, no matter what card you started with, the card that was selected, if you collect them properly, always ends up in the middle of the middle deck, of the middle column.

Okay? Yeah. Blos, you want me to do the trick one more time, Peter? I mean, it worked out.

I mean, I could say it's coincident, but it sounds very convincing. Okay.

So, the trick is this. It's actually not a trick at all. Whenever you gather the columns, the selected column where the card sits should always be between the other two.

Okay?

So, let's say, for example, you chose the eight of diamonds. Hopefully, you can see way up here.

Yeah. Then, what I would do is I would ensure that this column sits between these two columns in the pile.

Okay? In any order. So, it could be that the center's on top or the center's on the bottom, leaving the other column.

So, if this is the selected column, I will put one other column below and one other column above and then redo.

Okay? Two more times, asking where is the deck.

And on the fourth deal, the card always ends up in the center of the center column.

Now, once again, I'm going to emphasize the beauty of this trick. The beauty of this trick is that you really don't need to remember anything.

There's no need to know any math, even though it's really interesting math.

The only reason you would want to know the math, actually, is to understand curiosity.

You know, what are the underlying mechanics?

Or maybe to ask if it could be done with more than 21 cards.

And it turns out that it can be done with more than 21 cards, but it's not as obvious as you would think the different numbers that you could use, clearly divisible by three.

So, let's step back for a moment. I've prepared a couple of slides.

What I wish I really had was a whiteboard, because this all really comes back to something that is very dear to my heart, and this is what I call the elegance of computer science.

Okay? In my own life, I'm a big lover of STEM in general, science and math and engineering.

Didn't actually start in computer science, the problem being for me that I loved it all.

There was elements of elegance in every one of them.

So, the physics community, for example, out of relativity, they tell us that even if we're standing still, we're still traveling at the speed of light.

Right? But we're traveling at the speed of light in that fourth dimension, which is time.

It's actually part of the universe. So, moving, you know, it's an idea that has always blown my mind.

Or from the biology domain, we learn that the DNA in our bodies is shared, what, 97% minimum with every other species on the planet.

That common blueprint is just frighteningly beautiful. Well, in computer science, we have a little bit of that, too, when we talk about how we deal with information.

So, let me first share the screen. As much as I would have loved to have drawn this, we live in an age where, okay.

So, the question now is, Peter, can you see?

I can see your slide, yes. Great. Okay. So, for years now, there's been talk in the public sphere about teaching people algorithmic thinking, computational thinking, these types of things.

Or I get people who say, you studied computer science, that's really hard.

And I'm going to make a confession and an observation here.

The confession is, I hate computers. You know, I qualify that.

But true to form, I use pen and paper instead of typing at a keyboard whenever possible.

Email drives me nuts, as I'm sure it does many people. But what's beautiful is the way we are trained to think in this domain, okay?

So, where algorithms are concerned, anybody can do it.

The observation here is, it's not that the domain is hard, it's that it's too simple.

What it comes down to is forcing oneself to think on the level of a, no offense to the young infants in the world, on the level of a two-year-old.

So, there are things that we just implicitly know to do, patterns or blocks of things, and we don't think about the fine details.

And it's only when we interact with children and they ask the, you know, why is this?

Or what's next? And, oh, you didn't say that part, that you start to realize, actually, there's a lot we take for granted.

Same when we read. We don't, when we read text, we never read word for word, we read in blocks at a time.

I know this well from the psychology community.

So, when it comes to computer science, there are a very small number of building blocks.

From an algorithmic perspective, when we're programming, there are only three things.

Thanks. And I imagine I might get some pushback here, but let's try it and see.

So, the first building block is this idea of moving data or manipulating the data in some fashion.

And typically, this is represented by, let's see, where's my little pointer thing here?

Oops.

The assignment. So, an equal sign in most languages, in some languages, or certainly in pseudocode, you can do the left arrow, right?

Arithmetic is a great example of this.

Or moving data from one variable into another. This is a type of move or manipulation.

And I want to be clear here, if somebody would say, well, what about when you call a function?

At a high level, in a high level language, it's true that it appears to be something different.

But what's actually happening in the machine is you're just taking stock of where you are, the state of the machine, moving it, and then starting a new stream of instruction.

Okay? So, that's the first one.

It's just moving and manipulation. The second is this notion of what people generally call conditionals, some people call branching.

And this is the point at which you make a decision.

Based on the truth of something, either I'm going to do this, or I'm going to do that, or I might even do the next thing.

The structures we see are typically if statements.

Some languages have things like unless and when.

And there's a third element here, which I think just for our high-level human thinking helps to think about, which is repetition.

This is represented as a for, a while, or until in some languages.

The purists will probably contend that there is no repetition at the lowest level.

Repetition is accomplished by just doing a branching, so deciding whether to do something, and then having the equivalent of a go to where you go back somewhere else.

And that creates the repetition, if you will.

Now, a couple of things to note here is in the repetition domain, any for is equivalent to any while is equivalent to any do while.

So, the repetition structure almost doesn't matter.

True to form, I have a friend who is very experienced, way better at these things than I am, and he never uses anything other than for loops, even when he wants to do a while type structure, and he will leave parts of the for structure empty.

That's how he accomplishes it, okay?

I'm going to go one step further and say one of the reasons this is so important is because I have yet to be made aware of any example in real life that doesn't consist of these three things, okay?

Some might ask, well, if this is true, then how come we can't build robots and artificial intelligence that can do everything we want them to do?

I claim it's for two reasons.

One is we have fallible memories, okay? So, oftentimes we may or may not remember with perfect precision, usually the latter.

And the second is where machines are concerned, they're unable to evaluate gray areas, if you will, right?

Something is either a yes or a no in a computation on a machine, they can't prioritize or think about the exceptions, the way that we do.

And that's where the difference lies.

Fine. This is all well and good. But you know, for me, the real fun is in understanding data structures.

So, when we learn about these things, they typically are mixed together somehow.

And I think there tends to be a lot of confusion.

So, people say, well, writing algorithms or programs is hard. And I say, well, actually, it's because it's too simple, as you saw with the building blocks.

And data structures are the same. So, interestingly, all of the data structures that we know of, I don't want to wager my house on this one, but I'm pretty comfortable making the statement, are made up of only two things, either arrays, okay?

And this is this contiguous sequence of data that is ordered and equally spaced.

Or we get what are called lists, okay? Where we have some data, and then alongside that data, there's a little reference to where the next thing in the list lies in memory, okay?

And then there's a reference to the next thing, and so on and so forth.

And this is it. And I want to be clear, neither of these is a data structure.

So, data structures are defined as being collections of data on which there are operations.

And the way in which the structures are maintained tend to preserve order.

None of that is true here, okay? So, what you get is arrays that are fixed, but very, very fast because of this indexing, okay?

And you get lists that are incredibly slow to navigate, but they're ridiculously flexible.

So, I can manipulate these arrows any way that I choose, okay?

Why is this important? Well, let me show you how I implement that card trick in a machine, okay?

I'm going to do it using stacks.

And a really quickly, you know, quick and messy implementation of a stack that I just wrote on my own.

And the reason that I'm doing stacks is because of the ordering.

Remember, I laid out the cards in a particular fashion, and I want to preserve that ordering within every column.

And stacks are perfect for that.

So, imagine a stack of plates. The only thing I can do with a stack of plate without kind of violating the plate structure is putting things on top, which we call pushing onto the stack, or taking them off the top, which is called popping off of the stack, okay?

If I don't do this, trying to program that magic trick, I can't even imagine the mess of code that I would have to write and follow.

But abstracting away into a stack, you'll see here, I've got a push, and I've got a pop, and that's pretty much it.

But for this get the nth item, this is so I can quickly, I can cheat a little bit, and I can pull out the middle card of the middle column, okay?

But all I need to know about is a push and a pop.

And now I'm going to write my card trick. Now, please apologize. I apologize completely.

This is a first pass. It turns out that I have a version in a different language from way long ago, and if there's time, I will pull it out.

But this does get the point across.

So, I'm going to start the program here, and you'll notice it's just one kind of perform the trick.

And here's the main trick here, okay? I have a deck of 21 cards, seven cards each, so three columns, and I'm creating four independent stacks.

One in which I'm going to store all of the cards at once, and then exactly those columns that you saw, I'd like to call them stacks from here on in, the left, the middle, and the right, okay?

I'm going to take all 21 cards, I'm going to generate them, and I'm going to push them into my all cards stack here, okay?

And now here's the trick. Remember I said there's four separate rounds of dealing?

Zero through three. And this is where it gets a bit messy here. I should pull much of this out and put it into function so it's cleaner, but the idea is I'm popping a card out of the deck, okay?

And I'm putting it into the left column and printing it at the same time.

And then popping a card out of the deck, putting it into the center column, and printing it out at the same time.

And I do this for as many cards as there are in the deck, okay?

So at this stage here, all of my cards have been dealt out.

This is here, again, quick and easy. On the last deal, on the last round, I just interrupt this and I'm going to return that middle card, okay?

The fourth, three plus one, because we started zero.

And this is the beautiful part.

So I'm going to skip this and come back to this. This is the merging of the deck, okay?

And I'm going to show you the function that I wrote for the merge. And I want to be clear here, the data structures are doing so much of the complexity for me behind the scenes.

And so it's nice to know how they work for exactly this reason.

Here I've written a function. All it does is it takes three stacks and merges them together, okay?

And it does not care about the ordering, if you notice here.

I'm just creating a new stack and I'm going to put in it all of the cards from the first stack, all the cards from the second stack, and all the cards from the third stack.

Now at this point, some of you might be asking, wait a second, wait a second.

What you showed us was that the card, the selected card could be in any one of the stacks, okay?

In any one of the columns. Here, all you're seeing is first, second, third column, putting them.

So if the card happened to be in the first stack, it wouldn't end up in between the other two, if I look at this function, okay?

Now I'm going to come back to the main trick. First, second, third, right? Seeing here, first, second, third.

So let's go back to that trick. The way that I use the function here is what does the ordering for me.

So in my former life as a professor, when I would assign this, what I hadn't anticipated is some people would submit pieces of work where they had written different functions depending on which column the card stood in, okay?

And I just show you one, a single one.

And the beauty here is, because of the way functions work, I can reorder the arguments, okay?

So in the case that the card is in the left column, then I just put the left here in between the other two.

In the case that it's in the middle, then as before and in the right.

And what that comes out as is as follows.

Let's see. Can I make it somewhat bigger? I can make it bigger, yes. Thank you for saying.

Where's the, oh, it's because I'm presenting.

Can you see that, Peter? Yeah, in which column?

Okay, so in which column? Let's say I pick, I don't know, the six of hearts there in the third column, okay?

So I'll enter three. And six of hearts is now, Peter, if you see it before, I do call out the middle column.

Two. Now the six of hearts is in the first column.

Lo and behold, six of hearts in the middle of the middle column.

Oh, magic. Magic. Exactly. Magic. Oops, sorry. There we go. Have I convinced you?

Well, there's some code, so I could follow it and then try to understand it.

But, yeah. Yeah, the code is a little bit tricky. What we get across is, just as a, I'm taking advantage of the ordering that's preserved in the stack structure, right?

One of the things that I'm always afraid we get too used to is just using these structures, these building blocks, failing to recognize the elegance that's attached to them, the almost decades of development that has gone into reasoning about these things, and the power that they provide.

So never sell the structure short. Algorithms are incredibly important.

And it's probably worth stopping at this stage to see if there are any questions.

Albert, a question would be, you mentioned it would work for other sizes as well.

How far can you go?

Can you buy a whole shop empty and try it again in a reasonable amount of time?

Ah, okay. So this is entirely lovely, actually, because it touches on, people ask, you know, what is computer science?

And you must be great with computers.

And we, in the domain, we have this notion that a computer is to a computer scientist as a telescope is to an astronomer.

It's a tool by which we manage to communicate and or implement ideas.

But it spans, I think, crucially, the range of aspects from mathematics on one side, pure mathematics, to engineering on the other.

There are two ways to answer your question.

And notice I'm taking, I'm giving you this explanation, because I actually don't know the answer, but I can tell you how to find it.

The first is to master the theory, okay? And then you get a quick sense of what numbers actually, once you solve the equations that come out, you can figure out what numbers work and what don't.

But the second one is to write a program to do it.

So once you've written a program that solves this for the case you know works, now you can mess about and you can pick any multiple of three and just roll through them for as long as you want to figure out how big of a number and what that number is will work with this trick.

I want to say from memory that 33 works, but I wouldn't, I wouldn't wager more than a dollar unit of currency on that one.

It's three S and 33 backs of cards? No, three cards into three columns. Very good.

The trick is that the column size, I do recall from having learned this a long time ago, the column size has to be a prime number.

Okay. But not all primes work.

Okay, so at this stage, I think it's time to call it, it's a, if there's a next time we do this again, I'll think of some more error correcting codes is also a brilliant one.

You can demonstrate with card tricks that I can read your mind. But we'll save that for another session.

Something like you take a card and oops, I dropped it on the table and lose one.

Awesome. If only usually this is the next one would be I lay them all out in random sort of face up face down and then I ask you to flip one card over.

Oh, and I use a very simple error correcting code in order to detect which card that was.

I need to get a truly random next time. Thank you everyone for joining us.

The next show is about Cloudflare account security. It's a health check by all means tune in, especially if you're a customer.

Thank you very much for joining.