How Cloudflare Built This
Presented by: Rustam Lalkaka
Originally aired on March 3, 2021 @ 3:00 PM - 4:00 PM EST
Rustam Lalkaka, Director of Product at Cloudflare, will interview PMs and engineers on how specific products came to life.
This week's guest: Jerome Fleury, Cloudflare Director of Network Engineering
English
Interviews
Product Development
Transcript (Beta)
Welcome to this week's installment of How Cloudflare Built This. I am Rustam Lalkaka.
I'm a director of product here at Cloudflare. I look after a couple of our performance and networking focused products.
And I'm joined today by Jérôme Fleury.
Jérôme, do you want to introduce yourself? Hi, I'm Jérôme. I'm the director of network engineering at Cloudflare.
I've been in the company for the last seven years.
So I started in August 2013. And I manage a bunch of network engineers around the world.
And we make this network run 24 hours a day. Awesome. I'm excited to hear more about what goes into that and what the past couple of years have been like.
I told some folks internally that you and I were doing a show together.
And the first question I got was, are you guys going to be drinking natural wine together?
I'm not sure if that speaks well or poorly to our reputations.
A little bit poorly, yeah. Great.
Well, so you mentioned you joined Cloudflare seven years ago. I've been opening these shows by just asking folks what they did before Cloudflare.
It's always interesting hearing what folks' backgrounds are.
Where were you before joining what at the time was a very small company?
Yeah. So before joining Cloudflare, I was a network engineer for the last almost 15 years.
So it's going to be 20 years now, more than 20 years.
I was the technical manager at France IX in Paris, who the France IX is the main Internet exchange in France.
So it's comparable to Deckex or M6 in the Netherlands or the Lynx in London or the Equinix IX in the US.
I was the technical manager for that company. I joined France IX pretty early on during the funding of the company.
So we're only like three people in the company.
Before that, so I stayed, I was two years at France IX. Before that, I was mostly working at French ISPs.
So Internet service providers from relatively small to much larger.
So at some point I was working for the third largest Internet service provider in France, where I was also managing the network.
Got it. So sort of deep background in telecoms and IP networking. Did you always know you were going to be a sort of telecom engineer or how did that happen?
I did an engineering school on telecoms. So pretty much as soon as I was in school, I knew what I was, what I would be doing.
But specifically at the time, it was like, we're talking about 97.
So in 97, the Internet was not obvious. And if you chose a telecom career, like you had many choices and Internet was not the obvious choice.
But I made that choice pretty early on because I was doing some networking by myself very early.
I started doing some networking when I was like 13 years old.
The Internet was not even a thing. Just for fun or to play games or like what motivated you playing?
Just for fun. I was very curious, I guess, as a kid.
And my first modem was like very, very slow. I don't know. I was always fascinated by computers talking to each other.
I think it's insane that we've managed to do that.
So let's throw it back. What was your first computer? What was the computer you sort of started tinkering on?
You've never heard of it because it was a French company.
It's Thompson. They used to, they were pretty popular in France at the time because they were backed by the French education system.
And they actually had this bus. They could run a network on a very simple bus, which was pretty crazy.
We're talking about like 1984 or something like that. It was actually a good computer.
I believe it. Just because it's French doesn't mean it's bad.
And then what was the first networking protocol you used? The thing that I recall the best, probably Fidonet.
So Fidonet was an interesting concept at the time of the BBS.
So Internet wasn't a mainstream product at the time. Maybe in the US with AOL, but not really in France.
So people used to connect to BBS. The concept of BBS, you connect as fast as you can on the BBS on a flambeau.
And you want to connect as fast as you can because it's pretty, very, very expensive.
So you were like compacting all your messages in a package and you are uploading your package as fast as you can.
Yeah. And that was pretty impressive. I actually, so I had a friend running a BBS at the time it was on the Amiga computer.
I had a friend running a BBS and I learned a lot into networking protocols.
And I actually wrote myself like a software client on Amiga to read emails on Fidonet.
Fidonet actually introduced me to the Internet because you had gateways.
So you could have an email for the Fidonet network and then you had gateways that allowed you to send emails on the Internet and receive emails.
Got it. So was Fidonet built on IP under the covers or was it its own?
No, not from what I remember. Not at all. It was like a serial protocol on modem networks or a typical file transfer over modem.
No IP involved.
Interesting. It's always, I've worked with you for a long time. I had no idea that that was how this was going.
At the time, TCPIP was not even on the stack of these computers.
Like on the Amiga computer, you had no TCPIP. You had to buy the TCPIP stack.
It was a paid software. Interesting. Okay. So that's like the Jerome history lesson.
So then, it sounded like you were at a couple of telecoms, a couple of ISPs.
Cloudflare, how did you find Cloudflare and how did you decide this was the right company to join?
I think it's more like Cloudflare decided I was the right person for them.
It probably worked the other way around. A friend of mine applied at Cloudflare.
Both of us, we always wanted to work in the Silicon Valley.
We were based in France, in Paris. A friend of mine applied in something like 2011.
It wasn't the right time at the right moment. I also applied myself. Same story.
There was no open position. But I was very attracted by this company offering a content service at the time.
At the time, it was mostly CDN service. No fit and I applied two years again after that in 2013 and finally, there was a match and a fit.
So, it worked. But really, what attracted me is that when I started my career, my first job in an ISP in France, I remember setting up one of the first Akamai cluster.
So, engineers from the US, this company from the US calls us and they explain to us that they're going to install caches for free in our data centers.
And then, they flew someone from the US, which sounded magical at the time.
And this person started racking servers in our data centers. Wow, what exactly is the service?
I started to get interested in that. Realized that this concept of CDN distributed assets around the world.
Found it fascinating. But immediately, I realized that, oh, you have to talk to a salesperson to actually get access to that product.
And very early on, I thought that there should be a commoditized service for that.
So, when I saw Cloudflare actually allowing you to register and get your website CDN-ized, secured, fast, optimized without contacting any salesperson with a free plan, it sounded like too good to be true.
So, I wanted to be part of it because that's something I wanted to do 10 years before.
So, what's the secret?
Is it too good to be true or do we really have a free CDN that allows you to do that?
Well, it's not too good to be true. It's true. It's something I think the evolution of the Internet, especially the price of the bandwidth made it possible.
It wouldn't have been possible in 2004. I remember in 2001, the ISP I was working for, we're talking about thousands, tens of thousands of customers.
And we had a dedicated line between France and the US because the founder was a perfectionist and he wanted the best IP transit.
So, he bought a dedicated line from France to the US, two megabits per second.
And we were paying like 10,000 euros a month for that.
So, it was insanely expensive. Bandwidth was insanely expensive at the time.
So, you couldn't build a service like Cloudflare at the time, I assume.
And today, the cost of transit makes it definitely possible. Obviously, price of bandwidth is a big part of, or something that's changed over time that's allowed things like freemium models in the space to take off.
I think there's a bunch of other technical and network design decisions we made earlier that influenced that as well.
I'm excited to talk to you about those. But before we do that, in 2013, how big was Cloudflare?
What was the day-to-day like at that point?
Well, from what I can remember, I was on employee number 45, I think. So, there were 45 employees.
I think we were just opening a bunch of pubs. So, we were like up to 20 pubs at the time.
And actually, the first day I joined, Tom Patsyka was the only network engineer at the time.
He was like living for a month in multiple conferences around the world.
And it was just my first day. It was just like, I'm the only network engineer now available.
And my job was to configure the Juniper NX for expansion around the world in the 20 pubs.
It was pretty interesting. But the SRE team, like the on-call rotation was crazy.
We are basically firefighting all the time.
It was still very startup mode. And I was always impressed to see people having the time to innovate when actually, I didn't have any time to innovate.
It was just like fighting whether to do a large DDoS or a lot of things were not automated at the time.
We had to build constantly. Yeah. I mean, a big part of that too is like the user growth and traffic growth at the time was literally off the charts.
Absolutely. There was one sale person and doing a $10 ,000 sale was like a great deal at the time.
But beyond that, the self-serve business was thriving.
Just new signups and yeah. That's the thing. On day one, I knew that I was in a special company.
Like what kind of company does thousands of subscription a day?
It was something that I've never seen. One salesperson, but acquiring thousands of customers a day.
That's pretty cool. Yeah. You mentioned the firefighting and all that, and hopefully some of that has changed over time, but has the culture changed much?
Has the feeling in the office changed? Obviously, the office, that's a whole different story, but does the company feel much different from 2013?
Not much. It feels different, obviously. Everything is multiplied 20 times.
More than that, 25 times. Things have changed, but I'm always surprised to see so many early joiners like me still around in the company.
I think there's definitely a reason for that.
It's a company that leaves you a lot of autonomy.
If you want to fix things, improve things, I don't think anyone will prevent you from doing what's right.
I think that kind of freedom is what keeps people around.
Yeah. I was talking to someone who joined recently and they were like, why haven't we built this thing or built this product or shipped that feature?
Is there any reason? It's like, nah, no reason. We just haven't gotten around to it.
I think that's what's happening. Yeah, exactly. The shovels are over there.
Yeah, that makes sense. It sounds like you joined as an individual contributor, to use the jargony term, and you were configuring routers by hand and doing whatever it took to keep things online and growing.
What do you do now? As the company grew, I started managing people at Cloudflare.
I'm still very hands -on.
I try to stay a little bit hands-on. I like to see what's going on in the team. The reason I'm managing a bunch of engineers today, I think it's mostly coming from some form of technical leadership, which I think I need to maintain.
There's no other way to maintain it than doing some hands-on. But the truth, the reality is that I'm slowly losing my technical skills, but at the same time, I'm increasing my soft skills.
So, yeah, it has changed. Basically, my days are a lot of meetings, so scattered by meetings, and when it's not a meeting, I give some comments on Jiraz or the internal toolings that we have.
It's insane how much of my day is just me giving an opinion.
Sometimes I think it's actually being paid for that.
I have this feeling all the time. You were tinkering with this stuff when you were 13 years old.
This was fun for me, doing exactly the kind of stuff I do today.
It needs to stay fun. That's the thing. If I stop logging on routers and checking some configs and do some troubleshooting, it will stop being fun.
I can think of a couple whiteboarding sessions we've had where I lay out some crazy idea, and then you're like, yeah, I think we can do that.
And then six months later, we've shipped a new product at scale, millions of users.
It's cool.
One thing that has changed is that the scope of the product that we have the freedom to launch has actually increased.
We can be much more ambitious now.
You mentioned launching a product in six months. Effectively, when I joined, we could launch new stuff in a week or two.
Now, sure, it's taking six months, but that's because this project is much bigger.
The scale and the ambition is more.
Yeah, that makes a lot of sense. We've talked a lot about you. When we were laying out this agenda, you were like, oh, I can talk about myself for an hour, but we're going to have to cut that short.
Let's talk about the network a little bit. You said you managed the network engineering team.
We built Cloudflare's network out.
What does the network actually do? What is the network? That's a very nebulous, amorphous concept.
It's complicated to describe the network at Cloudflare, especially because it's a company that has a tendency to describe itself as a network.
We say the network is a product. The network is the computer. We use the term network everywhere at Cloudflare.
Sometimes, it really confuses me.
I can go into a meeting and someone will say, oh, the network had a problem, but they're definitely not actually talking about the network or actually not the one that I managed.
They're talking about some service somewhere that had an issue.
So, the term network, it's really hard to put a scope on it. So, what I call the network today, first of all, it's what my team works on, which is network hardware and interconnection, routing, a little bit of performance, some traffic engineering.
So, that's how I would describe the network. It's both operations, making sure that this network runs 24-7.
So, there's definitely a strong operational component to the team, but also the longer-term project, making sure that we keep improving it all the time, more efficient, more performance, less incident.
So, yeah. Yeah, makes sense. So, I mean, in a lot of ways, just what you were doing prior to Cloudflare, but at a much broader scale, and obviously, a lot of things are different, but still IP networking at the end of the day.
Typically, a network engineer, running a network means running network protocols.
So, whether it's at layer two, layer three, and more, but network engineers usually are interested in exchanging packets between network devices.
And surprisingly, you would be surprised how little that's what we do in the network team at Cloudflare.
Our goal is to push packets from servers to the Internet in the most efficient way possible.
And somehow, network engineers sometimes are surprised when they join Cloudflare to learn that, no, we don't run an expensive, very large interconnection of our routers around the world, even if we start doing it now, but that's not the meat of our job.
The meat of our job is to have the simplest, leanest layer two, layer three, and push the packets to the Internet as the best we can.
So, that's a really interesting segue into, what are some of the fundamental design principles underlying the network?
And which of those, I'm sure a lot of those are really good decisions and principles to adhere to, and some of them are probably less good.
What are some of the decisions we made a long time ago that have allowed us to scale the network as we have?
It sounds like an emphasis on what some folks would describe as software-defined networking, right?
Like doing as much stuff on servers as possible versus on routers, right? Historically, some networks, they want to build their own networks.
And when I'm talking about dedicated capacity, that's usually expensive, especially it was extremely expensive even 10 years ago.
But one of the decisions that Cloudflare made is to rely on the Internet, on the transits.
So, in every pub that we have, we rely on our partners, whether they're transit providers that we pay, or whether they're peering partners that we have a settlement-free agreement.
But we say, okay, we are going to trust these other networks to treat our packets and send them the best we can to the visitors.
But at the same time, we're closely tracking them and looking at them.
And that's the very early decision because you could take the decision to own everything.
And if you own everything, it's much easier to control.
And when you don't necessarily control, then one of the best decisions we made is to have the choice, which means in every of our pub, we have pretty much all the big, large transit providers in the world.
I'm talking about the older tier ones like NTT, GTT, Altelia, all of them. Everywhere they are, we will take a port with them so that we have the choice.
And when one of them is not doing its job properly, we have automatic actions.
And our systems are smart enough to detect it automatically and use the right path and take the right decisions.
And I think that's something that was decided very early on.
And that's definitely was a good technical decision for Cloudflare. To draw the analogy here, Google popularized the concept of...
When Google started, the common practice was to buy very, very expensive, failure-proof computing hardware.
And then Google said, we're going to buy the cheapest hardware possible, and then we're going to make sure it's reliable using software.
And that's exactly what we've done in Cloudflare with regard to networking.
We said, let's use affordable transit around the world and then build software to make sure it's as reliable as a private network.
Something we do a lot is making sure that things that people take for unreliable, granted as unreliable, we try to make them reliable.
And whether we force the vendor to improve their processes or whether we make the decisions ourselves, but that's something we do.
Yeah.
That makes a lot of sense. And so, it's interesting because we do operate our own L2 backbone now.
Yes. We don't really communicate on it because we're still on the principle from the early days that we rely on the Internet.
Our services is based on the Internet.
The mission is to make the Internet better. And obviously, even though we have a backbone and we try to use it as much as we can to improve a lot of things, you could still turn off the Cloudflare backbone today and everything would work.
Right. So, we now operate a backbone, but we just treat the backbone as another unreliable network component.
Exactly. We treat it actually the same way that we treat our transit vendors.
It's just one option. Yeah. Yeah. No, it makes a lot of sense.
So, one thing I wanted to touch on, Cloudflare operates a global Anycast network.
We made a big bet on Anycast early in our company's history.
And I think at the time that was seen as controversial, right? Like a lot of people thought that you couldn't run a global Anycast network.
Still controversial.
Okay. Well, let's talk about... So, why is that controversial and why has it actually worked for us?
The Internet was designed as a Unicast network.
By design, the Internet is Unicast. When you think about it, it's single IP to single IP communication.
And the concept of Anycast was not even drafted in any RFC or anything as something acceptable until a long time ago, until some DNS routes started using Anycast to offer diversity and reliability.
But then again, like DNS is UDP, it's stateless mostly. I know I oversimplify, but...
So, that was accepted as an acceptable method. But like running TCP on Anycast for the network engineers' small world, it was very controversial.
The main reason is that you can be rerouted at any time.
You don't have any control on that.
And your TCP session will be rerouted to some other device. And so, it will be reset.
The reality is a little bit different. Actually, the routing table on the Internet is actually pretty stable.
And you would be surprised, especially the more pups you have, the less routing disruptions you will see, which is a good strategy for Cloudflare.
That's also good technical decisions that we've made is to expand everywhere.
But in reality, you don't see TCP disconnections because of Anycast rerouting.
That's the reality. And it's extremely stable. The other thing is that by the very nature of HTTP, even if that happens, like the HTTP transactions are very small and short-lived.
So, most of the time, it's all right. And in terms of the core service, one of the core service of Cloudflare is DDoS mitigation.
And Anycast, I think, is one of the reasons of the success of the product is that it's a no brainer for DDoS mitigation.
It just works. Yeah. Yeah. Until you get paged, but we won't talk about that.
That's an interesting thing to talk through, right?
Early days, you mentioned DDoS attack would turn into a page.
I can't remember the last time we were paged about a DDoS attack, right?
We as a company seem to have invested a lot in automation and automating things that historically were not automated.
Where did that come from?
Was that focused on automation driven out of necessity? We were just growing so fast that there was no possible way we could handle things otherwise?
I think it's coming from two things coming together, which was out of necessity, exactly as you say it.
At some point, you realize that you're only firefighting outages on the Internet and we don't have control on outages on the Internet.
Surely, we can scold our providers and tell them to be more reliable, but that's not enough.
And at some point, we're only dealing with Internet outages.
And at the same time, there were some very smart people at Cloudflare who were thinking about automating the DDoS mitigation even more.
And I was like, well, if these people can actually do it, if only one or two people can actually completely automate the DDoS mitigation and make it much smarter, then why can we not do our own thing of making our decisions much smarter?
So, we started building it pretty much a little bit later, but that was definitely a reminder that we can do it.
And to do that, as you said, network automation, I don't think it's still even mainstream, even as of today.
And at the time when we started the project with Mircea Olinic, when he joined the company, the mission I gave him was like, you have to build a complete network automation framework based on SALT.
And there was nothing, SALT stack, there was nothing at the time.
The SALT people had created this concept of proxy minion, but there was no way to communicate to a Juniper device or a Cisco device or anything.
And he pretty much created it from scratch.
And once we had this core automation, we realized that we can do so much more with that.
So, that's when we started collecting probes from all over the Internet.
So, we started deploying full meshing of probes, and we started getting tons of information of what's happening on the Internet.
We could detect that a certain transit provider had problems on the transatlantic link.
And that actually happened.
I wrote a blog post myself, a public blog post on Cloudflare, and it was written like a post -mortem because we had so many outages because of one of our transit providers.
And in this blog post, I described what was coming, which was this automated mitigation system.
And I had to describe it because we were actively working on it.
And I knew that this would fix that kind of problem.
Then we released it, and it actually fixed that kind of problem. So, let's step back.
I think this thing you're talking about is so fundamental to everything we do that we don't even appreciate it as much as we should.
Walk me through the life cycle of making a config change on a router in the olden days without the automation platform you're talking about.
In the old days, well, it depends.
If it was a global change, then we would create a little snippet of config, and then we would almost manually send it to the 20.
You would SSH into router one, apply config.
Get some basic scripting to make sure that we could push on the 20.
Right. You're using some napalm or some craziness to- Yeah.
There was no napalm at the time, so it was just SSH and configure and send. Okay.
So, you have some shell script that's literally piping stuff into SSH and then- And same for when we had an Internet outage somewhere, we would connect on the pub, do the troubleshooting, and change the routing policy to make sure that we could accommodate that.
Right. And so, again, this is nothing specific to Cloudflare, right?
This is just how network engineering kind of works, right? Oh, yeah.
I mean, that's what a lot of network engineers love to do. Yeah. Right. And then with all of the salt automation and provisioning automation that we've built over the years, what does that workflow look like instead?
Instead, we've switched to a mode where we use the classic DevOps principles.
We can submit a change, get it peer-reviewed so that actually multiple engineers can have a look at it and make sure that we're not going to deploy something crazy.
We can have the continuous integration, so the CI -CD pipeline that will make sure that this change will not break the config so it can validate that, and it can also deploy it.
So, once the change is actually validated by the CI system, then it can be automatically pushed to the entire fleet, and you can specifically target the fleet.
So, if you want to target that change to only journey per routers, we can specifically do that in the pull request.
So, it's much more automated, and then we have a feedback.
The tooling is telling us what it's been deployed successfully in all these routers, and if one of them is failing, we instantly know that one of them has not been deployed.
You can even roll back in that case. So, it's much more like we've gained much more control on the changes that we used to do manually before.
Yeah. I mean, it sounds like that's the recipe for... The real benefit of this automation was not for config changes.
We didn't create it for config changes.
Usually, when people talk about network automation, they actually talk about config changes, how to push config changes and make them enforced, but we've built that system to create a resident network that could do automatic mitigations of outages, and that's exactly how it started.
We did the config changes, automation config changes, way later.
Got it. So, this is really the tool that allowed us to treat unreliable network paths as...
Exactly. Remove the tolerance from the network engineers and give them some breathing room.
Yeah. Yeah. That makes a lot of sense.
One of the things I've heard from customers a couple times and other network engineers is like, Cloudflare, you guys do a lot of DDoS mitigation.
You must use things like BGP flow spec really heavily.
You must have all sorts of intelligence on your routers and the routers must be the most important thing at Cloudflare.
Sorry? They used to be.
Used to be. So, it sounds like that used to be the case and then what does the world look like today and why?
Yeah. So, they used to be because effectively when I joined back in the days, flow spec was an integral part of the DDoS mitigation pipeline that we had.
So, flow spec just to, as a reminder, is a distribution of firewall rules on routers based on BGP.
So, you're using the BGP protocol instead of distributing routes, you distribute firewalling rules.
And it was pretty efficient at the time actually, but the thing is that flow spec first never really took off.
Juniper has an implementation, but the other implementation with other vendors are pretty weak.
And that's not exactly how we want it to be because flow spec, it's a dumb firewall.
It will drop packets or accept them, but you cannot ask flow spec to look into the signatures of the packet, for example, and search for whether it would pass or not, but more finely.
You can't do that with flow spec.
So, at some point, we realized that even though it was working, it wasn't working well enough for what we were trying to do.
The attackers got smarter and they could sometimes bypass.
So, we needed to shift the DDoS mitigation from the routers down to the servers, which at the time looked completely crazy because we're talking about hundreds of gigabits of DDoS attacks and some people internally at Cloudflare were actually telling me, well, you got to let these packets go through.
We'll take care of them. I was like, are you sure? Because any server would totally congest or crumble under the load, but they actually made it work.
So, removing a lot of intelligence from the routers, we did that pretty early on.
And I shouldn't be proud of it, but I'm proud to have dumb routers at Cloudflare right now.
It actually sometimes makes my life a little bit easier.
Yeah, no, it's interesting. I think one of the reasons you've stuck around for as long as you have is I don't think you have the classic network engineer.
My network devices are sacrosanct and they're the smartest and they're the best and everyone else has no idea what they're talking about.
That's not the vibe at Cloudflare generally and it's certainly not on the- Well, you know, as Cloudflare grew, we keep hearing that thing like, give away your Legos.
If you want the company to grow, you have to give away your Legos.
And at some point, we, the network engineers, had to realize that one of our Legos was DDoS mitigation, router-based DDoS mitigation, but it wasn't good enough.
And once you accept that, you give it away to other people who are smarter than you or they have better software than you.
You just let it go. When I first learned that we actually do DDoS mitigation and all of our networking, really, that customers see on x86 servers instead of routers, I was like, that's insane.
I didn't even fully put it together. And I remember asking Mike Housky, one of the DDoS mitigation experts who we're talking about, I was like, this doesn't make any sense.
Why wouldn't we drop traffic on the router if we can? And he was like, that's the definition of premature optimization.
If we're not going to congest the links between the router and the server, who cares?
Like, huh, that makes sense.
It was one of these very smart moves that the company did early on, I think.
Okay. We talked about a lot of things that went well. Interestingly, you were mentioning that, so I was saying I'm proud to have done routers, but today we're actually kind of taking the other step, which is we're starting to write code in our routers, like our own code.
And the landscape of the routing world has changed a little bit.
And Juniper and Cisco and other vendors are much more open to putting third-party software on their network devices.
And we have a lot of use cases now.
It probably would take a long time to elaborate, but interestingly, what we decided a few years ago to remove intelligence from the routers, we're actually doing the opposite right now.
Not for DDoS mitigation, but for other use cases.
I think that sums up Cloudflare engineering though, right? There's no dogma.
It's run the right code in the right place. Okay. So we've talked about a lot of good decisions we made and some sort of interesting choices that have paid off.
What's an example of a bad decision? It sounds like early on we decided to go multi-vendor, right?
Let's talk about that for a little bit. Yeah, multi-vendor. So when you're a startup and you need to expand, but at the same time, you can't spend too much.
So you go to every vendor and you will pick the hardware that fits the units and at the best price.
And usually in the networking world, you will look at the price per port.
So if you want to buy a 48 times 10 gig port, you will look at the price per 10 gig port.
And we did that a lot in the early days. We still do that.
It's part of the financial duty that we have. But at some point, especially in these years, like 2013, the market was very bubbling.
So you could get a terrific offer from any vendor every six months.
So we had a strong multi-vendor strategy.
We're actually pretty proud of it. And we built our automation on top of that, compatible with Cisco, Juniper, Arista.
But the reality is a little bit different.
At some point, we're a little bit probably too many vendors. And I would say that after two or three, it gets really complicated to manage.
It's not because you have, it's having automation doesn't mean you still have to write two times the vendors.
So two or three ensures resilience and you don't make any assumptions or take dependence on any one router manufacturer.
But then beyond that, you show diminishing returns on that, right?
Yes. Yes. So we slowly rolled back that and now we have less vendors.
It's much more like streamlined. We're not single vendor because that's not the right thing to do.
But we have two or three maximum.
That was probably one of the bad decisions that we made. Interesting. Let's switch gears a little bit.
You were the champion of an initiative that launched recently called, Is BGP Safe Yet?
Talking about something controversial. Yeah. Yeah. Right.
Yeah. Polarizing would be putting it mildly. Let's talk about this. What is BGPSafeYet.com besides clickbait?
It's a public website that we try to make mainstream, like available to a large audience so that the larger audience understands why a few outages occurred in the past.
Large Internet outages that were very impacted.
And we want people to understand the basic flow of the BGP protocol, which is trust-based initially and no authentication of the information, which means it's pretty easy on the Internet today to hijack a path and attract the traffic, illegitimately attracting traffic from other companies.
This is also what's known as a route leak, right?
Yeah. So you can say route leak or route hijack.
I like to talk about a route hijack when someone sends a more specific route than yours and start attracting all your traffic because it's more specific.
They're much more damaging than a route leak, but route leak can be damaging as well.
And the fact that BGP is very trust -based since inception makes it possible. So for those of our watchers that don't necessarily know or can't grok what the impact is here, can we walk through what a BGP hijack is?
Let me actually pull up is bgpcf.com.
So you mean describing how hijack works? Yeah. So if you're a network operator and you connect to the Internet, you have a transit provider that gives you the full view of the Internet routing table.
Yeah, that's a very good description, very simple, but good.
And if you can operate BGP, then you can pretty much send any BGP information you want.
That's what happened a long time ago, I think 2008, I think, when Pakistan Telecom started announcing YouTube to the Internet.
And they did it for regulatory reasons. They had an order to censor YouTube, but they did it poorly.
They re -advertised more specific routes of YouTube to the entire Internet.
And the terrible thing is that these upstream providers, I think one of them was PCCW at the time, they actually accepted these routes, even though they were completely illegitimate.
Like they had no reason to accept these routes from Pakistan Telecom.
And once they accepted, other networks down the chain started accepting them.
So it created a global outage because everybody trusts everybody.
So should we, let's imagine that I was the Pakistani Telecom in that situation and I mistakenly were hijacking the YouTube IP addresses.
Let's do that.
Yeah. So the hijacker is like Pakistan Telecom and you start attracting the traffic.
So it doesn't have to go to a malicious website. It's a sinkhole.
You're just attracting the traffic and the traffic dies because there's no way you can reroute it or serve it.
Right. There's no way to connect these pads and then the packets fall off the face of users.
That makes sense. Why is this even possible?
You mentioned that BGP is trust-based. Is it really the case that like anyone operating BGP session on the Internet can just go and say- There's no real source of truth.
The source of truth that we have for networking engineers are very weak.
They're like public database that can be easily hijacked or populated with fake information or wrong information, which people don't do usually, or they don't do it on a malicious principle because we're talking about the network engineer's world.
But this source of truth is not actually trusted.
You can't really trust it completely. So plus deploying this filtering and authentication of the routes that you receive is actually pretty complicated.
There's no standard to do it on the router and you have to create your own pipeline and scripting to make it possible, which most network engineers don't do.
Like 99% of the network engineers don't do that. So you end up with a very weak trusted chain of BGP advertisements on the Internet.
But the thing is that the network engineers have come up with a solution for that.
I think almost 10 years now, I think they created the RPKI protocol, which is the equivalent of the SSL certification chains for browsers, but adapted to the routing world for network engineers, where your IP allocation that is allocated to your company, you can actually sign it.
You can sign the BGP advertisement that you intend to propagate on the Internet.
So now there's a centralized version of a source of truth that it actually can be really trusted because it's signed and it's approved.
And the website is BGP safe yet is to encourage all network operators in the world to deploy RPKI.
So practically speaking, given the way the Internet is sort of structured, it's not that...
I mean, ideally, yes, everyone in the world implements RPKI and then everyone's safe from these types of mistakes and attacks.
But in practical reality, we only need a handful of networks to implement RPKI and the vast majority of these things will be protected against.
Is that right? Exactly. Yes and no.
Yes, true. Right now, we're putting the focus on the tier one transit providers because they are at the core of the Internet, which means if you're a small ISP in Europe talking to a small content provider in the US, there's 99% chance that your packet will go through one of the tier one trend providers.
So once you start securing this core of the Internet, something like 12 operators actually having global presence and transiting a lot of packets on the Internet, surely you reduce the blast radius of a hijack.
You make it very localized, regionalized instead of globalized.
And that's why we're putting the effort here, so that we can actually make these headlines about global outages a thing of the past.
And I think we're very close to that. The largest transit providers have implemented RPKI or are just about to implement it.
And it's possible that we completely, in the news, stop talking about it next year, even if they happen just because of that.
But we still have a lot of traffic also happening on direct connections between ISP content and very localized or regionalized.
And if you don't secure these interconnections that happen everywhere, you haven't secured completely the Internet yet.
So the next step is to focus on smaller regionalized transits, content providers, and make sure that everyone deploys it.
Actually, most of the actors.
But that's true for any technology on the Internet. You need a consensus, and you need as many network operators to adopt them to make them efficient.
Right. This is a case of the Internet's greatest strength is also its greatest weakness, right?
It's decentralized, and that's what makes it awesome.
And it's decentralized, and that's what makes it impossible to make progress.
It's the same as IPv6. It took 20 years for IPv6 to reach 25% of eyeball users, because you need everyone to deploy it.
Yeah. So you mentioned that isbgpsafetyac.com, which launched, I don't know, two months ago, a month ago, was very controversial.
Why? This sounds like a good thing, right? RPKI is better for the Internet.
Why? It's controversial because the users can actually call out their ISPs.
There's a tweet button saying, hey, the website detects whether your ISP has implemented RPKI origin validation or not.
And you can tweet back to your ISP saying, hey, it looks like you haven't deployed it.
Maybe you should deploy it. So some people were complaining that we were shaming ISP or harassing ISP.
We launched during the beginning of the pandemic crisis.
So obviously, some people were upset.
But we've also seen a lot of good feedback. So I think it's controversial, but pretty much, I think all good things in the world will be controversial anyway.
So you will always find naysayers. Yeah. Yeah. I can't have one of these shows without making a Steve Jobs reference.
So it's like when Steve Jobs removed the floppy drive from the iMac, right?
People were like- Yeah. Yeah. You're making the Steve Jobs of the Internet is what I'm saying.
I'm joking. We definitely, I think we understand that smaller ISPs may have a hard time deploying RPKI origin validation.
Even though it's become pretty straightforward, like Cloudflare has open-sourced tooling that is globally deployed by all the large transit providers right now.
It's super easy to deploy and you can activate origin validation on your edge routers in five minutes if you want.
It's not always that easy, but it's become very mainstream to deploy RPKI origin validation.
That said, we can understand that some small ISPs can be upset by it, but you don't even have to deploy it sometimes.
You can also just choose the right transit provider for you. If you default your packets to NTT or Telia, like large tier ones that actually have deployed RPKI origin validation, you are protecting your users.
So it's not just about, oh, just deploy it, implement it.
It can be all about just choosing the right vendor.
Yeah. Yeah. So is it that we're pursuing this because Cloudflare is uniquely vulnerable to leaks or anything like that?
Or no, this is like an Internet -wide problem and we want to help make the Internet available?
We are vocal about it, for sure.
I know for a fact that one of the reasons that the largest transit providers in the world have deployed it is because of customer pressure.
And when we say customer pressure, we're talking surely about Cloudflare because we've put a lot of pressure, but we're not the only ones.
And we're talking about actors that are way bigger than us.
So the pressure is coming from very large actors who are also impacted, but they just don't talk about it.
Yeah. Yeah. Makes sense. Yeah.
I mean, and even since isbgpsafia .com, and obviously we don't want to take total credit for it, but we've seen quite a few large networks go live with origin validation since the site went live, right?
So- Yes. Well, I'm not going to say that the website made them- Sure.
Of course not. Definitely preparing for more than a year, but since the website has been launched, we've seen GTT, Cogent, and Hurricane Electric fully deploying RPKI origin validation, which is an immense step forward for routing security.
Definitely. Yeah. And I know we've been in touch with other large networks like Comcast and others around how to do this.
Yeah.
Comcast, definitely thinking about it. And now we're talking to all the incumbent networks, such as Orange, the original French account company, but they're tier one.
We're talking to Sparkle, the Italian one, same story, Global Presence, Telxius, the offspring of Telefonica.
So all the tier one providers.
The elephant in the room is level three. So level three is now CenturyLink, is the largest tier one provider.
And once level three has deployed RPKI origin validation, it will be game changing.
Is that in the cards?
Do we have any visibility there? Absolutely. They haven't committed to a date yet, but they're definitely working on it.
Cool. Let's talk about something sort of more future looking.
Obviously, is BGP safe yet is an effort to make the Internet better and safer and more reliable, and it sounds like there's progress happening there.
Going back to Cloudflare for a second, where do you see our network going?
And sort of what are you excited about that you're working on?
I'm always excited about the network expansion. It's still a large part of the story, and we still open new pubs all the time.
There are challenging regions in the world where partnering with ISPs or serving traffic efficiently may still be challenging.
And I'm talking especially about large countries like India or Russia or Brazil.
We have presence everywhere, but expanding and making this presence even more efficient is always challenging.
And even we've realized that when we're talking about the good decisions and the fact that Anycast was a good decision, sometimes Anycast is not the best way to serve traffic.
We've realized that recently in some countries where our Anycast routing is not well supported by the partner ISP.
So, we're actually deploying new ways to serve traffic with system engineering teams, working together with the network and making sure that we use the best transport mechanism.
That is very exciting, definitely.
So, right now, I think today we have 220 pubs. Someone knows the number exactly.
I'm excited to go to 300 or 400. That's still very exciting.
Interestingly, when we were talking about making the routers smarter, we're back to doing that right now.
So, in my team, we frantically started writing code, our own code to do some traffic engineering and bring some cool features and improve the traffic engineering and embed that in our routers.
I'm very excited by that. There's not much I can say about it, but the fact that Juniper and Arista and other vendors allow you to push any of your code now is maybe game-changing.
What about the fact that you mentioned when you started at Cloudflutter, Cloudflutter was like an HTTP company.
There's a little bit of DNS here and there, but for the most part, every single bit in or out was HTTP.
And as the company's grown, the workload has gotten a lot more diverse.
How does that change your job?
Most of services don't really impact my team's work except for this new Magic Transit service that we have.
So, Magic Transit, being as successful as it is, and directly involves BGP routing.
So, our customers trust us to advertise their BGP presence on the Internet.
It's actually a big deal. We are becoming like a large transit provider, and any casting BGP routes for customers is quite a challenge.
Definitely adding a workload, but it's a very interesting one, like reversing the flow of the traffic, dealing with some specificities of various customers, and expanding our BGP number of routes.
It's also a challenge. So, plenty of challenges coming from this Magic Transit product.
Yeah. The other interesting thing about Magic Transit is if you worked at a different company, you're the person buying Magic Transit, right?
If you're looking for a DDoS mitigation solution, I assume that any company that may serve from a DDoS, but they operate their own infrastructure, they don't use AWS or anything like that.
They have their own data center, or they want to secure the corporate office.
And if you need only one DDoS to- Not just one day, but ruin your life.
Yeah. And then your boss is like, hey, find a solution for DDoS.
And then you realize that on-premise solution will not save you because you can still congest your inbound ports.
And then you start analyzing what's on the market.
And absolutely, if I was a network engineer working in one of these companies, I would definitely evaluate Magic Transit as a solution.
Cool.
We have three minutes left. If you had some advice for a network engineer playing with computers and networking as a 13-year-old in their basement, what career advice do you have for them?
Where is the Internet going and where should they be looking?
I think what I've seen at Cloudflare is definitely the software eating the world.
And if you're in the network engineering space, you should stop looking at vendors, especially don't look at what the journey per CLI looks like or what the next Cisco line card will forward, because the thing that's happening right now is in the Linux space, Linux kernel, XDP, EVPF.
These are big things.
Maybe something around the P4 protocol that's coming around. I have a good feeling about it.
But yeah, the network engineering world has been locked in with vendors for a long time.
And you could easily declare yourself, name yourself as a Cisco expert, but really all you know is the CLI and you have to move away from that, I think.
And it's definitely happening right now. The fact that the next open config, the automation layer that is being implemented by Cisco and Juniper is not defined by Cisco and Juniper.
It's defined by the customers. Google, Microsoft have created the network automation framework.
So the customers are taking over.
You should definitely look into that. Great advice. Thank you so much for joining, Jerome.
This is a fun conversation. You were worried that you weren't going to have enough stuff to talk about.
I think we could talk about this for another two hours.
Just in time. Yeah, that was really entertaining. I loved it. Let's do it again.
For sure. All right. Till next week. See you next week.