⚡️ Speed Week: Built for Speed: Zero Trust and Magic Transit
Presented by: Sam Rhea, Annika Garbers
Originally aired on June 15, 2023 @ 4:30 PM - 5:00 PM EDT
Join Annika Garbers (Product Manager, Cloudflare) and Sam Rhea (Director of Product Management, Cloudflare) as they discuss our newest updates to Cloudflare One and Magic Transit!
Read the blog posts:
Visit the Speed Week Hub for every announcement and CFTV episode — check back all week for more!
English
Speed Week
Transcript (Beta)
Hello everyone and welcome to Cloudflare TV. So excited to be here today. My name is Annika and I'm on the product team at Cloudflare.
Excited to be joined by my colleague Sam to tell you about some of the new announcements and exciting news we have as part of Cloudflare Speed Week, which is an innovation week that we're running all week with announcements about new products, new test statistics, all kinds of good goodies to show you about how Cloudflare is building a faster Internet, which is part of our more broad vision of helping build a better Internet.
Sam, will you introduce yourself?
Yeah, thanks Annika. Hi everyone. My name is Sam Rhea.
I'm also on the product team here and I have the privilege of getting to manage the products that compromise our Zero Trust suite.
So everything from our access control to secure web gateway, DNS filtering.
And I'm really excited to talk about why those are all faster with Cloudflare.
Awesome. And I manage some of the products on our network services side for Cloudflare.
So if you've heard of our magical things that includes Magic Transit and Magic WAN, that makes up about half of what we call Cloudflare One, which is our unified platform for customers to build their next generation networks.
And today we published two new blog posts about why Cloudflare One is the fastest platform for customers that are trying to solve these types of problems.
But for customers or viewers who are not as familiar with Cloudflare One, Sam, would you mind just doing a quick recap?
What is this and then why does it matter that it's fast?
Yeah. Cloudflare One is our vision that everything in your organization should have a fast, reliable connection to whatever it needs to connect to.
And that's really fun because it can be your laptop connecting to an Internet service that you use to do your job and realizing that my background makes my arm disappear when I wave.
Or you're connecting to an internal application, or it's data centers connecting to other data centers.
Whatever you need to connect, that's one half of Cloudflare One.
And then the other half of Cloudflare One is making sure that we secure those connections.
And security takes a lot of different forms.
It's encrypting those connections, making sure that when your employees are in a Starbucks Wi-Fi, they have a safe path to the Internet.
It's filtering threats from coming back into your organization, or simply it's just logging, making sure you have visibility into what's happening in your organization.
So Cloudflare One takes Cloudflare's network and uses it to help you connect and uses it to secure those connections.
And all of that's got to be fast, right?
Because- It has to be fast. Yeah. Then no one could get their work done.
Cool. Okay. So let's start with the post about some of our Zero Trust services that are under the Cloudflare One umbrella.
We published this one today, and it's about why Zero Trust with Cloudflare is the fastest way to solve these problems.
Let's just break this down, Sam. Tell me about why Cloudflare One and specifically Zero Trust is really fast.
So a lot of Zero Trust solutions are replacing a virtual private network.
When you're an organization and you have systems and infrastructure that you need to connect to, historically, you built a private network.
Most private networks followed your physical footprint. So if you had a large office in Austin, Texas, for example, and maybe in that office in the basement, you kept some servers that you needed to use, and you also put a VPN appliance next to it.
That worked really well when everyone was in that Austin, Texas office.
But then as people leave those offices, they need a way to connect back into Austin, both to reach their services.
And if you're securing their path out to the Internet and using appliances in the Austin office, well, then if I'm working out of New York, I need to go all the way back to Austin before I connect out to the Internet for security.
And that is really painful, as you can imagine.
It's painful for a lot of reasons. And we talk about the maintenance pain, the security pain, but this week is all about speed.
It's painful because it back calls your traffic.
And so when companies move to a Zero Trust model, a cloud delivered Zero Trust model, there's some natural improvements that can happen.
And that's fantastic.
If you're in New York, and instead of going through Austin, Texas, you go through something nearby in New York, you're just naturally going to see some improvement.
But Cloudflare takes it not just a step further, many steps further.
We aren't just satisfied with being better by virtue of removing that backhaul.
It needs to actually be better, better than something that might just be a natural breakout to the Internet.
And we do that in a number of really exciting ways.
And I know we'll have some time to talk about all of these. But before I get into those details, everything you're going to hear me talk about in the zero trust platform, and then Anika, when it's your turn to get to talk about the magic universe, the products themselves are fast because they are built on Cloudflare.
And not just Cloudflare as a kind of nebulous concept, but we'll talk about a lot of the individual components that we use to make these services faster.
Again, it's just not about being good enough by removing your backhaul.
It's about removing your backhaul and making you faster.
Yeah, awesome. So let's talk about one of those specific components.
One of the aspects that we break down in the blog post today is about how part of being built on Cloudflare is being built on Cloudflare workers, which is our development platform.
And so what aspects of our zero trust platform are built on workers?
How does that work and why does it make them fast?
Oh, it's fantastic. As a worker's customer, we couldn't be happier. So when you think about a Zero Trust model, the name is very goofy, but what it implies is that in a traditional private network, you trust the other connections, the other, the servers, the users on that network by default.
In a Zero Trust model, you invert that and you say, I don't trust anything else.
You have to prove to me who you are. That's a really great model because it's like if we lived in an apartment building and anyone who walked into the front door of the apartment building was able to walk into your unit or their unit or just any other apartment unit, that's not how we want security in an apartment building to work.
All of us have keys to our doors for a reason.
And so in a Zero Trust network, it works a lot like that. All of the services and applications and resources that you connect into this Zero Trust network each have their own lock on the door.
And that lock is different for every service.
Maybe you have a really sensitive service that you want to make sure users are in a particular Okta group connecting with a hard key from a company laptop from a set of approved countries.
That's a very specific rule. Maybe other services, you just need to be part of the organization and that's completely fine.
So bring all that up because there's a lot of complexity in these rules if you want them to be complex.
And that presents a challenge for not just traditional models, but Zero Trust models, because how do you enforce and consider all of these rules without slowing someone down?
And Cloudflare operates data centers in over 250 cities around the world now as of this week, we can talk a bit more about that.
And so what we do is instead of having these Zero Trust decisions made in just a handful of centralized data centers, we use workers to make these decisions.
So when you connect or attempt to reach a resource, which requires some combination of these signals, you have to, what happens is you hit one of the Cloudflare data centers near you and a Cloudflare worker entirely contained in that data center performs the evaluation of the signals that you're providing or the signals you aren't providing.
And that's really powerful because it moves these decisions closer to the user.
And also it makes it incredibly reliable and stable. As long as the Cloudflare, a Cloudflare data center is up and available to you, these decisions are going to get enforced.
And so that's a huge aspect of what makes our Zero Trust solution that much faster.
It's not just about the connection speed. It's about making those decisions really quickly.
Awesome. Yeah. And if you're interested, a viewer in learning more about Cloudflare Workers and the platform itself and why it's fast, there's also blog posts from earlier this week about improvements and just new news about worker speed in general.
And so the great thing about this sort of interaction between these is any improvements that happen in workers, we automatically get them for all the other Cloudflare services we use to build on top of them.
So it's this kind of like nice, mutually beneficial flywheel thing happening.
Cool. So workers is one great example. Let's talk about another one.
What's next on the list? Yeah. Everything that you're going to do on the Internet has one thing in common.
You're going to start with a DNS query.
At some point in your journey to the Internet, there's a DNS query made. And Cloudflare happens to operate the world's fastest public recursive resolver, 1.1.1.1.
And like workers, it's running in our data centers around the world, and it is the world's fastest resolver.
And Cloudflare offers a service that we call Gateway, which uses that same technology and that same performance to become the protective DNS resolver for your organization.
So when your employees make DNS queries out to the Internet, like other DNS filtering solutions, we're looking at the DNS query and saying, is this something that is what you're requesting good?
And it's not about I'm intentionally typing in something bad or malicious, it's I'm accidentally visiting a phishing link in an email, or I'm following a file that was shared with me that has a link contained into it to a destination that will infect my device with malware.
DNS filtering is a really effective way to prevent those attacks from occurring.
In a pretty lightweight model, you can deploy it to an office, you can deploy it to agents and endpoints.
But again, DNS queries are your or close to your first step to getting onto the Internet.
And if those are slow, if the filtering that's being applied is slow, well, then your entire Internet experience for your employees is going to slow down.
So by building the Gateway DNS filtering solution on top of the world's fastest public DNS resolver technology, and all the same data centers around the world, we deliver a DNS filtering solution that literally speeds up your team, which is really exciting because most people associate security products with something that slows me down.
But in this case, not only do you not have to compromise speed, in order to have security, you get more speed when you become safer with this DNS filtering solution.
So that's really fun. It's part of the luxury of like workers getting to build on the different components that make Cloudflare's network so special.
That's awesome. Okay, so DNS request comes in, Cloudflare operates the world's fastest recursive DNS resolver.
So right there at sort of the very first step in the request path, we're already making it faster.
But then what about what happens after that?
So we resolve the DNS query, and then it's got to go somewhere after that, right?
Cloudflare is not its next stop. Are we doing anything on that sort of like next path to send the request to its final destination?
We are. We're doing three things, which are pretty special.
The first is kind of the funniest.
There is a non-trivial chance that what you're going to is on Cloudflare's network already, using our reverse proxy, our WAF, our CDN.
And there's, again, a non-trivial chance that when you make a DNS query, I'm here in our Lisbon office, and the Lisbon data center is somewhere that way.
When I make a DNS query, it hits the Lisbon data center.
And if it resolves to something that's already on Cloudflare's network, that's in the cache of that Lisbon data center.
My security and the content that I'm attempting to access all happened literally in the same location.
And you can't beat that, right? Short of having a Cloudflare pop here in my home, which I might try to petition for someday, you can't beat the idea that the security is happening in the same place where the content that I want to reach is living.
But sometimes it's not on Cloudflare. So when it's not on Cloudflare, we use the same technology that powers our Argo smart routing to think about how do we get you to your destination and egress you out of Cloudflare's network, and then return the response back to you in a way that is highly performant, whether that uses our global backbone, which we talked about in a blog today, or we're just finding kind of like ways for the Internet, an analogy that we use a lot.
We're finding a path that avoids congestion on the Internet to get me to my destination.
That's two things. The third thing is something I really believe is going to change the world.
Cloudflare offers a browser isolation service.
And when we talk to customers about browser isolation, most groan, and that's completely fine, because they're familiar with two models of browser isolation technology.
One is pixel pushing. So I'm going to run an isolated browser somewhere, and the value of running it somewhere else off of my laptop is, of course, the code is not executing on my device.
I can contain any potential security threats, zero-day threats, keep them off of my device, and my local browser is a window into this isolated browser.
But historically, two models. One pushes pixels down, so it's like pointing a video camera at somebody else's laptop as they browse the Internet for you.
Pretty miserable. The other is something called DOM manipulation, which takes the DOM of the web page, unpacks it, inspects it, repackages it, and then sends it down.
It hopes it repackaged it correctly. It's like airport security inspecting your bag.
Doesn't always go very well. So what we've done is we've taken a very different third approach to this, where there's a headless browser, a headless version of Chromium running in Cloudflare's edge in all of our data centers around the world, including the Lisbon one down here, and all that's getting sent to my device are the vector renderings of the web page.
So my local laptop just thinks it's browsing the Internet, because I can copy-paste text, I can print if I need to.
If you're an administrator, you can prevent me from doing those things if you want.
But because all that it's sending are the vector renderings, again, I don't need a special browser.
My laptop thinks it's just the Internet.
And it can be more lightweight than sending the entire web page with all the code that's getting executed.
And then finally, as part of that browser isolation experience, because the browser isolation technology is running in all of our data centers around the world, and to bring it back to the first point, in some cases, the web page that that isolated browser might be navigating or browsing is sitting in that same data center because of our reverse proxy.
So you mentioned a virtuous cycle or flywheel earlier, same case here. So those are the three.
I know I've kind of a shaggy dog story way of going about these three bullet points.
I wasn't very concise, forgive me. But those are the three things that we're doing to really make this a powerful and fast solution for our customers.
Awesome. Okay. So there are a bunch of different things in here. We talked about using Cloudflare to make Cloudflare better.
We talked about sort of the fundamental nature of our network and our interconnectivity and then the intelligence that we can layer on top of that.
And then we talked about this totally innovative way to approach browser isolation.
That means that a lot of the performance concerns that traditional solutions have had are just not a problem for the way that we approach this.
I think I want to say that that's all we talked about in the blog about what we're doing now.
Is there anything else that you want to add or stuff that you're really excited about that's maybe coming up that we'll use to sort of build on as we develop out this platform?
I do. I will preview that. We're about to make some really exciting updates to browser isolation to make it even faster.
So we'll talk about that a bit more. But the other thing I want to add and I want to talk about more is what your team is doing and what y'all are building on performance improvements.
Is it all right if we switch places at this point?
Yeah, let's do it. Okay. So you wrote a blog post today about how Cloudflare's network makes customers' networks faster.
Can you tell me more about that?
Yeah, sure. So yeah, so the post is called Magic Makes Your Network Faster, sort of talking about the other half of Cloudflare 1.
But one of the things that we talk about in that post is actually a shared benefit between the zero trust aspects and for customers that are connecting their entire network to Cloudflare at sort of the IP level.
So maybe their entire data center, their entire cloud VPC, things like that.
And that difference is in the architecture change or the sort of shift in thinking between the classical model of network connectivity where customers are doing this sort call or traffic hairpin thing and the way that we want to approach it, which is all of the security filtering, any functions really that are being performed on that traffic happen as close as possible to the source of the traffic.
And then we can pick the most optimal paths to get it to its destination.
And so in one of the examples that we break down in the blog post, we have this sort of test network that mirrors the architectures that we've heard about from a lot of customers where we have locations in Oregon and Los Angeles that represent sort of branch office locations for this customer.
And then they're headquartered in South Carolina. So branches on the West Coast, but their hub where they have all of their security boxes, they're doing all of their filtering and then their actual Internet breakout as well, where they're able to actually send traffic that's sourced from those branches out to the rest of web all happens in that one place.
And so what that means is that for that Internet traffic, like the use case that you talked about before for the sort of the VPN architecture, that traffic has to go all the way from, let's say someone that's sitting in the Los Angeles office all the way to South Carolina before it can get out to the Internet.
Even if the place that it's going is maybe a data center, like let's say some AWS instance that's actually in Los Angeles.
So that traffic has to come all the way back.
And then for communication between those branch offices too, Oregon and Los Angeles are a heck of a lot closer to each other than Oregon and South Carolina, but traffic between those locations can't go directly between them.
It has to go all the way back through that central hub before coming back.
And so it's literally adding latency just by the additional thousands of miles that that traffic has to go over.
And there's only so much that you can do to sort of counter the speed of light.
And so in contrast to this architecture, which really when you look at it on a map, doesn't make a lot of sense.
What we want to do instead is have the traffic ingress at the closest Cloudflare location to its source.
So we talked about those 250 locations. There's one in Los Angeles.
So if I'm sitting in Los Angeles, I can just connect right there.
And then at that point of presence, all the security filtering happens, that sort of single pass inspection, all of that stuff.
And then wherever the traffic is headed next, whether that's maybe a location, the branch office in Oregon that's also connected to Cloudflare, or maybe it's out to the broader Internet, it can just go directly there.
It doesn't have to go all the way back to the hub because that filtering is happening at the location that's closest to the source.
Some companies have tried in the past to deal with this problem by just replicating the hubs.
So like putting a little stack in different regions. So maybe- More boxes.
Yeah, exactly. More boxes. Maybe they're like, oh, we'll put a box in wherever, somewhere that's kind of central.
Put one in Arizona or something that will do these same things.
But that gets really expensive and it gets really hard to manage.
And it means that every time you want to put in a new rule, you don't just have to think about enforcing it.
One place has to be everywhere and that just gets out of control.
And so most of the time teams classically have just decided to accept that latency trade-off and create a poorer experience for their users and make it sort of keep the security in place.
And essentially what we're doing with this architecture difference is removing that trade-off.
You can have both.
So when customers come to the team to use this service, it's primarily motivated by security.
Are they then surprised by the performance benefits?
How do they react when that happens too? Yeah. I think a lot of the time it is a big surprise.
I mean, customers thinking about specifically our Magic Transit product, which is a primary use cases is protecting external facing networks from DDoS attacks.
Basically what we do is we sit in front of customer networks.
We absorb all the traffic that's directed for their network. We drop DDoS attacks and then we send the good traffic back to them.
And customers experience with solutions like this in the past is that they add significant latency because what needs to happen in sort of a lot of these competitive solutions is the traffic will go to one or a couple of dedicated, what are called scrubbing center locations.
So if we take our same sort of example customer network use case, maybe there's someone in like Ontario, Canada, that's trying to access the resource that maybe is hosted in that South Carolina data center.
With the traditional solution, maybe the closest scrubbing center to Ontario is in like San Francisco.
And so again, that traffic is doing that backhaul thing to get back to the data center.
And so the ways that customers have combated this in the past is using DDoS protection or solutions like this in what's called like an on-demand format.
So only when their network is under attack and a big enough attack that it's severe enough that the latency trade -off doesn't matter at that point, because it's all about to stop the bleeding and get your network back online.
Only at that point, they turn it back on.
But that means potential for outages, time between when the attack starts and when you can turn it on and just overall not a great experience.
And so with Magic Transit, the vast majority of our customers decide to use it always on because they're pleasantly surprised when they turn it on, it doesn't impact their traffic.
Or in many cases, their traffic actually gets faster going through Cloudflare's network versus just sort of the public Internet.
Which is- So yeah, it's surprising and exciting.
It's like a little cherry on top.
Yeah, yeah. Very pleasant surprise. And a lot of what makes it so much faster beyond just our, reading your blog post beyond just the distribution of our data centers includes some things that Cloudflare does that are pretty unique.
Anycast, CNI, can you tell us more about that?
Yeah, sure. So one of the things about our network and sort of the way that we've architected it over past like 10, 11, 12 years is that we've cared a lot always, even when our primary products and sort of the main focus of our business was on our CDN and application services.
We cared a lot about being really close to users, end users eyeballs, because that's how we delivered traffic faster.
And so we've invested a lot in connectivity and being close to all the places that people are, but then also in connectivity to all the places that content is.
And so Cloudflare's network is really well peered.
We have over 90, I think the new number is 9,800 interconnects worldwide, which means we're sort of plugged into everybody everywhere and have lots of different paths that we can take to get traffic around.
We also give customers the option, you mentioned CNI, which stands for Cloudflare network interconnect to plug directly into us, to be one of those peering partners in location so that when we get traffic that's destined for their network, there's a really fast, direct secure path with dedicated bandwidth to get that traffic there.
And then we also talked about in a post, I think that was today, it could have been yesterday.
They're all blurring together.
Yeah, exactly. About our specific investment in last mile connectivity, which is again, being close to those end users.
And so specifically for folks that have been working from home since the pandemic, one of the pieces of feedback that we've heard from customers is using Magic Transit in front of their data centers has meant no performance impact for those customers because we're really close.
The place that that ingress traffic is coming in is really close to where those users are, even if they're sitting at home, maybe miles and miles away from a corporate office that's located in a city or somewhere that's traditionally had better connectivity to the rest of the Internet.
That is so cool.
Because you can think about the tangible impact that has on people's literal lives, right?
Like that I could move to work from home or entire workforces can leave the office and performance is just as good, if not better.
And I think there was another, this blog post might have been earlier this week, like you, I'm losing track, but about some intelligent work that we're doing for handling packets too, right?
Yeah, totally. So we talked about two things so far, the fundamental architecture difference, and then the fact that we're just peered with everybody all over the place, which means more basically path decisions for routes.
And the Internet's actually already pretty smart about picking the best ways to get traffic from point A to point B.
There's this protocol, border gateway protocol, BGP, that is used for every server that exists on the Internet and routers to be able to talk to each other and figure out which path to take to send traffic around.
But we actually want to go beyond that, right? We don't just want to say, hey, we have lots of options.
And so like let BGP pick the best option from all of those.
We actually want to use our perspective on the Internet, our unique view of all of this traffic that's running through our network, hundreds of billions of HTTP requests every day, and use that to make even smarter decisions on top of what just BGP would normally do.
And we can do that with choosing specific providers or connectivity options to send traffic down at a given location.
So for a while, we've had a product for applications called Argo Smart Routing that you can use if you are using Cloudflare as a reverse proxy, and that speeds up applications.
I think our old statistic used to be by 30% on average across the board, but we also introduced some new information about improvements to application layer Argo earlier this week.
But then we're super excited now to announce Argo for packets, Argo Smart Routing for packets, which brings those same kind of benefits, that same intelligence where we're looking at all the traffic within our network, making smart decisions about how to route packets around to the IP level.
So anyone who has their whole network connected to Cloudflare, their data centers, their offices, their clouds, whatever, you could basically just click a button, flip a switch, and get faster performance for all that IP traffic automatically.
And it's just getting better from here. We have lots of stuff in the pipeline around additional optimizations that we want to do to Argo and just our traffic in general to keep getting faster for our customers.
That is so neat.
Sometimes the brilliant folks who hang out in this company and put in the work to make something like Smart Routing for packets possible, I feel just lucky getting to kind of see that happen.
I like to read the Cloudflare blog post too, just as an observer for all the other developments that are happening outside of my little world.
Kind of to that end though, if you were telling a customer or someone watching this, what do you want them to take away about why magic makes their network faster?
If they wanted to go back and tell their team, hey, I watched this Cloudflare TV segment and they talked about how Cloudflare makes our network faster.
What do you want them to say? Yeah. The summary is kind of three points plus a little bonus one.
First one, fundamental architecture difference, just like we talked about for Zero Trust means just miles of latency sort of magically disappear when you're using our services to filter for security at the edge.
The second one is that the scale of our network makes it possible for us to have more options for how to get traffic to its final destination, which means that there's better ones available.
And then the third thing is that on top of that, on top of just letting the Internet do what it does and pick good paths to get traffic from point A to point B, we're adding layers of intelligence that make that traffic even smarter and faster than it would be on its own.
And then the little bonus is that Michelle, our COO, likes to say we're just getting started, always new products coming down the pipeline.
So stay tuned for more developments in this sort of magic space.
We're thinking about when optimization and quality of service is just a couple of the places that we want to invest in the coming months.
So not done here by any means.
That is fantastic. Well, for everyone watching, I can say this because I didn't write the Zero Trust blog post, so I'm not self-promoting.
Please go read Annika's.
It's incredible. It's on blog.Cloudflare.com. And then the corresponding one for Zero Trust are written by two product managers in our group, Tim Obazek and Kenny Johnson.
It's also incredible. You'll get a chance to really go take a deep dive into the data behind what we're talking about here today, the structure and architecture and diagrams of why this works the way it does.
And I know if you're following along on Twitter or LinkedIn, as you follow the Speed Week blog posts, if you have questions or commentary, let us know.
We're really excited about talking about this week.
And really, it's the culmination of not just weeks of work, but years of investment in all the technologies that make this possible.
So this is pretty fun. Yeah. Yeah, totally. Yeah.
Thanks so much for your time, Tim. Yeah. Yeah. Annika, thanks for being here. All right.
I'll see you. I guess we got to go get back to making things faster. Yeah. See you on the Internet.
All right. See you.