Cloudflare at Cloudflare
Presented by: Juan Rodriguez, Eric Reeves
Originally aired on September 24, 2021 @ 9:00 PM - 9:30 PM EDT
Come learn how we use Cloudflare technologies internally to solve problems (or as we say "dogfood our own products" internally).
This week's guest: Eric Reeves, Engineering Manager at Cloudflare
English
Interviews
Transcript (Beta)
This episode is brought to you by VeeR Teamwork Welcome everyone to Cloudflare at Cloudflare.
We've had a little pause for a few weeks, but we're back on the air.
My name is Juan Rodriguez, I'm Cloudflare CIO.
And in this show, we speak about dogfooding. Dogfooding is a term that we're using Cloudflare to illustrate the use of our own products to solve problems, build other products.
And in many cases, we actually invent products just by trying to solve problems.
It's a term that is very much part of the culture at Cloudflare.
And I always like to say that I try to be a Cloudfirst first customer to use our own products internally for many things.
Today, my guest is Mr.
Eric Reeves. I'm going to let Eric introduce himself. How are you, Eric?
I'm well, thank you. Glad to be here. My name is Eric, I'm the engineering manager for the Spectrum and Argo smart routing teams.
I joined Cloudflare in February of 2017 as software engineer on what is now called the Emerging Technology and Incubation Team.
Back then it was called Product Strategy. About four months into my career at Cloudflare, the head of that team asked if I was interested on working on a new product called Proxy Anything.
And of course I said, yes.
And the next year we launched that product, Spectrum. We started to build a team in Austin, Texas office.
And it was during that time that I transitioned to being an engineering manager.
All right, great. Well, it sounds like you've had quite an interesting ride in Cloudflare so far, you know, with different teams and stuff like that.
Yes. So we're going to talk today about two products in Cloudflare.
One is called Spectrum and the other one is called Argo Smart Routing, not to be confused with Argo Tunnels.
And Eric, why don't you tell us a little bit about Spectrum and Argo Smart Routing and what they are, what they do and all those sort of things.
Yeah, absolutely. Well, Spectrum is a reverse proxy product and it extends Cloudflare security and performance beyond web traffic.
There's this thing called the OSI model that describes how computers communicate over the Internet and Spectrum operates at layer four, which means that customers can put Cloudflare's global network in front of any TCP or UDP application.
And by security, of course, I'm talking about, you know, Cloudflare's capacity for mitigating, you know, distributed denial of service attacks.
But we also talk about giving customers access to features that allow them to define their own security posture on their own.
So an example of that is the ability to configure a software-defined IP firewall that they can use to, you know, selectively allow or deny traffic based on their use case.
On the performance side, because Spectrum is a reverse proxy and because it's built on top of the global Anycast network, it benefits from all of that network engineering that's used to, you know, steer our customers' traffic to Cloudflare and then ultimately to the services that we protect.
And along those lines, you know, Spectrum also integrates with our Argo smart routing products to accelerate TCP traffic.
Okay. And so, you know, Argo smart routing is a platform within our network that optimizes the network path from Cloudflare to the customer's origin server.
And so, you know, all of the systems that make up Argo, they collect network performance data that's driven by our customers' traffic and uses that information to compute the fastest and most reliable path across the Internet.
So that includes, you know, routing around network congestion, issues on the network, even issues in particular transits between, you know, various Cloudflare points of presence.
So, you know, while, you know, Spectrum uses Argo smart routing, you know, of course, customers know Argo as kind of the suite of products that's made up of Argo smart routing, which I just described.
There's Argo-tiered caching, which is, you know, improves performance of caching content, particularly for web traffic.
And then Argo tunnel, which is a way to secure origin services through Cloudflare's network.
Gotcha. Great, yeah. And one of the things that I always try to tell customers that I speak with and, you know, friends that I have in the community is that, you know, one of the, I think, the superpowers that Cloudflare has is basically the fact that it's built into this, you know, great network and all our services basically are built on top of that.
So you can, you know, many, you know, services that you use kind of like higher on the stack, if you want to call it that way, you get a lot of that benefit and all that good, you know, from the global network and the Anycast network and all the other things that are like a little bit lower on the platform, you know, kind of like for free, you know, you can take it out magically without having to basically glue a lot of things together, right?
Yeah, definitely. So tell me a little bit about, you know, how do we use some of these products internally?
How do we dog food Spectrum and Argo and these products to solve different problems or to build other things?
Yeah, I'll start with Spectrum.
So of course, when we built Spectrum, we were focused on making a top notch customer facing product.
And I'm proud of the fact that Cloudflare has this strong culture of combining products that make them greater than the sum of their parts.
And we anticipated that other teams would want to use Spectrum's technology at some point.
And the most common use case that we've observed is where, you know, some team wants to proxy non-web traffic outside of Cloudflare's network to, you know, some service inside of Cloudflare's network.
And before Spectrum existed, standing up all of that technology would have been cumbersome and at times even infeasible.
So it's been really helpful in that regard. You know, so some of the early adopters were quite willing to go through a manual process of onboarding their service to Spectrum.
But today that process is mostly self-service, meaning that they can use our customer facing API to set up and, you know, proxy traffic from the Internet at large to Cloudflare's network on their own.
And so, you know, a couple of examples of teams that are using Spectrum for their own services, you know, first Cloudflare Registrar, which is Cloudflare's domain registration service.
You know, they operate a Whois service and that service is powered by Spectrum.
So if you send a Whois request to whois .Cloudflare.com, that request goes through Spectrum.
And then one of my favorite examples of using Cloudflare at Cloudflare is for an internal tool that we call Crossbow.
And Crossbow is this super cool debugging tool that's provided by our support operations team.
And it allows us to run tests and diagnostics from any server on Cloudflare's edge.
And that's very valuable for troubleshooting and debugging.
So, you know, it's a command line tool.
I run that command line tool on my laptop and it communicates with the backend service inside Cloudflare's network.
And that service is a communication hub that communicates with all the various servers on our edge network.
And so putting Spectrum between the CLI and the backend allows us to use that tool without using our corporate VPN.
That's special to me because one, I love that tool.
And two, you know, the Spectrum team was able to contribute to putting a dent in obsoleting our corporate VPN.
I remember it's one of the first things that you and I worked together when I joined you.
We had like, try to get a crossbow across the line to be able to be used with access.
Yeah, that's right. I remember you walked over to my desk and you said, hi, I'm Juan.
Can you help us put Crossbow behind Spectrum?
Yes. So I remember that. Yeah. And speaking of the VPN, you know, one example of dogfooding early on was actually putting Spectrum in front of our corporate VPN.
And any employee that was connecting to the VPN would be doing so through Spectrum.
And that gave us a very diverse set of use cases and clients and operating systems and so on.
And so in those first few weeks, you know, we discovered quite a few bugs, you know, made some great performance improvements and just, you know, discovered issues that we wouldn't have otherwise found before putting it in front of our customers.
Yeah. And that is one of the things that I think that is fantastic about like us dogfooding these things is like, you know, the idea is that if it's not good enough for us, you know, it's not going to be good enough for our customers.
So we try to basically clean up a lot of these things, you know, before we send it to customers.
Yeah.
And, you know, with Argo Smart Routing, you know, earlier I mentioned, you know, Argo Smart Routing for Spectrum, which is, you know, an example of combining products and, you know, Warp and the Warp Plus product is another example of using Argo to, you know, optimize traffic for mobile devices to the services that they're trying to reach.
And customers know Argo as this feature that they turn on or they toggle.
And that just, you know, within moments starts improving their customer's experience.
Magic, it's magic. Yeah. But within Cloudflare, you know, we think of it as a platform that we can use to optimize any traffic across our network.
And so we're constantly looking for opportunities to use Argo's technologies to improve, you know, how traffic is routed across our network.
Great. So that is very, very, very interesting. And so, yeah, and I remember very clearly, you know, the thing about Crossbow and all that kind of stuff.
And, you know, it's been quite a journey, you know, to try to get all these products, you know, behind access and, you know, putting more, what I call nails in the coffin of the VPN inside of Cloudflare.
Tell me a little bit about, you know, if you can about what is in the roadmap, things that we're working on, you know, that we can talk about, you know, for Spectrum and Argo, things that you may be working on now or things that, you know, you guys have in the plans that, you know, we can disclose right now.
Yeah, sure. I'd like to talk about some of the features for Spectrum that we shipped this year that I'm very much proud of, yeah.
So earlier in the year, we launched our Spectrum pay-as-you-go product, which extends Cloudflare's benefits to, you know, all of the Cloudflare paid plans.
And we, through that, we offer protection for protocols like SSH and Minecraft and business and enterprise plans can also protect the remote desktop protocol.
So that was a great expansion of Cloudflare's scope from what started as an enterprise product to also our pro and business plans.
You know, since then, we've launched this thing called regional services, which is a solution powered by Spectrum for customers who have requirements around, you know, where their traffic can be processed and decrypted, whether those are contractual or regulatory requirements.
So Spectrum was a big part of that. And then most recently, we shipped this feature called port ranges.
And earlier, you know, I talked about how Spectrum is a product that proxies any TCP or UDP port, but in reality, a lot of services that exist on the Internet, you know, they operate on a handful or a range of TCP or UDP ports.
And so instead of having to, you know, create a single Spectrum proxy for each of those ports, you can now configure that in the same, in the same Spectrum configuration, which has opened up a lot of use cases that didn't exist before.
And it's been really great for some of our customers who have that need.
Great. Yeah, that's awesome. And I'm just thinking, so if we think about, you know, customers that may be contemplating a Spectrum or Argo or things like that, what would be, you know, some things that we've learned, you know, as part of our implementation of these products or uses internally that you think like, you know, if I was talking with a customer that I was thinking about like using this within their platform, right, using it to leverage certain services or things like that, you know, that we could give them as words of advice.
Yeah, that's a great question. So a little bit of latitude, you know, thinking about, because Spectrum's a layer four proxy and we allow customers to proxy any port, there were a lot of technological challenges that came with proxying any port at scale.
We spent a lot of time early on charting out how Spectrum would offer protection for a handful of applications and then how like we would operate and manage the platform to do that.
You know, when we arrived with the solution that allowed us to proxy any port, we had a whole new set of challenges, like what are the threat models that exist?
How does, you know, DDoS mitigation going to work and do we have the right kind of observability in place?
But one of the larger challenges was like, how do we present this functionality to the user and how do they interact with the product from the dashboard to the API and then, you know, extending Cloudflare's features in a way that makes sense to them.
And I think that our pay-as-you-go product has been very effective in giving customers access to, you know, a subset of that functionality to allow them to easily onboard a service to Spectrum.
And I think, you know, Minecraft is a great use case where customers want to protect their Minecraft server from denial of service attacks.
They go to the Cloudflare dashboard, they activate Spectrum and they create a Minecraft proxy within seconds.
And within seconds, our global edge network is proxying that traffic.
And, you know, those capabilities are usually sufficient for a lot of customers.
But for those mission-critical services, there's a whole other, like, much more diverse set of features that customers may need access to.
So some of those include, you know, the IP firewall, the ability to, you know, programmatically manage this firewall to selectively allow and deny traffic.
Load balancing Spectrum connections from our edge across their origins based on advanced load balancing directives that are provided by our load balancing product.
Or even, like, advanced logging that customers can use and consume to have their own observability around how Cloudflare's network is protecting their traffic.
And a lot of those features, you know, are available to our enterprise customers.
And so I think one thing to think about is, like, what are your requirements in your service in terms of, you know, not just reliability and uptime, because we know Cloudflare can provide that, but, you know, how, you know, other considerations that you're thinking about that your origin service has access to, like, information that your service has access to now that it might not otherwise have access to behind a proxy.
What features do you need in order to retain that functionality?
Gotcha. So I think what I'm hearing, Eric, is, like, you know, if you're basically getting your, you know, thinking about using a spectrum, you know, signing up to it through one of our, you know, through one of our pay-go plans, you know, to kind of, like, get a little bit of your feet wet on it, test certain things, you know, with some of the capabilities, and then maybe even learn more about requirements or things that you may have for the high-level competitors and potentially, you know, graduate to the full-blown product.
That's right. You know, an easy, little risk way to basically get used to the service, right?
That's right. Yeah. Good, that's great. Well, I hope that customers, you know, got a little bit of an overview on this and the, you know, the way that we use, who the product is.
Anything else that you would like to share?
And if not, we can wrap up a little bit early on the call. Um, I don't, I don't think so.
You know, advice to customers would be, you know, we've got a lot of great documentation, knowledge base and, you know, very capable, like, you know, support and sales staff who can help walk you through any questions that you have about the product.
So I encourage you to access those resources and check Spectrum out.
Great. Well, thank you so much, Eric. I appreciate you joining me today.
I hope people like the call today. We're going to wrap up a little early and we'll see you in another couple of weeks with another Cloudflare conference session.
Have a good rest of the week, Eric. Bye. You too. Thank you.
Thank you. Thank you. Thank you. Thank you. Thank you. Thank you. Thank you.
Yeah.
T-Mobile.
T-Mobile. Hi, we're Cloudflare.
We're building one of the world's largest global cloud networks to help make the Internet faster, more secure and more reliable.
Meet our customer, Falabella.
They're South America's largest department store chain with over a hundred locations and operations in over six countries.
My name is Karan Tiwari.
I work as a lead architect in Adesa e-commerce at Falabella.
Like many other retailers in the industry, Falabella is in the midst of a digital transformation to evolve their business culture, to maintain their competitive advantage, and to better serve their customers.
Cloudflare was an important step towards not only accelerating their website properties, but also increasing their organization's operational efficiencies and agility.
So I think we were looking at better agility, better response time in terms of support, better operational capabilities.
Earlier, for a cash purge, it used to take around two hours.
Today, it takes around 20 milliseconds, 30 milliseconds to do a cash purge.
Home page loads faster. Your first view is much faster. It's fast. Cloudflare plays an important role in safeguarding customer information and improving the efficiencies of all of their web properties.
With customers like Falabella and over 10 million other domains that trust Cloudflare with their security and performance, we're making the Internet fast, secure, and reliable for everyone.
Cloudflare, helping build a better Internet.
A botnet is a network of devices that are infected by malicious software programs called bots.
A botnet is controlled by an attacker known as a bot herder. Bots are made up of thousands or millions of infected devices.
These bots send spam, steal data, fraudulently click on ads, and engineer ransomware and DDoS attacks.
There are three primary ways to take down a botnet by disabling its control centers, running antivirus software, or flashing firmware on individual devices.
Users can protect devices from becoming part of a botnet by creating secure passwords, periodically wiping and restoring systems, and establishing good ingress and egress filtering practices.
What is caching?
In caching, copies of files are saved in a temporary storage location, known as a cache, for quick and easy retrieval.
In the context of a content delivery network, or CDN, a website's files are cached onto a distributed set of CDN servers.
Imagine a user in Tokyo trying to access a website hosted in Los Angeles.
The user's request will have to travel over 5,000 miles to reach the web server, and the response will have to cover the same distance.
That can take a long time.
A globally distributed CDN can cache the website's files in CDN servers around the world.
This way, when a user in Tokyo wants to access a website 5,000 miles away, they can minimize latency by getting the files from a CDN server close to them.
What is a bot?
A bot is a software application that operates on a network. Bots are programmed to automatically perform certain tasks.
Bots can be good or bad.
Good bots conduct useful tasks, like indexing content for search engines, detecting copyright infringement, and providing customer service.
Bad bots conduct malicious tasks, like generating fraudulent clicks, scraping content, spreading spam, and carrying out cyber attacks.
Whether they're helpful or harmful, most bots are automated to imitate and perform simple human behavior on the web at a much faster rate than an actual human user.
For example, search engines use bots to constantly crawl web pages and index content for search, a process that would take an astronomical amount of time for any human user to execute.
The real privilege of working at Mozilla is that we're a mission-driven organization.
What that means is that before we do things, we ask what's good for the users, as opposed to what's going to make the most money.
Mozilla's values are similar to Cloudflare's.
They care about enabling the web for everybody in a way that is secure, in a way that is private, and in a way that is trustworthy.
We've been collaborating on improving the protocols that help secure connections between browsers and websites.
Mozilla and Cloudflare collaborated on a wide range of technologies.
The first place we really collaborated was the new TLS 1.3 protocol, and then we followed it up with QUIC and DNS server HTTPS, and most recently the new Firefox private network.
DNS is core to the way that everything on the Internet works.
It's a very old protocol, and it's also in plain text, meaning that it's not encrypted.
And this is something that a lot of people don't realize. You can be using SSL and connecting securely to websites, but your DNS traffic may still be unencrypted.
When Mozilla was looking for a partner for providing encrypted DNS, Cloudflare was a natural fit.
The idea was that Cloudflare would run the server piece of it, and Mozilla would run the client piece of it, and the consequence would be that we protect DNS traffic for anybody who used Firefox.
Cloudflare was a great partner with this, because they were really willing early on to implement the protocol, stand up a trusted recursive resolver, and create this experience for users.
They were strong supporters of it. One of the great things about working with Cloudflare is their engineers are crazy fast.
So the time between we decide to do something, and we write down the barest protocol sketch, and they have it running in their infrastructure, is a matter of days to weeks, not a matter of months to years.
There's a difference between standing up a service that one person can use, or ten people can use, and a service that everybody on the Internet can use.
When we talk about bringing new protocols to the web, we're talking about bringing it not to millions, not to tens of millions.
We're talking about hundreds of millions to billions of people.
Cloudflare has been an amazing partner in the privacy front.
They've been willing to be extremely transparent about the data that they are collecting, and why they're using it, and they've also been willing to throw those logs away.
Really, users are getting two classes of benefits out of our partnership with Cloudflare.
The first is direct benefits. That is, we're offering services to the user that make them more secure, and we're offering them via Cloudflare.
So that's like an immediate benefit these users are getting. The indirect benefit these users are getting is that we're developing the next generation of security and privacy technology, and Cloudflare is helping us do it.
And that will ultimately benefit every user, both Firefox users and every user of the Internet.
We're really excited to work with an organization like Mozilla that is aligned with the user's interests, and in taking the Internet and moving it in a direction that is more private, more secure, and is aligned with what we think the Internet should be.