All about Cloudflare Hardware, Impact, and happy holidays!

Presented by João Tomé, Rebecca Weekly, Andie Goodwin, John Graham-Cumming

Originally aired on December 19, 2023 @ 5:00 PM - 6:00 PM EDT

In this week's program, we go a bit longer than usual to talk about Cloudflare’s hardware, our Impact Report, and the importance of our blog, but also to wish everyone happy holidays.

João Tomé initially is joined by Rebecca Weekly, our VP of Infrastructure and Head of Hardware Systems Engineering. We delve into Cloudflare’s perspective on hardware, GPUs, CPUs, and the needs of AI services and applications (you’ll learn about AI inference). We also discuss our Cloudflare Gen 12 Server (Bigger, Better, Cooler), Moore’s Law, and what to expect in hardware and AI in 2024.

Next, Andie Goodwin explains the Cloudflare Impact Report, which launched this week, and some of the most impactful projects and endeavors from our company in 2023. We also share some holiday and Christmas stories. Additionally, our Portugal office has a holiday message to everyone.

Last but not least, in the short segment "A Bit of History," our CTO, John Graham-Cumming, discusses the importance of Cloudflare’s blog and how it has shaped the company's culture over the years.

You can check the highlighted blog posts:

Also mentioned were:

English

News

Transcript (Beta)

Hello everyone and welcome to This Week in Net. It's the December 20th, 2023 edition. And this week we're going to talk about hardware.

And we have a special guest to do just that.

I'm João Tomé, based in Lisbon, Portugal. And with me I have Rebecca Weekly, our VP of Infrastructure.

How are you, Rebecca? I'm good. I'm good. Seeing the tail end of the year coming about and, you know, hoping all of our projects land in the midst of the United Colors of Benetton codes.

Red, orange, you know, thank goodness there's not a yellow or a green at this point.

True. A lot going on in terms of security, reliability.

All of those processes are ongoing in a sense.

You are based where in the US? With the Peninsula crew here in the Bay Area.

But yeah, I'm in the San Francisco office. And it's the end of the year, almost Christmas.

Do you celebrate Christmas or are there any other celebrations?

I celebrate all the things. So we have, you know, all sorts of different decorations all over our house.

We really try to be as inclusive as possible.

Is it important for you, the Christmas time, this time of the year, presents, all that?

I love just the ability to take a step back and there's inevitably time to reflect and think through what has happened.

You know, it's been, I think, a tough year for the world and you can't stop really, you know, looking at that, feeling that.

And particularly, I think it's an important time for us as a company to step back and think through where do we want to go from here?

How do we want to continue to scale and serve our customers? And so, you know, it's always a good time when we have LDWs to like fix things, work through the backlog, try and make sure, you know, we've documented ways to help our other people, whether in our own company or within our customer base.

So I always think it's one of the most interesting times to sit back and reflect and I love sunrises, new days, new years.

There's something about, you know, the denouement and then coming out of that dark period, you know, shortest day of the year into what is new and what is fresh.

I find a lot of joy in those moments. And you really enjoy the outdoors, right?

I do. I do. I'm a big trail runner, big trail, everything.

Because I see your Twitter account and you definitely show amazing images from the Bay Area there.

It's just here. I mean, this is like, it's like 10 minutes from my house and I do not live in the hills, you know, on a daily basis.

So it's one of the blessings of this area.

For those who don't know, what does the head of hardware systems engineering do, really?

Well, I think maybe I'll put the analogy out this way.

You know, we are at our heart, a network, a global network that is spread over 100 plus different countries.

You have 300 plus cities, 12 ,000 interconnects to other networks.

Fundamentally, all of that happens on servers.

Those servers are connected through various network backbone components.

There's switches, there's routers to make all of that work. And somebody has to make sure the hardware is doing what it's expected to do, that it has the right version of the BIOS and the firmware.

And can boot to whatever operating system, you know, those in SRE have chosen and production engineering have chosen to support.

Whether it's a net OS or obviously Linux. So there's a lot of work involved in making sure those systems meet the needs of our teams, are secure, reliable, you know, and ensuring that they are going to serve as efficiently as possible.

So, I mean, if you go look at what has happened in the server space in the last five years, you know, it's more than a 65% improvement in performance per TCO dollar and per watt.

That's going to get harder to deliver as, you know, process nodes are getting smaller and smaller and leakage is pretty high from a power perspective.

So there's interesting techniques and accelerators and other domains that we're starting to explore to try and ensure that we are continuing to deliver the Internet to others in a secure and safe fashion.

You joined Cauliflower in 2022, more than a year in April, if I'm not mistaken.

Yeah, it's almost two years, you know, and it's been an incredible journey.

It's a really special company. I think, you know, when you think about what you get to do every day, I like to be a net force for good, right?

And I fundamentally believe this company serves a net force for good in the world.

So what's better than being able to do that? You joined with a lot of experience with you.

You came with a lot of experience before.

Are you saying I'm old? Experience is not old. That's my new motto. Especially in the semi.

I've built hardware for a long time. In this area, you were mentioning before the past few years, the evolution of the past few years.

And there's a lot of discussion about the Moore's law in the hardware part of the industry.

It's slowing down, right? What is the landscape of the last few years in terms of servers, in terms of hardware, with new additions now with AI also playing a role in terms of CPUs, GPUs?

So, so much. Oh my gosh. So Moore's law was effectively the concept that we would see a doubling of transistor density every 18 months.

And, you know, that stayed with us for a very long time, like north of 50 years that we've had Moore's law and that Moore's law has been scaling very effectively.

And actually, interestingly enough, if you just measure it by the density, we are continuing to see scaling factors for density.

The challenge is that the performance is not scaling with that density.

And why? I mean, there's always, every problem has multiple facets.

But one of the challenges really is as you scale cores for general purpose compute, you end up having to do more and more clever things to get performance.

So you do things like speculative execution to try and assume you know where the next step of that particular core is going to go.

But the more cores you have, the more complexity there is, the more likelihood that you're going to have made a misprediction.

And then you add the domains of, you know, new security exploits that have been used to take those hints and turn them into backdoors into, you know, what is actually happening on those processors.

So there's a lot of things from, you know, multiple cores to multiple threads to, you know, again, techniques to improve the performance of general purpose compute that have led to branch misprediction, challenges, misses, you know, hash hit rates are going down.

That is not unexpected, right?

If you have, I don't know, 128 floor building and only two elevators, you're going to have a pretty slow rate of getting to the top floor.

You know, you need to continue to expand the number of elevator bays that are available to you and staircases and escalators and, and, and.

And unfortunately, there's a point at which it's just a really long time to empty that building.

And that's really what we're seeing with general purpose computing.

It's still a fantastic domain.

We continue to see opportunities to shrink process nodes, to improve techniques.

And I could talk for days about, you know, different ways in which people are exploring, you know, in some sense, disaggregation, even of centralized computing.

So, you know, CXL, which is the coherent express link is about, you know, removing in some sense, going back to a North bridge, South bridge, kind of an architecture for CPUs where you disaggregate the memory controller.

That sounds horrible.

Like everything has been integrated for so many years to make things go faster.

But fundamentally, if you're going to spend so much time shunting around between these different cores and these different memory controllers, maybe it's better to assume you're going to have higher latency, but give yourself so much more capacity.

And there is a loaded latency curve that correlates with having higher capacity, or you can change your buffer width, or you can be more specific to the use case that you're in.

So lots of things happening in a generalized compute domain to try and improve performance.

And then, of course, I would be derelict in saying, you know, alternate architectures.

So we're seeing going from more CISC based architectures to more risk based architectures.

I started my career not to date myself at Silicon Graphics, which was, you know, very committed to MIPS, which is one of these risk based architectures.

I know a lot of folks here looked at Sun, started at Sun, thought about Sun, maybe not here specifically, but here in the Bay Area.

And they too were deeply committed to that risk.

Very simple processor. So really focus on, you know, trying to do as little as possible, as efficiently as possible.

And then really focus on your IO speed throughput to be able to have coherent interconnect, you know, small, dumb, low power processors that shunt bits to where you need them to be with intelligent, you know, software doing what's really interesting.

And in some ways, what I just described is kind of our global network, right?

Like, so we see, yeah, we see a lot of opportunity where it's, you know, tons of packets, but they're not necessarily the most performant packets.

Now I say that about our general, you know, services.

Obviously things like workers AI, things like R2 completely changed that game, right?

And we'll see increasingly more heterogeneity to support those kinds of services, even as we continue to try and, you know, reduce the power, increase our efficiencies in the generalized purpose, you know, general purpose compute space.

There's more use cases now possible than one year ago. Although one year ago, ChatGP was already around, Generative AI was already starting.

November 30th, right?

Don't quote me on that, but I'm pretty sure. It was, it was, I can confirm.

And in that sense, of course, that opened a new door in terms of more companies being interested in this sector.

Of course, it's not a new sector, but at least in terms of hype, in terms of interest, it clearly grew after that.

We launched a few months ago, this GPU capacity for workers AI, which means having AI inference in more close to the users, more than 100 cities by now, right?

Yeah, we're over 112, last I looked.

So for those who don't know, first, what does that mean?

Having GPU capacity in our data centers spread out throughout the world. What does that mean for those who use the workers AI platform in a sense?

Well, I should defer to, you know, Ali and Philip and Rita and all of the folks who are building the software services for, you know, all the areas in which we're trying to improve our workers capability for our users.

From a GPU perspective, you know, you mentioned generative AI, you know, yes, chat GPT sort of transformed the world this time last year.

But those kinds of models, transformer based models have been around since 2016.

And we've been seeing how that scales and sort of pushes with just the parameter counts, the architecture.

So back to, you know, what I was mentioning earlier about what CPUs do well, and then where they break deep parallel pipelines, which neural networks end up having sort of a weighted capability that lends itself to deep parallelization.

That is where we start to see CPUs break.

And it's not actually, for most cases, the inference of models, right?

Inference is when you use them to predict something, right? You train a model on a whole huge set of data to like figure out, okay, what is that word that they said?

Or what is that picture that just showed up? Or, you know, name your favorite flavor from a use case.

It's the brain power, in a sense. It's not happening in your computer when you use chat GPT.

It's not using your computer to put the output there.

It's using brain power somewhere else in the server, right? Yes, to train the model.

Now, it is somewhere relatively close to you inferring from your statement.

So let's say you go into chat GPT and you say, tell me a bedtime story about X, which I do all the time with my children.

Because I used to be good at telling stories, but like after every night for COVID, I started running out of cool ideas.

Yes, so I take advantage of chat GPT. But what we call is the time-to-first token, right?

That time -to-first token where my question is fed into a computer somewhere that is interpolating against the model what a response should be and generating that story.

That is the time-to-first token. And when you talk about inference, that mean time-to-first token is sort of the metric of merit.

And in general, in hardware, we never will know everything you want to do in software.

It's not our job. Like, we love you. We want to understand what you need.

But fundamentally, what we have to do in order to do our jobs well is understand what your metrics of merit are.

Request per second per TC of dollar, tail latency, some sort of a, I need to have this within 10 milliseconds of foo, right?

These are the factors we look for. And then as we tune what we're selecting for CPUs, for GPUs, we're measuring those outputs and saying, okay, this is what you saw historically.

Here's the new one. Is it within wiggle room on your countermeasure?

So I mentioned before, request per second per TC of dollar at 10 milliseconds of latency, tail latency.

You have a measure and a countermeasure.

I can do all sorts of things to make you more efficient. But if I don't check the latency, you're not going to have a great user experience, right?

So you have to have both. It will take a long time to do the perspective. Of course.

Yeah, exactly. So that's what we're measuring for. So again, in the space of inference, we have constraints in terms of power, right?

Our edge network is white dispersed.

We have everything from 4 .5 kilowatt racks to 15 kilowatt packs. An average GPU and H100, we're talking over 700 watts just for the GPU, let alone the CPU it's connected to and the fans and the memory and everything else, right?

These are over kilowatt systems.

Some of them, like if you go buy an MGX, we are talking multiple kilowatts in a 6U box.

These are not things that you can install everywhere in our global network.

That's not what we're built for. That's not what we're good at at our heart.

So how do you find the intersection of what we are good at being low latency to edge users with what we can actually deliver in an effective fashion?

So when we were looking at the GPU project, what we really wanted to find out was, okay, we do a lot of inference today.

We do a ton of inference. If you look at bots, if you look at WAF ML, all of these are doing inference at our edge on CPUs.

So when Rita and this team started working together with Celso, with all the other folks, they were doing it on standard CPUs with Constellation AI back in May, right?

The challenge was that couldn't run what people were most excited about with the hype cycle of generative AI.

LLMs are just so big, you need so much more memory allocated to them to be able to store that in terms of your performance.

So we started looking at small GPUs that could do LLMs, that could do LLMA, and really help us expand, but within the power footprint that we had at the edge.

So we ended up with the L4s. The L4s are the lowest end GPU from NVIDIA. Why did we choose that?

Because we could. Because at the time we were thinking we had to run any model.

I think at this point we've gotten much more specific about which models we'll run.

But I feel like every day Sven and Celso and Phillip and Jesse are like, can we run the Mistral models?

What about the Hugging Face models?

So that flexibility. Recently we have been updating the new models available.

Of course. The MATA ones, the LLMA ones, the Mistral. Oh yeah, every day.

Yeah, every day. Exactly. And unfortunately, or fortunately, however you want to think about it, NVIDIA is so immersed in the ecosystem because they are definitely the largest incumbent in this domain.

Most models come out of the box working in an NVIDIA ecosystem.

So we are starting to test and play around. I'll say play around with AMD GPUs, the MI300, the MI250.

And I think we'll look at lots of other options.

We've talked to a bunch of startups in this domain, as well as obviously the Grafana.

Grafana, that's funny. Gaudi. I was looking at Grafana graphs before this.

Big surprise. But, you know, different options that are out there.

There's lots of silicon that's out there. Most of the smaller startups are really focused on either LLM specifically or the training market because they see the compute density and domains where they can win.

And our approach is more classic Christiansen.

Like we're not trying to take over CoreWeave or Amazon or Oracle on arbitrage for NVIDIA parts.

Like H100's lead time is over 48 weeks. A100's, the one from a year and a half before, is literally 52 weeks.

It's a long time. Impossible to get these parts, nor is that necessarily going to give us anything differentiated or interesting.

So for us, it's about workers. It's about integrating with interesting solutions quickly to support developers.

Fundamentally believing not every customer is going to want to do their own foundational models.

Most of them want to use a standard model and integrate it into the service experience on their website.

Or integrate it into, you know, better dynamics and control for something in their network traffic or, or, or.

And again, the product folks would know far better than me all the different use cases.

But that was really, you know, when we got the constraint space and the timeline, right, I had eight weeks basically to do the entire project.

That meant, you know, okay, we're going to get scrappy and do what we can and retrofit a server we had just designed, which was the, the my tech server for Gen 11.

And use OpenBMC to change our, you know, basically fan algorithm.

Retrofit with a riser card with a bracket that like literally got machined at, you know, a spot down here in San Jose.

I mean, it was, it was totally an all hands on deck kind of a project to get these things done.

It was amazing. Just for perspective, this means that developers that are using our workers platform are already using building AI, generative AI, LLMs, things, perspectives, services, using our network already.

That's already been working.

And we've been having people all over the world putting these GPUs in our data centers, right?

So that's also an amazing perspective.

And we wrote two blog posts related to this, if I'm not mistaken, this one, how we used OpenBMC to support AI inference on GPUs around the world.

Should we open that one?

Sure. You were mentioning this one in specific before, right?

Absolutely. So this one was really about, you know, taking that Gen 11 server that we had just designed.

And actually we had designed, you know, and poor Aki might be mad at me, but we had originally designed this server to give ourselves a larger storage footprint at our edge, because you may be familiar with our cache product.

You know, I think it's our second highest revenue grossing product here at the company.

That solution really does need a certain amount of storage to be able to have, you know, excellent retention rates and competitive retention rates.

So you may have remembered a blog from last year where we were talking about, you know, competitiveness with Fastly and others in that domain.

And one of the outcomes was that we decided to design a higher storage server for our edge for that team and really to focus in on higher retention.

So we had just done this server.

We started ramping it in kind of the middle of this year. And all of a sudden it became, because it was PCIe Gen 4 compliant and capable of, with, you know, a modular form factor and design, the ability to be augmented and retrofitted with OpenVMC out of the gate.

We could make the changes to our firmware to be able to add this card in, to be able to test it, to change the fan algorithms, and then to roll it out in an effective fashion across our edge.

You know, OpenVMC inherently doesn't make us any more secure.

You know, it's just an alternate way of, it's just an alternate operating system effectively that you can run on your baseboard management controller.

But what it allows us to do is control our own destiny.

And so because we've had such an incredible edge team under Ignot doing ERR, you know, so that we have a really great control mechanism for rolling out changes to our edge fleet and rolling them back if they're wrong.

That allows us to take the agility of supporting our own firmware development and design, integrating changes, and then rolling them out safely to our edge.

And so because that ecosystem was here, we were able to kind of leverage that to move towards an open source footprint, be the masters of our own destiny in this domain, which lets us make changes to our systems without having to go back to ODMs, without having to go back to the silicon providers, to integrate the fixes and wait usually nine to 12 months.

We've seen this repeatedly with security patches with by the time AMI does the integration and they're testing on this platform, whatever platform from whatever ODM we're working with, and then we roll it out, can be a really long lead time.

And that's time that there is a security patch out there that we have not done, that we have not integrated, and it means we're vulnerable.

So that's why I love the open source element of this.

That's why we adopted it, was to improve our security position.

Again, not because OpenBNC is necessarily better, but because we can be faster in integrating it for our specific platforms because of the ecosystem that we've built here.

I say we like my team did it. I really give full credit to Ignaten team, which allows us to move that much faster.

It makes sense and gives that flexibility and control that is required here.

And we also have this one called Cloudflare Gen12 server.

This is such like my heart expands on this topic. Because when I first came to Cloudflare about two years ago, we were in the middle of doing our Gen12 design and we'd sort of started on the treadmill, assuming we should do the exact same form factor, the exact same design that we had done year after year after year.

And as I started looking at it and realizing that these were 350, 380 watt CPUs, which was going to push the average power of our servers, you know, an additional basically 200 watts above where we had been before.

We started to ask the question, like, is a half-width 1U system logical?

I don't think so.

Because again, we have 42U racks, we have 48U racks. We're not going to get all of those sleds at north of 600 watts in if we have a 10 kilowatt rack.

And so our job as a team is to optimize the number, basically the cores per watt per rack.

So when we looked at it at the rack level instead of just at the node level, we realized we were spending more money on our cooling design, we were spending more money on lower loss materials over constraining our form factor when we had plenty of room in the vertical dimension.

So we worked really hard with Don Callahan, who I can only give like kudos, enough kudos to, to really understand our global footprint.

We had never actually mapped all the different rack powers and configs and sizes across our very large edge fleet until he dug in and started doing that work.

So huge kudos to Don Callahan, who's not on this blog, but deserves all the kudos.

And we really started to take his data and realize, hold on, we're going to go to a larger form factor.

And that means we can use bigger fans, which means we can actually use less power going to the fans to cool the system, because we have more general air to vent.

And because of that, we actually reduced our power, curse lead, and stayed within basically the overall footprint that we had in these sites.

And I think we also were able to kind of branch that forward to planning for the future, highlighting, hey, the trend in general purpose compute is going up, we need more 15 kilowatt rack sites, we need more sites over time, that will give us more headroom, especially as we add our two nodes.

Especially as we add, you know, GPU nodes, each of which is going to take more and more power.

So this allows us to, you know, look at vertical form factors for those add in cards over time, do a lot more than being in this one you half width box, which is very serviceable.

I mean, we made the decision for good reasons before. But as our needs are changing, we need to augment and change our design.

Makes sense.

And it's quite interesting to see that there's always improvements to be made to this more data, more knowledge will allow better improvements in a sense, or efficiency for all sorts of things, which is interesting.

This is engineering, Xiao.

You can see it in my face, there's nothing I love more than looking at something, assuming we have the right answer.

And then like getting schooled. Oh my goodness, we're wrong.

We don't know that. What do we do to fix it? How can we make something happen?

We never know how things will evolve. Our knowledge of things will evolve during a full calendar year, which is interesting.

This brings me to my last question, which is 2024 is right around the corner.

In this area, where do you see 2024 going in the hardware area in terms of challenges, but also opportunities?

I think it's getting hot in there. So I'll come back to the thermal elements that we have a varied edge and we are trying to deliver exabytes of storage for R2 with higher security control data sovereignty in more and more regions.

So this is going to be the year of storage and really thinking through how do we bring that onto our own hardware.

More efficiently, more effectively, more reliably.

When we talk about storage, like the cardinal sin back to the metrics of merit is losing data.

So whereas most of our edge network has been built for low latency and high performance in that low latency, you know, within the constraints of it's not the highest, it's not AI.

This is almost a flip. It has nothing to do with latency.

People will wait a long time to get their data. You just can't lose it. So I think that's a philosophical change in terms of how we build our edge, how we think about our edge and why that's so critical in what our service to our end users is.

So I'm super excited to see how we transform the world with storage. And I couldn't be more excited to work with folks like Chris Evans and Deval, et cetera, in delivering that service.

I think we're going to learn a ton. I think we're going to make some mistakes.

I pray we lose no one's data. I don't think we will.

I think we have a lot of ways in which we're trying to back that up to make sure we're doing the right thing.

But that will be very exciting. And I promise you, Joelle, nobody has ever said that storage is what I'm most excited about.

But I really do think there's a ton of changes for us as a company coming with the R2 space next year.

I think there's a ton of spaces where we are improving our security posture.

And it is so exciting. I mean, we talked earlier about how much I always post pictures of like trails and, you know, one of my favorite things in life is running.

And, you know, you never get faster just running fast. Sometimes you have to slow down.

You have to go up that hill. You have to work on strength training and other things so that you can then come back and run fast.

You got to rest.

You got to pull back the arrow in order to launch it. Right? Focus. Take time to focus and then go at it.

So this is a domain where there's so much we can do.

There's so much we can do and that we should do to serve our customers better, to serve the Internet better, to serve our core purpose.

That is critical. So it's going to be painful.

There's going to be some bumps. It always is. Like going fast is hard, but it will make us better.

I fundamentally believe that. So there's things we've done in the server side in terms of, you know, going to triple levels of encryption from a key signing perspective that I'm really, really proud of.

I think there's a ton of work we need to do in our network side to make sure that we are that much more secure.

And, you know, afterwards we will be a much better company.

So it is the year of security. It is that. Absolutely. And then the last one is what is going to happen with AI.

So right now we're really built around an inference service with a mean time to first token prioritization where the lowest latency possible in serving that first response of the model is, I hope, going to be served by being disseminated to a hundred plus cities.

Arguably, there are models that will take more than 80 milliseconds just to return that first token, even on the home node.

So being within 50 milliseconds of latency, eyeball latency to that node, unclear.

Unclear that that is the issue. I think what we'll start to realize is as these models are bigger, as our use cases are bigger, the only way to scale performance is going to be more memory bandwidth, more memory capacity, higher IO capacity, which means bigger GPUs, right?

Bigger GPUs on more PCIe lanes actually able to serve these bigger models.

And that will either be because the models are bigger that our customers want to run or our customers start to want to get into fine tuning so that they are augmenting and changing the model, not just using standard models.

So we're not there yet, but I lurk in all the different chat channels because it's one of the things I love about Cloudflare's culture, how transparent we are.

It's coming. And that will take bigger GPUs and more data to really get to good spots.

And that will be, I think, in core tension to the first two objectives of high durable state for storage and security coming together with data sovereignty.

So I think our edge network is going to start looking a little bit different in 24 and a lot different by 27.

And that's super exciting. That's super exciting because it's the evolution of our company into not just content dissemination, but security and support to build services that connect every user at low latency everywhere in the world.

Makes sense.

And it's quite interesting to see the evolution. It's the evolution not only of the company, but generative AI perspective, new companies that are being built and they're building new services.

It's quite exciting time in this area for sure.

And it's work in progress. It's growing. So let's see what 2024 will bring.

Last but not least, do we know the amount of cities, locations where we'll have GPUs next year?

Is there like a perspective? Well, I just said my perspective, which is, you know, there's a tendency in hardware to want to build ahead of the demand.

But we don't know yet enough about our service needs. And if it's meantime to first token, and we're actually on smaller models, then disseminating more broadly will help us reduce eyeball latency.

This is our core business.

This is how we know how to build it. This is why we're in so many places. But if it ends up going the other direction where it's bigger models, larger data sets, now all of a sudden, the only way to support that with a faster response time is a bigger node.

And we're more limited in terms of sites where we have the power to put in those bigger systems.

So it may end up being we go to fewer places.

And that may be the best choice to deliver to our products needs and our customers needs.

And that's where I when I said earlier, there's philosophical changes when it comes to storage when it comes to AI.

What we've built historically was around network dissemination, and very low latency for small workloads, lots of packets.

These are two different domains, they're fundamentally different, the way you build the systems, and what matters to our users are fundamentally different.

So I will be excited if we build zero new GPU sites next year, but build bigger GPUs, so that we can do fine tuning and improve the service experience.

But something you'll find about me is, I don't want to build it until I know the product team wants it needs it, we can do smaller things we can iterate faster.

Making big, big bets is exciting and thrilling.

And I'm super excited about what we did with GPUs to disseminate them this broadly.

Now we're in every region, we're in every core domain to be able to support.

But we need to let product team get all those systems online, learn, tell us what's wrong, so we can fix it for this next phase.

Feedback will be really important for sure, from the customers, from the product team.

That will be interesting.

And I remember over this year, and even last year, of storage being something that customers ask in a very frequent way, in a sense.

Absolutely. And I think as we take on these bigger, as we become more secure, and we have FedRAMP going to from medium, even potentially to high, by 25, each of those new customers and those enterprise customers are going to have higher level demands on data sovereignty, security, network, telemetry to understand if and when and redirection.

There's the new EU regulation on AI that possibly will bring that perspective of data sovereignty inside the EU, in terms of AI, more relevant than ever.

So that will be interesting too.

Yeah. So it'll be, again, natural tension to sort of what we've done breath first, to potentially more depth in region.

And again, I think that will make us a stronger company.

So it's just sometimes you have to know when to jump, and when to like hold back and listen.

And I think we're in a listening phase, as we've been these big bets to let them marinate and also let them get out of their hype cycle, right?

Like generative AI, a lot of these things are just toys right now.

True. But this is the year where it's so expensive, and the access is so limited, that I think you're going to see the CFOs go, hmm, what is our return on investment for this big expensive thing that we're doing?

True. Are we really getting value out of it?

And in that process, we will learn what we really need to build for everyone.

For sure. This was great, Rebecca. Hope you like it. Thank you for the opportunity to talk about hardware.

We're kind of like air. I mean, you really notice if it's not around, but nobody really thinks about it otherwise.

True, true.

Thanks for giving us a chance. Learn a lot. Thank you, and have a good end of the year.

You too. Enjoy. Happy holidays. Happy holidays. And now we have Andy Goodwin, project manager at Cloudflare and related to Cloudflare Impact, right?

Hello, Andy. How are you? Hi. Yeah, thanks for having me on the show.

I'm excited to chat today. How are you? I'm good. Where does this show find you?

I'm in Austin, Texas, and it's a slightly chilly and rainy day. How's the weather in Lisbon?

The weather in Lisbon is actually great. A little bit cold, but sunny and a good day, actually.

So, we had this week, actually, our impact report. It's already recurrent.

In December, we published this impact report. Since when do we do these reports?

This is the third annual report. We started in 2021, and it's grown over time, and we've been thinking about what do we want to achieve with it?

What do we want to highlight? So, this is really an exciting year, and this is also the best it's ever looked for design.

Yeah, it's pretty great, to be honest.

And why do we do it? What's the reason behind it? One thing that's useful is having one central repository where people can understand what is Cloudflare doing that's good for the Internet.

That ends up being really useful for our current customers, prospective customers, for people to kind of get a sense of what spaces we work in, as opposed to visiting 40 different pages on Cloudflare.com.

And then, also, really having an excuse to tell a story as far as what are we proud of in the year?

How did we try to make the Internet more secure, private, open, free, democratic?

Trying to figure out how to communicate to people what was important to us this year.

And then, also, just Cloudflare, as we get to be a bigger company, it becomes more and more challenging for everyone to know what everyone else is working on.

So, having a reason to kind of collect all these stories and get them out, I think, also really helps us push harder and think bigger.

Let me share my screen just to show a few highlights there. This is the press release we launched this week regarding some of the key milestones that are included in the impact report.

There's a lot about Project Galileo, vulnerable Internet properties that were protected.

But, also, there's this new kid in town, which is we shielded K-12 school districts in the U.S., right?

This is a new project. Yeah, and it's really exciting.

We're getting a lot of interest in it. And this is something that you'll see in headlines a lot, that schools can be attacked and that they have the students to protect, a lot of personal data to be careful about.

It feels like this is a really important area to be working in.

The impact report is available for those who want to download it.

What are the main highlights? We already said one, this new project, in a sense.

But what are the main highlights you would want to give people?

Definitely understanding that it's broken into three categories about how we see a better Internet as being principled for everyone and sustainable.

If someone only has a couple minutes, I would definitely have them look at the infographic pages on Project Galileo, Athenian project.

The page on Cyber Safe Schools, that one right there.

And maybe also the page on Responsible AI.

Something about the impact report that I think is helpful to understand is it's not just us announcing successes and talking about what we're excited about.

It's also help educating. Part of the aim is helping to educate people on the issues that are preventing us from having a better Internet.

And I mean us as in everyone in the world.

Trying to figure out what issues should people be aware of and what work we're doing on them.

This one, I love this page on transparency.

Kind of understanding Responsible AI, that that's something, a topic, people are talking a lot about and kind of understanding what's at stake.

And then at the end of the report, we have our disclosures.

Those are related to global commitments we make.

So you can find out more information about the efforts Cloudflare puts into topics like privacy, customer data, diversity, things like that.

Actually, just a few weeks ago, Jocelyn from our Project Galileo was in the show and just explained the importance also of elections in 2024 that are coming.

And that's also present in this report in a sense, right?

Yeah, it's exciting. And these are topics that I didn't really know much about before I came to Cloudflare.

We're definitely seeing it more in headlines now as far as like election security and what's at stake.

But like for Project Galileo, you know, it never really occurred to me that, let's say, a nonprofit that helps LGBTQ youth, that people might want to take them down.

That they might want to attack them only to silence them.

I have so much better of an understanding now about how important it is for people to have a voice on the Internet.

And not just be that people that have the most money, the deepest pockets can protect themselves.

Like really looking out for an Internet for everyone.

There's a lot there. A lot of organizations that are being helped.

It's exciting too. Like when you think of journalists around the world who are reporting on corruption and fraud.

That there are people who don't want that truth out there.

And Cloudflare can help protect them so that they get to have a voice.

There's also an area for sustainability in terms of cutting emissions by moving to the cloud in a sense.

And there's a study, a 2023 study referenced here.

In terms of the cloud-based service can decrease related carbon emissions between 78% and 96%.

So there's different numbers in terms of impact here.

Yeah, and then there's a pretty new announcement on this page that we're joining the Science-Based Targets Initiative.

So we'll be working on carbon reduction goals.

We have one page on our own emissions. I'm the one who gets to calculate those.

So that's something I will be doing in 2024 early on for our 2023 emissions.

And then you can also read about how we donate to tree planting projects.

Which is really fun and I love seeing the pictures each year. These ones are from Mexico and Portugal.

Oh, Portugal. That's right. Interesting. Pine forests.

Actually, we have a few blog posts that in a sense are related to policy, to cybersecurity.

And one is present in the impact report in a sense. It's an update on the project cyber safe schools we were mentioning before.

This is an update from Zaid actually, specifically.

Yeah, I'm excited to read his blog post.

He has been cranking these out and I need to get caught up on reading them.

And from my understanding, there's more schools now in that program. More can apply.

So it has been a tremendous success in terms of helping schools, which is always interesting.

There's also some comments here from some of those schools, which is also interesting to see in terms of impact.

Part of the value too, I think, is starting the conversation and helping people understand what's at stake.

And what is out there to help protect themselves.

It's really great to get so much attention on this.

And there's also who can apply. In this case, it's 12 public school districts in the United States.

Up to 2,500 students in the district.

But we also have a few other projects that can help outside the US. Schools and hospitals and all sorts of things, in a sense.

Yeah, project safekeeping.

People should be looking into that if they're working in an area of critical infrastructure.

We'd love to get more applications on that and on all of our projects.

In the blog we have, last week we had the review for those who didn't watch, they can see.

But we also have a blog post about how we're dealing in terms of Australia's cybersecurity strategy.

And how we're all in. This is a blog post related to a lot of, not only countries, but even regions like Europe.

Are getting their cybersecurity strategies in place.

And that blog post goes there. And there's also this one, also from Zahid and Mike Conlow.

Return to US Net Neutrality Rules.

So, these are the open Internet principles that are behind net neutrality.

Which Colfer has long supported. So, we have some comments there for those who want to explore more this policy related to trends.

Definitely an important issue to read up on.

A lot to explore. Before we go, we still have time to learn a bit about your Christmas in the US.

Do you have any Christmas stories you want to share with the audience?

If you celebrate Christmas or other end of year celebrations.

One of my favorite Christmas memories is selling Christmas trees and wreaths with my grandmother.

I was maybe eight. She was doing it for a season. And it was so like, it was up in Maine.

It was very cold, but cozy. And just so special to be able to see the excitement of people coming to pick out their trees.

This Christmas, I'm hoping to take a hike on Christmas Day.

I'm thinking about a trifle I could make for dessert.

But I really need to get my act together on figuring out a main dish and sides.

So that I can eat more than just dessert. What are your Christmas plans?

Or any special Christmas memories for you? I have a few. But to be honest, for me, it's mostly because I lived for, my parents still live there in a forest type of house.

There's a forest outside. For me, Christmas is, and returning to my parents' house, is returning to that forest.

So that forest, although there's no snow, unfortunately, there's cold.

And it's quite a good experience to feel inside nature during Christmas.

For me, it's about that. That's beautiful.

I'm hoping to get some gingerbread houses done this weekend, too. So let's wish everyone a Merry Christmas and a Happy Holidays.

Merry Christmas and Happy New Year!

Thank you, Handy. Thank you! And now it's time for A Bit of History.

A segment with Coffler CDO, John Graham-Cumming. So the Coffler blog has been around since the start of the company.

Actually, before the launch of the company, there was a blog post already.

So it's been around. And the company was very small at the time.

At the beginning, of course, as all startups usually are.

But the Coffler blog had initially, in the first years at least, a culture that was developing.

All about explaining what was happening on the Internet, but also explaining how we're building products specifically.

The deep dives, all that.

You started in 2011, but you knew the company before. How did that come about?

That culture of the Coffler blog meaning something? Internally, but also externally.

In a technical way. The question is, how do you get people interested in your company?

You could do advertising.

You could do marketing. But what do you need initially in a company? Obviously, you need customers at some point.

And you need awareness to get you those customers.

But more than anything, you need really smart people to build the product.

It's a recruiting problem in the beginning, often. I think one of the things with the blog was, at some point, we realized that the blog wasn't actually about marketing to our customers.

Customers, you're very welcome to read it. And I hope you enjoy it.

But the original idea was to market to engineers and hope they would think, Cloudflare seems like an interesting place to work.

I should apply. And that got us a lot of good people.

I mean, a lot. You have no idea the number of times I speak to someone who's joining Cloudflare.

And they'll say, the blog was one of the things that made me want to join.

And so, that was really the key insight, was write for other engineers.

Because then they'll be interested in your company.

And companies are often very secretive about what they're doing at a technical level.

And we decided we'd just do the opposite. We tell people what they're doing.

And there were two reasons for this. One is, it's interesting. If you're an engineer, you really want to read about other experiences.

Not a sanitized version.

Not a simplified version. But what is it like to really build a piece of software that does X at our scale?

And the other one is that we also realized that this stuff wasn't actually secret.

It wasn't actually the case that we weren't going to lose by telling someone how we built something.

Maybe someone would laugh and say, oh, why did you do it that way?

I've done it before, and I did it better than you.

Ha, ha, ha. I'm so clever. But no one's suddenly building a competitor because you're talking about how you do stuff.

And so, it really didn't matter.

And so, it just made sense just to talk about it. And, you know, we've really tried to stick to that.

That's why the blog is my department runs it.

I'm the editor. It's not in marketing. And it will remain that way because it's a different community we're speaking to.

And, of course, customers love it because they get to read really technical stuff themselves.

So, welcome customers.

Welcome new employees. Please keep reading the Cloudflare blog. And there's another perspective for those curious.

May that be engineering students, but also other types of people.

Journalists, as I was, just wanting to discover a bit of the Internet.

Hey, what happened there? The Internet came down. This very popular service came down.

What happened there? And over the years, we've been talking about what Cloudflare sees in terms of some big service came down.

What are we seeing?

Why did that happen? So, there's an educative part of that. I mean, one of the things I tell people who write for the Cloudflare blog is our goal is to educate the reader.

We will look smart if we make that person feel smart. We make someone learn something.

And so, they will consider us to be smart people if we explain to somebody.

So, go out and educate people. And the other thing that shouldn't be discounted is the satisfaction somebody gets, like an engineer, in being able to talk about what they do.

Because very often, they can't. The company doesn't want them to talk about how they built something.

But it's very satisfying.

It's a little bit like open sourcing, which we do a lot of. It's, here's my work.

This is what I worked on. And hopefully, in years to come, people will have left Cloudflare and gone to other things.

They can point back to those blog posts and say, hey, this is part of my resume.

I built this thing at Cloudflare.

It's quite interesting. And I remember in 2012, actually, when Google came down a big outage, Tom Paseka wrote a blog post of what Cloudflare was seeing.

That came to be a very popular blog post.

Journalists were tagging that blog post at the time.

So, that was a very specific use case of the blog as explaining what Cloudflare was seeing in terms of the Internet, right?

Well, and there was one about the Facebook outage.

That was, I think, our most popular blog post ever. It was, yeah.

And, you know, because people are wondering what's going on. And then they're interested to learn, oh, this is how the Internet works.

This is layers of the Internet.

This is the point at which it went wrong. There's this VentureBeat 2012 blog post about the Google outage mentioning Cloudflare's blog.

So, that was a use case that got attention.

And you were talking about the 2021 Facebook outage.

This was possibly the most popular, I think, in our blog. Well, one of the reasons this was popular is that we put this up before Facebook had fixed the problem.

And I think that, you know, people couldn't go to Facebook. So, they came to our blog and read about why they can't go to Facebook, right?

What's happening?

Whatever was going on. And if that was funny, somebody from Facebook wrote to us while it was happening saying, yeah, that's a pretty good description of what we're facing.

We didn't know all the details internally, but we had figured out from looking at how the Internet was responding and what was happening, what this must be.

And they were, you know, we knew what they were fighting, which was really a messy problem for them.

And it was pretty much related to BGP and the Border Gateway Protocol.

And we've been seeing problems related to BGP over the years and outages from specific networks, ISPs, or even Facebook in this case.

As you know well, on Cloudflare Radar, we talk about BGP problems, hijacks, leaks, all that sort of thing.

So, yes, BGP is one of those three-letter protocols that helps keep the Internet working, but also occasionally stops it working.

Exactly. So, a bit of the Cloudflare blog, in a sense.

Just to wrap things up for the blog, from 10 years ago, 2013, up until now, of course, the company is much bigger, more products, different aspects in terms of customers.

How did the blog and its content change over the years?

Much more content today with Innovation Weeks and all sorts of tools, features, deep dives.

But in general, how much has it changed?

Well, I mean, there's more blog posts, right? I mean, this birthday, we have 44 blog posts in five days.

We generally have more blog posts. I think the biggest change is that Cloudflare, over time, hired a very professional product management organization.

And you have more content, which is written by product managers, talking about their products in detail, on top of the engineers talking about how those things are built.

So, it's a combination now. You have a lot of engineering content, and you have a lot of product management content, explaining the products, how they're used, how they integrate, what they look like, how to use them, why we built them, and all that sort of stuff.

That's probably the biggest thing that's changed over time.

But there's something that's stayed on, like, make your blog posts curious enough for others, right?

Make it transparent.

Make it technical, if possible, right? Those are still around. We're targeting a reader who is interested in what we have to say, who is technical but not necessarily an expert in the thing we're speaking about, is likely a non -native English speaker, because we have a global audience.

And what we want to do is we want you to come away with technical knowledge you didn't have before.

So, that's in the backbone of the blogs.

Many blogs. Many blogs, yes. And there's that. There's the Cloudflare blog.

That's a wrap. Thanks, Joel. Happy Holidays!

This Week in Net

Tune in for weekly updates on the latest news at Cloudflare and across the Internet. Check back regularly for updates. Also available as an audio podcast!

Watch more episodes