ℹ️ CIO Week: Zaraz Tech Talk
Presented by: Yo'av Moshe, Simona Badoiu, Ruskin Constant, Yair Dovrat
Originally aired on May 15, 2022 @ 10:30 AM - 11:00 AM EDT
In this CIO Week segment, Cloudflare product managers and engineers will take a deep dive into the products and features we launched today.
Read the blog posts:
- Why Cloudflare Bought Zaraz ,
- Cloudflare acquires Zaraz to enable cloud loading of third-party tools ,
- Zaraz use Workers to make third-party tools secure and fast
Visit the CIO Week Hub for every announcement and CFTV episode — check back all week for more!
English
CIO Week
Transcript (Beta)
Hello, everyone. Thanks for joining us today.
We are the RAZ team that just joined Cloudflare together with me are Yoav, who was the CTO of the RAZ, Simona and Ruskin, who joined us as first and second engineers.
So, first of all, congratulations.
You are the team that built this magic technology.
So big shout out and tell us a little bit about the RAZ, what it is, it's architecture, how it works, etc..
So there was a third party manager.
Basically websites nowadays they have loads of third parties.
They load things like analytic tools, different chat boards, conversion pixels, you name it, different widgets, really everything.
We define third parties by everything that isn't your code, basically.
That's usually coming from some external or region.
So if you fetch a script or somewhere else in order to do something on the website, that would be considered party.
Got it.
So it's like not the font you are loading on the site, but what type of tools usually are we speaking about?
Yeah.
So it will be tools like Google Analytics, some Facebook pixel, reddit pixel, Twitter pixel, Snapchat pixel, TikTok pixel.
Everybody have their own pixels nowadays.
Yeah.
Things like this. Okay.
And can you explain a little bit about the architecture of how it actually works, like what was built?
Sure I can take that one.
I guess as architecture from two sides, one is what happens in the browser.
So normally those third party scripts would just be loaded in the website code and then executed in your browser when you load it up.
And what's special about the approach in the architecture is that we have been able to offload a lot of what would normally run in your browser, to run in a cloud, a cloud machine instead, but crucially a cloud that is very close physically to where your browser is, so that it's still a very fast back and forth.
And that enables to basically get out of the way of loading everything else on the website, which is what your users actually care about.
So yeah.
Thank you.
So I know because I was part of the team that we decided to build it on Workers.
Can you walk us through how you made the decision? Did you check other alternatives when deciding to build a Workers?
And a little bit on that front, please.
Yeah, that was that was when we started, really.
I think we started with the idea and we needed to figure out how to actually build it.
Now, the tools we were competing with or the tools that were like the closest ones in terms of offering were like traditional type managers.
The way they work is that like when a customer goes into their dashboard and configures and says, like, what about this?
And that?
To the moment that user hits publish, a static JavaScript file is being generated and then it's posted on a CBM to be served super fast, like whenever it's being loaded on a like on a website.
So we had to compete with that like we figured, we thought, okay, we need to answer like our, our software needs to respond in a time that would be at least as fast as a static file being closed.
And that's practically like the fastest request would be on the Internet.
Nothing is faster than a static firing idea.
But we wanted to do more than that in terms of we figure that we have to also have a server side so we can't just have the static file.
What we want to do different things like to do some operations in order for us to actually shift from the browser.
So that we actually were looking for a different options, like running Docker containers all around the world, maintaining our own fleet of servers everywhere.
But it was just like it looked impossible at the moment. We stumbled upon Workers.
It just felt super like at the beginning we didn't believe that it would work, but we gave it a shot and very quickly realized this actually has the potential to work.
We could serve something that looks like a static file but inserted with a dynamic response.
And yeah, it just worked and we stuck with it.
So may I ask if you can share with us some metrics how performance this architecture is?
What are the current stats?
Um, but it's like I'm saying, it's like the, the most performance request you can get on the Internet and it's really thanks to work in that sense.
So we're talking about the response time of about 10 milliseconds.
Yeah.
So it's really unnoticeable at all from the perspective of, of users. And as we're shifting to Cloudflare, by the way, even that gets like faster or eliminated in some of the cases.
So yeah, it's getting even better.
Cool.
So I want to get a little bit into like the developer kind of nerdy, geeky stuff.
And I was wondering like in your opinion, maybe let's start with what was the like you just mentioned a few components and different parts and different things in the architecture.
What was the hardest thing to build?
And it can be in many instances, of course.
But overall speaking, what was like the.
The hardest.
There are many answers, but in my opinion the most challenging was reverse engineering the tools that we are building now.
And I don't know how to enumerate exactly the reasons why this is the most complicated part, but you can imagine because we have to kind of reproduce what's happening in the browser on the server side.
Maybe my comments can help.
So can you walk me through a little bit about maybe mentioning a tool or two, how this process works and how it's done today and how you want to do it in the future now that we're part of Cloudflare.
So how it is done today off.
Can you help me with this?
Yeah.
I mean, the whole investigating part, like I said, is very complicated, mostly because, well, the scripts we were trying to figure out like to learn and to replicate, they're almost always minified.
And so we need to find a way to make sense of what's actually happening there.
And I think the process and Ruskin is one of if you have examples, you might feel free.
But the process is relatively different from tool to tool. So sometimes we would like really work very close with the debugger of the browser.
Sometimes we would actually focus on the actual network requests coming out from the browser.
Usually it's like a combination of the two.
Often we would look at the like on the user side of things on that tool.
So we would ask customers like, if you want to have support for tool X, give us access to X so we can see how it actually works behind the scenes.
But it is very different from typical.
Yeah, I.
It's definitely one of the most.
Sweating I guess is the word.
Part of the job in that.
It forces us to have a really intimate understanding of what is happening when you load up a website.
And it's there's something quite satisfying about just kind of pulling on these different threads of opening up the devtools of your browser when a site loads and just seeing where all these different variables and bits of information, where are they being sent to, but also where are they coming from and how are they being transformed?
And so it's in some ways a very pedestrian process that you just take it step by step, really kind of looking okay for this particular chunk of information, which may well be something more innocuous, such as the title of the page, or like an ID that identifies the button that you've just clicked on.
Like, where is that being sent?
Like, where is that being called?
Whether or not the code is minified or not.
It's a lot of really trying to.
Empathize and get into the head of whoever designed this code to run.
And actually, I think that's probably the place we have to start with.
Every tool is to understand why does this tool even exist?
What is what is it that is trying to achieve?
What are they selling to their customers?
Because of course, we want to replicate it without sacrificing any of the value proposition.
That is the reason why that tool is even on the site that we're trying to handle.
And there's another question, you know, in terms of how things sorry you ask about how things are going to change in the future.
And I think that's something that we're very excited about, even though I think they're investigating, but was fun in a way.
It's one of the most difficult things we've done, but it also is fun.
And the necessary part because you need to understand what you are doing, what you want to do.
Yeah, but something we are looking forward a lot I think is to work closely with the vendors and so that we can basically focus more on writing tools and writing these integrations together with them instead of trying to figure out like, what, what do they mean by this like variable code?
Like you have no idea.
So it's great and yeah, thank you.
That's really interesting.
So do you do you think that now being part of Cloudflare like a big chunk of our work would be basically helping them?
And if yes, how do you think we can do that?
And also maybe if you can share a little bit about how it was working with some vendors already, how the process is and what are the interesting parts so that.
Yeah.
I guess I would jump in and say for sure. Like very much looking forward to working with vendors and I think the...
very much the value proposition of Cloudflare in terms of performance, security and availability and reliability is we basically extend that to the vendors by exposing an API that they can use to do this in a more structured manner.
Part of what's very satisfying about the reverse engineering process is where you spot commonalities between the tool that you're working on and other tools that that are also in use on a on a client site.
Like in many ways, there's a lot of repetition.
You know, there's really only so much information you can take from a user's browser session.
What's the title of the page? What's the IP address?
What's the user agent of the browser that's being sent?
And so one of the parts that we can offer, which helps with the performance aspect, is just like a standard library or API available nuggets of information that might want to be tracked.
And the fact that standardized also means that it's less error prone because we'll check it once and then that's useful for all of the tools they're going to be using that and say, Yeah, what was the third part?
I said, reliability, safety and performance.
Yeah, I think all of that we get the reliability.
Well, one, I think we had it by leaning on cloudflare's infrastructure to use Workers, but I think being part of like more fully integrated now, I think that's baked in even, even more rigidly because we have access to support instantly, which is I think another thing that's really great to have at our fingertips now that we're, we're part of the team.
And so the ability to make quick changes and be aware of what's coming up so that we can stay ahead of any updates to the APIs that we'll be using is something great that we can now use to extend to vendors or third party providers that they can also keep up and help us make the rest of the Internet really fast.
Yes.
So in my mind, I don't know if I ever said this to you, my colleagues, it looks like standardizing the interaction between third parties and the websites.
And I think it's very interesting and very needed because it's an entire ocean of third parties and they are all doing pretty much the same thing as was thing, collecting pretty much the same data.
But they are all functioning in so many different ways and every time you are installing a third party, you have to reiterate through all these steps and see how that particular third party is doing.
So this is my point.
That's really interesting.
So it seems like you probably are the group that saw like that knows most about how third party tools are being used.
Today you reversed the engineers, engineered tens of hundreds.
You saw a lot of websites and how they implement them.
What do you think are the main problems?
Like any insights on how this industry works?
How like should people be scared about them and why?
I have one problem.
It's not to be scared by it, but the thing is that the.
Sometimes you are loading a lot of code that is not used on your website because probably you are sending a page view.
But that tool can do dinosaur of things.
Okay.
Know, that's super common. I remember we heard it when we were just like even before starting the actual company.
But we're in San Francisco and interviewing one of the biggest companies in Silicon Valley.
And I remember there were actually research, they were big enough to have like a research team for this topic.
And their research team went into one of Google's tools, something on the Google Stack, and they've done some analysis that show that 80% was just never executed.
And like I say, it's like it's super common.
Like usually I would say I think it's more of like that's the common case that the tool most of the most of the code in it would not run for most of the visits.
So that's that's just wasteful.
And we're seeing loads of these practices everywhere, like functions that are just not written like you would want them to be written.
And I think it's coming to a place where often these tools, when they were created, people didn't consider that this code is going to run on like everywhere, like on millions of websites, millions of CPAs all around the world.
Like when you run it, once you're saying, okay, it's just like a few more milliseconds, like 10, 20, 30 milliseconds.
Yeah.
But when you think about the impact it has on like computing or like the Internet as a whole, that's it's just you can justify it.
That's really interesting.
So, yeah, go ahead.
So I was going to say, it's just it's like a bit of a developer's nightmare, right?
Like this idea that everyone's going to be looking at your code and scrutinizing it.
Yeah, but in this case, we're the team that are doing the scrutinizing and judging harshly.
Yeah.
But to me, it sounds like. Yeah, one part of it is like how the tool was developed.
But many of the things you've mentioned, like code that's not being used or functions that run within tag managers that are basically might just be neglected there and no one even knows that they are there.
It's it can be some usage problems and not only third party.
Like design.
Like design? Yeah.
So what do you think about it? Most of the problem is that you are not using the entire.
Properties that the party is offering.
And this is.
Inevitable and nowadays. So is there something in the architecture that prevents that from happening?
I mean, what prevents a user from adding a function to the RA and then.
Just leaving it there for years without anyone noticing.
Will this affect the site?
I mean, in.
The old days.
Yes, but not anymore.
Right.
I think you're happy about. No, no, no.
Go ahead. Yeah, I think it's basically one of the big perks of the fact that we are offloading so much of what was previously running in your browser to a cloud.
Instead, in a very crude description, we essentially take a snapshot of most of the data that a lot of these tools are of acquiring and provide that to a cloud environments, that it can asynchronously process it and then kind of send off any extra requests that are required.
So to your question.
If there is any bloat, that's kind of user input.
If it's a tool that we can execute in the cloud, it's at least it's only bloating in the cloud and not on millions of users browsers as well.
But I would say the management part of what that is in the tech management part of tag managers in general, but also the thought that we've put into Zora's and the interface is also about kind of helping to clarify exactly what tools are running and under what conditions so that it's much easier to highlight what's in use, what's not in use and make know kind of nudge and prompts for people to keep their guidance of third party scripts tidy.
But maybe this is an idea for later to just scan the toys and see what's used, what's not based on our analytic statistics.
I never thought about it.
Okay.
So let's let's maybe dive into some startup story. So I'm wondering if you can share with me like a time that something seriously broke so that like.
We crashed something.
And what happened?
Benjamin. We have a crest.
But was there any moment where something seriously broke?
I think we.
Had.
That's good. Go ahead.
I mean, things. Not everything goes according to plan for sure.
And I think without naming any particular vendor or any particular client, there have certainly been instances where A2 seems to not be fulfilling what it's expected to be fulfilling.
Exactly.
So whether it's missing data, for example, that's saying let me be more concrete.
If we had a customer that would have added a tool that would have allowed them to track the effectiveness of their ad campaigns across a bunch of different media.
And they had just launched a campaign and were looking to see. Expecting to see a lot of uplift in data in the dashboard of that tool.
And then suddenly a few bits of information weren't being sent.
And so, of course, we panicked slightly, made sure that we had a reverse engineered or kind of replicated the functionality as best we could based on what we could see.
But actually, thankfully, it prompted an introduction to the vendor from our customer that then allowed us to share a bit more of what we're doing and how and eventually help them to slightly.
One update a bug which we found which is great, but also to update their API so that we could run it even more confidently on the cloud.
And that's something that we hope to replicate even more now that we're with Cloudflare and running on like many more devices, but with a bit more clout and support and.
Team right behind us.
Yeah.
Thank you for that. Super interesting.
So I'm wondering, like, if maybe you can share with me and the audience some particular feature that you've built that excites you the most?
Like what?
What are your most favorite parts of their?
Simon, I think you have one.
My most favorite part of the prize is something that now is not running.
That is not anymore since we started working with Cloudflare, but it was building deploying the client defined functions in Durable Objects.
This was the feature that I was scared of the most.
This is the most exciting.
This was most fun to build.
This is interesting because the customers can define their functions and we can run them in the cloud.
I don't know.
It's a nice interaction between customers, clients, workers, us, everything.
Just to add on this, I mean, we're rewriting this thing and would probably not use Durable Objects in the future just because, frankly, the Workers platform kind of developed and now you can call it work like one worker can call another worker.
So we're probably going to use that instead. But it was definitely one of the biggest, like the most complicated hacks that we managed to do in order to run client code separation in complete separation from our main code.
Yeah.
Maybe to clarify as well, when we say client code, we mean as a customer as or as of the interface.
You might want to run custom code that you've written yourself, not necessarily from Google Analytics or from another tool, like something extra to transform.
For example, with the page title into a neat little tag that you can feed into your data visualization might be a bit more easily, which I think most developers that are dealing with user interaction nobody is scared of because you're having to trust user input and run it on your infrastructure.
So we need to make sure that that can be done in a safe way that didn't affect other users or clients, but also didn't bring down like the site as well for us internally.
Yeah, Durable Objects came along. It was pretty fresh I think when we started using that.
They were still in beta.
Yeah. Yeah.
It was a pretty good timing. Very exciting.
Cool.
So you have, and Ruskin and Simona, you all mentioned different parts of the Workers ecosystem and that the fact that it grew and developed while we were working on it.
I'm wondering if you have any advice or suggestion for the entrepreneur or developer that's now listening to you and is just starting to build on Workers.
Anything you'd like to share with them that you think might be insightful?
I can say that if somebody already started to use Workers, then I think they're infiltrating.
They're very quickly figured out that they can do a lot with it.
But we were initially concerned that we can't do everything with Workers.
And I think that the way this environment grew and develop all the things around it that are still like this, I feel like the work is always releasing new features.
Basically, I think if somebody has the concern, like whether or not they can build a product on workers, the answer is probably yes.
Unless you're doing something that you might not even want to do, you can probably build the workers and get something that is way easier to maintain and just faster and more secure by design.
So I would talk to those that haven't yet started and I tell them like give it a shot, because I think there's a lot to be discovered and you're going to be surprised by how much you can achieve it.
Also if all builders on workers and teams.
So we're launching today an MVP.
I'm wondering if you can explain our listeners what exactly are we launching today and how can they use it?
Of course.
So what we've done today is. It's a simple but I think pretty effective and relevant means to basically if like add a third party script to your site, but bypassing all of the downsides that we basically spoke about in the introduction.
So instead of having to go through the pain of adding all these scripts, slowing down your website, thinking, Oh, I need a solution, and then discovering Zaraz as you can just kind of cut straight to the chase and start loading.
For example, Google Analytics, we've launched a suite of goofy tools, right?
I think maybe at least a 20.
Yeah.
You can load them like basically configure them just by pasting in, for example, your tracking ID or account ID depending on which you've picked and have that running on your website, especially if that's handled by Cloudflare already.
And it's like the minute it's configured, it's, it's ready to go and tracking and you shouldn't see any of the slowdown that you would normally see by going through the typical documentation process for these third party tools, which is where they say, hey, copy and paste this chunk of code into the head of your website and then in your user, in your browser network tab, you see a lot of requests going to these third party places.
Cool.
So we have very little time. Last question is and we really have a few seconds, how do you feel about joining Cloudflare so far and what do you wish for the future?
Majestic genius, and I feel like we're going to change it.
So that's pretty exciting.
Yeah.
If it was powerful. You know, we've been a tight knit team this year and that is great.
I think in terms of our how we work together and how we've got to know each other, especially with the wider context of the world right now.
But as the counterpoint to that, I think it was also slightly nervous or nerve inducing when we would grow and get take on bigger and bigger customers.
And I think now having the clout of the rest of the team, the support, all of those people and functions and departments and just some incredible talent that we've been able to talk to already.
Yeah, I'd like to echo what you have said that like I'm kind of excited about the impact we'll have on the Internet at large.