Using Cloudflare Zaraz for EU Analytics Data
Presented by: Emily Hancock, Yair Dovrat, Jon Levine
Originally aired on April 9, 2023 @ 2:00 PM - 2:30 PM EDT
Learn how Cloudflare Zaraz enables site owners to use analytics tools like Google Analytics, with an approach that protects the privacy of personal information and keeps it in the EU.
Hosted by Cloudflare Chief Privacy Officer Emily Hancock, Zaraz co-founder & Product Manager Yair Dovrat, and Group Product Manager, Data & Analytics Jon Levine.
Don't miss the blog post for more details: Need to keep analytics data in the EU? Cloudflare Zaraz can offer a solution
English
Transcript (Beta)
All right. Hi, everyone. My name is Jon Levine. I'm a product manager here at Cloudflare for our data products.
And today I'm talking with Emily Hancock, Cloudflare's data protection officer, and Yair Dovrat, who is product manager here at Cloudflare.
And we're here today to talk about some of the recent regulatory action in the EU related to Google Analytics and the use of application trackers.
So Yair, before coming to Cloudflare, was the co-founder of Xeroz, which is a company that built a service to load third-party scripts like Google Analytics in the cloud.
And we thought this was such a great idea. So Cloudflare actually bought Xeroz last year.
And now Xeroz's third-party tool manager is one of Cloudflare's offerings.
So welcome both you. Welcome, Emily. Welcome, Yair. So Emily, why don't you get us started?
There's been a lot happening in the EU related to privacy and Google Analytics.
So why don't you tell me about what are these Google Analytics decisions about?
Is this about GDPR? How did this start? Yeah, sure.
And great to be here to talk about Xeroz. So unfortunately, we can't talk about the Google Analytics decisions without giving a little bit of background around where these came from.
And that starts with the GDPR. And that's the Comprehensive Data Protection Act regulation that was passed in the EU, went into effect in 2018.
And it applies to EU personal data, and regardless of where it's processed.
So the GDPR does have rules around cross-border data transfers. So it says that if data is going to be processed outside the EU, certain things have to be done.
It doesn't prohibit cross -border data transfers, though. So that's very important to remember.
Instead, it provides a number of mechanisms to ensure that the GDPR level privacy protections are available for EU personal data if it's transferred outside the EU to a third country like the United States.
Data transfers from the EU to the US previously were permitted under an agreement that was a political agreement between the United States and the EU called the EU-US Privacy Shield Framework.
But in 2020, July 2020, the Court of Justice of the European Union invalidated the Privacy Shield in a decision that came to be known as Schrems 2.
And that is actually, Schrems is Max Schrems, and he's the lawyer and activist who brought legal action against Facebook for processing data in EU personal data in the United States and not having what he believed to be the proper privacy protections.
There was a series of lawsuits and they ultimately ended in July 2020 with this court decision.
So the court found that the Privacy Shield was not an effective means to protect EU data from the US government surveillance authorities.
They went through a whole analysis of how the US government can do surveillance on personal data, and the court said that the data, if it's transferred to the United States, can't get the level of protection from GDPR from these US surveillance laws unless some certain things were done.
So the court did say transfers of EU personal data to the United States are okay if the data is transferred pursuant to one of the other mechanisms that the GDPR has for these cross-border data transfers.
And one of those is something called the EU Standard Contractual Clauses, which is exactly what it sounds like.
It's a set of contractual language, contractual clauses that put in place certain protections.
So, and these are legal agreements that are approved by the EU Commission to enable these data transfers.
So the Schrems 2 court said it's okay for data to be processed in the United States if these standard contractual clauses are used, and if there are supplementary measures taken by the US processors to protect the EU personal data from US government surveillance.
Right. Yeah, that makes sense. It sounds like really the key issue here is EU citizens' personal data, and yeah, where is that being processed?
Is it being transferred outside the EU? And it's interesting, GDPR itself doesn't really say anything about this, but the Schrems 2 decision, which came later and really is based on GDPR, makes it tricky, creates a lot of rules about what you need to do if you want to process the personal data of EU citizens outside of Europe.
So that makes sense. So where does Google Analytics come into this?
Yeah, yeah, exactly. So in the time since that Schrems 2 decision, there's been a number of European data protection regulators looking at this question of, well, what are these supplementary measures that have to be put in place to protect the data from US surveillance and US government authorities?
What is sufficient?
What is it? And right after the Schrems 2 court case, Max Schrems, he has an organization called None of Your Business, which is an advocacy group for data privacy.
That group filed 101 complaints against European websites that were using Google Analytics and Facebook Connect third-party trackers.
And what that group said in these complaints was that use of the trackers violates the Schrems 2 ruling because they send EU personal data to the United States without sufficient supplementary measures.
So fast forward now, 18 months, it took 18 months for a number of data protection regulators to look at all these complaints.
They actually formed a working group to consolidate the review of these complaints.
The Austrian Data Protection Authority in January of 2022 became the first data protection authority to issue a decision.
And it said that a specific website's use of Google Analytics violated GDPR as interpreted by Schrems 2.
And they said this because they said Google Analytics was sending the IP addresses of visitors to the website to Google back in the United States.
And the decision cited earlier court precedent in the EU that IP addresses are considered personal data, even though I think there's definitely some argument around that given the types of technologies that there are to break the connection between an individual and an IP address.
However, the court precedent in Europe is that IP addresses are personal data.
And so the Austrian DPA said, this is personal data. It's going through Google Analytics.
It's going to Google. And they did not feel that the technical and other supplementary measures that Google had in place in August 2020, when the investigation was done, were sufficient.
So this case was followed by, I believe we've got Denmark, the Caneel in France, the data regulator in France, and Norway.
And I think there may be another couple, but several regulators now have said that using Google Analytics in a way that sends IP addresses to Google in the United States is a violation of the Shrems 2 interpretation of the GDPR.
Right. That makes sense. So it sounds like the key idea here is that if you believe that as the courts have ruled that IP addresses are personal data, Google did not have the right protections in place to transfer this.
And that's really the origin of these rulings. Right. So what does this mean now?
So does this mean, if you're a website in Europe, does this mean you can't use Google Analytics at all anymore?
Does this impact other similar tools to that?
Yeah. So I think the important thing to remember is that these decisions relate to these very specific websites' use and implementation of Google Analytics, and how it was implemented in August 2020 when the complaints were filed.
So there are some questions about whether there may be ways to implement Google Analytics without sending IP addresses to Google servers in the United States, which we will talk about.
And then also we know that in response, Google has announced that next year they're going to make a version of Google Analytics available that won't collect IP addresses.
So it's not clear that this is a ban on Google Analytics entirely in Europe, but the trend seems to be that if you're worried, maybe you're using Google Analytics is not your best bet.
And it's worrisome precedent. And websites are using dozens of third-party tools, some which might collect IP address or other unique identifiers.
And if these third-party tools are transferring personal data to the United States, it could attract the attention of the EU data regulator.
We just don't know exactly how far this is going to go yet. In addition to the privacy concerns you mentioned, there's security issues about using these third-party tools as well.
And yeah, next year is a ways away. So I know in the meantime, people are going to have concerns about using Google Analytics.
So Yair, this is where Zahraz comes in.
So maybe why don't you start by telling me what Zahraz is and then how it can help in this situation?
Yeah. So Zahraz is what we call the third-party manager, which means we load your third-party stack of tools, tools like used for analytics, for marketing, CRM, chatbots.
We load those tools on the cloud using what's called Cloudflare Workers, our edge platform.
And the way it's different is that in the traditional way, when you load third-party tools, the browser does all the work.
So the browser is loading the third-party scripts, hence the security risk because you're loading someone else's script on your website, it slows down the site, and it also creates communication between the browser to the third-party server that by design of the HTTP request reveals the end user's IP address.
Zahraz is a total opposite architecture. So when a user loads the website, we do all of the work on the backend.
So there's one request that goes to a Cloudflare worker.
This is where the Zahraz service runs. And we communicate with the third-party server from our backend.
So nothing is being sent from the end user's browser, which creates an architecture that we proxy everything that goes to a third party.
So we can also hide IP addresses. We can mask out some personal data.
It's basically another layer of security between the end user's device and the third-party server.
That's cool. It sounds like really a win-win-win. It's more private, it's more secure, and it's actually faster because all that JavaScript, which can take a long time to load and run, is all happening at our edge, which is super fast instead of having to happen in a user's browser.
So that's really cool.
So maybe we can talk more about data localization specifically because I think what folks are probably wondering is, how is Cloudflare different from Google?
So maybe my logs aren't going to Google, but are they still going to Cloudflare?
How do we address that? Yeah. So that's a great question. And I think we see a lot of customers lately, especially in Europe, coming to us with that exact question.
So it's a good chance to actually explain how this is different. So as I explained, I think Cloudflare is positioned in a unique place here because, in many cases, we proxy your main website, and you're using Cloudflare as a CDN, and we can leverage the Cloudflare global network.
And this is a good example of how natively Zerasi integrates with the entire Cloudflare stack.
So Cloudflare is offering enterprise customers something called Data Localization Suite.
It's a set of features.
Mainly, we're talking about two features. One is called regional services, and the other one, metadata boundaries.
And I'll start with regional services because I think this correlates strongly to the question here.
So regional services mean that you can specify where you want specific services to run.
And in that stack, you can also choose where you want Zerasi to run.
So if you, for example, set up regional services to only run in Europe, and you have, let's assume, a European bank that's using data localization, we can assure that a request isn't even inspected, and the service only runs only in European data centers.
And if you have regional services turned on, it will immediately apply on Zerasi as well because Zerasi runs in a first-party context.
So that's the first thing. And I think this is a big change because we can actually leverage the fact that we have this global network and decide where to run specific services, and no other tool out there can do that.
That's cool. That's cool. That seems really key. So with regional services on, everything is just fully encrypted.
We can't even see what's happening in that request unless we're in Europe.
So that cloud service, Zerasi, which is doing that work, is only running in Europe.
It's not running in the U.S. ever when you're using regional services.
True. And that's a big edge. And then the second thing is the metadata boundaries.
And this is about us not saving any data. So it sounds like something scary, but it's basically what we use as analytics to customers, let's say.
And we can basically decide not to save any points of data. Zerasi, by the way, by default, we save nothing except for error logs to improve the service.
And this is also something you can customize. So bottom line, a customer can decide, I want to run services only on a specific region like Europe.
And I also don't want Cloudflare to save any data whatsoever, not even for analytics or errors.
And those two things together are what gives us the most, I think, private tool out there to load third-party tools.
That's really great. So it sounds like with regional services, we're only doing the work in Europe.
And then with the metadata boundary, we're making sure all the logs, everything that's generated that we do collect is staying in Europe.
But even more importantly, when Zerasi runs, we're actually not collecting this data.
So Google doesn't see the IPs.
We're not storing the IPs from running Zerasi. So we're not even really storing that at all.
And the logs we do have about that just say that Zerasi ran and things like that, that's going to stay in Europe.
Yeah. So just to comment on that, you can configure Zerasi.
It's a toggle off thing or toggle on to hide IP addresses from Google, from Facebook.
So this is how it works. You can decide to not reveal end-users IP to Google.
That's really great. Yeah. And then just tying that back to the court decisions, it sounds like that's really going to alleviate the main concern here, which is like, is personal data being transferred?
It's like, well, it's only running in Europe. We're not even storing this data or sending it to Google at all.
We're kind of creating that firewall, that barrier to make sure it doesn't get sent.
That's really great. Cool. Well, this sounds like something really amazing that people will be able to take advantage of right now.
Zerasi is in the Cloudflare dashboard. People can turn it on. But I know we're not done there.
I know, Yair, you have a whole roadmap coming up, especially related to privacy features for Zerasi.
So maybe you can tell me a little bit about what you're working on.
Yeah. So as I told you, we are hearing a lot of requests and feedback from customers all around the globe, actually, not only in Europe.
So we are going to dive deep into building some privacy features this year. I think the biggest one we're currently working on our own consent management platform that integrates natively with Zerasi and lets you manage consent or any end-user preference to apply on cloud loading of tools.
Because cloud loading of third-party tools is a very new thing.
Managing consent or how you want to treat personal data is still a riddle.
And we want to make it super easy for our customers to do that.
So we're building a very privacy-focused consent manager. That's one big thing we're on.
Second thing is we call it the DLP feature. And I touched this briefly.
The idea is that we've seen it across every Google account I've seen in my life had this issue where accidentally or definitely not intentionally, customers are sending personal data into the third-party systems.
So for example, it happens a lot when you have a form on your website. And I'll just pause you there.
What does DLP stand for again? Data Loss Prevention. Sorry.
Yeah. So you want to make sure that stuff isn't, data's not leaving your website that you weren't intending for it to leave.
Is that right? Yeah, correct. So I'll explain how it happens.
So imagine you have a form on your website, someone fills in the form, and then the company, I don't know, or the email address is as a query parameter to the URL.
And then Google Analytics is also working on that site.
And Google Analytics collects all URLs to show them on the pageview reports. And then you end up sending a query parameter with a user email into Google Analytics, which is not a good idea.
So with the DLP feature, you can, because we proxy all of those requests, we can look for email addresses, social security numbers, names, and mask those out or alert you or both.
So this is what we mean when we say the DLP feature.
So that's the second thing. It's really like a defense in depth to make sure you're not sharing anything you don't intend.
Correct. Yeah. So you'll have like preset templates to search for names, for example, and you can also customize it to have your own patterns.
Yeah. And then the third thing is another example of how we can leverage the global network.
It's about loading specific tools or creating what we call triggers, which is the rules according to which you want to load specific third -party tools only on specific geolocations.
So imagine a company that has global operations, they need a place to configure their third-party stack of tools and they need to be able to load different tools on different locations.
And to do that, they need to know where the user is and then to apply the rules to change the actual code on the website.
So we are going to make this a click away.
So that's the third thing. And we have other things, but that's the main idea.
The consent manager is really interesting to you.
I mean, I'm sure everyone listening in sees the pop-ups about accepting cookies and consent, but I think what's interesting about when you're using Xeras is like the actual kind of consent you need is different because the information we're collecting is different.
And so I think that's a great example of where like I mentioned, it's faster, it's more private, it's more secure, but also I hope the user experience is better because it's really not nearly as burdensome to ask people about that.
So that's really great. Yeah. We're trying to make it very user -friendly and it's going to be, I think, if I'm not mistaken, the first consent manager you can actually load on a website without needing to make any code changes because we proxy the website.
That's really amazing. Yeah. I mean, obviously Cloudflare, we're deeply committed to privacy, deeply committed to making sure we're protecting our customers' personal data, the personal data of end users, but we also do want to make the web friendly and easy to use as well.
So great to see that.
Awesome. Well, thank you, Emily, for giving this insightful background onto GDPR and shrimps and all these scary concepts and breaking it down and making it very easy to understand.
And thank you, Yair, for helping create Xeras and for all these great features, which are making the Internet more private and more secure and easier to use.
So thanks to you both and have a great rest of your day. Yeah, thank you.
Thanks. Thank you, John. Bye-bye. We're betting on the technology for the future, not the technology for the past.
So having a broad network, having global companies now running at full enterprise scale gives us great comfort.
It's dead clear that no one is innovating in this space as fast as Cloudflare is.
With the help of Cloudflare, we were able to add an extra layer of network security controlled by Allianz, including WAF, DDoS.
Cloudflare uses CDN, and so allows us to keep costs under control and caching and improve speed.
Cloudflare has been an amazing partner in the privacy front. They've been willing to be extremely transparent about the data that they are collecting and why they're using it.
And they've also been willing to throw those logs away.
I think one of our favorite features of Cloudflare has been the worker technology.
Our origins can go down and things will continue to operate perfectly. I think having that kind of a safety net, you know, provided by Cloudflare goes a long ways.
We were able to leverage Cloudflare to save about $250,000 within about a day.
The cost savings across the board is measurable, it's dramatic, and it's something that actually dwarfs the yearly cost of our service with Cloudflare.
It's really amazing to partner with a vendor who's not just providing a great enterprise service, but also helping to move forward the security on the Internet.
One of the things we didn't expect to happen is that the majority of traffic coming into our infrastructure would get faster response times, which is incredible.
Like, Zendesk just got 50% faster for all of these customers around the world because we migrated to Cloudflare.
We chose Cloudflare over other existing technology vendors so we could provide a single standard for our global footprint, ensuring world-class capabilities in bot management and web application firewall to protect our large public-facing digital presence.
We ended up building our own fleet of HAProxy servers, such that we could easily lose one and then it wouldn't have a massive effect.
But it was very hard to manage because we kept adding more and more machines as we grew.
With Cloudflare, we were able to just scrap all of that because Cloudflare now sits in front and does all the work for us.
Cloudflare helped us to improve the customer satisfaction.
It removed the friction with our customer engagement. It's very low maintenance and very cost effective and very easy to deploy and it improves the customer experiences big time.
Cloudflare is amazing. Cloudflare is such a relief. Cloudflare is very easy to use.
It's fast. Cloudflare really plays the first level of defense for us.
Cloudflare has given us peace of mind. They've got our backs. Cloudflare has been fantastic.
I would definitely recommend Cloudflare. Cloudflare is providing an incredible service to the world right now.
Cloudflare has helped save lives through Project Fairshot.
We will forever be grateful for your participation in getting the vaccine to those who need it most in an elegant, efficient, and ethical manner.
Thank you. Hi, we're Cloudflare.
We're building one of the world's largest global cloud networks to help make the Internet faster, more secure, and more reliable.
Meet our customer, Falabella.
They're South America's largest department store chain, with over a hundred locations and operations in over six countries.
My name is Karan Tiwari.
I work as a lead architect in Odessa e-commerce at Falabella.
Like many other retailers in the industry, Falabella is in the midst of a digital transformation to evolve their business culture to maintain their competitive advantage and to better serve their customers.
Cloudflare was an important step towards not only accelerating their website properties, but also increasing their organization's operational efficiencies and agility.
So I think we were looking at better agility, better response time in terms of support, better operational capabilities.
Earlier, for a cash purge, it used to take around two hours.
Today, it takes around 20 milliseconds, 30 milliseconds to do a cash purge.
The homepage loads faster. Your first view is much faster. It's fast. Cloudflare plays an important role in safeguarding customer information and improving the efficiencies of all of their web properties.
With customers like Falabella and over 10 million other domains that trust Cloudflare with their security and performance, we're making the Internet fast, secure, and reliable for everyone.
Cloudflare, helping build a better Internet.
Enterprises have quite high standards for the scalability and performance of the products that Optimizely is bringing into their organization.
We have a JavaScript snippet that goes on customers' websites that executes all the experiments that they have configured, all the changes that they have configured for any of the experiments.
That JavaScript takes time to download, to parse, and also to execute. And so customers have become increasingly performance conscious.
The reason we partnered with Cloudflare is to improve the performance aspects of some of our core experimentation products.
We needed a way to push this type of decision making and computation out to the edge.
And Workers ultimately surfaced as the no -brainer tool of choice there.
Once we started using Workers, it was really fast to get up to speed.
It was like, oh, I can just go into this playground and write JavaScript, which I totally know how to do.
And then it just works. So that was pretty cool.
Our customers will be able to run 10x, 100x the number of experiments. And from our perspective, that ultimately means they'll get more value out of it.
And the business impact for our bottom line and our top line will also start to mirror that as well.
Workers has allowed us to accelerate our product velocity around performance innovation, which I'm very excited about.
But that's just the beginning.
There's a lot that Cloudflare is doing from a technology perspective that we're really excited to partner on so that we can bring our innovation to market faster.