🎂 Privacy Edge
Presented by: Mari Galicer, Emily Hancock
Originally aired on April 21, 2023 @ 6:30 PM - 7:00 PM EDT
Join Cloudflare Chief Privacy Officer Emily Hancock and Product Manager Mari Galicer to learn more about all the work that has gone into Privacy Edge.
Read the blog post:
Visit the Birthday Week Hub for every announcement and CFTV episode — check back all week for more!
English
Birthday Week
Transcript (Beta)
Hi, welcome to Cloudflare TV. I'm Emily Hancock.
I'm the chief privacy officer here at Cloudflare.
And I'm joined by Mari Galicer, who is a product manager for something that we are going to talk about today, which is called Privacy Edge.
So, Mari, can you please tell us what Privacy Edge is?
Yeah.
Hey, everyone. Nice to meet you.
Like I said, I'm a product manager for consumer privacy and I'm super excited to talk about this new launch of this offering of all these privacy oriented products that I've been working on for probably the better part of a year now.
So basically, Privacy Edge is a collection of products that we're working with a lot of different partners on that are a lot of them are built on standards like open Internet standards.
And what they aim to do is to make it really easy for developers, folks that are maintaining and developing platforms and applications to build privacy into their applications by default.
So to make it really easy on the network level, basically to not collect the information that they want to collect or to protect their end users privacy, like I said, by default.
So it was really interesting because I so the fact that you are a PM for consumer privacy I think is really interesting because most people probably think of Cloudflare as a B2B company, which we generally are.
And I know you and I have worked together a lot, and it's pretty cool that we do have somebody who is thinking specifically not so much about the privacy offerings that we necessarily bring just to our enterprise or paying customers, but also to technologies that will work for just the everyday user on the Internet.
So can you explain why is it called Edge and why?
Yeah, why is maybe to what I was just saying, like, why is Cloudflare doing this and why are we doing this when what we usually do is come up with technologies that help our paying customers?
And so this is kind of a shift in direction.
So tell me a little bit more about that.
Yeah.
So I think that to take a step back, I often think a lot about how when the Internet was first built, I mean, this goes back maybe a couple of decades, but it wasn't built with privacy in mind.
Right.
And I think a lot of the original the people who were the architects of the Internet or who were trying to provide a certain level of connectivity, weren't necessarily thinking about privacy because it just wasn't necessary at that point, weren't at a place where, like industry and commerce on the Internet had actually happened and you couldn't predict, right?
Like back in the 90s, you couldn't predict that cookies would be the kind of third party tracking services that they are now or any of the really advanced ways in which data brokers and other third party trackers operate.
So if you pair that with the kind of narrative of cloud for cloud first started, right, a lot of it's at least the meat of its business in its first years was with CDN and DDoS.
And that is a way, the way in which we created the network that we have now.
And so folks might not think of Cloudflare right now as a network, but that's really what it is.
It's really what empowers all our products. And so now what I'm thinking about is we have this network that is really foundational in the Internet.
It's serving a huge amount of traffic.
It's serving millions and millions of requests every day.
But how can we build privacy into that foundation, right?
Like, how can we how can we think about our role in this and how can we have our customers also think about their role in serving traffic in a more private way?
And that means serving it by private by default.
And so there's this principle of least privilege and, you know, not collecting the data that you don't need to collect.
And I think that part of my goal here is to make that really easy.
And I think that some of the offerings that I talk about in the blog post that I got published today was about making that making it really easy to not collect that data as requests or serve through that network.
Right.
And that kind of goes to Cloudflare's mission is to help build a better Internet.
And we like to also say that a better Internet is a more private Internet.
So yeah, this ties directly with that. So you mentioned your blog post today.
Can you outline for us a little bit of like what is privacy edge, what are the components of it?
Yeah, so, so privacy edge is for products.
I'm to make sure I get all of them.
It's Privacy Gateway, which is a super lightweight proxy that encrypts request data and forwards it through an IP binding relay for folks that are familiar with our kind of traditional Proxying or reverse proxy Proxying offerings.
It's very similar to that, but with a privacy lens built in.
Basically what it does is it forwards a customer's traffic through a relay and that relay replaces the IP address with a Cloudflare IP address so that the origin server or the person or company operating that application doesn't see the IP addresses of the users that are using their service.
The second is code auditability, and we talked about this I think it was a few months ago when we released our partnership with WhatsApp.
So this is all about making sure that the code that is delivered in some mechanism in this case it's in the browser, is actually authentically the code that is supposed to be delivered.
You might be wondering, why would I even worry about that?
Why?
Why is it even a threat? But I think if you look to the, you know, the kinds of users that are most targeted by cyber attacks, it's folks like journalists, human rights defenders, activists, people like that, and the threat of compromise of some malicious actor replacing the code for a sensitive application.
What's up for a lot of folks is a sense of application can become a threat.
And so what this does is it makes sure that the code and applications that these folks are running is actually the code that WhatsApp is saying is delivered.
The third is private proxy.
And this we talked about last year with Apple.
It's part of our implementation for private relay.
And basically folks can think about this as kind of like the protection that a VPN offers but built into applications and products.
So how can you ensure that your users traffic is tunneled through a secure proxy, but that also that secure proxy doesn't necessarily collect.
Or maintain a lot of information about your end users.
And the last is Co-operative Analytics.
And we actually talked about this a little while ago.
It is a really cool really, I don't know what to say besides like cutting edge cryptography solution that is based in secret sharing.
So basically it takes what is a measurement and it splits it into multiple parts and different parties operate multiple parts of that service so that they collect those shares of that measurement and aggregate them on their own, then put them all together to collect the aggregate measurement over a population body.
So why would you go through all of that kind of work or a lot of it is.
Just going to say so, yeah, I have so many questions, by the way, for everything you're talking about.
But yeah, on this one, what, what do you use that for? Like what?
What are people going to do with it? Yeah.
So I like to think about the example, you know, this is not a perfect one or I'm not saying we should do this tomorrow, but like a voting example where you can think about it more simply as a poll or something like that.
But what if you wanted to collect the results of a poll or the results of an election or something like that, But you didn't want to see necessarily who each individual user or person voted for.
So that could be a big privacy concern or an abuse of power concern, right.
Where you don't necessarily actually need to see each individual's vote.
I don't need to see that like Mari voted for this result in the poll or that Emily voted for this candidate in the election.
But you still need to calculate the overall aggregate results of that election.
And so what the cooperative analytics and this kind of like underlying multiparty computation approach does is it allows you to separate that step of aggregation out from the actual part that is connected with the identity of the person voting.
That's so interesting.
So was voting really kind of at the heart of when people started working on this?
Was that kind of a use case.
I think the early iterations of this were from the original part of the system is based on a paper called PRIO by Henry Corrigan-Gibbs, who's a researcher.
And I think the first implementation scale was used for COVID exposure.
So Apple and Google collaborated on a system.
This is kind of in the earlier stage of the pandemic where we wanted to do contact tracing and provide people with alerts when they had been exposed.
But you don't necessarily want to give the system itself the data that shows the location of every single person, you know, using the system.
Right.
So there's this like tension between this like the system that has a lot of utility and that is actually keeping people safe, but actually presents pretty intense privacy and safety risk if it falls into the hands of the wrong body or is just operated over by an entity that folks don't trust.
And so this was a really useful situation and kind of piloted in that.
In that case.
That makes a lot of sense.
And it's it is good to know because I think we're going to run into more and more situations.
We're going to need to move surveys, voting record keeping about public health issues.
More and more of that is going online, as is everything in our lives, it seems.
And so coming up with really novel technologies for how to make sure that the data is not being tampered with, number one, but also that you can kind of break the connection between the individuals and the thing you're trying to count is going to be really critical, I think, going forward.
Yeah.
And I think that it's like it's it's indicative of a shift in the folks like designing these systems in the sense that like we can think more now about like what what data do I actually need?
Do I need this, this aggregation as a whole or doing or do I need each individual thing that I can each individual input, that I can slice up and then like look into and interrogate over time?
And it's like a lot of times actually we just need these like holistic, like aggregate insights into these like really big sets of data.
We don't actually need each individual input.
And I think that part of the goal here is to de-risk that, like how can you actually look at those statistics that talk about these huge populations of data while actually preserving the privacy of the folks who are users in that system?
Yeah, it makes doing that big those big aggregations a lot less scary.
So.
So I want to go back to privacy, gateway and private proxy. So both of these things to the very non-technical person, right?
Both of these things do something to break the connection between the end user.
So me random person going online or sending a text message using Apple messaging or something.
And what's happening on the other end.
So, so there's kind of this break so you don't know that it's me doing the things.
Can you explain a little bit, though, how these two things are the same and how they're different?
Because they're both right. They're both the same and they have different technologies.
Yeah.
Yeah. So let's start with Privacy Gateway. So basically how Privacy Gateway works is that it sends there's three different parties involved.
So there's the client or end user device. You can think of this as like your laptop, your phone, running a smartphone app or something like that.
There's a server that is actually running that application or technology.
So you can think of that as the application developers or like the businesses that are building that app.
And then there's the proxy for the we're calling the relay here.
And what I think that kind of if you think about the privacy premise here, like the privacy use cases, like what they're motivated by is that there's actually some cases in which the application developer doesn't need to know where the where the HTTP requests coming from their application are coming from and what kind of use cases are those there generally for use cases or applications that are very privacy sensitive.
So you could think of like password managers, like we were talking about COVID data, it could be covert data, it could be like a tip line for some sort of like journalist or something like that.
It could be DNS requests.
And we've actually implemented a version of this called Oblivious DNS for, you know, for our resolver.
So you have this need, right, that these applications say like, actually I don't really need to know the IP address and I don't want to log that in my system.
I actually don't want to see this, but the request actually have to come from somewhere.
You can't just magic away that information. That's just not how the internet works.
So I don't.
Know, the internet seems pretty magic to me.
But yeah, I mean, sometimes I mean, trust me, like as I've gotten into more of this work and, like, start reading more about cryptography, I'm like, Whoa, this is kind of magic.
So yeah, especially the cryptography stuff. But yeah, these requests have to come from somewhere, right?
And the way that they can come from somewhere without being from the originating client device is to put a proxy in between that.
So we already, we actually had a solution that just basically does that system Proxy proxy is information in between client and server.
However, what we did with Privacy Gateway is we use this emerging IETF standard called HTTP, and this offers a layer of privacy just beyond what a normal proxy does.
So what it does is it actually takes the public key of that application server and encrypts the contents or the body of the request from that from that client device to the application server before sending it to the proxy.
So this means that you can safely proxy or like in our case, we can safely proxy requests without actually being able to inspect at all the contents of that request.
And so I think this is a really big improvement upon traditional Proxying solutions.
Obviously, there are trade-offs right there trying to do anything. So with any proxy system you have to create to TLS connections.
There's a there's a potential latency trade-off with that.
But we haven't really seen a huge latency penalty so far from what we've tested.
And then there's also the time it takes to like generate those keys and encrypt things, which is not negligible, but also seems to be generally worth it for these very privacy centered use cases.
And then you also asked about the private proxy.
So basically it is a little more like this, like the privacy gateway solution I was just talking to talking about is a bit of a lightweight solution because it's just Proxying requests from client server with private proxy.
We're talking about tunneling all of the traffic from a device to that order server.
So it's a little bit more heavyweight. Why would you want to tunnel all of the traffic?
Right.
Well, for relay, obviously they've decided that they're with iCloud Plus that's that's actually a really great feature to have for certain types of traffic or for people who have that those additional privacy needs.
But it also with being able to tunnel all the traffic, there's all these risks of device level fingerprinting, third party tracking that become a lot safer when you're tunneling all of the traffic through private proxy service.
That makes sense and that helps me to understand the difference between them a little bit.
So going back, though, to Privacy Gateway, we have a pretty cool partner that we're working with on Privacy Gateway.
Can you tell me a little bit more about our partner Flow Health?
Yeah, so, so Flow Health is a period tracking app and just a little bit of context for folks that maybe haven't been reading the news for the past few months or something like that, but that they were getting a lot of heat after Roe v Wade was overturned.
A lot of women's health tracking apps and things like this were, you know, rightfully so, a little bit more worried after the overturn of Roe v Wade about their data retention policies and about the potential footprint that their users were leaving as they were using those apps.
So Flo started working on this thing called Anonymous mode.
And Anonymous Mode is a really cool feature that basically allows one to use their the functionality of their app without actually tying their personal health data to any personally identifying information.
So what it does is it just strips out all of the potential identifying things like emails, like your name, whatever could potentially be used to tie that data to you and allows you to use it essentially anonymously.
So they, Flow, they're a mobile app team and they had kind of figured out all the different pieces to strip that identifying information from their servers.
However, there's still a lot of identifying information like we were talking about.
The requests need to come from somewhere, right? And so requests also have all these things like headers, user agents, stuff like that can be used to track people over time.
And ostensibly you can imagine some sort of malicious actor kind of looking at the network traffic and correlating that with some user data or something like that.
That's where we came in with Privacy Gateway and we had actually just been like exploring like how to launch it and it was like pretty new.
And we're like, oh, like, is it ready?
But then Flow came along and we're like, It's got to be ready for this.
If anything, it's got to be ready for this.
And so we were able to implement it for them for anonymous mode.
And what's really cool is that if you think if you go back to the original threat model that I was talking about, like how does the server now see requests and the server receives requests from as if they were coming from Cloudflare's Privacy Gateway server instead of the individual client users.
So they flow is not in a place where they actually are even able to log their originating client IP addresses because it's all coming from Cloudflare.
And I think that's really cool.
That is really cool.
So that's I mean, I think that gets to what a lot of people were afraid of is that law enforcement in certain states that now have laws restricting abortion rights would go to a company like Flo and say, I want to know who's been accessing this or I want to know something about a pattern of activity here and now they won't have the IP addresses, they'll turn over, won't be able to be traced back to the individual user, it'll come back to us.
But we have no way of tracing that back to the individual user.
So yeah, that's.
Yeah, there are so many other you know, we're just like, this is almost like an emergency band aid type of situation, right?
But like what I want to encourage is for folks developing applications that have anything to do with sensitive data, whether it be like, like this one, like a health tracking app, but location data, you know.
Secure communication, stuff like anything that could potentially be sensitive like.
They should also be thinking about these samples and potentially how they could be mitigating these kind of risks.
Yeah, absolutely.
And that's one of the I mean, that's one of the fundamental kind of privacy principles, right, is that you want to minimize data and you want to build in privacy by design.
And it's so cool that we are seeing these new privacy enhancing technologies come out and it really gives application developers, website developers, whoever it gives them a lot of a lot of ways to think about, to be creative about how they can minimize data collection and, and really protect the data of their users.
Because we've all heard stories about, you know, the flashlight app that leaked that was tracking people's location.
Yeah.
And now that you did that. Right, right.
Right on purpose, they're just like, I didn't realize I was using this SDK and was accidentally tracking location of all my users.
Right.
But it's like, okay, it's 2022 times do better. Yeah, definitely.
Definitely.
And it's I mean, there certainly are so many ways we, you know, everybody can build privacy into their applications that yes, it's definitely time to do better.
Well, one of the things you mentioned was these standards and IETF.
So can you tell us a little bit about some of this idea of standards based approach and how we're participating with organizations that are developing certain standards?
Yeah.
Yeah. So first of all, I didn't I really did not know about this whole standards world before I came to cover.
I'd heard about TLS one of the most in-use and ubiquitous standards on the Internet, but I really didn't know a lot, and I think I credit a lot to our research team.
We have an amazing research team here at Cloudflare and they do a lot of work with these standards, making bodies to push these things forward.
So for folks who don't know, a standard is basically a way or a set of rules that anyone can go and implement and then be able to do something called interoperability or to be able to communicate with another body or entity company implementing the same standard.
And what's really cool about this is the kind of work that gets done for the entire ecosystem.
It's not something that just benefits Cloudflare.
It's not something that just benefits any other company or institution in particular, but it benefits though every single player and therefore every single user that might actually be implementing or be a consumer of these standards.
So an example here of how standards work comes into the privacy space is, like I said, with flow and with this privacy gateway product that we're launching today, it's based on a standard called Oblivious HTTP.
And so what oblivious HTTP is a way of describing the encryption between the client and the origin server that we were talking about and how the message is not only get encrypted but then sent from each party to the other party.
And what's cool about that is actually anyone can go and implement their own version of Oblivious HTTP.
Someone else could.
Some competitor of ours could go and implement HTTP and call it whatever they want.
And that's that's how the standards world works is anyone can implement them. But what I think about what I think is interesting is there's this kind of juxtaposition of there's a standard that improves people's privacy, but then also, like not all implementations are created equal, right?
Not all implementing bodies, implement them in the same way.
Right.
So what I think why Cloudflare particularly is well positioned to do a lot of work in this specifically privacy standards space.
And why I decided to come work here was because it's like people really trust Cloudflare, I think, and I think we just have a really great culture around doing work in the open and like we're doing right now, just talking about how we actually build these products and the ways that they're good, but also the ways that they can be improved.
And I think that with obvious right, there's this notion of like you have this proxy in between, but do you trust that the proxy is actually doing the thing that they're saying that they say that they're doing?
And that's only as good as the reputation of the institution or company.
So I think that there's a lot of interesting forces at play in between, like the standards based work, but then also what we at Cloudflare are doing with it.
But the most I can do is just say like, you know, we're we're really working hard at this.
And, you know, I think that there's always room to improve how we are transparent and how and how much we can, you know, build more in public around these standards.
Great.
Sorry. I think my zoom cut out for just a minute there.
But so I was just to say, we only have a couple of minutes left, even though I think we could probably talk about this for a couple of hours.
What is what's next for privacy edge and what should we be looking out for in the future?
Yeah, so a lot of the stuff is, as I said in my blog post, early access.
So that means that these are early iterations of these products and we're looking for new partners to implement them with.
We have some really exciting ones that I can't talk about right now, but I'm hopefully we'll be able to announce in the next couple of months or so.
But I think that as we as we get introduced to more and more partners around this space, like for example, with WhatsApp code portability stuff, we've been approached by some other folks that have also really interesting needs that are compatible with the standards based approach we're doing, but just a different flavor of it, right?
And so we're working on some different flavors or different iterations of deployments of these that are that are really interesting as well as how to deploy these things at scale.
So we're kind of like learning on the go.
Like I said, with Flow, we just kind of put it together because they're they were really important use case.
But I think that there are some still some tricky issues and tricky things at scale that we're we're still in the process of figuring out.
So we have a lot of cool partnerships on the line.
We have a lot of cool things that we need to do at scale.
There's also some new emerging standards that are coming into place.
One has to do with supply chain integrity.
There's another standard that might come along bringing the code portability stuff to the browser.
There's so many there's so many different pieces of for privacy gateway, like how we could improve the integrity of the system using cryptography and other techniques.
So yeah, there's just there's so much to do.
It's a lot.
There's a lot. Yeah, there's going to be a lot happening in this space, so.
You're not going to get very bored, I don't think.
here.
Definitely not.
So you mentioned kind of partners is there.
So we had the blog post today, so that's on our Cloudflare blog, Web page.
Is there information in there on how partners can reach out to us to learn?
Yeah, perfect.
So at the end of the blog post, there's a link to a landing page that has a contact form and stuff like that, and that will go to me on the other members of our team.
Great.
Well, I think we're at about time here.
So Mari, thank you so much for telling us about Privacy Edge and all the really, really cool stuff that you're working on.
And I'm glad I get to work with you on this stuff because it's so.
Much I think.
Keeps my job interesting.
Yeah, no problem.
Thanks, and thanks everyone for joining us.