Cloudflare U: The Problem with Using IP Addresses for User Identification
Presented by: Alex Chen, Nate Sales
Originally aired on August 18, 2021 @ 2:00 PM - 2:30 PM EDT
Join Alex Chen and Nate Sales, interns of Cloudflare's Research team, as they discuss the issues that stem from using IP addresses to identify users.
English
Transcript (Beta)
Hi everyone, I'm Alex and I'm here with Nate today to talk about our internship experience on the research team as well as the project we've been working on.
Nate, could you introduce yourself to our audience?
Sure. Well yeah, so my name is Nate Sales.
I'm a research intern this summer. I go to school at the Catlin Gable School.
I'm a rising senior and I'm really excited to be here with you, Alex, to talk about this project today.
Yeah, likewise. So yeah, I'm Alex Chen. I'm a PM intern on the research team as well and a rising junior at Yale studying computer science and economics.
So I guess to start, I wanted to talk about what we've learned so far during our internship at Cloudflare.
So yeah, Nate, could you talk about your role and what it's been like working with your team?
Sure. Yes, this year I've been focusing on sort of the engineering side of research.
So I've been working with the research team and still working on research, but sort of focusing on the implementation side and some of the technical aspects of actually getting in and writing the code to build these projects.
So I'm going to talk a little bit more.
The project that I've been working on is related to CG NAT and multi -user IP detection.
But really in general, I'm sort of about the role.
It's very team oriented and I think that's one of my favorite parts about working here at Cloudflare is that there really is an opportunity to connect with people outside your team to really work in sort of this collaborative way, getting to meet a lot of different people from different teams and sort of a cross -functional experience.
Awesome. Yeah, so I've also had a really good experience working across teams.
This is my first product management internship.
Previously, I have done software engineering internships.
So it's been really cool to be able to exercise some of these soft skills and meet a lot of PMs and engineers on other teams and kind of work with a bunch of stakeholders to get this project over the line.
Nate and I did work on this same project.
So I guess you all will get to hear two different perspectives on kind of what it's like to work on this kind of thing with the research team.
So I guess first, Nate, do you want to provide some context into what CG NATs are and also just multi-user IPs in general?
Sure. Yes, I think one of the most important sort of background items to address here when talking about CG NAT and multi-user IPs in general is sort of the difference between IPv4 and IPv6.
So IPv4 is sort of the original version of the Internet protocol.
It's very old. It's been with the Internet since the very beginning. And IPv4 has a 32-bit address space, which seemed huge at the time, this massive number, millions and millions and millions of IPs.
And that scaled well for a few years.
But now that the Internet has grown so much, the v4 address space is just about used up.
And so then once that started to happen, people were wondering, what are we going to do to sort of increase this address space?
Because this is sort of a resource that has been exhausted.
And what do we do to fix this and to sort of expand the usable address space for IP addresses?
So the solution to that was IPv6, which instead of a 32-bit address space is a 128-bit address space, which is just enormous.
And so that pretty much solved the problem of assigning IP addresses, because there's just so many to go around.
There's no shortage anymore. The problem, though, is that not all of the Internet is v6 enabled.
So there are some sites that are still on IPv4, and there are some clients, some users at home that are still on IPv4 as well.
According to Clarksware Radar, currently about 77% of Internet traffic is IPv4, and that's still the vast majority of traffic.
So we need some way to sort of interoperate between these two address families, still keeping v4 working, but while working on v6 deployment.
So cgnet is one solution to sort of the carrier side, so people that would be working on networks where humans are actually interacting through the Internet, as opposed to the server side, which wouldn't really be using cgnet.
Operating on cgnet is a way to have multiple users sharing the same IP address.
And this could be households or businesses, but it could be up to hundreds or thousands even of users sharing the same IP address.
And pretty much this is what cgnet is, and that was some background for sort of what this product is looking at.
Awesome. Thanks for giving that context, Nate.
So yeah, Radar is a really cool dashboard that Cloudflare has. Check it out at radar.Cloudflare.com.
You can see a bunch of stats on kind of the Internet, and one of those is IPv4 versus IPv6.
But yeah, the majority of traffic routed on the Internet today is still relying on the IPv4.
And so cgnet is one solution that's deployed that allows a bunch of private IP addresses to share fewer public IP addresses.
This can lead to security products that exist that rely on detecting these IP addresses as a way to kind of distinguish requests that are coming in from different origins.
And so yeah, this is kind of the problem that we've been examining through this project.
And we in the long term think IP addresses shouldn't be used at all to identify users.
There are a variety of kind of IP address obfuscation technologies that are being developed.
And even with the move to IPv6, which theoretically would allow every single user and every single device their own IP address, because the IP address space is so much larger.
We think that kind of with these developments, the industry as a whole, the Internet as a whole is moving in the direction of IP addresses not being relevant for this use case.
And so there is this problem still that exists today of legitimate users being kind of misidentified due to being behind middle boxes like cgnets.
And that can cause security products to make the wrong decisions regarding kind of what to do with those requests.
So it is an important problem to solve. And it really is a balance of security, performance and reliability.
And it's one that we wanted to look at and attempt to solve this project for customers.
So I guess we can start off with the process of how we actually went about building this product and also share kind of our experiences with this summer internship.
So as a PM intern, I started off with a bunch of information gathering chats, I think over 15, to kind of gather the insights needed to learn about who we were actually building this kind of product for.
And so this really involves speaking with PMs on a variety of teams, engineering managers, engineers, solutions engineers, and really anyone else who could kind of lend perspective on their their use cases.
And after kind of gathering that information, I synthesize those into more, I guess, concise insights, and then summarize that in what's called a product requirements document.
And so this PRD was one of my main deliverables during this internship, where I kind of summarized the target customers and then also prioritize the requirements needed for our minimal viable product.
So this really describes kind of the capabilities that users have requested, and really kind of drives the development of the product itself.
And so again, I guess I can pass things off to Nate, to talk about kind of the functional spec process, and then also how it went about building this product.
Yeah, so pretty much once this PRD is done, which we're going to describe the overarching goals and sort of objectives of this project, the PRD can then be sort of turned into a functional spec, which is starting to get into the technical details of really, like from a software sort of architecture level, how is this service going to be built, how they're going to be designed, how exactly it's going to work.
So that's sort of where the engineering part comes in, is turning sort of these goals and objectives that set up a product into sort of this very specific technical diagram about the exact sort of inner workings of the service.
So pretty much with this service, which is focused on a multi-user IP detection, it was sort of started out as a research product to figure out, could we build some sort of system to figure out which IPs could be CGNAT and could be multi-user.
And so pretty much there are two parts to that service. There's a variety of data sources, and there's about five different data sources that we use, using internal and external data.
And there's this scheduled job called an inserter, which pretty much goes through all those data sources, processes them and does all sorts of this sort of inference work.
And then it stores it in a database has sort of a local cache, because one of the things we need to not sacrifice performance and force or disenhance security.
And so we have to keep these queries super fast and snappy.
So we have this sort of intermediary cache to store that data.
And then the second part is just an API, which exposes the data internally to Cloudflare services.
So we can use this in our products. Got it. Yeah, so it is important that the internal API aspect is what allows us to then consume that information with another service, Cloudflare, and deploy it at the edge.
So it can actually be useful for our teams. So yeah, Nate, let's say you're looking at a bunch of request data, and you want to actually determine which of these requests is coming from a multi-user IP.
What kind of data points have you found useful for your research?
Sure. Yes, as I mentioned, there's about five different independent data sources.
And there's a lot of data points within those sort of five categories.
So some of the internal data that we're leveraging is a very simple count of number of source ports open to an IP address.
And this is starting to get into sort of how these CGNAT boxes work, by translating sort of these open connections coming from clients into a single source IP with lots of open source ports.
So let's say that we have five households, and they all are going to have some Internet clients talking to servers on the Internet.
And they're all going to be put through the same CGNAT middle box.
So what happened is that all these open source ports coming from all the devices are going to be translated and sort of compressed, all translated into a single one IP, which is going to then have all of these source ports sort of cumulatively added up from all those Internet clients to that single IP address.
And so one of the things we can look for is we can look for IPs that have a disproportionately large number of open source ports, which could help identify which IPs may be used by multiple users.
Awesome. Yeah, so kind of the information that is then offered to our teams can be really useful for certain products.
One example is a rate looming product.
So being able to kind of protect websites when they get kind of hit with many requests, potentially from the same IP address.
And so we want to be able to distinguish cases where it's either malicious or malicious traffic being directed to a website versus a bunch of people sitting behind a CGNAT middle box that has just translated or is only exposing, you know, a few public IP addresses at once.
And so if those people behind the CGNAT middle box tried to previously request a site that was protected by rate looming, that could cause issues there with them being safely challenged or blocked.
And so really the goal of this project is to kind of streamline the experience for those users that were previously mistakenly blocked, while at the same time still offering our security products and securing our customer sites.
So yeah, I guess another question I had would be, if we have the time, Nate, what else do we want to look at?
Are there any other interesting data sources that exist?
And like what else do we want to build out? Yeah. Yeah, so definitely a lot more to explore here.
That's just sort of the tip of the iceberg in terms of the research outlook to this.
We have dozens more data sources that we started to look at.
Some of the things are like peering DB, using some certain models to sort of infer network type based on traffic and some research that's going on currently.
One of the big things that I think is important is getting some ground truth validation data.
And so we have all of this sort of inferred data that we can look at source ports, user agents, and things like that to sort of determine if an IP probably is multi-user.
But really what we have to do is figure out if there is a way to validate this data.
It's one of the things that we're currently doing is reaching out to some of our ISP partners to see about confirming or denying if any of these IPs truly are used by CDNATs or sort of multi-user middle box translators.
Currently the validation process uses a service called Ripe Atlas, which is offered by network operators sort of as a way to run queries from these little probes, these little network boxes that are located all around the world.
And so one of the things we can do is we can use these Ripe Atlas probes, which are located within networks, and we can do a trace route out of the network and we can see if there are any intermediary hops, things in sort of the shared or private address space that could be used as part of a NAT or other sort of translation mechanism.
And so currently the validation works using Ripe Atlas, which is the method that we've been using to look at sort of a more ground truth data compared to just pure inference like we're doing with source port user agents.
Got it. And yeah, from a product side, it's unfortunate, yeah, that our internship is ending next Friday, but we definitely would want to kind of integrate with more product-based products and also establish more feedback loops so that we can kind of gather additional validation data that can be used to further refine the accuracy of our multi-user IP detection.
And so, yeah, also reaching out to ISP partners, as Nate mentioned.
I guess taking a step back from our project, kind of curious what the most challenging experience you had this summer was, Nate.
That's a good question. I think some of the most challenging experiences.
I think one of the things that's sort of a bit challenging is starting to adapt to a brand new development environment.
Something from going in sort of my past work, which is more of an academic setting, into sort of this industry, which I think is a little bit different in terms of adapting to review processes and things like that.
But I think that Cloudflare really does a very good job about this in sort of keeping communication open between teams, which I think is definitely something that helps with sort of navigating this development process and things like that, because there's always someone that's open and willing to answer questions if you get stuck.
I'm curious, sort of from the product side, what do you think was some of your most challenging experiences this summer out?
Yeah, definitely I had a lot to learn for this summer, this being my first PM internship.
I think it really was learning how to communicate effectively, make sure everyone was kind of updated on the status of our project and bought into kind of our vision for how we want to take it and the prioritization of the requirements that were outlined in the PRD.
And then even beyond that, just how to drive an effective meeting.
I think that was something that I hadn't really, it was a skill I didn't really get the chance to leverage much at previous software engineering internships.
And as a PM intern, it really kind of that responsibility fell on me.
And so coming into meetings prepared with an agenda, the points we wanted to cover, any questions that came up, taking good notes, and then sending those notes to everyone involved and making sure everyone was in kind of agreement of what the next steps were.
Those were all kind of really important skills that I got to kind of learn over the course of this entire summer.
So I think the challenging aspect of this internship really was, yeah, adapting from that engineering mindset to more of a product management mindset.
Yeah, so I guess another question, Nate, for you is do you have any advice for future interns?
You're really a special case here with still being in high school.
And so, yeah, I guess any advice to other potential interns that are looking forward to potentially interning at a company like Cloudflare?
Sure. Yeah, so I think sort of the biggest advice I would give and something that I started to take advantage of sort of later on in the internship is really just setting up meetings with people.
I found that people are super willing to chat, people from all different teams, a little bit like I mentioned before, people are just so open and willing to share things.
So I would say like absolutely reach out, find someone that's doing interesting work, someone you're interested in, someone that's doing something different maybe than what you're working on.
People are so willing to just put down 15, 30 minutes for a quick chat just to get to know each other and to sort of expand with the type of work you're familiar with.
People from different teams, I got to chat with all sorts of people, which has been super fun.
I'm curious for the same same type of question, Alex, what do you think would be your biggest advice for future interns?
Yeah, definitely. So I agree with you. It's been really cool being able to chat with so many brilliant people at Cloudflare and not just within the product engineering organization and obviously the research team.
Yeah, I think speaking with people in business intelligence, solutions engineering, and yeah, the legal, the huge number of teams at Cloudflare, it's been really cool kind of getting their perspective and being able to learn while we've been here interning.
I think another piece of advice I'd have for interns that are interning at a company like Cloudflare or interning in tech is really kind of focusing on establishing your learning mentality.
I think this has been kind of a really important, yeah, really important attitude that I've come into Cloudflare with.
And I think it's really paid off as far as approaching kind of difficult situations or being able to kind of resolve disagreements.
I think I've learned a lot from making mistakes and I've learned a lot from the brilliant people at Cloudflare and really thinking of every problem that comes up as an opportunity to learn, improve, and deliver a better final result for our customers has been really, really beneficial to me.
So I think, yeah, making sure you have the right mentality as an intern is probably my advice.
We did get actually a question from the audience. So it's, this is great.
Could you share a bit more about how each of you developed your technical know-how?
If you're just getting started on your technical journey today, where would you begin?
Nate, you want to start off? Yeah, sure.
I can start with this one. Yeah. So I think sort of what's been most useful for me is really getting hands-on and trying things.
So definitely like before Cloudflare, there was a lot of time just sort of spent, especially during quarantine where I've been inside for a while, I'm really just like finding some interesting problems to tackle and then start actually writing some code, getting your hands in there, working on something and just trying to figure it out.
This is definitely something that I think Cloudflare encourages, sort of this time for experimentation, especially from a research side and getting this freedom and flexibility to really find some interesting question that you want to answer, some problems to solve, and then really just like totally put everything towards that and just focus on figuring this out with the support of everybody around you, I think, which is great.
I definitely think sort of the true hands-on aspects has been really fun and I think educational for me.
Awesome. Yeah, and I guess I can share more of a kind of college perspective.
So I really had minimal programming experience, you know, before college.
I thought I was going to be a mechanical engineer and was involved in things like FIRST, FRC, like revives competition in high school, and so I think I was definitely technically inclined, but not so much on the software side of things.
So coming into college, yeah, I kind of started just with an intro CS class and was really bad at it, and so I still tried to, you know, stick with it, and yeah, I actually think I dropped out of my data structures and algorithms class, which is the second CS class I took, just because I was doing so poorly in it, and I think really just like speaking with the professor and going to office hours the second time around.
I kind of changed my approach to focusing a lot more on the kind of abstract problem -solving component to software development and coding, and I think that really helped.
So spending, I guess, less time just trying to churn out as much code as possible, working on both projects and assignments, and more time really, you know, trying to understand abstractly, like, you know, what was the kind of approach I should be taking to solve these problems, and so it, yeah, I think by spending more time on the whiteboard and less time actually coding, that proved, like, super helpful during my technical journey, and yeah, I think definitely echo your point.
Projects are a great way to learn.
There are a lot of resources online to get you started, and then also internships.
I had really good experiences during my software engineering internships at two different places, and I think it's really cool to be able to get different perspectives on what it's like to work in a production environment where the stakes are a lot higher, and there's more testing, more collaboration.
You know, your product impacts a lot more users than most likely your personal projects, and so yeah, I would highly recommend kind of seeking out those experiences as well.
I totally agree.
Awesome. Well, yeah, thanks for that question, and yeah, feel free to ask more questions if anyone in the audience hasn't.
I think another thing I wanted to ask, Nate, is, I guess, what were some of the interesting, like, trade -offs or, like, interesting engineering decisions that came up during the development of this project, and, like, how did you kind of determine, like, which was, or what direction to take?
Sure. So I think one of the main ones was sort of balancing this limited time span of the internship.
I think especially during a product that is very research focused, there is always more work to do, and so I think one of the biggest sort of things I was focusing on towards the beginning was trying to sort of clearly scope out this project in sort of manageable chunks that can be achieved in this limited two weeks that we have in terms of time frame.
So I think that, like, that's definitely an important thing to start to think of at the very beginning, starting to plan sort of what this product is going to look like in terms of a time perspective, because for me, there's always more work that can be done.
There's always more learning to be had.
I'm definitely sort of trying to boil it down just to these five key data sources that I talked about earlier was definitely important to be able to actually, like, deliver and ship code on time.
I think that's key. Got it. Yeah, I think I've also had to make some, you know, product trade-offs, and this really comes with kind of prioritizing what the MP requirements needed to be.
And so, yeah, I think really timing was probably the biggest limitation, and I think it is for a lot of interns.
Like, what can we actually accomplish during the summer? So being very, very focused in scope was definitely something that drove a lot of the prioritization decisions being made.
All right, well, I think we're gonna wrap up a little early.
We highly recommend, if there are any, like, prospective interns or students in the audience, apply for the Cloudflare internships.
I think both the PM and engineering internships here are excellent, and I've definitely learned a ton over the course of the summer.
I think for sure, I totally agree.
Totally echo what Alex said. Absolutely. I think we're coming to the end of the time here, so thank you all very much for watching.
Thank you.
My name is Jagger McConnell.
I am the CEO of Crunchbase. Crunchbase is the world's leading provider of private company information on the Internet.
We have over 30 million people coming to us using us for finding their next investment, finding their next investor, even salespeople going and finding their next opportunity, all within Crunchbase.
When we break a news story, when we have a new funding event happening that's a big deal, tons of traffic comes into Crunchbase, and those are the most critical times for us to stay up.
If we go down, our users would be disappointed.
They'd go find that news somewhere else, and that's when we honestly rely on Cloudflare at the most.
I don't have the bandwidth or the headcount to go and have a huge team trying to keep our servers up on time, trying to mitigate those attacks, and that's when you go and turn to a vendor to go and say, look, this is your core competency.
Please do that for us because you're going to do better than we can.
From the CDN perspective, 92% of our traffic goes to the CDN servers, so they don't actually touch our origin servers.
It's pretty amazing. That means that when those spikes happen, the traffic is distributed across that entire network of servers, which not only saves us bandwidth, and that ultimately saves us cost, but also from an end-user perspective, things are super fast, and that makes them happy.
Cloudflare allows us to go and look like we have a huge international presence by having 115 different servers worldwide.
It's actually pushing our data out to those edge servers, and that allows, from our end -user perspective, to have a faster experience because the servers are right next to them, so there are very few hops.
Even though they're in Australia or in Asia, they have an incredible experience using Crunchbase because of that CDN.
The most critical thing is that they can rely on us.
They know that when they go to crunchbase.com, they find the thing that they expect in having that server up and running.
Cloudflare lets us deliver on that promise, that expectation that that user has by being up and running.
When you're a large website, you have a huge target on your back, so Cloudflare helps us go and mitigate that risk with our DDoS mitigation, with their WAF protection.
Just last month, we had over 1,600 different attacks.
It's funny. Lots of people crawl us, as you might imagine. When we're a data set, people go and try to get our information in whatever way they can, but what's funny is that sometimes they're not the best coders when they try to go and write those crawlers.
What it actually becomes is a DDoS attack where it's one IP trying to just get as much information from the same page over and over and over, millions and millions of requests.
They don't even probably intend to be doing that.
We see those sorts of attacks all the time. That's, again, where Cloudflare can step in and just make sure that doesn't happen.
That is an important thing for us from a security standpoint.
Obviously, with all these hackers trying to do ransomware and trying to take down our servers, having the Cloudflare protection there ensures that we are safe.
Honestly, I think the hackers just go somewhere else when they see Cloudflare protecting our servers.
Thanks for watching.