🔒 Dispelling Privacy Myths
Presented by: Tara Whalen
Originally aired on December 29, 2021 @ 2:30 AM - 3:00 AM EST
Privacy is often complex, which can lead to confusion about important concepts. This session highlights and clarifies some common misunderstandings, to help you approach privacy challenges and solutions more effectively.
English
Privacy Week
Transcript (Beta)
Hello. Thank you for joining today's session on Dispelling Privacy Myths. My name is Tara Whalen.
I'm a research lead in privacy here at Cloudflare, and I'm pleased to give you a presentation on privacy during Cloudflare's Privacy and Compliance Week.
So what's this session all about? Well, for anyone who's been working in privacy can tell you that privacy is a pretty complex topic.
There are a lot of ideas, concepts, considerations, requirements, and it's really easy to get confused.
So this session is meant to highlight and clarify some common privacy misconceptions to help you better understand many aspects of privacy.
This could be a privacy requirement. This could be a privacy regulation.
It could be a foundational concept in privacy. I also can talk to you about some possibilities in privacy and some new opportunities.
The idea is in practice having a better understanding of privacy will help you better develop products and solutions, particularly those that have requirements for privacies and for protecting your users' data.
Now in terms of today's talk, which is only 30 minutes long, I did think about the fact that there are a lot of privacy misconceptions that one could choose from to talk about.
So I thought given that this was Privacy and Compliance Week that I would choose some concepts around data protection.
In this session, it will involve clarifying terms that are often confused with one another, describe ways in which privacy concepts overlap with one another, or how they differ from one another.
And I also want to highlight aspects of privacy that may be overlooked. Depending on how you're thinking about privacy, you may not be thinking through all of the aspects that you need to consider.
And I also want to underscore the importance of how terms are used in privacy.
There can be differences in their formal usage and their informal usage, which become important when you move into more regulated spaces.
Before I dive into that, I thought I would talk a little bit about my own privacy background.
As I mentioned in the introduction, I currently work in privacy research at Cloudflare, where I help to build a better Internet with a particular privacy lens.
So I'm a Canadian working in the United States. So I've had a couple different countries where I've worked in privacy.
I've had a long career in privacy and security, and I've worn a lot of different hats in my time in privacy.
So I am a computer scientist.
So I have spent time as an engineer. I've spent time building products and services.
I'm also a scholar and an academic. So I have also spent a lot of time thinking about privacy and the complexities of privacy.
And I also worked as a technical expert at a privacy regulator.
So formerly, I worked at the Office of the Privacy Commissioner of Canada, where there was much more focus on the regulatory aspects of privacy.
And all of these different roles helped me to understand different perspectives.
So I understand a lot of the needs for building product.
I understand the complexities in the concepts of privacy from the scholarly work.
I also understand a lot of the regulatory context around privacy. But I will note, as you will see in the footnote, that I am not actually a lawyer.
A lot of the discussion today is going to involve regulation, which I will be discussing, but I am not providing a professional legal opinion or interpretation on those.
I will give you my best knowledge of these things, but recognize this is not a legal interpretation in any way.
Here is the overview of the particular privacy misconceptions that we're going to talk about today.
You can see these are generally all around those basic concepts around privacy, around data protection, and what personal data is versus personally identifiable information, private data, sensitive data.
These things have nuances that become important when you want to think about what the implications of these definitions are when you are in the privacy space and, for example, trying to roll out a product or service.
We're going to start out with one of the more complex aspects.
Ordinarily, I would not begin by jumping onto one of the more complex topics right out of the gate, but I thought it was important in this case to set the stage, to introduce some of the concepts that we're going to return to repeatedly throughout the session.
One of these is around privacy and data protection.
Now, what I'm going to say is that there's no real clean distinction between these, and legal scholars argue a lot of the finer points, but what is worth noting is that these terms can be and often are used differently.
I have tried to highlight some of the more significant aspects in the table of privacy versus data protection.
I'll note, for one thing, privacy tends to be very broad.
So it's a more social concept with a lot of facets, and we'll go into that in a moment.
And when you talk about data protection, in contrast, this is also very complex.
It isn't really very simple, but it is more focused.
So it is more focused on data -specific aspects, whereas privacy may have more than data.
When we talk about these things informally, you'll probably also note that data privacy, often when people talk about it, they're really discussing things around confidentiality, around keeping data from one party, sharing it with a limited number of parties.
When we talk more formally around things like data protection, you'll note that the conversation becomes broader.
It isn't necessarily just about confidentiality.
It isn't necessarily just about the flows of the data, but it will incorporate other aspects of the data as well, things like data accuracy.
And again, I'll go into all of these points on the following slides.
Now, to make things a little bit confusing, I will also note that some regulations will use the term privacy when they're talking about data protection aspects, and some will call it data protection.
So you can't just rely on the term in a vacuum to know what aspect is being discussed.
I will also note that privacy is more often found in things like human rights legislation, when it's discussed as privacy rights to private life and privacy of the person and its links to human dignity.
Sometimes you will find data protection in human rights -focused legislation or rights legislation.
So I'll note it's in the EU Charter of Fundamental Rights, for example, has a specific right around data protection.
But you won't always find this to be the case in other regulations. So we'll jump right into privacy.
This, as you can well imagine, is not easily defined.
And I am not going to try today either to define it in a short session. I'll note that there are a lot of concepts, a lot of related aspects that you would put under the privacy umbrella.
And some of these relate to data privacy, but they don't all relate to data privacy.
It really is a very, very broad domain with a lot of aspects to consider.
So here's an example of one way of laying out privacy.
This is not the only way. But Daniel Solove wrote a paper in 2002 on conceptualizing privacy and then laid out from a perspective of U.S.
legal scholarship, what privacy means and which he highlighted it as a family of related concepts.
So, again, this is just one way of laying it out. You may choose other ways, but I found it a good summary.
So you'll note on the left, there are about six different ways in which there are aspects of privacy.
Elements such as the right to be let alone is a very important concept in American legal scholarship.
It was thought to be where the privacy rights had their foundations with Warren and Brandeis all the way back in 1890.
We also talk about things like limited access to the self, like your ability to not have others have access to you as an element of privacy.
The next two are more tightly connected to this idea of data protection to some degree of data privacy.
The idea of secrecy, if you want to conceal particular matters from other people.
So, again, that's very much that idea of access control and confidentiality.
There's also control over personal information.
So, again, very much this idea around what data am I sharing with an organization and who can they disclose it to and what can they use it for, which all involve control over information about me.
And the next two are, again, broader and more closely aligned to this idea of the person.
So personhood, protecting dignity and using privacy as a way of supporting the protection of the person and self.
And intimacy, also having aspects of life that are set apart and using privacy to control access to that part of your life.
So, again, lots of broad concepts to bear in mind when thinking about this big idea of privacy.
But then we move to data protection.
And I'll note that many of regulations around how you do data processing will use the term data protection and not privacy.
It does depend, but you will often see data protection used.
And, of course, the most well-known example, we have the General Data Protection Regulation or the GDPR in the EU, which many of you may be familiar with.
You will see data protection right in the middle there, where it's spelled out very clearly.
And in data protection regulations, you describe lots of ways in which you will appropriately handle data.
So there could be principles around personal information handling, the appropriate practices for this.
An example would be data minimization. So the idea that you only collect the information needed to deliver a particular product or service, and no more than that, and restrict it to the purpose for which it was collected.
Now, there are some requirements that are not really about data flow, and they're not really about confidentiality.
So there are some additional principles, for instance, that talk more about the appropriate handling in a broader sense.
So the idea, for example, of data accuracy. So if you have information about a person, you have an obligation to ensure that that information is accurate.
And the data subject can get information about what data is being held about them and are able to make corrections so that what you have about them is correct.
One probably doesn't think about that necessarily when you have privacy and you think mostly about confidentiality and data flow.
You might not think about data accuracy.
So there are some elements you have to bear in mind when you think about data protection.
So some highlights for you there.
So in practice, I know in the tech industry, where I work, when you see privacy, generally it's referring to the more data protection, data focused elements of privacy.
You do need to think about all the requirements of data protection, not just the confidentiality part.
And I'll note that a lot of these discussions are more common now.
We've had GDPR, which has definitely been spreading and growing since it was launched in 2018, and there are other similar laws.
And so there's more material available to learn about these things.
And a lot of business has now had to consider a lot of these requirements as they have to be compliant with GDPR.
Additionally, there are things, for example, like regulations may have support for individual rights as well.
So GDPR has, again, right of access.
So that's that idea that the data subject can ask about the data that you have about them.
They have the right to have their data provided in a form that they can move it from one service to another, this idea of portability.
And again, might not be the first thing you think about for privacy, but there are other requirements that you have to bear in mind while you are being compliant with data protection regulation.
But I'm also going to note that you want to think more about broader privacy concerns.
I did do this focus on data, but there are some other things to think about.
We did have that broad umbrella of privacy.
So you may be developing a feature, for example, and you're absolutely fulfilling all of the data protection requirements, all of your systems and your processes are in place.
And you may have a user who may still feel this feature is invasive.
It might not be something about the data per se. It might be about other aspects of privacy.
Maybe it's about the limited access to self component and not really so much about the data focused elements.
So you have to have a broad perspective when thinking about the challenges of privacy.
So I wanted to take a moment to talk about personally identifiable information, or PII, and differences between this and personal data.
I'm going to note PII, in my experience, tends only to be used in the United States.
It's a similar idea to personal data, which is the term that's used in the GDPR, but they're not really the same thing.
So personal data was set up deliberately to be broad. So again, it's any information relating to an identified or identifiable natural person was taken straight from the text of the GDPR.
Again, very broad in the types of information.
It can be something that directly identifies a person. It can be things where it can be put in combination to identify a natural person.
PII, in general, I'd say, can be considered a subset of personal data.
It tends to be more narrowly scoped.
There isn't really one single definition of PII. In the United States, for instance, there's no nationwide definition.
Things may be sector-specific.
There are some very broad, expansive definitions, and some of them, again, are very specific types of information.
And I'll give you an example of this.
So within the U.S. Office of Privacy and Open Government, in the Department of Commerce, gives a definition of personally identifiable information.
And you will see some examples that they give, such as social security number or biometric records, which, again, can be linked to a specific individual.
If it can be combined with other information, date, place of birth, compared to the California Song Beverly Credit Card Act, which is very specific and more very much, as you can see, about credit cards and data processing.
And even here, they've changed the term slightly to personal identification information.
And it's information about a cardholder and talks about information set forth on the credit card, such as cardholder address and telephone number.
So they are not defining it in the same way.
These are both talking about information that is connected to a person, but they've framed it in a very different way.
So what's my recommendation around PII and personal data?
I generally recommend that you try not to use the term PII unless you need to use it if you are trying to talk broadly about personal information and personal data.
There are times when it's very appropriate. So if you are making reference to a specific regulation, which includes the term PII, and you are trying to make reference to that definition, then it's absolutely appropriate.
Apart from that, there is this risk of confusion because it means a lot of different things.
And depending on the definition, it may not actually incorporate those aspects of the data that you are trying to protect.
And I will note that absolutely be careful not to use it interchangeably with terms such as personal data, which in a lot of contexts have a very specific meaning.
So you can't really swap those out one for the other.
So now on to private versus personal data.
So here's a distinction to think about considering regulation.
This is a topic that has come up for me over the years in a variety of different contexts.
When people have a confusion about privacy, when they think about data as private, when really they need to be thinking about whether it's personal.
So the key distinction here isn't to say that you have to be concerned with protections required for data, whether it's private data or public data as the key differentiator.
It's whether it's personal or not personal data.
So you have to be a little bit careful thinking about the fact whether it's personal.
Is it connected to an identifiable or identified person? So the regulation is around the personal data often.
So things like the GDPR talks about personal data.
So data can be personal and be in the public. So because something is public data doesn't mean it isn't personal data anymore, for instance.
So the simple example I often use is, so suppose you had a data breach.
There was a whole bunch of personal data that got exposed.
So that was made public. You wouldn't say that it's not personal data anymore just because it's available to you in the public space.
That's not the consideration that you make when thinking about what is the nature of the data.
It would still retain all the requirements for data protection.
So you can't make an assumption that because something is available to you publicly that it need not still be considered personal data.
That is not the axis to look at.
Again, it's personal versus non-personal, not private versus public.
Of course, you can't just take personal data and make it public because you have requirements around controlling disclosure.
But absolutely, it's the personal component of it that you have to keep front of mind.
There's often also discussions around sensitive data versus personal data.
Informally, you will hear discussions around sensitive data, what data is sensitive in a generic sense, where people will talk about data that has special requirements that it needs to be particularly protected, that there is a very high risk should something happen to the data.
So if you mishandle the data and inadvertently disclosed it, for example, then there could be a high risk to an individual.
There could be an infringement of a right or a serious harm. If you want to talk about the formal ways in which sensitive data is described, there are specific requirements in regulation about this.
And they're in the same spirit as the common sense way of talking about sensitive data, but they are much more tightly and carefully defined.
So I'll take the GDPR as an example again in Article 9, which has a discussion of sensitive personal data.
Again, it's still personal data, but it's a particular type.
And you will note there are several pieces of personal data that are under consideration, racial or ethnic origin, political opinion.
Again, religious, philosophical beliefs. Something that some people, depending on your country or background, may not recognize.
Trade union membership is considered something that could be of high risk.
Also, genetic data, biometric data are also considered sensitive.
And of course, the health, sex life, sexual orientation.
So the degree to which these are considered sensitive is such that you'll see that the processing is prohibited with some narrow exceptions or very carefully constructed exceptions.
So the default is that this data is not available for processing.
And then you have to demonstrate why it is sort of appropriate for you to be handling this kind of data with the recognition of the high risk.
Very much what is defined as sensitive is going to be regulation specific.
So you want to see what the requirements are for your situation.
You may be working in a particularly regulated space that talks about what types of data are sensitive.
There may be considerations around health data, for example.
And this may require special handling of data.
There may be processes that you need to follow. You may need to do documentation.
You may have to do a privacy impact assessment, for example, before you would handle data like this.
So be very careful when dealing with this kind of data to check what regulations apply to you so that you can remain compliant.
I do want to have a little bonus after all of that, which is the people who think personal data is entirely off limits.
So I have had people who say, well, they would just say, I didn't think you could use it in any way, which is not the case.
You end up with this unnecessary binary framing where it seems like it's an all or nothing type of approach.
So either the data is personal and off limits and not personal, and that's fine.
That's not the way to think about this. And if you do frame it in this all or nothing way, it can also incentivize people who are trying to use data to mischaracterize data as being not personal if they think that's what's required in order to be able to use it.
So, no, it is the case that personal data certainly can be used in many cases.
There are many situations in which it is necessary, in which you just have to use it appropriately.
It's good to bear in mind that there are restrictions on the ways in which you use it.
So you must be careful.
You have to follow all applicable regulations and also be thoughtful about how you use it.
It's very important to maintain user trust. So if you are going to be using personal data, then you absolutely have to be respectful and be responsible and follow all the requirements.
A few closing thoughts on privacy and data protection.
Absolutely, compliance with data protection regulation is very important, but it's not really the whole privacy story.
I would say that user expectations may not always be addressed by privacy regulation.
And I encourage you to go above and beyond just what's required by compliance.
That has to be your baseline.
You must fulfill your obligations that are compliance. But think a bit more broadly.
Think about sort of larger individual and social issues around privacy.
Think about what users might want, might need, what their expectations are.
I encourage you, when you can, to do user research in this area because it helps you to understand how users are thinking about privacy.
So you can go above and beyond what you've put in place for your responsible data handling and make really excellent products where users really feel that you have been responsible with the data that they've entrusted to you.
So I just want to thank all of you for tuning in to this session.
I really enjoy communicating about privacy with people, with the community.
And I feel some responsibility to bring a lot of the concepts that I've learned over the years in these different environments to all the people that I interact with.
I'm interested in hearing about other topics that people would like to know about, what people would like to learn about in privacy, perhaps for some kind of a follow -up.
So this, I hope, has been useful to you. I hope that you have learned something.
I encourage you to contact me if you have feedback and questions, and I'd love to hear about it.
I hope you will also take advantage of all the presentations that are happening during Privacy and Compliance Week here at Cloudflare.
And I want to thank you for tuning in and listening. Thanks so much. The real privilege of working at Mozilla is that we're a mission-driven organization.
And what that means is that before we do things, we ask, what's good for the users as opposed to what's going to make the most money?
Mozilla's values are similar to Cloudflare's.
They care about enabling the web for everybody in a way that is secure, in a way that is private, and in a way that is trustworthy.
We've been collaborating on improving the protocols that help secure connections between browsers and websites.
Mozilla and Cloudflare have collaborated on a wide range of technologies.
The first place we really collaborated was the new TLS 1.3 protocol.
And then we followed that up with QUIC and DNS over HTTPS, and most recently, the new Firefox Private Network.
DNS is core to the way that everything on the Internet works.
It's a very old protocol, and it's also in plain text, meaning that it's not encrypted.
And this is something that a lot of people don't realize.
You can be using SSL and connecting securely to websites, but your DNS traffic may still be unencrypted.
When Mozilla was looking for a partner for providing encrypted DNS, Cloudflare was a natural fit.
The idea was that Cloudflare would run the server piece of it, and Mozilla would run the client piece of it.
And the consequence would be that we'd protect DNS traffic for anybody who used Firefox.
Cloudflare was a great partner with this because they were really willing early on to implement the protocol, stand up a trusted recursive resolver, and create this experience for users.
They were strong supporters of it. One of the great things about working with Cloudflare is their engineers are crazy fast.
So the time between we decide to do something, and we write down the barest protocol sketch, and they have it running in their infrastructure, is a matter of days to weeks, not a matter of months to years.
There's a difference between standing up a service that one person can use or ten people can use, and a service that everybody on the Internet can use.
When we talk about bringing new protocols to the web, we're talking about bringing it not to millions, not to tens of millions.
We're talking about hundreds of millions to billions of people.
Cloudflare's been an amazing partner in the privacy front.
They've been willing to be extremely transparent about the data that they are collecting and why they're using it, and they've also been willing to throw those logs away.
Really, users are getting two classes of benefits out of our partnership with Cloudflare.
The first is direct benefits.
That is, we're offering services to the user that make them more secure, and we're offering them via Cloudflare.
So that's like an immediate benefit these users are getting.
The indirect benefit these users are getting is that we're developing the next generation of security and privacy technology, and Cloudflare is helping us do it.
And that will ultimately benefit every user, both Firefox users and every user of the Internet.
We're really excited to work with an organization like Mozilla that is aligned with the user's interests, and in taking the Internet and moving it in a direction that is more private, more secure, and is aligned with what we think the Internet should be.
Microsoft Mechanics