🔒 Security Week Product Discussion: Improving HTTPS Redundancy and Configurability
Presented by: Dina Kozlov, Patrick Donahue
Originally aired on September 1, 2023 @ 11:00 PM - 11:30 PM EDT
Join Cloudflare's Product Management team to learn more about the products announced today during Security Week.
Read the blog posts:
Tune in daily for more Security Week at Cloudflare!
SecurityWeek
English
Security Week
Transcript (Beta)
All right, welcome back to Security Week. I'm your host, Patrick Donahue. Hopefully you caught our kickoff session earlier today with Michael Tremonti.
If not, we'll be posted soon on Cloudflare TV, so you can go re-watch it there.
It gives a great preview of the week.
I'm joined today by Dina Kozlov, who's the product manager for our SSL team.
She just published a blog post on blog.Cloudflare.com, so go check it out if you haven't seen it yet.
SSL TLS is a team near and dear to my heart. I used to product manage that team probably five, six years ago, at least now.
So Dina, let's start with the basics.
What are SSL TLS certificates and why are they important?
How are they used? For sure. Hi, everyone. Really excited to be here. So TLS certificates are essentially the core of security on the Internet.
They do multiple things.
One of the most important things that they do is they authenticate a website, meaning that when you go visit some domain and it has a TLS certificate, what that certificate is saying is that you are, in fact, visiting the correct server and the right website.
The other thing that they do is they help web traffic stay encrypted between clients and servers.
And so overall, they're really important to keeping sites secure and safe online.
Great. So how does one get a certificate?
I remember when I first got one many, many years ago, I want to say it was, gosh, late 90s, early 2000s.
I remember having to send some articles and incorporation of a company and actually mail this back to a company and wait a long time to get it.
Presumably that's not how they're issued today. Can you just talk me through what that issuance process looks like?
Sure. So since then, the process has sped up a lot.
It's almost instantaneous. But you have different organizations called certificate authorities.
And certificate authorities are responsible for issuing certificates to applications or domain owners.
And so what happens is as a website owner, I want to get a certificate for example.com.
The first thing that I need to do is prove that I own example.com because I shouldn't be getting certificates for websites that are not in my control.
And so the way I do that is through a process called domain control validation.
There's a few different ways that you can prove your ownership of a domain.
You can either do it through DNS records.
You can do it by serving HTTP tokens. Some certificate authorities send an email to you and you validate it that way.
But essentially once you've completed the domain control validation and shown that you do in fact own this website, the certificate authority then issues the certificate to you.
With Cloudflare, what happens is we're the proxy for websites. And so our customers will essentially prove that they own their domain to us or to the certificate authority.
And then we will go and we will fetch a certificate from our different CA partners like DigiCert and Let's Encrypt.
And then we will take that certificate and serve it at our edge so that any clients that connect to Cloudflare's edge are then that connection is encrypted and all of those websites have a TLS certificate.
Yeah, really important that these certificates are only given to those that can demonstrate control.
I think I read about a hack a few weeks ago or maybe a month ago now where somebody was able to hijack control and get a certificate issued and sort of bypass that.
Then obviously if you have a certificate, you can intercept traffic and decrypt traffic.
And so really important to secure that.
I know we've got products that help CAs check those methods from many parts around the world, but we'll save that for another day.
So you're issued this certificate.
Is that a one -time thing or do you have to do that sort of periodically?
So every certificate comes with an expiration date. So you do have to periodically renew your certificate so that you always have a fresh one that is served at the edge.
And one of the reasons why actually short -lived certificates, meaning certificates for 14 or 30 days that expire in 14 or 30 days are much better than long-lived certificates like year-long certs is that that certificate and that private key is only valid for that amount for that period of time.
So if there is a disaster and someone gains control of your private key, actually by when that certificate expires is when their ability to use that private key also expires.
So we do encourage our customers to use short-lived certificates. And we are integrating more and more with certificate authorities who actually default to 90-day certs, which means that essentially you would renew that certificate every 90 days and get a new one.
Yeah, I think that it's great to see the ecosystem, the WebPKI ecosystem, get to a point where everything is automated and you're not mailing things around or doing manual validation procedures.
Because to your point, being able to get certificates issued quickly and frequently is really great for hygiene, especially when you want to deprecate old protocols and things like that.
So it sounds like you're reissuing when certificates expire. What else would you reissue for?
Are there other reasons why it makes sense to do that?
Yep. So the good reason is certificate renewals when they're set to expire.
There's a few bad reasons why we would want to renew certificates. One of these is, for example, a vulnerability.
So for example, HeartBlade is a vulnerability that was exposed a few years ago.
It allowed anyone with a vulnerable version of OpenSSL, if an attacker made use of that, then they could gain control of someone's private keys.
And so in that scenario, you essentially want to roll the private keys onto a new set, which also requires that you issue a new certificate.
So that is one example.
Another example is, just aside from a vulnerability, someone gaining control to your private keys, you would want to instantly get new keys, new certificate.
Another reason that we've seen a few times recently is certificate authorities, if they find an issue in their system, they are required by the Certificate Authority browser forum standards to revoke any faulty certificates, either within 24 hours or within five days, depending on how bad the issue that they discover is.
And so we saw this recently, Let's Encrypt had to reissue about 2 million certificates.
Cloudflare customers were not impacted by this.
But actually about nine months ago, one of our CAs did have to do a mass certificate revocation, because they found that they were reissuing certificates on old domain control validation tokens.
And so we had to reissue 5,000 certificates immediately.
And so with the 24 hour and the five, or either 24 hours or the five day window, that's a very short period of time in which you have to get a new certificate, or else the Certificate Authority will market as revoked.
And any browsers or clients that check for revoked certificates, they will no longer be able to essentially serve your website unless you have a new certificate ready to go.
Yeah, that I remember, I think Heartbleed was 2014.
I think that was the year before I joined Cloudflare.
And I remember we had a very large, back then, revocation list. CRL was the way of the day before OCSP became a lot more prevalent, OCSP stapling, and was quite a burden of serving that very large revocation list.
And then these days, a lot of browsers don't even check that.
So being able to quickly reissue and move to a different certificate is clearly important here.
I think, just to fill everyone in on what Cloudflare has done to protect against this, you mentioned OpenSSL had this vulnerability.
We, of course, moved to BoringSSL, a smaller code base, a bit more hardened and tested in a project run by Google.
Obviously, we've also moved keys. So Heartbleed was getting the bits of the private key a little bit at a time.
I just remember, we had this challenge where we put up a challenge site and said, try to get the private key for this host name.
And we were thinking, this might take days or weeks for somebody to do.
And I think it was within 24 hours.
I might be off a little bit. But somebody was able to extract the private key.
And so moving those private keys out of that process and doing something that we call keyless, that you obviously spent quite a bit of time working on doing keyless, maybe even on that machine or in different locations.
And we can do that today by doing it in other parts of the world where we can serve keys from and customers can keep those keys themselves if they want on their own infrastructure.
So a lot of work that's been done to kind of harden against things like Heartbleed.
But of course, there are things that are going to happen with CH.
So you just took us through a couple of those things. At Cloudflare scale today, we have millions and millions and millions and millions of certificates that we're managing on behalf of our customers.
How long would it take to kind of reissue all of those certificates?
So our team did an estimation. It would take about weeks to reissue 45 million certificates.
And one of the reasons why it would take so long is even if our pipeline and system can handle the reissuance, we rely on our certificate authority partners to be able to issue as well.
And so there's a lot of different kind of components in the system that need to work to be able to reissue at that scale.
And with 45 million, that's just not, it would cause, that's just too high of a scale for our systems.
So yeah. Yeah, no, I remember when we were originally renewing all those universal SSL certificates, we were getting rate limited by, we worked with some different CA partners or fewer CA partners back then, and just being able to take a dependency on a third party and issue that, you can only go so fast as the underlying CA.
And so clearly not something that we want to have to do or wait weeks if there is a particular issue, if the CA is compromised, our customers expect us to react a lot quicker than that.
And that's why they use Cloudflare to make that easy for them and take that burden away from them.
So what would happen if we weren't able to reissue those?
What would actually happen from a security perspective? So either in a key compromise or in a CA revocation, your application domain is left vulnerable.
So either attackers gain control of your private keys and are able to serve the domain on their behalf.
And you're not able to kind of regain control. Or if you're left without a certificate at all, then all of the traffic to your website is insecure.
And so any onlooker can get information like credit card, bank statements.
And so you definitely want to keep that information encrypted and secure.
And to do that, you have to have a TLS certificate that is valid. And so you would essentially be left insecure or just completely down because you wouldn't want to serve insecure traffic to your customers.
Neither of those sound like a good option.
So let's jump into your announcement today. What are we doing kind of proactively to avoid that scenario?
Yeah, so I'm super excited to talk about backup certificates.
So essentially, the team started thinking about this a few months ago of if we had to reissue all 45 million certificates, how would we do that?
Since we're not able to do that reliably or we don't want to wait until a disaster to do that, what we're going to start doing is ordering a backup certificate for all of our universal certificate packs.
So essentially, every time we order a new pack, either you onboard a new zone or a renewal comes up, we usually just get one universal SSL certificate.
But now we're going to get a backup certificate.
And the certificate is going to be issued from a different CA than the primary one.
And so this is to help with cases like CA revocation. If that comes up, it wouldn't impact the backup certificate.
And the other thing that we're doing is we're wrapping the backup certificate with a different private key than the primary certificate.
So that again, in the case of a key compromise, the backup certificate is unaffected.
And so then in an event of a disaster, our team can quickly deploy backup certificates to the edge, preventing impact.
Very cool. So sounds like we're starting this process now, or we're about to start it.
Like, can you just take me through? I didn't quite catch.
Who are we doing this for? Is this universal SSL customers? I know we have other products, advanced certificate manager, that sort of thing.
Like, who is going to benefit from this?
So the end goal is to have a backup certificate for every Cloudflare cert.
But to start, we have our universal SSL product. That's the product that gives all customers free TLS certificates.
And so we have two kinds of customers.
We have customers that use Cloudflare as their authoritative DNS provider.
And we have customers that only use Cloudflare as their proxy and use a DNS provider that's elsewhere.
And so to start, we're going to issue backup certs for customers who use Cloudflare as their authoritative DNS provider.
And we're going to issue backup certs for all of the universal certificates.
We do in the future want to support our customers who have their DNS elsewhere.
The only caveat there is that issuing backup certificates for those customers will likely involve manual intervention with the customer, because they would have to add in the domain control validation records that we talked about earlier.
But then... Sorry, go ahead.
I was just going to say, but after we issue backup certificates for all universal SSL certs, which make up about 76% of the certificates in our pipeline, after that, we want to issue backup certs for advanced certificates and for our SSL for SaaS customers, for SaaS providers that manage TLS certificates for their millions of customers.
We also want to issue backup certificates for those. And then also for our customers who upload their own certificates to Cloudflare, if they would like, we would also one day would like to issue backup certificates for them in case they ever forget to renew the certificate that they upload or some issues come up that they also have a backup that's ready to go.
That makes sense.
And so in order to have a backup, and it sounds like a great ordering of from most value to most people, and then transitioning over time to getting everything covered, sounds like we're going to have to have multiple CA partners, certificate authority partners, the people performing that validation you mentioned.
Can you talk a little bit about how do we think about that? Do we have multiple today?
Are we adding any? How do we actually get to the point where we have a good diversity of that ecosystem?
Sure. So today, the primary certificate authorities that we use are DigiCert and Let's Encrypt.
The backup certificate authority that we're going that we are currently using is Sectacle.
We are currently working on adding more CAs into our pipeline.
So for example, Google is one of these certificate authorities.
And so we want to get to a point where we onboard more and more Acme-compliant CAs.
Acme just means that they have an automatic certificate issuance integration.
And so we want to get to a point where our customers, where we can essentially load balance between different backup CAs.
Our customers can have multiple backup certificates. And so that way, no one CA has too much impact on our customers if an issue ever comes up.
Yeah. And that's really interesting.
I know in talking to customers previously, sometimes they work with a particular CA, some kind of work with DigiCert already, or Let's Encrypt, or one of the other ones.
And they have a preference or an affinity for that.
I know we've let people sort of choose that from a primary perspective. Do you anticipate we'll let people choose who they'd like as a backup perspective as well?
For sure. One of the things that we want to add to backup certificates is configurability.
So allowing our customers to choose their backup certificate authority.
We're just right now starting with SectaGo. But like I said, we would like in the future to give our customers the option to choose.
And in the future, also have multiple backup certificates.
So if there ever is an issue with one of our backup CAs, there's also a backup authority to go.
That's great. And I know as somebody who was a previous Cloudflare customer before I became a Cloudflare employee, it was great to not have to worry about that stuff.
Dealing with that is not fun.
And I'd rather turn it over to somebody else to sort of manage that process, somebody that specializes in it.
So great to see that on behalf of our customers.
You mentioned something called ACME. I just want to spend a second on that.
If you think back to when we were early days Cloudflare integrating to the different CAs, it felt like everyone had their own sort of specific API that we were calling and going through different flows.
Some would sort of do the validation and issuance all at once.
Some would do validation as a separate step from issuance and so on.
And the effort for the engineering team that you work with here in team is sort of as you add each CA seems to add a double or triple or whatever the effort as you add CAs.
How does ACME kind of help that process? I think we lost Dina here.
So I'll answer that question myself. The thing about ACME is it's a standard.
And so the idea is that CAs will implement to a standard. Let's Encrypt really helped pioneer this.
And they were one of the big early adopters and helping drive automated issuance.
And so it's great to see certificate authorities adopting this model.
It makes it easy for companies like Cloudflare to integrate. And so if you're a certificate authority listening to this and you don't have an ACME endpoint, we look forward to seeing you launch one and make it easier to work with Cloudflare.
So with that, I think we'll wrap.
I know we started to issue some of these and we're excited to hear your feedback.
If you have any questions, please let us know. Dina is monitoring all that feedback and she's looking forward to incorporating that into the product plans in the roadmap.
Looks like we got a question coming in.
I will try to answer to the best of my knowledge. And if Dina is able to rejoin us, she can probably add some additional context there.
So one is, as part of the backup certificate generation process, do you have plans to also generate and publish TLSA slash Dane DNS records?
That's a great question. And let me explain what those records are.
And so if you think about when you go to a website and you're going to HTTPS, you get that certificate back to your browser.
Your browser is essentially chaining that back up to some sort of root certificate that ships with either the browser or the operating system.
Mozilla operates a trust store.
There are other trust stores out there that Mozilla is used by multiple browsers.
Depending on the platform you're running on, that certificate of trust goes up.
There's also another way to convey the validity of the certificate, the specific certificate that you're looking to do, and that's by publishing DNS records.
And so this is something that has not really been adopted on the website, largely for latency concerns.
And so the nice thing about having it locally and chaining back to it is you can validate and prove the correct certificate without having to make any sort of DNS queries.
But this has really been adopted quite a bit on the email side.
And so if we're able to then take that certificate that we're issuing and automatically publish in DNS and manage it on behalf of our customers, that becomes much easier.
And so that is absolutely something I know that the team is thinking about and wanting to do for those that do like TLSA or DAME DNS records.
And so if you're interested, go ahead and read more about that. The second question is, will you use both certificates randomly in production, or will almost all certificates be from the primary CA until vulnerability is detected and everything switches to the backup?
My understanding is that we are going to have a order, and so we'll have a primary certificate that will always be used, and then a backup certificate that would get flipped if there was some sort of compromise or reason to use a different certificate.
Not actually a bad idea from a feature request perspective to check resiliency, but from a predictability perspective, from a way to actually give some predictable responses.
Some of our customers will actually make connections and sort of expect to see a particular certificate back.
And so if we were kind of randomizing or bouncing between those, might throw off some of their testing.
Of course, some customers will actually also pin in a mobile application typically, they'll say, I expect this particular certificate, hopefully not the leaf certificate, the one that's sent back, but maybe something higher up in the chain, an intermediate, for example, because you may be reissuing on renewal, and you don't want to break your application, you don't want to lock people out of it.
And so my expectation from chatting with the team is that we will use that primary and then go to the backup certificate as needed if a vulnerability is detected and switched to backup.
So the last question is, any plans to have a Cloudflare CA?
So this is something that we've talked about over the years, and Dina comes back, I'll let her speak to kind of the current thinking here.
It's an area where from a where we focus perspective, it doesn't necessarily make the most sense to do.
We like having a diversity of partners to work with, going and becoming a certificate authority and going sort of through that process of getting validated and being added to a root trust store takes many years.
And there's not a whole lot of advantage for us doing it when we have great partners that we can work with and that have those APIs that they've adopted as such as Acme.
And so our preference is to spend that engineering time and audit time elsewhere and building solutions to diversify across them.
I don't think I'll rule that out. It may be something that we do in the future, but it doesn't make the most sense if you think about how do we spend our engineering time, where do we prioritize?
It's not something that our customers are asking for or asking about.
And so I don't think that you'll see that from us anytime soon.
We like the CAs we work with and we'll continue to expand that ecosystem.
So with that, I will wind down here. Thank you so much for joining.
Hopefully you enjoyed the session. If you have any questions, please reach out with feedback.
Dina is excited to hear your thoughts and ideas here.
If you're an enterprise customer, you can give that to your customer success manager.
We also have the Cloudflare community forums that you can join and post questions on.
And we look forward to hearing your feedback there. With that, have a great day.
We're betting on the technology for the future, not the technology for the past.
So having a broad network, having global companies, now running at full enterprise scale gives us great comfort.
It's dead clear that no one is innovating in this space as fast as Cloudflare is.
With the help of Cloudflare, we were able to add an extra layer of network security controlled by Allianz, including WAF, DDoS.
Cloudflare uses CDN and so allows us to keep costs under control and caching and improve speed.
Cloudflare has been an amazing partner in the privacy front. They've been willing to be extremely transparent about the data that they are collecting and why they're using it.
And they've also been willing to throw those logs away.
I think one of our favorite features of Cloudflare has been the worker technology.
Our origins can go down and things will continue to operate perfectly.
I think having that kind of a safety net, you know, provided by Cloud Flare goes a long ways.
We were able to leverage Cloudflare to save about $250 ,000 within about a day.
The cost savings across the board is measurable, it's dramatic, and it's something that actually dwarfs the yearly cost of our service with Cloudflare.
It's really amazing to partner with a vendor who's not just providing a great enterprise service but also helping to move forward the security on the Internet.
One of the things we didn't expect to happen is that the majority of traffic coming into our infrastructure would get faster response times, which is incredible.
Zendesk just got 50% faster for all of these customers around the world because we migrated to Cloudflare.
We chose Cloudflare over other existing technology vendors so we could provide a single standard for our global footprint, ensuring world-class capabilities in bot management and web application firewall to protect our large public-facing digital presence.
We ended up building our own fleet of HAProxy servers such that we could easily lose one and then it wouldn't have but it was very hard to manage because we kept adding more and more machines as we grew.
With Cloudflare we were able to just scrap all of that because Cloudflare now sits in front and does all the work for us.
Cloudflare helped us to improve the customer satisfaction.
It removed the friction with our customer engagement.
It's very low maintenance and very cost effective and very easy to deploy and it improves the customer experiences big time.
Cloudflare is amazing.
Cloudflare is such a relief. Cloudflare is very easy to use. It's fast.
Cloudflare really plays the first level of defense for us. Cloudflare has given us peace of mind.
They've got our backs. Cloudflare has been fantastic. I would definitely recommend Cloudflare.
Cloudflare is providing an incredible service to the world right now.
Cloudflare has helped save lives through Project Fairshot.
We will forever be grateful for your participation in getting the vaccine to those who need it most in an elegant, efficient, and ethical manner.
Thank you.
you