Why AutoTrader Migrated DNS and WAF to Cloudflare
Presented by: Alex Cruz Farmer , Mark Bell
Originally aired on June 10, 2020 @ 11:00 PM - 12:00 AM EDT
Best of: Cloudflare Connect London - 2019
Join Cloudflare Lead Security Product Manager Alex Cruz Farmer and AutoTrader Systems Engineer Mark Bell for a discussion on AutoTrader's migration to Cloudflare, and the associated benefits.
English
DNS
Migration
Cloudflare Connect
Transcript (Beta)
🎵Outro Music🎵 Hi, everybody.
My name is Alex Cruz Farmer. Oops, sorry. My name is Alex Cruz Farmer.
I'm the Product Manager for the Firewall at Cloudflare. The best way to really show how great Cloudflare is, is through our customers.
So we actually have Mark from Autodrader here, who's going to give you a quick deep dive into what Autodrader have done with Cloudflare.
And then afterwards, I'm going to whip you through some of the technologies that Cloudflare have behind the scenes.
So over to Mark.
👏👏👏 Hi, everyone.
I'd just like to say thanks very much to Cloudflare for having me today.
It's an honor and privilege to be here. So I'm Mark, and today I'm going to talk to you about why Autodrader decided to migrate our DNS and WAF to Cloudflare.
So the agenda is, I'm going to talk a little bit about our company, and then I'm going to go over our cloud migration journey, and why we chose to use Cloudflare.
So some facts and figures here. So Autodrader is the UK and Ireland's largest digital automotive marketplace.
93% of UK customers know who we are, and that's kind of testament to our strong brand position in the UK.
We get around 55 million cross-platform visits a month, and this equates to around about 94 advert views per second.
So this is a timeline that shows kind of the inception of the company in 1977.
So we started off as a magazine business. Basically, when I get asked questions about the company, I always get asked, oh, the magazine, because it's kind of a nostalgic thing.
So in 1996, we then launched our kind of initial online website.
In 2007, we then transitioned across to looking to move the print revenue across to digital, and this is because at that time, we saw that basically the print revenue and the print market was going down, so we needed to ensure we were kind of digitally ready to take on the challenges going forward.
In 2010, we launched our website for retailers option, website app.
So basically, what this is, is a dealer, for example, like bobscars.co.uk, they can basically have their own website and advertise the stock that they have on our website through their own website, and we kind of like manage all the infrastructure for that.
So we kind of manage all the servers, all the kind of DNS, all the DNS zones, et cetera, for that.
And then in 2013, we actually closed the magazine business, and that's when we actually became a 100% digital business.
So I'm now going to start talking about kind of our cloud migration journey and where we've come from.
So our previous solution is that we manage two data centers.
So we have a data center in Manchester Equinix, and we also have a data center in Runcorn Electronics.
The solution that we were using at the time was open source, so that was PowerDNS, and we also used VeriSign as a managed solution.
For our WAF, we used F5 Big IP, and for DDoS, we used Arbor. Just a little note on the PowerDNS to VeriSign bit.
So all of our DNS zones are on PowerDNS, and we're performing an axe of a sync to VeriSign.
So that's basically like a DNS zone transfer.
So factors for our migration decision. So we decided in mid-2018 that we decided to migrate all of our kind of on-premise applications to Google Cloud, specifically GKE, which is Google Kubernetes Engine.
One of the reasons that we decided to do this was we thought that we could get some faster feedback for our application developers that could deploy things faster and just better than the kind of infrastructure we had on-premise.
VeriSign had also been bought by another provider, which meant that at the time we decided to do this, we'd need to transition to another technology anyway.
So we'd either need to go with the one that VeriSign had been bought by, or we'd need to go to a different solution.
And there was a closed timeframe on this, so we needed to do it all by February 2019.
And at this time, we saw an opportunity to review the market to coincide with our cloud migration.
So what did we need to consider? So we needed to review all the different DNS providers that were out there.
And from my perspective, being a systems engineer, I was more interested in the performance to make sure that we were kind of getting a better performance than what we had previously.
We needed to research the offerings against each other. So we needed to make sure that we basically looked at the pros and cons of each of the products and kind of picked the one that was right for us.
And a key part of that is reviewing the costs as well, because things can get quite expensive.
One of the reasons why we actually picked Cloudflare was because they have an Internet program with Google.
So you can save up to 75% of your egress costs using Google Cloud and Cloudflare.
And then once we decided on the solution, we needed to come up with some migration steps, because as I mentioned earlier, for our dealer website platform, we've got thousands of zones.
So we've got over 9,000 dealer zones and 3 ,000 active dealers using our products.
So it's quite a lot of zones to migrate. So you're probably asking, well, why Cloudflare?
So as I mentioned earlier, improved DNS performance.
So when we migrated to Cloudflare, we've seen an improvement of 50%.
So we were getting around 25 milliseconds from VeriSign. And when we moved to Cloudflare, based on internal and external monitoring, we're getting around 10 to 13.
Also, if you go to dnsperf.com, which is a completely third-party site, you can see that Cloudflare pretty much at the top on that site, which is great.
Another great thing is the ease of use of the interface itself and the API.
So we currently use the API to perform backups of all the zones. We also use the API in conjunction with our GKE environment with Helm.
So basically, we use what's called external DNS incubator plugin, and that allows our application developers to automatically provision DNS addresses that are automatically enabled for DDoS and WAF out of the box, which is really fantastic because previously we were having to use some automation that was going directly to the database and putting things in there, which is not great.
And then the DNS and DDoS and WAF in a unified interface because, as I said before, we had three different products for each of these things.
So getting it all in one place is really powerful.
And then looking forward to the future, as has been mentioned earlier, we're looking to utilize the edge with Cloudflare Workers.
Another thing that I've not actually put on there, just to say it's really good about Cloudflare, is the fact that they've got really good support.
So things like LiveChat. So LiveChat's really great if you want to ask a quick question to see who will get the answer back.
And also, with our customer success managers and solution engineers, they've been really helpful in our transition to Cloudflare.
So how do we migrate to Cloudflare?
So it's kind of a phased approach. So what we did was basically batch up the zones.
So as I mentioned earlier, we had quite a lot of zones to migrate.
So we did it in a phased approach. And part of this had to do some cleanup.
So by checking all of the name servers that are in our kind of infrastructure, I identified that there was over 8,000 defunct DNS zones because no one was cleaning it up.
And really, the reason that I've mentioned this is because I'll never have to do this cleanup task again.
Because Cloudflare automatically check that your zone is assigned to their name servers.
And if it's not assigned to their name servers, then they'll automatically delete the zone, which is really fantastic.
So I'll never have to do that again. In order to do this, I had to build a migration virtual machine with MySQL.
And basically what I did was sync all of the zone files into this migration VM from our live PowerDNS.
And then to get all the zones into Cloudflare, I used an open source script called PDNS to Cloudflare, which is a bash script, which is great.
As a system engineer, I love bash.
And just a shout out to OctoDNS because, unfortunately, it didn't work with our version of PowerDNS.
But if you're using any other different DNS technologies, that's a really good one.
It's created by GitHub. And it allows you to migrate from different offerings like AWS, Google DNS, or Fastly and things like that.
So some facts and figures from our migration. So as I mentioned earlier, we migrated over 9,000 dealer zones, 300 of our main zones.
These are things like autotrader.co.uk, carzone.ie, which is our Iowan site, and motortradedelivery.com.
And at the time of migration, our autotrader.co.uk zone contained over 4,000 records, which is by far our largest zone.
So going over to the Cloudflare WAF features.
Some of the features that we noticed that were really good were the enforcement of capture challenge.
So we actually noticed that one of our third -party sites that we don't actively maintain ourselves had a vulnerability with the contact form.
So what this allowed us to do was enforce a capture challenge, which is really great.
Also, for your login URI, you can increase the security level.
So say, for example, you want to increase the security level to high for a specific URI, such as login.
This is really powerful as well because it might be that you don't want to have the highest security level on every zone that you've got.
So firewall analytics. So this is something we actually fed back to Cloudflare.
So when we were reviewing the other providers, as I mentioned earlier, we identified that the analytics platform in some of the other providers was a bit better.
But to be fair to Cloudflare, they took the feedback on board, and they've implemented the new firewall analytics.
So you get a nice graph now. So this basically shows all the WAF events.
You can see log, simulate, block, and challenge, and you can filter these.
And once you filter them, you can see all the granular details.
You can see the specific URI, the ASN number, and the actual rule that's blocked, which is really helpful.
And as part of this, actually, we're currently looking at, with the GKE stuff that we're working on, we're looking at automating a bit of this through Terraform using the Cloudflare plugin.
And also, just onto this, we're also utilizing LogPush.
And basically, LogPush allows us to push the request logs into either an ADRS and GCP bucket.
Obviously, as we're migrating to Google Cloud, we decided to use GCP buckets.
And then in conjunction with this, we're using FileBeats and Elk for visualization.
As you can see, there's a graph there that shows all of our kind of domains and the requests going through.
This is really good, actually, because Cloudflare currently limit the number of data points that you can see in the UI.
So then this allows us to take all of the data out, and we can store that for as long as we want.
Cool. So that's all for me.
Thanks very much for listening. There's some URLs here if you're interested to learn more about our company.
So plc.autotrade.co.uk. There's a direct link to PDFs to Cloudflare, and also a case study that we did with Cloudflare and Google.
Thank you very much. I'm going to hand over to Alex now. Cool.
Thanks, Mark. That was great. So as I mentioned, I'm Alex Fusar. I'm a product manager for our firewall.
So as John mentioned, I mean, one of the big things that Cloudflare want to do and we are doing is building a better Internet.
And it's a mission that we've taken on board.
But we can't do it alone. We need your help.
We absolutely need your help to make sure that we can do that. So Cloudflare have a number of different products and features, and Mark's demonstrated a number of them through his presentation.
And he's touched on just a few of what Autotrader use.
I mean, Cloudflare is not just about having a WAF. There's a lot more behind the scenes that happens.
So, for example, as Mark mentioned, we provide DNS and DNSSEC.
We have a leading DDoS protection platform. We provide SSL out of the box and we make it easy.
I mean, we were one of the first pioneers in making sure we've made SSL available to everybody.
And as John mentioned, you know, we're leading the way with things like TLS 1.3.
We have features and capabilities like rate limiting to control the number of requests going to origins.
You can limit abuse. You can look at fraudulent logins and a number of other things.
And then we obviously have our WAF.
So that's where we contain a number of our capabilities like our managed rule sets, which is something that Cloudflare curate.
And then we're managing and we're building and we're using the data that we have to improve them on a day -to-day basis.
But we also have other capabilities in there as well. So, for example, file rules, which Mark demonstrated.
It's a really flexible way of being able to create protections for your domains and your applications.
And then finally, we've got our bot mitigation as well.
But those are just like the core pieces of our security portfolio.
And what's really interesting is that we don't just focus on web applications.
We're also branching out. So Orbit, for example, it's our ability and our capability to protect IoT devices against attack.
We have Spectrum, which is now opening up a whole world of different opportunities for you, our customers, where we can now protect TCP and UDP applications or even ports.
We can protect them against volumetric attacks.
And then we'll then go forward and then become a bit more application aware and then provide you that better insight, but also better security at the edge.
And then we've got Access, which then provides you a great security interstitial in front of your application.
And then you can integrate that with your Active Directory, GitHub, Facebook, whatever it may be.
And it's something that I personally use to secure up my personal applications that I have running at home.
I also personally use, as well, Argo Tunnel because I'm on an IP address.
There's some carrier-grade NAT, and there's some other crazy stuff that happens.
And I run everything in containers because I want to make it easy.
So I use Argo Tunnel, which then creates a secure encrypted tunnel between Cloudflare and my origin.
And the great thing is that it's very seamless.
It's literally a one -liner command for me to deploy that. And then finally, we have Workers, which allows us to run JavaScript to the edge.
Now, you're probably saying, well, you've just, like, given me a whole world of different features and products and things, but, you know, why?
So one of the things I want to do is go through with you and walk you through what the OWASP Top 10 is.
So check my clock. I've got a few minutes. So I'm going to try and do the OWASP Top 10 in 10 minutes.
I bet my colleague, Pat, that I wouldn't be able to do it, but he seems to think I can.
So for those of you who don't know, the OWASP Top 10 is an open-source framework, which essentially provides the world, essentially, a standard on how and what they should be looking at securing their applications against.
So the last framework refresh was in 2017. Previously, before that, it was 2013.
And it's really interesting to see how they've then evolved as they go on.
So obviously, we're now expecting a new one in 2021, but for now, we're going to focus on 2017.
So firstly, injection. If we look at injection, it's pretty straightforward.
Anything to do with SQL injection, anything to do with remote code execution, but, you know, that doesn't mean anything.
What's the context behind that?
So in each one of these examples, I'm going to give you some taxonomy.
I'm going to give you some examples and statistics. I'm going to give you some proactive steps that you can do outside of being a Cloudflare customer to help maintain your security, and then also explain to you what Cloudflare features and capabilities are available to help you protect against each one of the OWASP Top 10.
So A1, injection. So, again, ability to inject malicious code. So where is this being seen in the world?
Well, Equifax, they had an attack where someone came along and injected some code, and then the attackers were able to then exfiltrate 143 million records.
That's significant. If you consider Equifax and the type of data that they have, that's huge.
Magecart, British Airways, Ticketmaster, all of these attacks were all where attackers injected bad code into an application, and as we all know, the effects were phenomenal.
I mean, Magecart has been extremely successful at what they do, and they made a ton of money.
So what can you do to protect yourself?
You're going to hear a couple of these again because they're repeatable, and the great thing is that a lot of these things are quite easy to mitigate and work towards.
So vulnerability tests. Your applications will always have CVEs coming out.
Now they've got opportunities for open source tools for fuzzing and other ways to find vulnerabilities now.
We're seeing tons and tons of vulnerabilities and CVEs coming out everywhere, and it's extremely hard to stop that.
On average, an organization has 20-plus SaaS products, and they have a whole load of open source software as well as corporate software as well.
I mean, if we look at SharePoint, for example, SharePoint had a huge vulnerability recently as well, and that's something that Cloudflare then built and added within our Cloudflare managed rule sets to provide you that protection.
So, you know, you can have a bit of respite before you can patch your platform itself.
For bigger organizations, there are bug bounty programs.
When I go and talk to a lot of our enterprise customers, this is what they use.
They found it more successful than using vulnerability in pen testing because the bug bounty guys are more motivated to find vulnerabilities in applications than a traditional pen tester or a traditional vulnerability scanner would find.
So customers, for example, HackerOne, I believe, are a customer of ours, as well as BugCrowd as well, can work with you to build that out.
But on the Cloudflare side, obviously, our Cloudflare managed rule sets, as I mentioned, is a really great way of securing your application.
So, for example, if you're running WordPress, it's a very easy one-click enable for a rule set that will give you that level of protection against common vulnerabilities.
We also have firewalls, which is what Mark mentioned, which then can give you that flexibility to add protections to your application.
We provide regular expression access, and we provide a whole ton of flexibility within that.
And obviously through our security level as well. Our security level is powered by our data.
We use machine learning, and we use all the bad things that people are doing, and then we curate, essentially, our IP reputation database, which can then be controlled both through firewalls as well as through our security level.
And what we found from customers and the feedback that we've had is that because a lot of the bad actors are essentially scattergunning their approach, we're now able to pick up and learn where the bad things are happening and then deal with them immediately and as quickly as we can.
And what we see is, I believe, over our proactive detection between our DDoS, our Layer 7 DDoS and security level, we block around about 15 to 20 billion events a day.
So, broken authentication.
This is a fascinating one. When I was just doing some research on this, just literally in the last couple of months, there's been two significant breaches.
Now, we've all heard about credential stuffing, where databases have been stolen.
Troy Hunt, who's a big friend of Cloudflare, I think the latest stat that I found yesterday when I was updating this, is that he's got just under 8 billion username and passwords on haveibeenpwned.com.
That's insane. When I've been talking to some more commercial bodies who provide these style services, they have over 16 billion username and passwords.
What's been fascinating is that Uniqlo had 460,000 accounts compromised just between April 23rd and May 10th.
Now, that's huge, and this was in APAC.
But just because those accounts were compromised, that doesn't mean it stops there.
They had access to PII. They had access to partial credit card numbers.
There's a lot of damage that that information can do.
Reddit did a mass reset on a whole ton of username and passwords recently because of that exact reason.
They saw a massive credential stuffing attack, so they took proactive measures to deal with it.
So what can you do within your organization?
Well, you can monitor for an increased authentication errors.
You can look at things like 403s or 401 errors or whatever your application provides you back.
Do things like enforcing 2FA. If you can enforce two -factor authentication, it requires a human to do something else.
And whether that's via email or whether it's through a token, it makes a big difference because it means that credential stuffing just won't work.
So CloudFlight provide a number of different tools that you can use.
So, for example, our bot management. A majority of these credential stuffing attacks come from bots.
So that's a really solid way of solving it.
Through firewalls, you can add some protections in there.
For example, Mark demonstrated on the portal they're using captures to validate a human when they access their site.
You can then use things like our rate limiting as well.
So you can add rate limiting onto your login pages. So if you're seeing an increase in 403 errors, rate limiting can count the number of 403s that are seen and now block the IP address from them repeating those attacks.
And then finally, obviously, our security level.
Because we're learning. We're constantly learning all the time.
We can then use the data that we have to then apply that onto your applications as well through the security level and also through firewalls.
And you can use and get access to this data through CloudFlight logs.
Because CloudFlight logs will provide you that data or give you the error codes that you need.
And then you can start building that protections and you have that full visibility into what's going on.
Fascinatingly as well, and it's unfortunately fallen off the bottom here, but there is a blog post written by John Graham-Cumming, our CTO, about how you can integrate Have I Been Pwned and CloudFlight workers together to provide a really interesting credential stuffing solution as well.
So sensitive data exposure.
The fascinating thing is that we talk a lot about SSL.
But it's amazing how many people out there don't have SSL enabled at all.
6.8% of the top 100,000 websites in the world use insecure SSL protocols. 21% of the Alexa top 1,000 don't even use encryption at all.
That's just amazing to hear that.
So what can you do as an organization to make sure that you can maintain that?
Well, firstly, obviously use Cloudflare because we're going to sort all those worldly problems out for you.
But also maintain your own ciphers as well.
Your origin could be exposed on the Internet. So it's important that you provide that security there as well.
Use technologies like HSTS. I mean, Cloudflare is something that we can support and help you with.
But also look at DNS features as well.
We see a lot of DNS attacks, DNS poisoning. So DNSSEC is something that Cloudflare provides out of the box as well.
And we make that easy for you. So one of the other things that's fascinating is, and I just realized there's a mistake in my presentation because we renamed Argo Tunnel to Access Tunnel recently, is that Access Tunnel also will then provide that protection between your origin and Cloudflare.
So whilst we were able to provide a solid level of protection between the client and Cloudflare, there is always this opportunity of weakness between the origin and Cloudflare.
And this is where our Access Tunnel product can solve that problem by creating that encrypted tunnel between Cloudflare and your origin.
One of the other things that we can do is we can do very straightforward things.
Automatically enable any HTTP links within your source code.
We can automatically rewrite those to HTTPS. So you can almost use us as that safety blanket to give you that level of protection.
Sorry, I'm flying through. I've got seven minutes left.
I think we're almost doing okay. All right, so hands up. How many people here are still using XML in their applications?
Oh, a couple. All right, so this one will be really quick.
Stop using XML. That's basically the answer to this.
But ultimately, but more importantly, what we've been seeing is the XML processes have been targets of vulnerabilities.
They're old. They're not something that people are investing a load of effort into.
So it's been fascinating seeing XML bombs and people basically taking down XML APIs with just huge volumes of data.
So what can you do? Disable XXE if you don't need it. Why have it on?
And you'll see that as a trend as well, that disabling things you don't need is a really good way of securing the application.
Limit and validate XML payloads.
I mean, you can do that through Cloudflare Workers, and you can validate them before they hit your application.
Cloudflare's managed rule sets as well within our web application file do have some good protections against XXE attacks.
But also, again, rate limiting.
If you're seeing a repeated attack, use rate limiting.
It's a really good, solid way of you being able to limit the number of requests that comes to your application.
So broken access control. This is basically a way of someone being able to get access to an endpoint on your API or on your web application that you think is protected.
So there's two really good examples.
Magento just had a vulnerability in March 2019, which they've patched. But Cathay Pacific, which was fascinating, had a 10-year-old or so vulnerability in their loyalty program, which gave attackers access to essentially it should have been an authenticated area where they got access to a whole ton of information that was extremely sensitive.
I believe that 41 system credentials were leaked during that attack based on the report that they provided.
So what can you do within your organization to do better?
Well, firm up your SDLC. These mistakes often happen when code is often missed or authentication is not put in properly or the authorization and the code and the validation isn't there.
But also you can use things like penetration testing and security audits as well.
So you can validate the access.
So when we look at our APIs, we're making sure that the right access is available to the right people.
But ultimately, log. You've got to log and know what's going on.
Logging access and logging authentication failures and seeing people doing systematic things.
So again, you can add protections with firewall rules.
You can do things like if this cookie is not available, make sure you block this request because we expect these cookies to be available or these session tokens to be available on this request.
So if it's not there, block it. You can use rate limiting again because what we see is systematic abuse where there's a huge scan across your infrastructure re -applications, and you can use rate limiting to then limit that.
And obviously, as a good one, you can always use Cloudflare Access to make sure that there is authentication there, and then you can integrate that within your application as well.
So security misconfigurations.
I know I'm whistling through this. Anybody can have a conversation with me offline.
But ultimately, this is where attackers find applications that are running default configurations, default credentials, or in most of the worst circumstances, no authentication at all.
So in March 2009, First American Mortgage had an open S3 bucket, and that had 885 million mortgage documents that were leaked because they didn't put any security or any authentication in front of their S3 bucket.
In media health, leaked 1.5 million patient records due to a website misconfiguration.
So what can you do? Protect your applications before putting them live, making sure that you don't have anything default turned on, and disable or remove any of these default modules or credentials.
There's new technologies, or DAS, so Dynamic Application Security Testing, and these things are good, and they're available.
They're not the cheapest thing in the world, but they will give you what you need, and they will help you find these gaps in your security.
So Cloudflare Manage All Sets does have some capabilities in there to look at common applications where we know are sensitive, so things like config files that are not supposed to be ever read by an external entity.
You can use things like firewall rules as well to firewall off access to your application.
You can use things like adding in that you require a cookie or you require a safe IP address to get through.
There's a ton of different things that you can use there, and obviously looking at things like rate limiting and Cloudflare access can significantly help you as well.
So cross-site scripting. I mean, this is something that is sort of age-old and has been around forever, but I think it's really misunderstood as to how powerful this can be for an attacker.
So, for example, let's consider something random.
So alex.com, I have an open cross-site scripting attack, but I'm also the most popular website in the world, and you basically, I don't know, store your Bitcoin in my magic wallet or something.
Someone finds then a cross-site scripting attack on my website.
If they're able to then attack me and then basically plant some JavaScript in this attack, they will then be able to potentially steal my users' session cookies, be able to then log in to the application, and then go and basically steal all the Bitcoin if they wanted to do that.
So WordPress had a vulnerability in their live chat support platform where 60,000 users which used it were potentially compromised because of this cross-site scripting vulnerability.
And again, WordPress themselves have equally had some vulnerabilities with cross-site scripting.
And again, WordPress are very good at patching them.
And they're also very good at informing us so we can then add that security in to our managed rule sets to make sure that you, our customers, are protected.
So again, regular vulnerability and penetration tests.
Make sure that you're running up-to -date software. Bug bounty programs as well will help you find cross-site scripting attacks.
Some of our bigger enterprise customers say that the best success they've had finding cross-site scripting attacks or vulnerabilities is actually through bug bounty programs.
So a Cloudflare managed rule set is a really good way of being able to add that protection to your application.
So insecure deserialization. So ultimately, the bottom line is that when you have a piece of XML and you need to put it in some kind of machine-readable format that's a little bit easier, essentially you can use this technology to make essentially your XML a bit easier to read.
I didn't really explain that very well.
But anyway, so examples of this is that Ruby itself was found to be vulnerable.
But this also extended to a number of other applications as well.
Or sorry, coding languages as well. And even in April 2019, Oracle themselves, WebLogic, had a remote code execution attack as well.
And the reason is that once these pieces of data have been essentially deserialized, they're often used and then stored.
And then often the web server then may execute them at some other time.
And it's that execution piece and the storing piece which is where the vulnerability really is.
So Cloudflare's managed rule sets do provide that level of protection to essentially give you some respite against these style attacks.
And Cloudflare Workers equally as well. But ultimately, you need to have necessary checks on your application workflow.
But also logging attempts for deserialization because it is something that your applications can detect.
So one of the things that I found fascinating when I came into the security world was that actually a number of people that I spoke to often used software that they just picked up from GitHub or picked off someone's fork or whatever it may be.
And a lot of those pieces of applications may even have CVs or vulnerabilities available.
So one of the things that I think is really critical is that when you're loading up a new application or you're finding it, make sure you've got the most up-to-date version.
And you can use things like vulnerability scanners to go and understand whether those applications do have vulnerabilities in them.
But if you've got old applications or things that you would consider legacy that are in that position where I can't update these because I think it might break everything, then try and disable as much as you can.
And ultimately, if you can patch them, do do that. But Cloudflare's managed rule sets, again, will provide you that good level of protection.
Finally, and the last one, and I'm way over time, but insufficient logging and monitoring.
So one of the things and one of the trends that I've talked about a lot through this presentation is making sure you have that visibility into what's going on.
As Mark mentioned, Cloudflare's firewall analytics was a huge step in the right direction to give autotrader that visibility into what was going on.
So between the 23rd of April to the 22nd of May, Cloudflare bought 1.3 trillion requests.
And ultimately, for us to store all of that data and then provide that to our customers is critical because it means that they can see what's going on.
So having that integration with things like Cloudflare logs, as Mark has as an autotrader, but we've also made that quite easy now with integrations with things like Sumo Logic, Splunk, Datadog, and Looker, and I believe I saw Datadog here today.
But ultimately, what can you do at your organization to make that successful? One of the big things that we often see and Target had when they got attacked was that they had a ton of investment made in the security side.
They had FireEye. They had all the things that they needed.
But the challenge that they faced was that they had so many alerts and so many errors going off that the people monitoring in their SOCs just basically ignored them because they didn't really know what they meant and they didn't really understand why they were coming in.
So that was the reason why Target was able or their attackers were able to then leave and sit there and harvest everything that they needed to.
And similarly with Magecart as well.
They go in under the radar and then they stay under the radar for as long as possible.
And that's the reason why being able to control your signal-to-noise ratio is really important.
So understand what is critically important to your business. All right.
So in summary, make security part of your SDLC if it's not already. And I believe that a majority of people here will always be considering security as part of their SDLC.
But invest in the monitoring side. Assess risk and making sure that you're communicating about security up and down your organization.
What's often I see is that security is often talked about lower but it's not often talked about at a board level.
And the C-level execs do need to understand what's going on and they do need to understand how much of a risk or if there are potential risks to your organization.
The great thing about Cloudflare is that our customer success and our solutions team as well as our support teams are excellent.
So if you do need help with any of the things I've talked about today, they are more than happy to help set this up.
Our product marketing team came up with a great concept about the Cloudflare maturity model.
And that will help you get a much better understanding as to how well you're using Cloudflare's products and services and also make sure we can get you on a plan to increasing your security.
There are great resources available through our blog, our developer documentation, and our enterprise newsletter.
And obviously if you need anything more from me, my email address is alexcf at Cloudflare.com or Twitter at alexcf.
Great. Thank you, everybody. Thank you for coming.
I hope you enjoyed your event. I hope you had a very nice lunch. And welcome to this afternoon session, so the first one talking about DevOps and how to build serverless computing solutions.
And in this talk, for 30 minutes, we are going to talk about how to accelerate your application delivery.
Yeah.
One, two, one, two.
Is that any better? A little bit more, please. And here, one, two, three, four.
Good? Okay, great. So, yeah, I was saying we are going to talk about application delivery and how to make it faster and easier, but not on the spectrum of talking about the portfolio of Cloudflare this time, but more focusing on actually what it takes to get an application created, connected to the Internet, and connected to Cloudflare's Edge to start receiving your traffic flow through Cloudflare to reach your region.
So before we start a bit about myself, I'm Stéphane Nouvelon.
I'm part of the solution engineering team at Cloudflare, and it's been now two and a half years.
And the solution engineering team at Cloudflare plays a technical advocacy role for our customers and has two main objectives.
So make you make the most of Cloudflare and help you to integrate our solutions in the most efficient way.
So Instant Promo, the solution engineering team has prepared some very hands-on demonstrations.
You can find them next to the entrance of the Conti suite room, I guess.
So that's a nice occasion to see them face-to-face and to see all products in action.
So the agenda for today, we're going to start to talk about challenge of what it takes to get an application online and connected to the Internet.
Then we're going to take a step back and define what is Cloudflare and how does it work at core, stripping any portfolios we have in the equation, but talking about the concept of Cloudflare.
We're going to see how to reduce the distance between your origins and Cloudflare by connecting Cloudflare virtually on top of your rack with some tunnels.
So actually having the opposite instead of you talking to us, having us talking to you.
And we are going to see how to bring that logic and that benefits to some Kubernetes deployments with our ingress controller.
We are going to see how to gain agility and a bit of latency with serverless computing with Cloudflare Workers.
And we're going to see how to automate in a nice way your configuration without investing time with IPIs because we know that we are not the only Cloud vendor you use in your stack.
So we want you to make this easy and we want to work for you on this.
And we're going to end up with a live demo.
So you're going to see the journey of a small web application I created for the occasion running on Kubernetes with two services.
We are going to open tunnels to Cloudflare so Cloudflare load balance the charge between both of them.
And we are going to apply logic in serverless to authenticate access to some assets with tokens.
And we're going to automate everything in Terraform.
So the main objective is to see me like doing hands-on manipulation without touching the interface.
So you have an idea of what you can do in your DevOps pipelines.
So first, so what does it take to publish an application? So maybe the first point can be seen a bit trivial but it is not actually.
It is arising some challenges.
You need a public IP. So if you know Cloudflare a bit, you know that you do create from the beginning to the bottom.
You create a DNS record which is a subdomain of your domain where you want to host your application.
And that subdomain on Cloudflare side will need to point some out to a public IP, IPv4, quad A, whatever.
The problem with a public IP is that this IP might be dynamic.
So you need to keep state between your different system interacting with it. So extra work and maybe come with a cost if you want it to be static.
The second thing is that you might need to do some traffic engineering to make sure that this new traffic reaching that IP is actually reaching your final application to make it working.
And the last and not the least, in term of security, everything that you do open from the outside world of your trust and safety zone, you need to be patched.
That needs to be patched to be sure that only Cloudflare is reaching your application, your origin.
The second thing is the eternal dilemma around running code server-side and client-side.
And I think John has made a very nice demonstration explaining both of the challenge that they bring.
So lack of control for the client -side and maybe a question of latency and performance for the server-side.
So we are going to see how to bring a surplus to run code to just teach the best of the two worlds.
And finally, because there is a multiplication of Cloud vendors, as I said, we have the humility to think that Cloudflare is not the only one you use in your stack.
And we want to work for you because we don't want you to learn by heart 10 different APIs and integrate with them.
So that's why we did the investment in Teleform, which allows you to do your automation with only one language, with only one abstraction level.
So let's take a step back. So when I need to define what is Cloudflare, I like to not explain Cloudflare by its portfolio because that can be confusing.
We do a lot of stuff. I like to explain that Cloudflare is a network, a very large network present in 180 different locations.
And this network has a main goal to put in contact visitors and web assets.
And as a result, the web assets are supposed to receive only clean traffic because we do our magic by this position of man-in-the-middle, security, reliability, and performance.
And then your application is safer and faster.
And your visitors are connecting to your application where they are with the Anycast network and the BGP concept that we are leveraging.
So in that presentation, we are going to focus on the right-hand side, so basically between the cloud and the web assets.
So not talking about the portfolio of Cloudflare, but actually what it takes to get that web asset created, hosted somewhere, or maybe meet multiple locations, and to get it connected to Cloudflare to start publishing your application.
So the first product I wanted to introduce to make this happen.
So usually you know that you need to point to an IP address.
You need to have this IP address available and secure and everything.
So what about doing actually the opposite? So instead of having Cloudflare talking to you, you talk to Cloudflare to maintain tunnels.
So you have a private connection with Cloudflare.
You have your application public on the Internet, but private for the external world.
So you are sure that all of the traffic reaching your application is flowing through Cloudflare.
So as a solution, ArgoTunnel provides VPN tunnels, minimal bootstrapping configuration.
So when you declare a new service, the minimal bootstrapping will be done for you.
So DNS records all load balancers if you have more than one traffic ingestion. And then as an extra on the performance side, we can apply some compression as well.
So you do add an extra mile in terms of performance. So now to bring the benefits of ArgoTunnel to Kubernetes.
So quickly about Kubernetes. Kubernetes, I will describe it as Kubernetes is to the containers, what VMware is to the virtual machines.
It's a container scheduler that allows you to operate at scale, the scalability, the network, the storage, and everything related with your deployment of containers that can be microservice or that can be your own policy of having all of your application containerized.
So the ingress controller provides exactly the same benefits of ArgoTunnel with one bonus, one exception, the way it works as a daemon.
So you do create your service and you can see on the right-hand side, line 44 and 45.
So there's a minimal configuration, but those annotations are saying to your Kubernetes that this service will need to be connected to Cloudflare's edge, thanks to tunnels, that the bootstrapping of the configuration will be done in such a way that this service will be load-balanced and inserted in the pool named left.
And on the line 52, that this application will need to be public through an hostname named demo .justa littlebyte.ovh.
So now, and because DNS records and load balancers are not the only thing you can create on Cloudflare, far from this, so you can manipulate everything, basically reliability, security, performance.
As I said, we want you to do it in a nice way without investing time wrapping up with your API.
So we have different solutions. We have SDK and PHP in Go, in Python, but we do invest also in Terraform, which is quite popular, provided by the majority of the Cloud vendors and Cloudflare as its own vendor.
So as a result, you manage your infrastructure configuration as code, not only for Cloudflare.
If you do maintain your application, your containers, load balancers, instance, and Cloudflare configuration, you do manage one infrastructure with one language, one source code, and for sure that's very compatible with continuous integration and continuous delivery pipelines you have, you might have.
So now about the challenge of running codes server-side or client -side.
As I said, server-side, if your application is not global, but your market is, in some case, your computing is done far away from the visitor, so it can induce some latency.
At the same time, client-side, you do have a lack of visibility of simply starting with the fact that is JavaScript enabled?
Am I in a risk to break my application?
Or even like the browser type or the operating system, which needs you to do an extra work of adaptability to make sure that you maintain like a global compatibility.
So with Cloudflare Workers, we do provide a soft place to run code, which is basically stitching the best of the two worlds.
You do have your script executed on every of all locations, pushed to the edge and live in less than three seconds, and computed in almost all of the world in less than 10 milliseconds of your visitors.
So it provides you very fast machines to do your computing intensive work that you do again and again and again at your origin that you can push to the edge.
You can implement some stateful logic with the storage we have.
So it means that one worker instantiation can have access to the state or the disposition of the workers instantiation that happened before.
And finally, and again, very compatible with continuous integration and deployment.
So everything as we do as Cloudflare starts with the API and then the UI.
So you can see a nice UI where you can do preview. You can test and troubleshoot in advance.
But then when it comes to automate, that is also available to you.
So we are going to go through a small demonstration that's going to last 15 minutes.
And we're going to wrap up the three concepts we've seen. So there's going to be the journey of a small web application.
I'm going to create an host on two clusters, Kubernetes, different locations, that I want private, so no connection to the Internet outside and ongoing HTTPS connection, connected to the edge of Cloudflare and load balanced by Cloudflare without touching anything.
I want the DNS records to load balance are created automatically.
And we're going to focus on the security scenario I have to authenticate all of the assets on my website.
So the videos, I want my visitors to come to my website to get a token to access the video.
If they share the link, I want it to be expired after 60 seconds. And for closing the subject, we are going to automate everything with Terraform.
So as a no directive, I'm not going to touch the interface of Cloudflare at all.
So the first step, we have seen it already.
So that's the configuration of Kubernetes, very basic one.
I'm just pointing to the container I want my service to run, the number of replicas, like quite usual Kubernetes stuff.
The main change we have here is about the ingress, where I'm not pointing to a load balancer, a public IP or a VPC that allows me to receive traffic from outside.
I'm just telling ArgoTunnel to, hey, that service will need to be connected to Cloudflare's edge, needs to be automated in terms of bootstrapping, needs to be load balanced, because I'm going to start with left, but then I'm going to create the right side for resiliency, and I want that service to run and to be available on demo.
That's just a little byte.ovh.
So we are going to do that. So for doing so, I'm using the SDK of Kubernetes, the kubectl, which is the command line, and I'm going to do in parallel something.
So I'm going to show you the logs live of my ArgoTunnel machine.
So I'm skipping the step of creating ArgoTunnels in Kubernetes. That's very trivial.
You do it once. That's compatible with scalability. That's very well documented.
You do configure it once and you just leave it. It's going to work forever.
So we are going to see logs, and we are going to come back to the logs when I'm going to create the service to see what happens with the Ingress Controller.
So I applied the first configuration, which is the left.
And if we go in the logs, so we see the Ingress Controller discovering a new service, the mode.shutter.ovh pointing to my origin left on the port 80, and that starts spinning up service tunnels.
So we see two tunnels created to Frankfurt and Brussels.
The reason is residency. We want two tunnels connected to different machines so we can start moving bits around if there is an issue locally.
And we are going to do the same for the right -hand side.
And normally, again, we should see the service discovered by Ingress Controller.
Cool.
Discovered. This time to the right, same port, and we're starting connecting tunnels.
So the tunnels are created by service and four times, but you can configure that depending on your residency.
And the tunnels are going to be connected to the closest position from your Kubernetes cluster and where it is.
That's the advantage of having an anycast network so you can guess that my locations are already in Europe somewhere because we see Frankfurt and Brussels.
And Amsterdam here.
Cool. So I think you can do the test yourself now. So the first step is done. So my application, normally, I'm not going to test.
Yes, I'm going to test. This application is already running on Kubernetes.
Kubernetes is private. I want that traffic to be sure to flow through Cloudflare before reaching my origin and already load balance to Cloudflare.
So let's do the test. Cool. So that's my page.
Done. First step, okay. So now the second step, and if you look on your phone, the main page provides you a link to go to a Kayak video.
I want that video to be served by visitors flowing through my page and not having those users sharing on social medias or whatever.
I want them because I do have a subscription. I want them to connect to my website.
I want to tokenize that access and to make sure that it's expired after 60 seconds.
So that logic, why not running it at the origin? Because it wouldn't make sense.
So every request I'm going to receive, maybe 35% of them are going to be bounced back.
So why doing that at the origin where the request will need to flow the whole way from the visitors through Cloudflare and through the origin at the end to have a waste of bandwidth?
Because I can do that at the edge at 10 milliseconds from my visitors and very quickly as well.
So that's the client-side integration.
That's how it works. And if you do the test already, if you click on Kayak video, so the video is going to play anyway because that's still at the phase of the concept.
The client -side is integrated, but there is no way for now to validate actually the token, if it is good or not.
So you can see on that link, you don't see that link very much.
So you can see in that link, I'm explaining to you.
So that logic is basically generating on the fly for any request you do to the main page, a unique token based on the timestamp when you did request the page plus the path with a secret I'm going to share with Cloudflare workers, which is Cloudflare.
And at the end, I do insert that link into the page where you do click in that link, you are going to have the token which is valid for 60 seconds.
That's it. So that's not working because that's only client-side. And that's the way to do it server-side.
Basically, the worker will do three things.
So the first thing is validating the validity of the request. Is my request containing a query string?
Is that query string composed by two groups? The first one, the time frame, which is decimal.
So that's the regex to the right-hand side and separated with a dash and then having the token.
If it is the case, we are going to start importing the key, which is Cloudflare, the same as the client-side.
And we are going to start verifying and serving the content if valid and not expired and refusing if not valid or if expired.
So let's go back to this configuration.
And that's here that we are going to introduce some logic.
So we are for sure going to use Terraform. I'm too lazy to use the interface.
And in that configuration, I'm basically defining three things.
I'm defining the fact that this configuration is going to use the provider Cloudflare.
So I'm going to do things related to Cloudflare. For sure, some authentication in number two because that's at the end talking to the API.
And the third thing, I'm going to define what kind of resource I want to create.
So the first resource is actually the creation of the script and the upload to Cloudflare's Edge.
And the second one is effectively the trigger because if you have the script, the script does nothing by default.
You activate the script by triggers, by portion of your application you want to trigger on.
And in that situation, I do ask the script to be computed on the link slash token node.
So all of my asset behind token node, something will need to trigger and will need to have a valid link.
Otherwise, you will receive a 403. So Terraform is great because it provides basically two steps.
I'm simplifying and it does more than two. But actually the first place, the first step is the plan.
So the plan is testing the existence of the resources you are asking him to create.
And in that result, you can see two things.
So the first thing is that I don't have any syntax error, which is good.
And the second thing is that the plan is about two additions. So actually there's the script and the root are not existing, which is true.
And the second phase is about applying the change.
So syntax check again, checking the number of addition and I'm running it in interactive mode, but for sure you can like automate everything and do it programmatically.
And if I do validate, normally he is going to start converting all of your resources in a succession of API that you don't know about.
You just talk about resources in Terraform and it does the magic for you.
So you can see here that the application is complete. Two resources has been added.
So normally I have my script and triggering on slash token or something.
Cool.
So let's go back to the main page or maybe let's keep this link, which was normally a valid link, but more than 60 seconds I guess spent already.
So if I do refresh, yes, I'm going to start seeing token expired and you can see how fast it was for Terraform to push to the API, to the API to start pushing everywhere in all of the data center and me connected to potentially London, I guess, start getting the result of what we have done.
So to go back, I'm going to need to connect to the main page because I do have my subscription.
I want to control all of my visitors.
And if I do that again, I'll get the access to the page. So you can see that the word token consisting on building the timestamp and the path of the video.
So if I'm smart, I'm trying to be smart and change the timestamp for something else, I'm going to start to see invalid token.
Same way if I'm trying to strip something from the token, I'm going to see an error as well.
Cool.
So wrapping up, we have seen three things today, only focusing on the right -hand side or to get an application created to have it hosted somewhere, connected to Cloudflare's Edge and to start doing your business.
So flowing traffic to your applications without thinking about traffic engineering or to get validation to that new traffic flowing to that public IP.
So ArgoTunnel provides you a way to do actually the opposite.
You do SD1 to Cloudflare so Cloudflare can come back to your origin so you keep them private.
Plus bootstrapping, some automation, minimal automation about DNS records and load balancing and as a bonus, performance gain around broadly compression directly at the listener side.
The second thing about the acceleration, think about all of your application, the type of computing scenario you have very repetitive that you have some challenge around having it scale.
You can move it to Cloudflare's Edge to have fast next to your visitors everywhere and we grow with you for sure to have it computed quickly.
And the last about DevOps and having it automated when it comes to deliver application and application and microservice at scale, you for sure don't want to use the APIs, the interface or whatever.
You want to integrate that into a nice global CI-CD pipeline.
So with StellarForm, we do provide you a framework supporting one language and where you don't need to integrate with all APIs or any other APIs.
With that said, thank you for your attention and you do have the full configuration available to this link and for anything else, we can continue the discussion outside.
Thank you. Hi, we're Cloudflare.
We're building one of the world's largest global cloud networks to help make the Internet faster, more secure and more reliable.
Meet our customer, AO.com, an online retailer specializing in electrical goods from washers, dryers and refrigerators to televisions and home entertainment systems.
They transact over £1 billion per year. My name's Austin Davis. I'm a DevOps engineer at AO.com.
We work on solutions to make development teams go faster. One of the challenges faced by AO.com was to be the best amongst their competition when it came to site performance.
Faster web performance for the business is very important in our industry.
In e-commerce, speed is a differentiating factor. You can literally buy any product from us.
You can get that from other retailers, but what you can't get from other retailers is speed.
Having the best security for their business is another major differentiator for AO.com.
Web security for the business is hugely important.
We're seeing more attacks. We're seeing our competitors get breached.
If we ever have downtime, even if it's just for 10 minutes, the cost to our business is huge.
AO.com saw immediate benefits by selecting Cloudflare as their performance and security provider, from quick production adoption to ease of use to competitive pricing.
With Cloudflare, it seems like security is through and through in all the products, constantly in the forefront of what they do.
With things like workers, with things like the security, they all seem to be miles ahead of the competition.
With customers like AO .com and over 10 million other domains that trust Cloudflare with their security and performance, we're making the Internet fast, secure, and reliable for everyone.
Cloudflare. Helping build a better Internet.
Cloudflare.
Helping build a better Internet.
Cloudflare.
Helping build a better Internet.