Between Two Clouds - A Look Inside Cloudflare Support
Presented by: Shane Ossa, Garret Brown
Originally aired on September 30, 2021 @ 12:30 AM - 1:00 AM EDT
Inside Cloudflare Support explores the people and processes behind Cloudflare's Customer Support team and service. Each segment will include a discussion with a different Customer Support professional on their experiences and their take on the effort to support Cloudflare's customers and products.
English
Support
Transcript (Beta)
Hello and welcome to Between Two Clouds Inside Cloudflare Customer Support. My name is Shane Ossa.
I'm the host of this segment, this Cloudflare TV segment, which aims to get to know members of the Cloudflare customer support team and get to know our process and our service a little bit.
So I'm the training manager for the team.
It's my job to make sure everyone gets the training they need to do their jobs.
I have a whole team, which is really great. Shout out to my team. And today I'm honored to be joined by our support manager of Escalation Engineering, Garret Brown.
Hey Garret. Hey Shane. Thanks for coming on. I've been meaning to get you on this forever.
You put yourself on the list and we've gotten to your name on the list.
So here we are. I finally made it. You made it to the big time. Yeah, it's good to have you.
There's so much I'd like to talk to you about for the listener and viewers sort of context.
Garret is the head of Escalation Engineering on the customer support team at Cloudflare.
So his team handles the really tough escalations on customer issues.
Usually it's stuff that the average tech support engineer might not have access to, a tool or data, and that gets escalated up to Garret's team.
Yep, that's exactly right. And how's that going?
Oh, it's good. It's good. So we've only existed for about two years now. So we're actually a relatively new group, but we're global.
So I manage the entire global team.
We're fairly small, less than 10 people at the moment. But yeah, we get to handle kind of all the tier two escalations from the customer facing teams.
So we handle more than just support. We actually handle issues from solutions, sales, customer success, all kinds of places.
Yeah. So, I mean, it's still relatively new.
Two years is a decent amount of time. Still learning. You're not relatively new.
I can see that you have a shirt that has the old, old, old Cloudflare logo.
That's right. I don't know how marketing is going to feel about that.
But yeah, tell us how long have you been at Cloudflare? So I've been here for over five years, which doesn't seem like a long time, but it is a long time.
That's a long time.
Yeah. But I actually started as a technical support engineer. So I was one of those frontline engineers, customer facing, getting things done, and then just worked my way up, became a team lead, did that for a while, and then moved to escalation engineering.
Yeah. Yeah, that's right. And so you've moved around a couple of times too.
I remember that you were living in San Francisco, but now you're based in Austin.
Yeah. So when I started with the company, I moved from Ohio to California, San Francisco.
I was there for a whopping three months. And then there was talk of opening an Austin office.
So at first I actually declined it. I said, no, I'm not interested.
I just got here. And then I thought about it and I was like, actually, I will take that offer.
So moved out here and I loved it ever since. So one of the first people in the Austin office?
Yeah. One of the first. I wasn't the first.
Right. Right. But I was one of the first people. Yeah. So we were very small.
I think there was only like three of us in the office in the beginning support side.
Yeah. Yeah. That's crazy. Yeah. So I've been with Cloudflare four and a half years now and you were here when I joined and you were one of the best tech support engineers.
And then all our tech support engineers are amazing. And then, yeah, I mean, the team has grown really fast.
I think it was about 25 or 28 since I joined and we've now grown to over 130, I think, on our team.
Yeah. Quite significant amount of people now.
So it's awesome. It's crazy to think that when I was doing kind of enterprise, we were doing, you know, maybe had like 10 tickets in the queue.
It wasn't actually like we had two people on the enterprise queue. Now it's hundreds of tickets, multiple engineers, all kinds of things going on.
Yeah.
Super exciting. Yeah. Yeah. It's been interesting, challenging and fun to sort of try to build systems, tooling and process to sort of go along with the evolution of the company and the team and the product offering and the growth of customers.
And sort of a natural progression is to develop sort of like you said, a tier two, second level of support.
Can you talk to us a bit about that? Is that pretty common in the industry?
From what I understand, yes. We kind of still, I mean, a lot of what I've seen other companies, you know, they'll have tier one, tier two, tier three, maybe even like tier four type thing.
We are less, we don't have it as set in stone what a tier is.
Right. But kind of loosely defined, our tier one team is kind of our front line technical support engineers.
Tier two would be kind of my team.
And then tier three, I suppose, would be kind of when we escalate to product and engineering.
Right. So my team, we kind of handle anything between customer facing teams and product and engineering teams.
But the tier two aspect is really just, you know, that we're getting handed exactly like you said in the beginning, the more difficult problems, deep dives, things that we need, a higher level of access, certain tooling, things like that.
So. And I heard you might be hiring.
Oh, we're hiring. Yeah. So you got to look at the careers page. We've got openings around the world.
Cloudflare.com slash careers. That's right.
And you can come and work for Garrett's team. What would you recommend, you know, a candidate for escalation engineer, what sort of skill set, background experience or just like general, you know, attitudes would you be looking for?
The thing that I always tell candidates is I'm looking, the number one trait is really kind of curiosity.
I want people who are who are motivated, but also curious.
Right. Because that kind of translates to being driven and learning and growing.
Right. We don't expect that we're going to hire people who know every single thing that just doesn't happen.
We hire people who have a good base set of kind of fundamental skills and understanding, and then we grow them so that they can handle kind of the issues that come across our plate.
Right. Right. But things like we've been really heavily kind of pushing network focused.
Okay. Right.
So because we have magic transit, magic win, all these different things that we have to handle.
And so network focus is a big thing, kind of layer three, layer four fundamentals.
But honestly, I'm looking for people who are motivated to even learn.
Right. And that doesn't mean we're not looking for people who understand layer seven proxy DNS HTTP.
Right. We want we want people with skill sets, you know, a wide, wide variety of skill sets in the technical realm.
Yeah. Yeah.
We like people who like to who want to know how things work, who want to set it up themselves, set up a server themselves, put it behind Cloudflare, play with the features, you know, and really get to know do testing and experimentation.
Right. Yep. And that's exactly how I learned. I mean, that's the I'm a nerd at heart.
I love doing stuff. And so I love to, to just set something up, build it, learn from it, break it, fix it.
That's how I really feel like you kind of learn how things really, really work.
Yeah. Academics is great. But being able to put that into practice is, is, you know, the fundamental part.
So absolutely. Yeah, we have a training program.
I mean, I can speak to that for a moment that we have, you know, a very structured training program that takes people through their first 90 days, we have sort of an intensive boot camp at the beginning that introduces everyone to the core concepts.
But then it continues on with hands on training, remote training, eLearnings, broken servers that you have to fix little assessments that are sort of submittal, you know, graded, where we give people feedback, really helps us sort of identify individual knowledge gaps and individual training needs.
We have people, you know, can play around with our test servers, but we sort of walk them through and hold hold someone's hand if they need it, setting up their own test servers.
And learning command line tools, right? People come to us with some command line experience, but we also do a ton of training on curl and dig, and lots of other sort of, you know, common open source command line tools.
Exactly.
And then, I mean, we have a lot of our own kind of things that we've built like crossbow and whatnot, that we want to we want to train engineers on because they haven't used that before.
Right. Command line is huge, not being afraid of the command line.
When I started, I barely knew command line at all. Right. You know, so like, that's, it just wasn't something I had to deal with before.
Yeah. Right. Yeah.
So kind of my past history was, I worked at Ohio University for many, many years.
I actually started when I was like 16, doing their IT. Yeah. And so in one of their engineering department, Institute for Corrosion.
And so I managed all of their stuff, kind of worked to manage their computers, kind of went from having a bunch of different computers that are kind of all independent and going on Active Directory, right?
So we were very much a Windows place. That's why command line, little bit of PowerShell.
Yeah. But it taught me the kind of the fundamentals of troubleshooting, which is something that I also look for in engineers.
And that's kind of that curiosity of being able to take a problem, you know, looking at it as a puzzle and breaking it down and saying, you know, let me, let me try this.
Let me look at this.
Let me rule this out and then move forward. Right. So, and that's really kind of the fundamentals of with our, the issues that come across our desk is like, we have to be able to take that issue and move it forward.
Even if the engineer themselves don't like solve it in their shift, we want to take a look at it and move it forward.
Right. So do debugging on this, rule this thing out, check that out, say that, that looks funny to me.
That looks unexpected. Maybe I'll talk to engineering about it, escalate it to them, get their point of view.
Right. But it's kind of things like that.
So, but yeah, my background is, is very much IT and, and software development.
Right. So those were, those were kind of the things that I had.
And then I started doing technical support engineer here. Right. And that taught me a ton of stuff, right.
Cloud proxy, everything like that. And now those are my loves.
And now I do Linux, Linux, Linux, and I don't think I've installed windows in, in probably five years.
You know, you should give it a try. It's gotten a lot better recently, but.
Honestly, I like windows. So I'm, I'm, I'm operating system agnostic, I guess.
Like, I don't, I don't, I think they all have their place in certain regards.
Right. So, you know, what's your, what's your sort of latest home project in terms of, you know, geeking out on some of the tech?
Well, I've been doing, I kind of have so many different little projects that I got that I kind of work on.
I was building my own kind of like enterprise servers. And so I got those built, some, some R720s that I set up.
And so I learned all that, right. And put, put some virtualization software on those and running all kinds of random things, just to basically a home lab where I can go in and set up whatever I want.
So I could install windows if I wanted to, I could install, you know, any Ubuntu, whatever I want.
A good, good platform to be able to kind of build on. Besides that, I was doing a lot with like Prometheus, messing around with Prometheus.
That's all kinds of little things.
Lately, I've been doing a lot of Cloudflare Workers though. So a lot of like software development.
Right. That's been my, my projects lately. Nice.
And what, what brought you to Cloudflare? Like you were doing Ohio State, you know, kind of IT and, or not Ohio State, I guess the university.
Yeah. That's a, that's a no, no.
We're not Ohio State. Okay. Okay. Totally different. No, but what, what brought me to Cloudflare was, so actually I was part of a, a startup with a friend of mine and he actually moved out to San Francisco and he was trying to get me to move to, to San Francisco.
And, and I was like, well, you know, I don't know if it's right for me.
I hear it's expensive, but I'll definitely come check it out.
Right. So he actually, the startup paid to fly me out there and I spent a week out there and I was like, I actually really do like it.
And so I started looking for jobs and Cloudflare was one of those that came up and I was like, I've seen this company before.
It looks really interesting. And I applied. And then I, I think the recruiter reached out to me probably within less than a week, said they're interested.
I took the take-home test and then did a, did a phone interview and went from there.
Do you still have the take-home test? Could you like?
I'm sure I do. And I'm sure I did. I'm sure actually I did quite bad on it compared to what I could do now.
But that's the secret about the take-home test is it really doesn't matter if you do bad, it matters that you try and that you use your curiosity and drive, because we don't expect that you're going to know every little thing.
We want to, we want to see that you're, you're, you know, motivated and driven and trying.
Right. I mean, even if you get a question wrong, just writing your thought process down, right.
Exactly. This is what I was thinking. And I tested this and I tried that.
And based on that, you know, I'm leaning towards this being the real cause of the problem.
And, you know, just, just like in, in, when you're solving customer issues, you're not going to, you know, as you gain experience, you'll be able to kind of know what the answer is almost immediately.
But initially, when you come across a problem for the first time, you're, you're going to have to use that, that troubleshooting methodology and just kind of work through it.
Right. So that's really what we're looking for in those tests. Like, can you use kind of that methodology to try to solve the problem?
Right. Or take it that step forward.
So. Yeah, that's great. So what's, what's upcoming, what's next for the escalation engineer team?
Like what's, what's on your roadmap? What are you building?
What are you working on? What's kind of exciting? We've got a lot of things going.
A lot of kind of, we've been working with engineering and kind of optimizing our process.
Right. Better reporting, better being able to know kind of where we are in the, in the customer journey, the customer story to ensure that we're catching things right.
The worst thing, kind of the worst experience that we can have for customers is like something gets escalated and then it just kind of hangs and doesn't really, you know, get to a resolution fast enough.
Right.
So really trying to catch those, really trying to just make everything a little bit more clear for where we are.
We, we probably handle over a hundred escalations a week.
And so some of those, and right. And some of those we're handling ourselves.
Some of those we're, we're reaching out to engineering teams to help them.
And so keeping track of all those and making sure that we're not, you know, that the things are happening in a timely fashion is really important.
So. Yeah.
But besides that growing the team. Right. Is huge. Yeah. I'm working on kind of getting so right now I, I manage the entire team.
And so I'm hoping to bring on some managers to help me with that as we scale.
So scaling and growth. Right. It's hard to manage 24 people.
You need a couple of other managers in there to help with that.
That's right. That's right. Yeah. Yeah. Yeah. You got me thinking about a lot of things as you were sort of talking about how we're scaling up the escalation program.
You know, we, we talk a lot about as, as the training manager and the manager of escalations, how we can sort of close some of those knowledge loops, you know, you, you sort of mentioned kind of in passing, you know, that, that you, your team would handle is like most of these things or some of them, right.
And the others get escalated. So there's, there's an opportunity there when the escalation stops and gets fixed by your team to sort of do a retrospective on that incident or issue and say, could the tech support engineer or success manager, or the solutions engineer who escalated that have solved it themselves?
Can we enable them?
Is it a tooling problem? You know, is it just a simply they don't have access to the same data that we do, or is it a knowledge opportunity, learning opportunity?
Right. And then there's another one from, from engineering down to escalations, right.
Is it something that the engineering Cloudflare engineering can sort of enable the escalation engineer team to sort of handle, right.
Yep. And that's exactly, it's a, it's really at every level that we need feedback loops, right.
Yeah. Because it's a better customer experience. If the customer one, a lot of customers will solve these problems themselves if they, right.
So training, just getting that information to the customer so they can, they could take care of it is the ultimate solution.
Sometimes they have to reach out to support engineers.
And if support engineers have to reach out to us, we want to bring that knowledge down to them and so on.
Right. Just as you were saying, from every tier.
So it's kind of the feedback loops, the, the, the review that we want to go through and make sure, you know, especially if we see the same escalation multiple times, we look at it and say, how do we, how do we make this more efficient?
Yeah. And how do we bring that knowledge, that, that training, that access level down so that it gets resolved faster?
Yeah. Yeah. I mean, it could be, these could be, you know, feature requests that are on products roadmap, you know, the, the types of escalations that you're going to see really vary per product line and per the maturity of, of a product, right.
We have products that are super mature products that are still kind of beta or early access.
Right. And those issues are getting escalated and you're sort of helping your team sort of helping the engineering product teams prioritize, you know, what feature to build next or, you know, or what thing to, what thing to fix.
If something, or if it got escalated and you, like you said, if it's a repeat one, oh, customers keep meeting this piece of data, can we expose that to them?
How can we expose that to them on the dashboard or via API?
Exactly. Exactly. So that's, that's part of the fun is there's always, always something going on, always a problem to solve, puzzles to solve, new products coming that we have to learn, right.
And become subject matter experts in, and that's why your team helps tremendously with that.
So never a dull moment, always, always something good going on.
Yeah. I'd love to know, you know, you mentioned some of these are fun, like, are there any product lines in particular right now that are, you know, either challenging, quite challenging or fun to sort of investigate?
I'm thinking like newer ones, bot management. I mean, I know workers has this whole other edge workers sub requests, you know, layer to debug.
What's, what's hard to debug or what's fun to debug? Are there any of these right now, or you're just getting a lot of.
I think it depends on, so one of the things that I've been working on is kind of, so escalation engineers have to be generalists.
I often say, but, but yeah, we have to kind of specialize a little bit, right.
We have to be able to handle any, any problem that comes to us. But some of my engineers know networking inside and out.
Others know, you know, maybe workers and the platform proxy layer seven better.
And so it just depends. I would say all the products you mentioned can be difficult to, to debug and problem solve.
Magic transit is, is definitely one of those that we work and we've been working extensively to get better at.
But it is, you know, it's one of those that's growing super fast.
More and more customers using it all the time. And so we're getting better and better at troubleshooting it.
Workers is also another one.
A lot of the hard part about workers is customers write their own code. And so we want to ensure that they're successful with that, but we also don't necessarily want to write their code for them.
Yeah. Right. And so we want to help them along their way to get what they want done without kind of putting ourselves into the situation of becoming their developer.
That one's difficult, right? We kind of often have to treat their code as kind of a black box and make sure our platform is handling it correctly.
And then work with them to identify what next steps to take to kind of rectify the problem.
So it just depends. Right. But yeah, I, I would say, you know, some of my engineers that are really strong in network, we, we have things that are network related that are third party.
Right.
And we have to troubleshoot those because cloud is in the middle. Right. And those can be very challenging as well, because it's not really in our hands.
It's something that's outside of us, but Cloudflare presents the problem to, to the eyeball or the client.
And therefore we're kind of on the hook to make sure that we figure out what the problem is and next steps towards the solution.
Yeah. So layers, layers of things, but like I said, it's never, never a dull moment, always something.
Yeah. And so there's, there's, there's data that we're tracking there, right?
Number of escalations, how many needed to be escalated from there? You know, how long did it take to resolve, you know, what was the root cause?
You know, what was the resolution time to resolution?
And we sort of are working with the product specialist sub-team on Cloudflare support to put together sort of a product health card or a product scorecard that one of the, one of the metrics is, you know, what's the escalation ratio of this product line?
Right.
We have a strong partnership with the product team to sort of, you know, look at how we can improve the customer experience, improve the product experience, right?
We are, we are in some ways, the eyes and ears, product team definitely meets with customers all the time, but we're the ones, you know, handling issues every day.
Right. And usually these are edge cases, you know, these are people doing either sort of testing the limits of the platform or trying out something really experimental on their side.
Right. That we get escalations.
And then that's just one of the sort of many data points that we're sort of, we talk about, have conversations with the product team as to, you know, how we can make a better customer experience.
So. Yep. And our, I mean, our customers drive some of these products because they come to us and they say, Hey, we want a solution for this.
Right. And so our teams work with them and say, okay, let's, let's build it or let's, you know, fix it, what we're doing and make it work.
And so sometimes escalation engineers are, are in the middle of that where, you know, especially if it's like a bug, we will have to identify it and say, yep, this is an issue.
We were able to reproduce the problem. This is the expectation of what should happen.
How can we get, you know, move, move from what, what is happening to what should happen and work with engineering and then, you know, communicate that back to the customer.
So. Nice. Yep. Can customers contact your team directly?
Usually no, just because we are a tier two. Right. But we do in, in kind of dealing with incidents and, and very large customers.
Yeah. If we are the subject matter expert on the problem, we're going to be jumping into the phone call with, you know, that premium customer to really kind of be that, that technical, you know, resource.
Yeah, exactly. So that's usually where we're going to be more customer thing, but we're not really a customer facing team, right.
We support customer facing teams.
Yeah. Yeah. So they can't, you know, there's a special bat phone that, that rings the customer, you know, the escalation engineer desk, but not really jump on a call.
You will jump on the bat phone. If one of the people who does have the bat phone gets a call and they need to, yeah.
You've mentioned engineering.
Have you ever had someone from the escalation engineering team transition to like a full-on bachelor engineering development?
Yeah. So there's definite growth paths, right?
We always want to have growth paths in the department.
And one of those growth paths is as an escalation engineer, you get like, like I said, about specializing.
I have engineers who say specialize in network and learn a ton of stuff about network.
They embed with the network team. And then maybe their career goal is to eventually become a network engineer.
Right.
And so we've seen that in the past with, with engineers. And so that's definitely a growth path there.
There's, there's a variety of different growth paths.
So. Yeah, that's great. I mean, in general, the support team likes people to join and stay for life on the support team.
We have a lot of different roles. You can be an escalation engineer, tech support engineer, manager, if you're on the manager track, trainer, if you like helping other people grow, product specialists, if you want to geek out on on the, how, you know, collaborative products work and can get better, support operations team, if you're actually a programmer, you want to build support tools.
And we have all sorts of roles within the support team.
But we also just like to keep people at Cloudflare in general, we've had people join engineering, product, success, solutions, you know, you know, sales, you know, you name it, you know, and the support team has people who have graduated from support, or, you know, those people are amazing, because they understand the support side.
And they go on to do other things in the company.
And they always, you know, remember that support side.
Yeah. I talked to them all the time where I'm like, Hey, can you help me with this?
Hey, you know, so it's that that alumni style. Yeah, it's huge.
It's huge. Yeah, empathy for the customer support role, and also just the customer in general, right?
Geez, we're running out of time, Garrett's been flown by.
You've been in Austin now, how long? Five, five years, four years, three years? Actually, five years in?
Yeah, five years, just about five years. Wow. So I know this is controversial, but do you want to wade into the world of who has the best either tacos or who has the best barbecue?
Or are you just gonna not? It's I won't stir the pot, but I will say it's Austin.
Oh, yeah, no, I mean, within us. Oh, within. Oh, yeah, that's tough.
That's tough. That that's that you give us like a top three that can give me you know, a taco or or and or barbecue that we should try out there.
There are certain places that I can't I can't talk about because they're that good.
And then you don't want to show up. Yeah, okay. There's probably co -branding.
Can't give away my spots. Don't mention it. Don't mention it. Because I think if we mention a certain taco truck or something like that, should we you know, they could potentially send us tacos like you know, like a radio show, we could do spots.
So if you're out there, and you're a taco truck, and you're listening to Cloudflare TV, contact Garrett Brown at Cloudflare and Shane at Cloudflare.
Tell Shane and we will we will take I will take the free tacos.
But in all seriousness, great to talk to you.
Thanks for coming on. Yeah, you too, Shane. I appreciate it. Yeah.
And if you're watching out there, thanks for watching Between Two Clouds. My name is Shane.
We'll be back in a couple weeks. Take care. We're coming.