📊 Life as a Data Engineer at Cloudflare
Presented by: Ruby Shen, Harry Hough
Originally aired on September 29, 2021 @ 11:30 PM - 12:00 AM EDT
Join Ruby and Harry as they discuss their experience working as a Data Engineers at Cloudflare.
English
Engineering
Transcript (Beta)
Hey everyone, thanks for tuning in to the Cloudflare TV session today. My name is Ruby, I've been working at Cloudflare as a data engineer for almost two years now and I have my teammate Harry here with me today.
We've been getting similar questions when we interview candidates as the team is growing really fast so we thought it would be very helpful for us to share our experiences working as data engineers at Cloudflare and hopefully this would help you out if you're looking to start a career in data or you're looking for new opportunities and think Cloudflare would be a great option for you.
So Harry do you want to take it away and introduce yourself a little bit?
Yeah definitely, so hello Cloudflare TV viewers. I'm going to talk about some of the topics relating to our experiences that we'll be covering today and kind of a general background on the BI team.
I joined about a month after Ruby I think so I haven't been here quite as long.
We will also cover how we both got into data engineering as well as what technical skills we find important to do our jobs.
Ruby also recently moved to the Cloudflare Lisbon office from Austin so we'll get to hear a little bit about her experiences.
I'm in the Austin office so definitely sad to see Ruby go to Lisbon but exciting.
We now have BI team members in Austin, Lisbon and San Francisco which is really cool.
We will start out with kind of this presentation type format but most of the rest of the segment will be more kind of a casual conversation between me and Ruby on our personal experiences.
Ruby is going to start off by giving us a little bit of background on some of the job functions of the Cloudflare BI team.
Sure, so I believe this is I think personally speaking I think this is the top one frequently asked questions from candidates when I'm interviewing.
So the most questions I get is oh where does the BI team fit into the entire organization and you know if someone's interviewing for a data engineering role they usually are curious so you know how is the team set up are we going to be aligned by different lines of business and what kind of projects and efforts we support or are we going to be aligned by job functions.
So the answer is kind of both.
I wanted to say so everybody here obviously it's BI week and this is our first session.
Everybody here falls under the whole BI umbrella and then under BI we have three smaller teams and the first team is data engineering.
It's the team that Harry and I are on. Our main job here as data engineers are first we need to build a very reliable platform and to build and maintain the infrastructure to support any data pipelines and data products.
And the second responsibility main responsibility I want to say to our job is to you know build the pipeline and then source the data bring them in-house into our cloud platform so then the analysts and data scientists including our business partners can then build insights reports and dashboards on top of it.
So the second part to our whole BI organization will be the data analyst team and like I mentioned before after the engineers build a platform and the data set sitting in the cloud platform it doesn't really mean anything unless you know we can actually provide visualizations dashboards and you know come up with insights to the business to help them make data-driven decisions.
So that is our data analyst team. Now the third team also is very critical to the whole BI org is the data scientist team.
Now I also included machine learning engineers as well because I think their job is very critical although they can be similar to what data engineers do but their main focus would be around you know deploying models for data scientists and build platform and framework that's more geared towards data science products.
So this is just the overall how the team looks like at the moment.
Next I want to you know give it to Harry so then he can tell us a little more of you know within the whole data world what are the components and then what are the other different roles and how do we all fit in under this whole umbrella.
Sure definitely so we're actually a business intelligence team so our roles as data engineers are maybe slightly different than other teams with data engineers.
There are other data engineers at Cloudflare just to be clear but they really do quite slightly different work than we do.
So this is kind of a data role and hierarchy at least how I see it within Cloudflare.
So the first step is the data collection. This is like logging configuration data and this is in as I was saying the realm of the data team and the engineering organization.
The next step is moving and storing the data. This is actually where BI team data engineers come in and we replicate data that has been collected by these other teams into the BI team's data lake.
The next step after that would be transform and aggregate.
So this is BI team data engineers as well we actually create curated data sets.
So for some products it needs 15 or 20 tables in order to query that.
So we just create one table that is really really easy to query very reproducible and then we also create like aggregations.
So for example most of the data we get is maybe like one hour grain or one minute grain and so we roll that up into day grain and month grain to make those tables less onerous to query.
The next step above that is like analytics. So that's in the role of like data analysts like Ruby was talking about and that's reporting and then it's some maybe testing and experimentation and maybe some simple like kind of machine learning models.
And then the next step right is machine learning and deep learning and this is fully in the realm of kind of the data science team.
But the analytics and machine learning and deep learning we actually support those teams with data sets.
So those teams aren't expected to get data from the for themselves they're just using the data that we have curated and rolled up effectively.
So now you have a kind of a better view of this hierarchy and some of the roles that Ruby's talked about.
So I think it would be kind of good to talk about some of the projects that we're working on.
So for me one of the exciting projects I'm working on is a full stack web application and this is basically a data product to help internal teams that don't necessarily know how to use SQL to answer like really important business questions.
So it's usually used by teams like sales and marketing.
This is kind of cool as it's something that could typically be seen as outside the realm of data engineers and probably is you know outside the realm of data engineers in the collect phase so the data team and the engineering org.
So it's really cool to have kind of varied this kind of varied work on our team and you know my background is in kind of full stack engineering before I joined Cloudflare and I did do data stuff before that too.
So it's cool to kind of like go back into that into that kind of stuff.
I also do a lot of data ingestion tickets.
So that's bringing data from you know these internal sources and then bringing those into the BI team and those are kind of can be interesting because they can be streaming or they can be batch and then we also do a lot of I also do a lot of you know cross -functional work I mean everyone does on the team basically providing insights for Cloudflare for teams which are really cool interesting product and Cloudflare workers.
So if you guys haven't you know haven't heard of those products definitely check them out.
So what type of stuff are you working on right now Ruby?
I think well aside from the project you just talked about right Harry and I are actually working on that and then I handle more of the back end stuff at least on that project to feed data into our Postgres database so that it feeds into the front end eventually.
But aside from that I think the other one I think that would be interesting would be the product 360 for expansion purpose specifically.
So most recently I've worked on a project that helped the marketing team generate marketing emails and then help them send usage data to our targeted customers and by that I meant based on usage patterns where we found a very specific group of our customers that would be really great candidates for our bot management products just because you know the way the traffic is flowing in for their website.
So we thought it would be really helpful and then along with the email campaign we wanted to produce some you know valuable insights to them as well to see you know give them an idea of you know how many requests are coming in spot traffic and etc.
So with that it was really interesting because I get to work with a marketing team as mentioned and that was really brings in that you know the final business value or what am I even working this hard for and actually see the end result we're producing and what kind of value we're adding to our customer as well.
So to me that was very interesting because it's not really I'm just doing the back end of it and I'm only I'm also building a pipeline that goes from end to end right it went from very raw data sources to you know producing business values.
So yeah that actually I think that's actually very interesting Harry.
I think we all we both kind of went over you know the different job responsibilities and you know even within our projects there are different things and different roles we need to perform different hats we have to wear.
I wonder what how did you get into data engineering right how did you even know this is something that you want to do and you know what the job really you know includes I guess.
Yeah so I mean I touched on it a bit before right like my background before data stuff was really like full stack type engineering.
So I have kind of a weird path and I mean I think it's difficult for people to kind of see a path into this kind of world.
I mean there's a very traditional path right but there's there are other paths which I think are interesting too.
So for me I went to school for and did a history degree which I dropped out I didn't like it and then I changed like I thought I would try out game design.
I thought if you if you love playing video games you probably love game design which is maybe maybe not the case.
Not really but sure and it turned out to be kind of more art than than code.
So I ended up coming to Austin the reason I moved to Austin was to do a JavaScript full stack boot camp but that mostly really prepares you to be a front-end engineer maybe maybe a full -stack engineer but I would say it's probably more heavily front-end.
My first job after that was was at a really small startup with around 10 people only three of which were in Austin and that company grew like it grew very very quickly and basically our traffic was an aggregate of all of our customers traffic because we had some embed code on our customers websites that basically would you know do analytic custom analytics for them so they could see reports of like you know the performance on their website.
So we had to build that custom analytics system and at the time I really only knew how to do like front-end development and maybe some like node.js stuff.
So I had to learn all about the big data world about Cassandra, Spark, all these types of things and Clickhouse and actually when I was you know applying for Cloudflare that was one thing I talked about is Clickhouse and Spark is something that we also use on on BI team and within Cloudflare.
So and then also beyond that right it ends up that you need to give insights to marketing teams to the CEO to all these types of people so that was really gave me this great experience that I could use when I was when I was interviewing to come up with Cloudflare.
How about you Ruby? I think you have maybe a little bit more of a traditional route but not not a bit unexpected as well I would say.
Yeah definitely because I think mine compared to yours is very conventional as I did my master's and bachelor's both in management information systems for anyone who's familiar with the major or the field and knows that you know pretty much majors like that you're geared towards working in data after you graduate but I think the confusing part for me is back then when I was in school I didn't really understand what engineering means and what data engineering means and then what does it mean to work in data should I after I graduated my with my bachelor's degree right I was thinking oh should I get a CS degree or should I pursue a you know a math and more statistic advanced degree in order to be either a software engineer or a data scientist type of role so I talked to some professors and kind of got more understanding of what the field is like so I decided to go for you know same major information systems and I did my master's so with that experience really kind of helped me understand obviously by working on some school projects from end to end also my internship experience it really gave me the taste of the entire data world includes you know bringing the data and building infrastructure and having more engineering side of things and also after that you can be a data scientist which you know you have more math background and you have you built models to do predictive analytics and at the same time you can pursue a analyst route as well if you're very into analytics and trying to you know dig through data to find insights so with that really gave me an opportunity to know my interest really lies in engineering because I'm not really the best at you know doing analytical work but I really like problem solving and I like taking things apart and solve problems and learn new things and you know figure things out working with systems so that kind of helped me being you know less confused and figure out my future path and also after I graduated I started off unlike Harry right I started off with a bigger company so it was more of a corporate culture and then I only really worked with a very small part of their technology and very smart small part of their data so that really got me you know a few years later it got me really more comfortable in in what I do so I wanted to learn more and grow as an individual so I decided to join you know a smaller company just so that I can handle the end-to-end of how data infrastructure is set up in the entire company and with all that the rest is history I landed in Austin and now in Lisbon at Cloudflare and can't be happier so Harry I think we after all that right we've been telling a lot about you know how good is data engineering here how it's set up and you know how much we're enjoying it but do you want to go over a little bit of you know when we're interviewing people what kind of skills are you looking for like what do you have to have in order to perform the essential duties of this job sure and I hope everyone's appreciating these these great memes that we have included on the slides so the number one skill that I look for I think and most of the team probably looks for is SQL skills so I think this is probably the most critical skill for a data engineer so Spark now is moving completely towards SQL it helps to understand SQL data models and so I just feel like this skill is absolutely critical we usually do SQL SQL tests during the interviews and then we also you know for data warehousing knowing SQL is also great as well so I would say that that for me is the most the most critical skill yeah I definitely agree with you I think SQL would be one of the most essential skills for our job because you just really need to know how data works right the logic behind everything when you do data modeling and just how tables interact with each other but the other thing for me is coding like you said Spark is widely used to do ETL I think in most companies nowadays and even cloud platforms right and then as I mentioned before a big part of our job is to build and maintain a sustainable platform and framework for all the pipelines that we're building on top of it so coding is really important because of that so not just when we do Spark jobs we need to do it in either Python or Scala for the most part or Java for some some companies but also you know in order to troubleshoot some of the systems and in order to just perform the DevOps side of things in a job coding is really important and that's what really consists of our technical phone screens initially so after you know a candidate talk to our hiring managers we usually have a 45 minutes to an hour technical screen and then we do live coding there and that usually has half SQL problems to solve and half coding problems and we really want to see the candidate have the ability to work through problems break them into smaller chunks and also be able to present and communicate their solutions I think that kind of goes into the next thing I want to ask you Harry aside from the technical skills right that's that's very important with no doubt it's the basics of you know this job but what else are you looking for when you're looking for that you know perfect teammate well I guess culture fit is a big one and you know what do we mean by like culture fit I feel like usually when I hear that in an interview I kind of cringe because it's so hard to kind of know what that means and how do I like achieve culture fit and all these kind of things but I think for us it just means like shared values so shared values for us as the BI team and shared values towards Cloudflare's values in general so that's really how how I see culture fit I mean ultimately you know we want to hire people that you know we have a very careful team dynamic that we've kind of cultivated and a careful company dynamic so we really really want to protect that for our existing employees you know and for anyone that kind of joins in the future so that is something that we that we really really think hard about I mean even at Cloudflare right the the final round is actually with a C-level executive so you know we care a lot I would say about hiring I mean if you're doing the interview process you might be a little frustrated even with how much Cloudflare cares about hiring because it's quite a long process but like throughout the company this is something that that's taken really really seriously so what would you say else is important when finding the right teammate Ruby?
I definitely think like you said where culture fit is although it's very cliche but it is a great part and I think essentially when I interview people my when it comes down to it my bottom line is I want to work with this person I want this person to be on my team and you know at some point if we are on the same project we might be working I don't know a few hours of you know we might have a few hours of interactions every single day so I really want someone who is you know humble they can be very smart but you know being humble I think is a very good quality to have and I think that's really part of our team culture as well as the company culture really I think you know we're touching on the company culture here but that is something that I look for specifically and also like I mentioned before problem solving skill is very important because with our job you're always going to learn something new you're always going to encounter things that you've never done before that you've you just didn't know in your past so when it comes to doing it doing something for the first time how do you learn how do you break down problems and how do you you know handle the frustration of some piece of coat not working for an entire day that really I think is essential for me when I'm you know looking for a potential teammate and on top of all that I think diversity is also something that we value a lot you know the team and Cloudflare what by diversity I don't really just mean it traditionally right and you know race gender obviously I think for me it's great as a minority female engineer I feel well very welcomed and you know everybody on the team never treated me differently that was awesome that was a different story because it's just the base of how everyone else understand diversity to be but in addition to all that the team really does a good job in like bringing in people from like different locations right we have different we have three locations just for BI team right now we have Lisbon we have San Francisco we have Austin we have eight hours I believe time difference between Lisbon and San Fran and how we work together and the other thing I think is really helpful is that like Harry and I mentioned right there are different ways to get into engineering and get into data so by having people from different backgrounds it really helps bring in new perspectives you know when we have infrastructure designs when we brainstorm new projects and how we design a certain solution I always find it very interesting and how your background experience can impact how you think so much what do you agree Harry or you're like yeah definitely I mean it's really cool to see like there was actually been a few Cloudflare TV segments I think on people that a Cloudflare that went to coding boot camps so it's kind of see cool to see other people that took the same kind of path as you we also have a great veterans community and all sorts of like internal ERGs and stuff so that you know everyone feels like they have a place here which I think is really awesome I mean I also want to touch a little bit on like the interview process at least like you know my philosophy is we do try to make people feel really comfortable you know it is so hard to evaluate people in such you know compressed time horizons but like for example if I'm giving a sequel test or something like that we try to throw some like you know some easier questions at the beginning so that people can kind of gain their confidence you know for us we just want to try and you know get the best out of people in that hour so I think like we you know we really approach this stuff carefully and it's super super important and everyone on the team really thinks about it you know I usually get emails before interview panels saying hey can you please focus on this can you please focus on that and just the process here I think at least for me as a candidate and you know and as the interviewer is so much more thoughtful than it is I think a lot of other places that I've interviewed at so you know I really appreciate that yeah I completely agree I had very similar experiences when I was interviewing for call fire and also I think when we were taking interviewer trainings right one thing people always emphasize on is to try to make sure that you know the candidate is getting all the information he has to to make the most informed decision because it's a two-way kind of selection not only we need to see if a candidate is a good fit they need to get enough information to see you know whether call flare is a good fit for them to yeah absolutely yeah so the next topic we want to really get into is I think we both mentioned the location in Lisbon and I think well Harry really brought it up that and then he said we have to cover this because I recently moved to Lisbon yeah so this is a beautiful photograph that Ruby has has taken and sent to us after she went to us she went to Lisbon so at the very least moving to Lisbon makes you a better photographer I would say for sure actually just this morning we had this chat going on the Lisbon team Lisbon office channel and people were just sharing all this you know amazing pictures that everybody took with their phones this past weekend because whether the weather was amazing here this past weekend so absolutely so so what made you what made you move to Lisbon Ruby I mean this is a big move like a big move personally right and then also you know so this was a big professional move too so like how did this like why did you have this opportunity at Cloudflare like how did this come about and why did you decide ultimately I guess to take it yeah for sure I think ever since I believe it's a little over a year ago Cloudflare started publicizing within the company that we're going to establish a new office in Europe and then eventually we decided on Lisbon and then the first step Cloudflare took was to send a landing team there to bring to make sure that you know we bring in the company culture and then it's not just a an office starting fresh on its own and then you know even back then I was just really interested in this entire thing because it's very interested it's very interesting to just see someone start a new office and how it's going to grow and you know what the future direction is in that office and what hiring looks like there so with all of these questions it was just always in the back of my mind to you know to be like oh wouldn't that be great to work there for you know a little bit but then I guess when BI team started growing in Lisbon or when we decided that we should we want to establish a team there that's really what triggered me into thinking about this seriously so I actually reached out to a few lending team members when they went as part one effort and then they were very helpful very patient with me answered all my questions addressed some of my concerns so that's when I finally decided to reach out to my manager now Kalpana to see if by me moving there will help the team settle down there and if I can add value as a kind of a veteran Klauffler because everybody's pretty new at Klauffler so even with my almost two years experience I'm pretty old of an employee in a way at least for BI team so yeah she was very supportive too and Klauffler made this process really smooth really I think the other big thing for me is that I think the growth is really what what's really tempting to me because like I said I was never so involved in you know establishing a new team before so personally I think if down the line you know if I want to get into leadership positions in the future this really helps me understand the dynamics and then gives me new perspectives of how things work and also you know I get to travel all around Europe once COVID settles down so I mean there's no cons in this situation really.
Got it so what was the process like of moving countries so you know my dad lives in England and so I haven't been out to go see him for a long time right because you have to quarantine for two weeks when you when you get to London so like what was the process like moving like was it very difficult did you have to do any you know quarantine anything like that?
Fortunately I didn't have to excuse me quarantine once I get here but I did have to you know go through some extra steps like getting my work visa way ahead of time and then getting a COVID test again luckily Klauffler really took good care of you know the lending teams and any lending efforts and then we had a relocation package and also we hired a relocation agent to help me settle down find an apartment here and also company lawyers were help were helping us figuring out the work situation right what kind of visa do you need preparing your documents and once you get here you need an appointment to figure out your you know residential permit and all of that so I mean it's it's painful definitely for sure but it's not bad because you know Klauffler really gave us a lot of help in that area.
You sound you sound very interested though in this are you?
Oh yeah I'm definitely gonna come come visit I think a few people might come visit.
Yeah I really look forward to seeing you guys here once things settle down for sure.
Yeah I guess it probably won't be till maybe maybe the end of the summer or something but I think we'll definitely come.
Yeah well I think that the last thing we kind of want to touch on we already kind of went into it a little bit with it you know when we're interviewing people and now with the Lisbon thing is how do you you've been with Klauffler for almost two years too right how do you like the company culture and the team culture so far?
Well one thing that happened recently so so this this picture here is a picture of all the suppliers I think in there that were kind of left over in the office in Austin I mean obviously no one's been there for I guess a year now right since March yeah but there was a winter storm in Texas which I'm sure everyone is aware of and that storm was like was really crazy I mean so for I live kind of near downtown so people that lived in that area kind of were okay because we lost power or for only like a few hours but a lot of our team members I would say probably most of our team members at least on the BI team live up in Pflugerville or Leander which is kind of in the north Austin area and those people lost power for days and days and days and lost water a really kind of scary situations I mean people were like you know negative degrees or negative degrees Celsius very low degrees Fahrenheit in their homes and people didn't run out of pasta didn't you run out of pasta at some point I ran out of everything all I didn't shop before that so I just had like some some pasta and some sauce and that was it I mean people lost power right so they couldn't even they couldn't even boil water so I feel like Klauffler really really stepped up in this situation and it felt more than just one of these kind of nine to five jobs you know I felt like they really cared and they really kind of went above and beyond what they were really had to do or expected to do and that kind of like culture is just it feels really good to be to be part of that to be honest yeah I think so too I think we only have maybe like 20 20 seconds left so maybe maybe you can just wrap it up quickly Ruby yeah so I think thanks for tuning in to our Klauffler TV session and this will be the first one to the rest of the week we have a BI week I think this week from Monday to Friday and please check out the Klauffler .tv schedule and see if there are any topics that you're interested in and also please follow Klauffler on LinkedIn and then you know if you're interested in the job or any other positions in Klauffler look at our job postings and you know come join us