*APAC Heritage Month* Founder Focus: Beyang Liu
Beyang Liu co-founded of Sourcegraph, a web-based code search and navigation tool for dev teams.
Transcript (Beta)
All right, welcome to another episode of Founder Focus. I'm your host, Jade Wang, and I run the startups program here at Cloudflare.
Today, we are joined by our guest, Beyang Liu, who is co-founder and CTO of Sourcegraph.
Welcome to the show. Hey, Jade, thanks so much for having me.
It's great to be here. Awesome, and by the way, everyone, if you have questions, you can call in your audience question or email in your question, and we will take them near the end of the episode.
So this information is also down below, livestudio at Cloudflare.tv.
So without further ado, so first, really quick, what is Sourcegraph for people who haven't heard of it yet?
Yeah, so Sourcegraph is great code search for all. So what that means is it's a tool for searching over all your code, diving into a specific part of it and kind of building up a working understanding of it so that you can kind of understand the context into which you're committing code and therefore work a lot more efficient and more quickly.
So basic idea is you type in your query to that search box there, that jumps you to a point of code that's interesting to you.
You can also type in like a regular expression to see all instances of a certain pattern.
And then we jump you into this code navigation interface, which is kind of like IDE-like in nature, but optimized for the process of reading and understanding code.
So you get all the standard, jump to definition, find references, and additional contextual information plugged in in line in code to assist with the tasks that you probably do most of the day as a developer, which is reading and understanding code.
So before Sourcegraph existed, I'm given to understand that there are some larger companies with large code bases that had developed internal versions that are basically like rudimentary precursors to Sourcegraph, is that correct?
Yeah, exactly. So we are not the first code search engine.
I think the first thing that you could call code search probably traces its roots back to Bell Labs in the 80s.
But yeah, as you said, there are larger companies that have developed their own internal code search engines.
Google is one of the more prominent examples.
Google code search, you ask any developer at Google, or he used to work at Google and ask them, what's code search like?
And they'll tell you that they love it.
It's an integral part of their development process.
A bunch of other companies have internal versions as well. I think Etsy built an open source version called Hound.
There's another code search tool called LiveGrep created by Nelson Elhage, who was an engineer at Stripe at the time, I believe.
And a lot of other instances, both in open source and inside companies of code search.
And what we've tried to do is build what we think is the best in class version of code search and make that accessible to every developer inside every company.
Awesome. So you've recently raised a large round in the past year, like 73 million, is that right?
Yeah, that's the total amount of funding we raised last year.
So what's next on the roadmap? What can we look forward to in terms of releases in the next year?
And what are you currently hiring for? Given that our audience is mostly DevOps and developers and people who might be interested.
Totally. So we are aggressively hiring in all departments right now, everything from sales and marketing to of course, engineering and product and support, really all functions.
Our big goal this year is really to go out and build the Google for code.
And what I mean by that is there, I think there's yet to be this kind of like single standard search engine that searches over the entire universe of code, both the giant corpus of code that's an open source and all the relevant code to you inside your private repositories, your company's code base.
And so we want to go and build that. Yeah, sorry. So like, if a company is currently using which repository tools would you be compatible with?
Like if they're using GitHub, GitLab?
Yeah, so we are compatible with GitHub, GitLab, Bitbucket, and really any Git or Perforce based code host.
Perforce is something that we just added recently in those for some of the larger enterprise customers that we're dealing with.
And so, the theme last year for us was really building out this like big enterprise business because we wanted to prove that there was a significant market for what we're building and it helps to pay the bills.
And this year, given all the funding that we've raised, we really want to put that money to use just like building this tool that is universal, that can be used by every developer, whether you're working inside the large enterprise or you're just a single person open source.
And so the whole point of raising all that funding was so we could invest it in this kind of longer term effort that's not quite as revenue focused, but really about building this kind of common destination that any developer out there can go to and help them access and take full advantage of all the code that already exists in the world.
Awesome. So talk to me about find and replace batch changes.
I mean, I had read a part of your blog post about it and like, is it fair to say that you've made a lot of the sort of boring and morale sapping chores just much easier and faster to do?
And like, are there other things that are soon on the horizon that you're really personally looking forward to?
Yeah, I think, you know, taking boring and, you know, sloggy tasks and making them more fun or just more quick to do is a big part of, you know, the theme of the company, which is, you know, make developers happier and thereby more productive.
For batch changes specifically, so that was a feature that I guess, first of all, explain what it is.
So, you know, a lot of companies, especially in the larger your organization code base gets, you run into this issue where you wanna make some, you know, fairly straightforward or simple change to a shared API or library that is depended on by all these other, you know, repositories in your code base.
And because of the scale of the code base, making that simple change also requires making a bunch of upstream changes to the dependencies of that API or library as well.
And that becomes a slog because all of a sudden you have to open up, you know, a pull request against all these different repositories.
You have to, you know, work with the members of each of those teams and kind of keep track and manage of it all.
A lot of companies build a custom tool to deal with this.
It goes by various names, you know, large scale refactorings or some people call it code shepherding.
And this was a problem that we found was pretty endemic across many of our customers, especially the ones who are growing rapidly or larger code bases organizations.
So we went ahead and just kind of like listened to them and rolled that into a fully fledged feature and product which we call batch changes, which is all about helping you automate the process of making these large scale refactors and track progress against each repository or each part of the code base from a single dashboard.
And so, yeah, to your point, I think we've identified a task that was previously very annoying and difficult and time-consuming, especially because, again, like the change that you're trying to make is not like that difficult.
It's not like a thing that really exercises the unique creativity of the human mind.
It's just something that you need to execute on a wide scale.
And so we try to make that much easier and hopefully more enjoyable as well.
So we touched on a lot of themes about developer productivity and like, you know, CTOs and VPs of Eng everywhere are like thinking really hard about this.
From your perspective and thinking about developer productivity all day, like what are some good and bad ways of measuring developer productivity?
Yeah, so, you know, I'll start with a bad way. I think a bad way is measuring in terms of number of commits over time.
So, you know, the mug that you had right there, that's kind of the famous GitHub commit graph, you know, not to rag on GitHub too much.
I think they're a great company, but I do think that one of the common mistakes people make, you know, not necessarily with bad intentions is to start tracking developer productivity directly or indirectly by number of commits or some proxy of lines of code.
And the reason that's bad is, you know, I kind of tie it back to the outcome that you're trying to achieve.
So like the goal of any piece of code that you write is you're automating some, you know, previously complex task and reducing it to something that's more simple and straightforward.
And if you truly want to measure a developer productivity, the only like, at the end of the day, the only real way to measure that is by the impact of the software that you write.
And so there's kind of two like common metrics. You know, one is revenue.
Like if you're selling your software for money, how much are people willing to pay for it?
Because the price someone's willing to pay for a good or service is signifies kind of like a lower bound on the value that they assigned to it.
And then the second is amount of human time saved. So, you know, if you're an open source library, you're not selling it, you're giving it away for free.
You can kind of gauge the true productivity of your code through the amount of time that you save, you know, human beings that are using your software.
Now, tying those two top level metrics back to, you know, what a single developer does, especially if it's a shared code base where many developers are contributing is a non-trivial problem.
And that's where people fall into this trap of like, okay, let's try to reduce it to more, you know, tactical chunks.
And a lot of times I think folks make the mistake of deriving some metric from lines of code or number of commits as that kind of first layer of development productivity.
So if some, I mean, presumably when we give people better tools to work with, generally their productivity goes up, right?
Like what is a good way of thinking about that on a tactical level?
Or like, you know, if a team is like has good morale, they're going to be doing more or on a particular day.
Or, you know, if they're super distracting events going on in the world, like all of what everyone has had to go through in the entire year of 2020, that's going to, you know, sap people's morale and make them less productive.
Yeah, I think, you know, I think happiness and morale are, is pretty good proxy for overall productivity and perhaps a much better proxy than, you know, lines of code or a number of commits.
And I think that's because, you know, most developers, there's a very strong correlation between happiness and productivity.
Like, I don't think I know too many developers who are happy at their jobs when they're happy.
They're not being productive.
And I think, you know, conversely, if I think back to like the best moments I've had as a developer, it's really when you're kind of like in the zone, when you sit down at your machine and you just feel like you're in flow state, like all the pieces of contextual information that you want to have, that you need to have in order to write code productively are just kind of paged into the L1 cache of your brain, so to speak.
And, you know, again, to plug Sourcegraph a bit, I think that's a big part of what we help people do because there's all these different parts, pieces of context, knowledge about various parts of the code base that are relevant to you when you're, you know, focused on authoring one specific, you know, new feature or fixing a bug.
And we can help accelerate and defrictionize your ability to collect that information into your brain so that you can work in that kind of optimal flow state of high productivity.
So before the show went live, we talked very briefly about the book, The Mythical Man-Monk, and I'd love to get your thoughts on, you know, like it seems like at some point there is a, like did this particular problem get solved aspect of productivity, of, you know, developer productivity, right?
Yeah, yeah, so, you know, that book is great.
It's a classic. I think, you know, every developer should read it at some point in their careers.
I will say, you know, I read it twice, I read it once at the beginning of my career where I didn't really get it.
I thought, oh, this is like super dry and I kind of made it halfway through and then stopped reading it.
But then I read it like two or three years later and it was like a page turner. I couldn't put it down because like so many of the things that they discuss, I think are just things that you encounter day-to-day in professional software development.
Well, you just had a lot more context for a lot of what was taught, right?
Oh, I think you're frozen.
Oh, we're back. Sorry, where did I cut out? Did I freeze or?
Yeah, you froze for a moment there. Okay, Mythical Man Month, great book. Everyone should read it.
I think what, so- Second time it was a page turner. Yeah, second time was a page turner.
And they talk about developer productivity a lot in that book.
I don't think they provide like the answer. Like the answer is something we're still figuring out, but I think they do provide guidance as to how to avoid a lot of the common like pitfalls or anti -answers, so to speak.
And among them is kind of tracking productivity by lines of code or number of commits.
And yeah, there's this really visceral analogy I remember from that book, which is where the author draws an analogy between software development teams.
And he says, they're more like surgical teams, like a surgeon in like a medical setting where each person in that surgery room has a specialized role.
And you're ultimately measuring things by the outcome, which is, does the patient survive and get better or do they die on the operating table?
But the way that we treat software development, developer productivity, we often don't think of a development team like a surgery team.
We often treat it as a slaughterhouse where in a slaughterhouse, it's very metrics driven.
You're just thinking about like number of whatever animal it is you're butchering, like how many of those are you getting through per hour?
And it's all about throughput. And you start to think about software development like that, it reduces what is a very like creative and high variance and outcome driven tasks to this essentially like widget factory, which is not what you want because you can write a lot of code that at the end of the day doesn't benefit users or customers that much.
Yeah, that's, I remember my husband telling, quoting him like, of course you don't know how long it's gonna take.
If it were a solved problem from before, then it would take zero time to do.
Yeah. Just the nature of working with bits instead of atoms.
Exactly. So speaking of a lot of the events of the past year, how has 2020 been for you?
Did you have to transition to remote work?
Like both in terms of like what it was like to work during that time and also what it was like for the business?
Yeah, so 2020 was a crazy year for us, I think as it was for everyone.
We were a bit fortunate in that, our company has always been remote friendly and at the beginning of 2020, we actually made the decision to go full remote.
And so we got rid of our office, which used to be in San Francisco and went office-less in January of 2020, which was in retrospect pretty good timing.
That's super pressured. Yeah, so it's not like we foresaw any of this happening, but I guess we just got lucky in that regard.
And so, in some ways we were kind of set up to handle things as best as we could, given that we had invested in the remote async work processes earlier, having kind of like adopted those as we grew up as a company.
But it was still very tough. We had teammates around the world and the varying kind of lockdown restrictions, I think take a toll on everyone.
Because when you're basically just restricted to staying inside your house and seeing the people in your immediate household, you can't go out with friends, you can't meet other people.
For us, we've more than doubled in size in the past year.
And to this day, I've not met the majority of people that are part of our company now in person.
And so that's crazy. And that makes collaboration and just like day-to -day work a lot tougher.
And so, we would try to be mindful of that as much as possible.
So let's, now that we're sort of in the second half of the episode, I'd like to focus this on sort of like personal journey and origin story.
So can you tell the story of how you and Quinn first met and how you decided to start Sourcegraph?
Yeah, so I think Quinn and I first met in college, but we didn't really work closely together until post-college when we both joined a company called Palantir working on the commercial team.
So essentially, we were in this kind of startup within a startup inside Palantir that was trying to open up a new line of business.
This was going after large Fortune 500 enterprises.
And because of that setup, we were both working fairly closely with technical counterparts on the customer side.
He was more on the business side.
I was more on the development side, but the lines were pretty blurry.
He wrote a lot of code and I was involved in the customer conversations.
And in kind of working through that, we started to realize that a lot of the challenges that we were seeing these technical software teams face inside these large banks, those were kind of our target customer at the time.
We said like, hey, a lot of these challenges around reading and understanding their existing code and trying to move faster, these are challenges that we both feel as software engineers.
And we're both familiar with code search as a concept. I think it was like relatively unknown around that time, circa 2011, 2012.
But Quinn had previously done a lot of work in open source.
He had used things like Open Grok for searching over open source code.
I previously worked inside Google and experienced the glory that is Google code search.
And so we were like, hey, something like that could alleviate a lot of the problems that we're seeing and help these teams move faster.
And it would also scratch our own itch because I think every professional software developer has a moment where they take a step back, they think about the frustrations of the day-to-day, and then they think back to that first moment where they fell in love with programming and they're like, ah, how can I get back to that?
And for us, code search was kind of a means to help more developers get back to that magical flow state on more of a day-to-day basis.
So we kind of started talking about this idea while we were at Palantir.
Fast forward a couple of years, we didn't really act on it immediately.
A couple of years later, we ended up bumping into each other at a house party in San Francisco.
We started talking about this again and then started hacking on it and then went from there.
So between the sort of 2012, 2013 era and now, I feel like the entire DevTools ecosystem has evolved so much.
Can you tell us about what things have changed and what are the forces behind it?
Because it used to be super hard to get a DevTools company off the ground because so many investors were just allergic to it.
Yeah. Now it's so different.
It's crazy. You're absolutely right. When we first started the company, we spoke to investors and a lot of them would just get up and walk away as soon as we said the term DevTools.
So in those days, when we were working on the investor pitch, we were like, okay, how can we not say the terms DevTool together in the same sentence?
Because that is a very negative indicator to people and they won't give us money.
Nowadays, I feel like it's the hot thing. It's like second to cryptocurrency. Yeah.
But yeah, things have changed. I think part of the things that are driving this shift in investor mentality is business reality.
So you had a couple of big DevTool companies or acquisitions, GitHub via Microsoft being the first one and then you had GitLab and HashiCorp.
These are real big companies that have proven out there exists a super large market for developer productivity tools.
And I think their success has in turn been driven by the overall trend of software is eating the world and it continues to eat the world.
And as a consequence, more and more companies are effectively becoming software companies.
Even companies that we don't think of as traditionally as being in the high tech sphere, they have sizable engineering teams now.
And now they're getting to the point where their code base and their engineering teams are rivaling the code bases of big tech companies, especially big tech companies, five or 10 years ago.
So now they're getting to the point where they need all these tools that previously only a small subset of companies in the world needed and built internally.
And so a market exists to create those sorts of tools for everyone and bring them to the kind of developer masses, so to speak.
It's really cool and interesting to think of like non -technology industry verticals as essentially also being tech companies, because it's kind of the reality that our world has forced them to live in.
Yeah, yeah.
I mean, like just to take a recent news item, like that the whole oil pipeline that got hacked recently and drove up gas prices around the country, that was a complete software move.
And it's just crazy that like that is perhaps like the number one threat to the economy and national security these days is a software thing.
So that just goes to show like even like an oil company where you think of them as just like drilling for oil and taking it out of the ground, moving to the place.
No, software plays a key component of that infrastructure. So I'd like to imagine, do a thought experiment, right?
Let's say I invited you to a magical Zoom meeting where you're having a conversation with yourself from 2013.
I would like you to tell me like, what is that conversation like?
Like, what do you wish you knew?
Like what are like, you know, assuming that you don't break physics and causality.
Yeah, yeah. That is a great question. Like what would I tell myself in 2013 other than buy Bitcoin, right?
That's it. I think I would say that I would tell myself I would go to that person and say, are you thinking like just like general personal, professional, or like more specifically about Sourcegraph?
Like what would we do? I mean, either, like whatever is more salient to you.
Yeah, I think I would tell that person to go all in on self-hosted enterprise as a go-to-market strategy on the business side.
Cause that's the thing that ended up paying off for us.
It's kind of also a little bit against the conventional wisdom of like, okay, start with like SMB cloud.
Turns out that's, you know, maybe good for some companies but not, it doesn't necessarily generalize all companies, certainly not for us.
As far as like general advice to, you know, me as a person, I think, I think I would tell that person to just go and make the effort to get in front of a lot more people and take their feedback to heart.
You know, one of the things I regret not doing is iterating more against user feedback in the early stages.
There's always this like tension between, you know, the faith that you have in the ultimate vision of what you're building versus the feedback that you're getting from users on the ground.
And often, you know, they're in kind of direct tension because you build something and you're like, hey, this is gonna be magical, just wait.
And then someone tells you like, no, this is not very good.
I don't really see it. And you kind of have to figure out how to rectify the two.
And I wish, you know, I had probably been more of the mindset of like, you know, listening to user feedback earlier on, because I think it would have driven, you know, faster iteration cycle for Sourcegraph in the early stages.
That makes a lot of sense.
So in the last three and a half minutes of our episode, I'd like to steer this biographical and just like ask you about, you know, how you, where did you grow up?
What was it like? And what were your experiences like? Yeah, so kind of the arc of my personal background, I was born in China, came to the United States when I was around two, landed in Iowa, spent about five years in Iowa, then five years in Minnesota, then came out to SoCal for middle school and high school, and then ultimately ended up in the Bay Area for college and have been here ever since.
And I think, yeah, like childhood was awesome. I really loved the Midwest.
I think there's a lot of great people there. A lot of folks that I knew from those days have now, you know, come out to San Francisco, you know, ended up reconnecting with them.
And yeah, overall, I think I had a good upbringing. Do you feel like, given that you've lived in a lot of different places, like sort of California type place in Midwest, whether it's Iowa or Minnesota, did you feel like the demographic differences like felt different and can you characterize what that was like?
Yeah, for sure there are demographic differences. One thing I noticed when going to SoCal was just the, you know, there are a lot more people who are like Asian, you know, coming from Minnesota and Iowa.
And I think that kind of cut both ways.
Like, you know, it was interesting. Like when I was in Iowa and Minnesota, I never really felt like a separation, I guess on the basis of race.
Like I was fortunate enough, you know, the city I was in, all my friends, you know, they're very like open people.
And, you know, I felt like I had a really tight knit friend group there.
When I came out to California, you know, for the first time, it kind of became conscious of some of those differences.
And it was interesting to watch the dynamics because there is this kind of thing that happens when you have, I guess, more people of a certain, you know, minority population, you know, self-segregation oftentimes occurs.
And I think one of the lessons I took from my upbringing in the Midwest is to kind of avoid that mentality and to as much as I can, just like make friends with people from, you know, a wide variety of backgrounds, because at the end of the day, we're all, you know, individual people, we're all, you know, very different and colorful based on our own kind of personal histories and experiences.
And I think like, that's probably the thing to shoot for.
It's certainly what's worked for me in my life. I think that's, yeah, I think that's a great message for everyone.
Just like every individual person is their own universe and to sort of approach that with open eyes.
For sure. Well, thank you so much. It's been a great episode and thank you for coming on our show.
Thanks so much for having me, Jade. This was a lot of fun.
Thanks. All right. Bye. See you. All right.