Sourcegraph, the company behind a universal code search platform for companies including Amazon, PayPal, Qualtrics, and Cloudflare, has raised $50 million in a series C round of funding led by Sequoia Capital.

As many tech luminaries have noted in recent years, every company is now a software company, which means every company now has to deal with code in some form. But the bigger and more disparate these codebases become, as more developer tools and repositories are added to the mix, as engineering personnel come and go, the more complex it becomes to keep on top of things.

Big code

Much in the same way as companies harness and process “big data” to extract insights from multiple large and complex data sets, “big code” is a concept that seeks to address the growing volume and variety of source code that businesses have to deal with across their development projects. Big code is chiefly concerned with the extent to which the amount of code; the variety and complexity of languages, systems, and tools; and increased expectations around speedy software cycles impact companies’ ability to function optimally. And this is where Sourcegraph is carving its niche.

Founded in 2013, San Francisco-based Sourcegraph brings together the various strands that make up modern developer operations (DevOps) teams, spanning repositories, programming languages, file formats, editors, and more. For example, if a developer needs to know how to use a particular function or service, what impact changing a piece of code will have on other dependencies, or where the correct library is for a particular task, this is where Sourcegraph comes into play.


There are plenty of individual tools designed to tackle the “big code” problem. Datadog is a cloud monitoring platform for applications and infrastructure that provides error metrics so companies can narrow down the problem if a website goes offline, for example, while a developer can use something like Codecov to check the test coverage for a codebase and then LaunchDarkly to deactivate a new experimental feature that caused the fault. The problem is, all these tools exist separately.

“These services all help with big code — the tough part is weaving them together into a workflow that a developer can follow quickly,” Sourcegraph cofounder and CEO Quinn Slack told VentureBeat. “All of a company’s developer tools should be woven together in order to efficiently address the big code problem.” As such, Sourcegraph offers various extensions that allow companies to integrate all these tools in a single platform.

The big code problem isn’t a new phenomenon, of course; it’s just that more companies are beginning to realize that it is an addressable issue. “Previously, everyone just accepted that it was, and always would be, hard to build software — which is strange, because technology is all about lifting barriers to progress in other fields,” Slack continued. “We know that big code has a direct impact on the business outcomes of software development efforts, but most organizations are just starting to recognize that big code is a problem that can be solved or mitigated.”

Some companies, including Facebook and Google, have in fact built their own universal code search tools internally, likely consuming considerable resources in the process. But from a practical perspective, such an endeavor makes little sense for most companies — this is evidenced by some of the big names (such as Amazon) that use Sourcegraph instead of building their own incarnations. Much in the same way as companies leverage APIs from the likes of Twilio to build voice and SMS features into their products, or Stripe to integrate payments, it just makes more economic sense to let a dedicated third-party do the heavy lifting of joining multiple codebases and developer tools.

“Building universal code search is really hard,” Slack added. “It needs to support all major languages; all code hosts, such as GitHub, GitLab, Bitbucket, and all recent versions of those; huge monorepos (mono repositories); and millions of repositories — and it needs to integrate with many tools in the dev stack. That’s a lot of work.”


Prior to now, Sourcegraph had raised $48 million — this includes a $23 million tranche announced back in March, which preceded a significant growth period for the company, with its annual recurring revenue (ARR) quadrupling and its all-remote workforce more than doubling to 80. Moreover, the company said that it added 26 major enterprise customers to its client base in 2020, with “zero enterprise churn,” which Slack at least partly attributes to the rush to remote work.

“In the era of big code, most companies are operating like tech companies,” he said. “Developers are managing 100 times more code than they were a decade ago, and code is only getting more complex. During the pandemic, most development teams are working from home, which makes communication harder and forces developers to work more independently. Sourcegraph empowers developers to be less dependent on hallway talk to figure out big code issues and collaborate better remotely.”

Adding esteemed Silicon Valley VC firm Sequoia Capital to its roster of investors is a major scoop considering its previous investments in the devops space, which include GitHub (acquired by Microsoft for $7.5 billion), Altiscale (acquired by SAP), MongoDB (IPO), Unity (IPO), and Docker. And that’s not to mention its long history of investments in game-changing companies such as Apple, Google, Airbnb, and PayPal. Other investors in this round include Goldcrest Capital, Craft, Redpoint, and Felicis Ventures, with Sequoia partner Andrew Reed now joining Sourcegraph’s board of directors.

While parallels can be drawn to the big data revolution that helped enterprises benefit from myriad new types of applications, the “big code revolution” feels like it’s still in its relative infancy. Indeed, AI and machine learning are now crucial components of big data applications such as cybersecurity software, and this is something that will likely creep into Sourcegraph’s platform to enable new kinds of applications built around universal code search. “Stay tuned” was Slack’s response to his company’s plans on this front.

“Most companies’ code is scattered across a lot of different systems, and Sourcegraph is becoming the one place where it’s all accessible inside of a company,” he added. “That’s a prerequisite for doing machine learning or AI on code. That said, Sourcegraph is an on-premises product — companies keep full control over their code, and will pilot the progress of this development.”

