Base Case

Blog

Headless business intelligence

Jan 7th, 2021
Ankur Goyal Alana Goyal
8 min
Back

It’s 2021, and everyone is excited about the post-Snowflake world (worth just shy of $80B at time of writing). At basecase, we are bullish about the next generation of software companies built on top of cloud-based data warehouses like Snowflake. Specifically, we believe there’s an open opportunity to solve a critical problem that all data-driven businesses face: calculating metrics consistently. Metrics like daily active users, funnel events, and churn signals are critical to scaling a product-led growth motion, but are still too difficult to shuttle into the core workflows that product and go-to-market teams engage in everyday.

The category of products that should be addressing this is business intelligence (BI) tools; however, by bundling analytical logic (i.e. the definition of business metrics) with one way of consuming them (visualizing charts and graphs), today’s BI tools are limited in their scope. For example, how might you trigger a salesperson to reach out to a prospect when they hit a certain metric? Or better yet, reach out to the user directly?

In this post, we’ll take a closer look at this problem, walk through various attempts at solving it, and then motivate an approach that we believe can produce a highly scalable and enduring business: headless business intelligence. If you’re an entrepreneur who resonates with this idea or is already working on something similar, we’d love to meet you and help.

Status quo

The process of effectively using business data involves multiple steps: collecting it, dumping it into one place, defining metrics, and then using them to make better decisions. Historically, the first two steps (collecting and gathering) were incredibly challenging, and often the bottleneck in this process. Products like Segment, Fivetran, and Stitch have largely solved the data collection problem, and, by integrating directly with cloud-based data warehouses like Snowflake, BigQuery, and Redshift, businesses can easily dump it in one place.

Now that this is the norm for most businesses, a new bottleneck is becoming apparent: making effective use of the data. Data warehouses speak SQL, which was originally designed for non-technical users, but has not evolved to support modern analysis straightforwardly. Simple tasks like user sessionization, funnel analysis, and data deduplication often require 1,000+ line SQL queries which must be written by expert data engineers or generated programmatically. Furthermore, the output of a SQL query is a table, which is useful in certain contexts, but not sufficient to power bar/line charts, complex pivot tables, funnel visualizations, etc. As a result, businesses often need to employ another layer of tooling to actually leverage the data in a data warehouse.

Enter business intelligence tools. BI tools such as Looker, PowerBI, Tableau, and Data Studio are designed specifically to abstract away the complexities of SQL and make it easy to visualize simple business metrics over the underlying data. While these tools enable business users to use their data better, they have one fatal constraint: users can only use the metrics they’ve defined within the four walls of the visualization tool. Historically, this was fine, as analysts only really consumed metrics to inform manual decisions (e.g. changing a product feature based on user engagement).

However, with an increasing number of companies focused on product-led growth, these metrics have become crucial to business workflows. For example, if a user is nearing their capacity limit, you might want a salesperson to reach out and give them a nudge. Or, if they are about to churn, you might want to email them a free offer. You may even want your data scientists to try and model behavior that optimizes certain metrics.

Unfortunately, these scenarios are nearly impossible to support, since metrics cannot be accessed outside of the BI tool. This is catastrophic: not only is it a lot of work to implement the same metrics in multiple systems, but it’s even harder to do so consistently. Engineering is often the bottleneck, which is expensive and distracts from core product development. Worse, inconsistent worldviews across sales, marketing, data science, and product management lead to friction and a lack of team alignment.

The ideal solution

Imagine instead that you could disentangle metric definition from visualization. In this world, the teams that own metrics would be able to define them once, in a way that’s consistent across dashboards, automation tools, sales reporting, and so on. Let’s call this “Headless BI”.

Consider the following metric: daily active users (DAU). Defining DAU is a complex task that involves specifying which activity signals to track and what frequency to expect them. With headless BI, product managers could define this complex logic without writing SQL, potentially even in a UI. They could then point one or more BI tools at this logic to visualize daily active usage over time or correlate it with signals like industry or use case. They could even work with engineering or design to build product features that consume this signal over an API. For example, you might want to prompt a user with an advanced tutorial after they’ve been a DAU for 10 days. Without headless BI, you’d have to redefine DAU in each BI tool you use, and implement the logic directly in SQL to leverage the signal in your product.

Beyond that, imagine the implications outside of product, namely sales, marketing, and customer success. For example, if a user has been active for 14 days, you might want a salesperson to email them to schedule a meeting. Or, if a prospect at a certain stage in the funnel stops being active, update their record in your CRM so the team can reach out. Unfortunately, there’s no way to export from a BI tool into an email automation platform or CRM, so in most cases, this automation is impossible or driven from integrations that require redefining DAU and other metrics.

In short, the core criteria of a compelling solution are:

  • Easy to define metrics, without writing code (SQL)
  • Metrics can be used flexibly across BI visualizations, SaaS integrations, and an API
  • Metrics can be queried in real-time and at high enough scale to power automation like email triggers, product experiences, etc.

Alongside these customer benefits, a headless BI product is also much stickier than a traditional BI tool. BI dashboards live and die by the analysts who look at them to make decisions. Since a headless BI tool can be integrated into many business workflows (e.g. product, sales, marketing, customer success), it is much harder to rip out. Furthermore, headless BI can easily survive as UIs evolve to support more visualization types across desktop, mobile, and web, essentially turning visualization into a thin layer over universally defined metric logic. Finally, companies can invest in headless BI without locking in their BI strategy to a single visualization platform, making it an appealing choice for customers who are not ready to put all of their eggs into one basket.

An open opportunity

Business metric definition is a big problem, so it’s no surprise that many different products attack pieces of it. However, we have not come across a product or company taking a truly scalable, headless approach and believe this is a massive open opportunity.

BI Market Map

Modern BI tools make metrics easy to define and analyze: Looker makes metric analysis self-service for business users; Mode lets you visualize SQL directly; Tableau, PowerBI, and Datastudio are practically universal. However, none of these tools allow you to use the metrics you define outside of their UIs, via integrations or APIs. Instead, a headless BI product would make each of these tools much simpler and more effective. In a large enough organization, you’re bound to use multiple BI tools, and with headless BI, you can ensure they’re visualizing the same metrics.

More recently, “Reverse Fivetran” tools (Census, Hightouch, Polytomic, Headsup) allow you to define metrics and export them to sales & marketing tools (Salesforce, Hubspot, etc.). However, while they are well integrated with other SaaS tools, the metrics you define cannot directly be accessed through BI tools or queried ad-hoc through APIs. Without careful project planning and engineering, you are at risk of defining an alternate truth for the consumers of these tools vs. BI users.

Data warehouses actually offer some basic features to solve this problem. The key building block in SQL is a view, which allows you to define a queryable relation in terms of another SQL query. In many cases, you can define metrics as views over the underlying data, and simply point BI and Reverse Fivetran tools at these views. However, at any reasonable scale, views must be precomputed (materialized) which is barely supported in data warehouses (including Snowflake).

A few interesting companies solve pieces of this, which could be components of a headless BI solution:

  • Materialize is building a SQL database optimized for materialized views, and Rockset indexes semi-structured data directly, either of which could make database views performant enough to represent complex metrics on messy data. If so, Materialize or Rockset could be the underlying datastore for headless BI.
  • DBT lets you more easily build layers of SQL with a version-controlled software engineering workflow. However, it’s optimized for engineers who understand how to write SQL, and they are explicitly not trying to solve the metric definition problem. A headless BI product could leverage DBT to templatize common metric patterns or even offer SQL-level extensibility.
  • Supergrain1 is building a new kind of data platform where data analysts can define their metrics centrally and deliver them to end users in the interfaces of their choice.

We recognize that this is not an easy problem to solve, but the possibilities are compelling. Headless BI can multiply the impact of data-driven efforts in small & large companies, as each metric you define can be readily consumed for both analysis and automation. If you have opinions about this topic or are building a company in this space, we would love to chat.


Thanks to Yan-David Erlich, Anand Mariappan, and Mo Bingham for their feedback on this post.

Footnotes

  1. basecase is an investor in Supergrain.