

With a time series database, this functionality is provided out of the box. They must devise schemes for cheaply evicting large sets of data and constantly summarizing that data at scale. This kind of data lifecycle management is difficult for application developers to implement on top of regular databases. This means that for every data point that goes into the database, it will have to be deleted after its period of time is up. This data is aggregated and downsampled into longer term trend data. Another example: With time series databases, it’s common to keep high precision data around for a short period of time.

TSDB’s are optimized for exactly this use case giving millisecond level query times over months of data.

This kind of workload is very difficult to optimize for with a distributed key value store. This requires going over a range of data points to perform some computation like a percentile increase this month of a metric over the same period in the last six months, summarized by month. What we need is a performant, scalable, purpose-built time series database.įor example: With a time series database, it is common to request a summary of data over a large time period. What we’re witnessing, and what the times demand, is a paradigmatic shift in how we approach our data infrastructure and how we approach building, monitoring, controlling, and managing systems. This means that the underlying platforms need to evolve to support these new workloads - more data points, more data sources, more monitoring, more controls. So now, everything inside and outside the company is emitting a relentless stream of metrics and events or time series data. In addition, we are witnessing the instrumentation of every available surface in the material world - streets, cars, factories, power grids, ice caps, satellites, clothing, phones, microwaves, milk containers, planets, human bodies. Today, everything that can be a component is a component. Monolithic mainframes have vanished, replaced by serverless servers, microservers, and containers. The fundamental conditions of computing have changed dramatically over the last decade. But financial data is hardly the only application of time series data anymore - in fact, it’s only one among numerous applications across various industries. Time series databases are not new, but the first-generation time series databases were primarily focused on looking at financial data, the volatility of stock trading, and systems built to solve trading.
