Talking TimeSeries

jroy's picture

In 11.70.xC3, we added some new time series capabilities. Why would you care?

Time series are found everywhere. It is simply data that is collected over time. It could be changes in stock price and transaction volumes. It could also be reading of your house electric meter. Readings could be done every 15 minutes for example to provide a much more accurate picture of how electricity is being used. Other time series examples include weather information, network traffic, thermal readings in a large data center, and so on.

One key characteristic of time series is that the processing always include a time component. For example, you want to get all the meter readings for one month for a specific customer. With this data, you can calculate daily consumption, running averages, etc. To do this type of processing, you need quick access to the specific range of data you want to analyze and you also need to get it in time order.

Informix provides a data type that is used specifically to optimize time series data. It also comes with a extensive set of functions used to manipulate these time series. The Informix TimeSeries provide three major benefits:

  • Space savings

    In a standard relational database, each time series element must have an identifier and a time stamp because the time series element are stored in a separate table from the object it refers to (such as an electric meter). You then need an index on the identifier and the time stamp so you can joint it with the table that allows you to select what identifier and time range you want to operate on.

    In contrast, Informix TimeSeries stores the time series in the same table and row that represents the object. You don't need the additional index on the identifier to join tables. Also, the data is kept in time order in the time series. In the case of regular intervals, you don't even need to keep the time stam since the position gives you the time

    In customer tests we have regularly seen that Informix TimeSeries takes one third of the disk space.


  • Performance benefits

    Just by having less data on disk gives the Informix TimeSeries a significant performance benefit but it does not stop there.

    The data is ordered. This means it is much faster to get to the exact subset of data that you want to process. By being ordered, This means you will likely find the next record that you are looking for on the same page as the last one. In a relational system, all this data could be scattered over a large number of pages without specific ordering. This would cause a lot more I/O operations to be executed than in the case of Informix TimeSeries. You also don't need to go through an additional sorting step before you process your data.

    In customer tests we have seen as much as 60X performance improvements over standard relational in some queries.


  • Simpler development

    Informix TimeSeries comes with a set of functions that allow you to manipulate the time series. For example, if you need to group ou r reading to go from a 15 minute interval to a hourly interval, it can be done in a simple statement. Similarly, if you want to calculate a running average: simple statement. This means you don't need to write specialized code to provide this processing. It is built into the Informix TimeSeries capabilities.

    The end result can be simpler code, less maintenance, and faster time to market

Informix TimeSeries also provides the ability to create relational views on top of your time series data. This opens the door to the use of standard off the shelf products to do things like reporting.

With this very brief introduction, we are now ready to talk about the improvements made in 11.70.xC3. This will have to wait until next time :-)