Aggregation of time series values
Different ways to aggregate data without influence on time group
projectsgrafanasqlpostgresqlinfluxdbpython
1273 Words
2025-05-26
This post is about some information I have not found anywhere else on aggregated queries in Grafana. The concept is not unique to Grafana and has also occured to me in Python when using pandas resample. It can happen with InfluxDB or PostgreSQL, so we have a variety of examples to go through.
The main starting point of our problem is data where the value is tied to a time spatial and includes values per time. For example looking like this:
time | value | technology | unit |
---|---|---|---|
11:00 | 4 | solar | unit1 |
11:15 | 8 | solar | unit2 |
11:30 | 6 | wind | unit3 |
The example given here is about average power measurement values of a time slot in kW. It is not the generated electrical energy in kWh. If it would be, we would not have this problem, but we would have other problems.
Other use cases of this include average requests/second, MBs/written, network throughput, current number of wifi clients and thelike.
Easy Working Queries
We can of course now select the time series data for a single unit in Grafana with InfluxDB:
SELECT
MEAN("value") AS "Value"
FROM data WHERE $timeFilter AND ("unit" = 'unit1')
GROUP BY time($__interval)
the last GROUP BY
evaluates to GROUP BY time(1h)
in the Query Inspector of Grafana, therefore taking the average of all values within an hour - and displaying one value per hour as time series.
Great! Similarly, for PostgreSQL:
SELECT
$__timeGroupAlias(start_time,$__interval),
AVG("value") AS "Value"
FROM data WHERE $__timeFilter(start_time) AND "unit" = 'unit1'
GROUP BY 1
By the way, a less more common known feature of SQL is referencing the column id by its (1-based) index - here the GROUP BY 1
spares us another writeout of $__timeGroupAlias(start_time,$__interval)
.
The use of the $__timeGroupAlias
and $__timeFilter
looks weird at first - it is a Grafana function which evaluates to time_bucket('3600.000s',start_time) AS "time"
.
While the __timeFilter
evaluates to start_time BETWEEN '2019-01-01T00:00:00Z' AND '2019-01-31T00:00:00Z'
.
This ensures, that the correct time filter is used and the aggregation of values is done correctly, even when zooming out quite far.
Another important benefit of the average is also that data stored multiple times does not influence your values. This can of course also be a pain point when trying to figure out why a sum of some value is unexpectedly high, while the other queries show correct results.
Aggregating queries
The fun begins when we want to query an aggregation. In our example from above, we want to have one time series for all units and create the sum of this.
The problem now is, that using the mean query gives you the mean of all units, which is not what we want, while using SUM
as aggregation would give the sum of all units - and also of the time.
When zooming out of the graph (which increases the interval time of the graph), the values are getting bigger.
That’s not what we want.
How can we solve this?
Multiple graphs through GROUP BY
One easy way is to reduce our problem to the above single queries and create individual series per unit id.
This is done by adding unit
to the GROUP BY
clause:
SELECT
MEAN("value") AS "Value"
FROM data WHERE $timeFilter
GROUP BY time($__interval), "unit"
In Grafana we can then stack the individual queries to get the sum of this. Or we could use the client side aggregations for this, which are less efficient and reduce the performance of the dashboard.
Fixed interval size
Instead of the dynamic interval size, one can use the sum in combination with a fixed interval size. If we have for example quarter-hourly data, we might need to add a calculation factor for this, so to divide by 4.
SELECT
SUM("value")/4 AS "Value"
FROM data WHERE $timeFilter
GROUP BY time('1h'),
This has two drawbacks. First of all, it is not possible to see the quarter hourly values anymore and have to do calculations around it. Secondly, when zooming out to monthly or yearly data, the query still tries to get hourly values which has a very high response time and reduces the dashboard’s performance a lot.
SUM-MEAN subquery
The best way to tackle this is the usage of a subquery which brings the values to the correct interval size and then sum up all the individual values.
This looks quite complex at first but is understandable once you get used to it.
SELECT time, SUM(value) as "value", FROM
(
SELECT $__timeGroupAlias(start_time,$__interval),
AVG(value) AS "value", unit
FROM data
WHERE $__timeFilter(start_time)
GROUP BY 1, "unit"
) subquerydata
GROUP BY 1
ORDER BY 1
Here we have grouped means in the subquery, removing our burden with different aggregation sizes. We then only have one value per interval, per unit - so the only thing left is to sum these values up.
This is also dynamic as the GROUP BY clause does respect changes to $__interval
.
Note that we do not need a where clause in the outer query, as the inner query does not return any values outside the filter. Filtering once is enough.
Aggregating queries by another value
For our initial data, we could also ask the question of the total generation graph per technology. This would also require us to group the inner query by unit and technology - and the outer query by technology:
SELECT time, SUM(value) as "value", technology FROM
(
SELECT $__timeGroupAlias(start_time,$__interval),
AVG(value) AS "value", unit, technology
FROM data
WHERE $__timeFilter(start_time)
GROUP BY 1, "unit", technology
) subquerydata
GROUP BY 1, technology
ORDER BY 1
So we need to adjust 4 places - both SELECT clauses and both GROUP BY clauses. That’s it.
Real world examples
As I am in science which does not require a NDA (non-disclosure agreement), I can also show some real examples.
For example this one of the assume simulation generation visualization per technology:
SELECT time, SUM(volume) as "volume", tech FROM
(
SELECT
$__timeGroupAlias(start_time,$__interval),
AVG(accepted_volume) AS "volume",
unit_id, pm.technology AS tech
FROM market_orders mo
JOIN power_plant_meta pm ON pm.index=mo.unit_id AND pm.simulation=mo.simulation
WHERE
$__timeFilter(start_time) AND
market_id = '$market' AND
mo.simulation = '$simulation'
GROUP BY 1, unit_id, bid_id, pm.technology
ORDER BY 1
) subquery
GROUP BY 1, tech
ORDER BY 1, tech DESC
Or for the consumption data from the electricity usage of the universities campus. This data is stored in influxdb, therefore using the slightly different influx query language:
SELECT sum("P_Ges") AS "Strom HV" from (
SELECT mean("P_ges") AS "P_Ges"
FROM "Strom_Hauptverteilung"
WHERE $timeFilter
GROUP BY time($__interval), "id" fill(null)
)
WHERE $timeFilter
GROUP BY time($__interval) fill(none)
Conclusion
In this post, we went through a few different methodologies of aggregating time series queries. The crux of this always starts when the values are are stored as power instead of energy.
Another solution used in telegraf quite a lot is to store monotonic increasing counters and then visualize the non negative derivative of the data. This is quite common in InfluxDB but leaves data which is only hardly interpretable by its own otherwise.
I wondered how I could add a visualization of this problem without just screenshots from grafana, but I did not get a sufficient one. So be it. You have to tinker with it yourself to get an understanding of what AVG of SUM query means and when it is the same as the SUM of AVG query.. :)
So far I got good results using the SUM-MEAN aggregation, as this does not create performance bottlenecks on the server or on the client. As I did not find similar information on other places. This is here to be linked to for others.