InfluxDB Flux - Filter where field matches value

18.8k views Asked by At

I'm using InfluxDB with Grafana, and I have a measurement called items with some tags and a field called itemType. I need to filter for rows where itemType is a certain string. The following InfluxQL query does exactly what I need:

SELECT * FROM "items" WHERE "itemType" = 'example'

How can I do the same in Flux?

I currently have the following query which does everything except filter by field:

from(bucket: "dbname/autogen")
    |> range(start: 2020-10-12T01:56:34Z, stop: 2020-10-12T02:54:10Z)
    |> filter(fn:(r) => r._measurement == "items")
    |> aggregateWindow(every: 5m, fn: count)

But replacing the filter function with filter(fn:(r) => r._measurement == "items" and r.itemType == "example") returns no results, even though the InfluxQL query above does return data when used in the InfluxDB CLI.

2

There are 2 answers

1
kes_ On BEST ANSWER

Your specified flux query would work if itemType was a tag, however, since it is a field one of the ways to make the query work is by setting conditions on the field name and its value as follows:

from(bucket: "dbname/autogen")
    |> range(start: 2020-10-12T01:56:34Z, stop: 2020-10-12T02:54:10Z)
    |> filter(fn:(r) => r._measurement == "items" and r._field == "itemType" and r._value == "example")
    |> aggregateWindow(every: 5m, fn: count)
0
Sebastian Staacks On

I am a bit surprised that this has not been answered in more than three years as it popped up several times while I was researching this. If I understand the question correctly, then the problem is based on a misunderstanding of how data is structured in InfluxDB and the accepted answer ignores that the question aims at retrieving not just the count of matching items but other columns that belong to the filtered "rows" (SELECT * FROM ...).

Columnar format

InfluxDB is not a traditional relational database. Your request does not return a table like an Excel sheet with one column for each field and field names as their headers, but literally a column that contains the field names.

Example:

_time _field _value (...and a few more taht do not matter here)
00:10 "fruit" "apple"
00:10 "color" "red"
00:20 "fruit" "apple"
00:20 "color" "green"
00:30 "fruit" "banana"
00:30 "color" "yellow"
00:40 "fruit" "banana"
00:40 "color" "brown"

Let's say you want to retrieve the color of all apples with a query like SELECT * FROM something WHERE fruit = "apple", right?

If you filter with r._field == "fruit" and r._value == "apple" (as suggested by the accepted answer) you will only retrieve two lines:

_time _field _value (...and a few more taht do not matter here)
00:10 "fruit" "apple"
00:20 "fruit" "apple"

In particular, you will lose all other fields except the one you are filtering.

Row format

What you want is your related data being structured in a row, which you probably know from a relational database like MySQL. You want this:

_time fruit color
00:10 apple red
00:20 apple green
00:30 banana yellow
00:40 banana brown

If you filter this with r.fruit == "apple" you get all the data that belongs to the value you are filtering for:

_time fruit color
00:10 apple red
00:20 apple green

pivot()

In order to convert your columnar data in influxdb to row data you can use the command pivot().

It takes three arguments:

  • The key that should be grouped into rows (in my example _time)
  • The key that should be grouped into columns (i.e. the headings or keys of your new table, this is usually _field)
  • The key that contains the values (most of the time _value)

So, putting it all together...

tl;dr

The equivalent to SELECT * FROM something WHERE fruit = "apple" in flux is

from(bucket: "mybucket")
    |> range(start: 2020-10-12T01:56:34Z, stop: 2020-10-12T02:54:10Z)
    |> filter(fn:(r) => r._measurement == "something")
    |> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
    |> filter(fn:(r) => r.fruit == "apple")

(Technically it is not entirely equivalent because of the mandatory range.)

Oh, and if you want to aggregate this, you should do it before the pivot as pretty much everything else expects the columnar format.