Prometheus Vector Operations Are Not Associative
I was working on a PromQL query at work to find disk usage on Kubernetes control plane nodes - the first step was to identify those nodes:
kube_node_role{role="control-plane"}
...then to use vector matching operators to associate each control plane node with the node-exporter running on it:
kube_node_role{role="control-plane"} * on(node) group_left(pod) kube_pod_info{created_by_name="node-exporter", namespace="kube-prometheus"}
...and finally, get the filesystem usage reported by each matching node-exporter, associated with the corresponding node:
node_filesystem_avail_bytes * on(pod) group_left(node) kube_node_role{role="control-plane"} * on(node) group_left(pod) kube_pod_info{created_by_name="node-exporter", namespace="kube-prometheus"}
This got me the dreaded "many-to-many matching not allowed: matching labels must be unique on one side" error! This kinda surprised me - after all, I'd tested node_filesystem_avail_bytes
and kube_node_role{role="control-plane"} * on(node) group_left(pod) kube_pod_info{created_by_name="node-exporter", namespace="kube-prometheus"}
in isolation, and they both worked fine - and when I evaluated them in separate query panels, they yielded compatible label sets!
Here's the fixed version:
node_filesystem_avail_bytes * on(pod) group_left(node) (kube_node_role{role="control-plane"} * on(node) group_left(pod) kube_pod_info{created_by_name="node-exporter", namespace="kube-prometheus"})
Do you see it?
I'm guessing you may have since I kinda gave it away with this post's title, but in case you haven't, here's a diff between the two queries after they've been formatted:
node_filesystem_avail_bytes
* on(pod) group_left(node)
+ (
kube_node_role{role="control-plane"}
* on(node) group_left(pod)
kube_pod_info{created_by_name="node-exporter", namespace="kube-prometheus"}
+ )
So while the resulting value of a vector match operation may be associative (as far as I know - please reach out if you have counterexamples), the resulting label set is not!
This is kind of my fault for starting with the B * C
part of an A * B * C
query and then adding the A
part at the beginning - I'm not exactly sure why I did this! I think I might just favor group_left
, or maybe I just like leading with the metric whose values I care about, followed by the "filter" part of the query, much like how a SQL SELECT
statement is structured.