Prometheus Vector Operations Are Not Associative

I was working on a PromQL query at work to find disk usage on Kubernetes control plane nodes - the first step was to identify those nodes:

kube_node_role{role="control-plane"}

...then to use vector matching operators to associate each control plane node with the node-exporter running on it:

kube_node_role{role="control-plane"} * on(node) group_left(pod) kube_pod_info{created_by_name="node-exporter", namespace="kube-prometheus"}

...and finally, get the filesystem usage reported by each matching node-exporter, associated with the corresponding node:

node_filesystem_avail_bytes * on(pod) group_left(node) kube_node_role{role="control-plane"} * on(node) group_left(pod) kube_pod_info{created_by_name="node-exporter", namespace="kube-prometheus"}

This got me the dreaded "many-to-many matching not allowed: matching labels must be unique on one side" error! This kinda surprised me - after all, I'd tested node_filesystem_avail_bytes and kube_node_role{role="control-plane"} * on(node) group_left(pod) kube_pod_info{created_by_name="node-exporter", namespace="kube-prometheus"} in isolation, and they both worked fine - and when I evaluated them in separate query panels, they yielded compatible label sets!

Here's the fixed version:

node_filesystem_avail_bytes * on(pod) group_left(node) (kube_node_role{role="control-plane"} * on(node) group_left(pod) kube_pod_info{created_by_name="node-exporter", namespace="kube-prometheus"})

Do you see it?

I'm guessing you may have since I kinda gave it away with this post's title, but in case you haven't, here's a diff between the two queries after they've been formatted:

  node_filesystem_avail_bytes
    * on(pod) group_left(node)
+ (
    kube_node_role{role="control-plane"}
      * on(node) group_left(pod)
    kube_pod_info{created_by_name="node-exporter", namespace="kube-prometheus"}
+ )

So while the resulting value of a vector match operation may be associative (as far as I know - please reach out if you have counterexamples), the resulting label set is not!

This is kind of my fault for starting with the B * C part of an A * B * C query and then adding the A part at the beginning - I'm not exactly sure why I did this! I think I might just favor group_left, or maybe I just like leading with the metric whose values I care about, followed by the "filter" part of the query, much like how a SQL SELECT statement is structured.

Published on 2025-03-29