Skip to content

Feat/watch namespaces#1146

Open
wilfriedroset wants to merge 5 commits into
clastix:masterfrom
wilfriedroset:feat/watch-namespaces
Open

Feat/watch namespaces#1146
wilfriedroset wants to merge 5 commits into
clastix:masterfrom
wilfriedroset:feat/watch-namespaces

Conversation

@wilfriedroset

Copy link
Copy Markdown

This PR addresses #1145.

Full disclosure: I've used claude with opus 4.7 extensively to land the feature as I'm no golang expert. I've fully tested manually on a production cluster and self-review the changes

testcontainers-go v0.42 changed Container.Exec to return a nil stdout
reader when the exec call fails before producing any output (v0.41
always returned an empty, non-nil reader). Calling io.ReadAll on the
nil reader panics with a nil-pointer dereference, which was breaking
worker_kubeadm_join_test.go since the testcontainers bump.

Make the three Exec sites return early when err != nil — that is the
case in which stdout was nil and the panic occurred, and on a non-nil
err the exitCode and stdout values are not meaningful anyway. In the
two best-effort blocks (modprobe br_netfilter, swapoff -a) the
returned error is logged and the spec moves on; in the join command
block the error fails the It via Expect.

Signed-off-by: Wilfried Roset <wilfriedroset@users.noreply.github.com>
Introduce an opt-in --watch-namespaces flag on both the controller and
the kubeconfig-generator. When set, the manager's cache only watches
namespaced informers in the listed namespaces via
controller-runtime's cache.Options.DefaultNamespaces. Cluster-scoped
resources (DataStore, ClusterRole, ...) are unaffected and stay
cluster-wide. When the flag is empty (default), Kamaji keeps its
current cluster-wide behaviour.

The new internal/manager package hosts three small helpers:

* NewCacheFunc returns a cache.NewCacheFunc with the resync period and
  the optional namespace scope applied.
* MergeWatchedNamespaces canonicalises the user-supplied list (trims,
  drops blanks, deduplicates while preserving first-seen order) and
  appends caller-supplied extras. Returns nil when the user list is
  empty so the cache stays cluster-wide.
* ValidateNamespaces fails fast on invalid identifiers (RFC 1123
  label) so misconfiguration surfaces at startup rather than as
  opaque watch errors.

Both binaries pass their install namespace through MergeWatchedNamespaces
extras so it is always part of the watch set: the controller needs it
for the migration Job watch (predicated on KamajiNamespace), and the
kubeconfig-generator includes it for symmetry to keep leader-election
working defensively.

Unit tests cover all three helpers plus the applyOptions transform end
to end, including the canonicalisation path through applyOptions.

Signed-off-by: Wilfried Roset <wilfriedroset@users.noreply.github.com>
Thread the new --watch-namespaces flag through the Helm chart on both
the controller and the kubeconfig-generator deployment templates, with
a single values.yaml entry (watchNamespaces, defaulting to []) feeding
both. When the list is empty the flag is omitted so existing installs
keep their cluster-wide behaviour.

The kubeconfig-generator template also gains a POD_NAMESPACE env var
projected from metadata.namespace, mirroring the controller. The
operator code path that auto-includes the install namespace in the
watch set reads its own pod namespace from this env var, so without
the projection --watch-namespaces would silently drop the install
namespace for the kubeconfig-generator.

charts/kamaji/README.md regenerated via helm-docs.

Signed-off-by: Wilfried Roset <wilfriedroset@users.noreply.github.com>
Add an Ordered Ginkgo describe that mutates the running kamaji
controller-manager Deployment to validate cache scoping end-to-end:

* the first It creates a TenantControlPlane in a watched namespace
  and asserts the operator reconciles it;
* the second It creates one in an unwatched namespace and asserts
  the operator never touches it (Consistently);
* an originalArgs snapshot + DeferCleanup restores the Deployment
  args even when an assertion aborts the run.

The describe is named with a zz_ prefix so Ginkgo executes it after
every other spec in the suite, minimising the blast radius.

Several test helpers were tuned for stability:

* findManagerDeployment further filters by the manager container's
  args[0] entrypoint, since the kubeconfig-generator deployment
  shares the same selector labels (both use kamaji.selectorLabels).
* waitForRollout strictly asserts len(pods.Items) == replicas with
  no DeletionTimestamp, so terminating pods from the previous
  generation cannot leak past the rollout (this also fixes a latent
  HaveLen(1) flake in PrintKamajiLogs in utils_test.go).
* waitForWebhookReady polls the kamaji-webhook-service
  EndpointSlices for at least one Ready endpoint.
* createTCPWithRetry retries the Create on transient
  webhook-unreachable errors. The container readiness probe targets
  healthz (8081), not the webhook port (9443), so EndpointSlice
  readiness alone does not guarantee the webhook server has bound
  yet, and kube-proxy may also lag by one informer tick.

Signed-off-by: Wilfried Roset <wilfriedroset@users.noreply.github.com>
Document the new --watch-namespaces flag: what it scopes, what
remains cluster-wide, the install-namespace auto-include, and a
short Helm install example. Linked from the guides nav so it is
discoverable from the rendered site.

Signed-off-by: Wilfried Roset <wilfriedroset@users.noreply.github.com>
@netlify

netlify Bot commented May 7, 2026

Copy link
Copy Markdown

Deploy Preview for kamaji-documentation ready!

Name Link
🔨 Latest commit 560c485
🔍 Latest deploy log https://app.netlify.com/projects/kamaji-documentation/deploys/69fc4ed8dc62ed0007fb1ebc
😎 Deploy Preview https://deploy-preview-1146--kamaji-documentation.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@prometherion

Copy link
Copy Markdown
Member

Full disclosure: I've used claude with opus 4.7 extensively to land the feature as I'm no golang expert

Thanks for being transparent.

Although we appreciate contributions, I have some remarks when merging external contributors code, mostly because it creates direct implications in maintaining such a feature.

I'm honestly neither convinced by the feature request since I see a potential security threat.

@wilfriedroset

Copy link
Copy Markdown
Author

I have some remarks when merging external contributors code, mostly because it creates direct implications in maintaining such a feature.

I would happily rework the PR to meet your standard and liking. would you be able to elaborate regarding your concerns?

I'm honestly neither convinced by the feature request

My usecase is that I'm one of the tenant of a shared k8s cluster. My platform team are focused on powering the platform itself. Deploying and maintaining kamaji for a single tenant is out of their scope. I want to deploy and maintain my instance of kamaji in my own namespace, hence scoping the namespace watched by the operator becomes handy.

I see a potential security threat.

I was under the impression that this opt-in feature was adopted by several operators.
I failed to understand this part and the associated tradeoffs. Does this security threat relates to kamaji itself or generally to the scoped watch? Do you mind detailing the security threat?

@prometherion

Copy link
Copy Markdown
Member

My concerns are regarding RBAC: although the changes introduce the ability to watch only for a specific subset of Namespaces, the resulting Kamaji instance still relies on a ClusterRole. This should be narrowed down and refactored to be a Role object, but again, this would create a problem when watching multiple Namespaces, since we would need to distribute Role and RoleBinding across all the watched Namespaces.

There are no traces of this support in the Helm templates: we could just ignore this and leverage only the ClusterRole, but that would mean an attacker could easily change the arguments of the Kamaji namespace-scoped instance and start disrupting others' Namespace installations, as well as exfiltrating Secrets, ConfigMaps, etc.

This is just the thread modelling for the Tenant Control Plane: there's also the Datastore API, which contains sensitive data and relies on the Cluster-scoped level. In a multi-tenant environment, you would need to grant access to the Datastore of that tenant; otherwise, data could be potentially leaked with a sophisticated attack, hard to reproduce, but still possible.

For such features, as maintainers, we prefer to have a proper discussion and a clear picture: as you can see, it's not just a matter of a simple feature, but it would require an important refactoring of the code-base and the installation of Kamaji itself, which would then be a burden on CLASTIX, which is maintaining the project and honours customers with (where possible) bug-free stable releases.

Last but not least, Issues are used to track bugs: features or enhancement proposals are preferred to be discussed in the Discussion section, or privately in case of sensitive use-cases.

@wilfriedroset

Copy link
Copy Markdown
Author

Last but not least, Issues are used to track bugs: features or enhancement proposals are preferred to be discussed in the Discussion section, or privately in case of sensitive use-cases.

Noted, and sorry for the wrong door! I had followed https://kamaji.clastix.io/guides/contribute/ and landed on issue #1145, but I see now that a Discussion would have been the better starting point. The PR was really meant as a "here is what it could look like" sketch rather than a request to merge as-is — happy to move the conversation wherever suits you best.

My concerns are regarding RBAC: although the changes introduce the ability to watch only for a specific subset of Namespaces, the resulting Kamaji instance still relies on a ClusterRole. This should be narrowed down and refactored to be a Role object, but again, this would create a problem when watching multiple Namespaces, since we would need to distribute Role and RoleBinding across all the watched Namespaces.

What about letting users bring their own RBAC via a chart value like rbac: false?
Defaults stay exactly as they are today, and admins who want a stricter posture can bind the existing ClusterRole with a RoleBinding scoped to one or more namespaces (which works fine for our use case).
WDYT?

There are no traces of this support in the Helm templates: we could just ignore this and leverage only the ClusterRole, but that would mean an attacker could easily change the arguments of the Kamaji namespace-scoped instance and start disrupting others' Namespace installations, as well as exfiltrating Secrets, ConfigMaps, etc.

I want to make sure I'm reading the threat model the same way you are: if an attacker can already mutate the operator's args, it seems they could also swap the image, add a malicious init container that inherits the pod's RBAC, mount extra volumes, etc.
At which point ClusterRole vs Role doesn't really change the outcome.
Is there a scenario you have in mind where args-only mutation is the realistic boundary?

This is just the thread modelling for the Tenant Control Plane: there's also the Datastore API, which contains sensitive data and relies on the Cluster-scoped level. In a multi-tenant environment, you would need to grant access to the Datastore of that tenant; otherwise, data could be potentially leaked with a sophisticated attack, hard to reproduce, but still possible.

I should have called this out upfront: I deliberately stayed clear of the cluster-scoped vs namespace-scoped DataStore question in this PR. A TCP in namespace A still references a cluster-scoped DataStore, so cross-tenant leakage stays structurally possible even with a perfectly scoped cache and RBAC.
That trade-off is fine for our setup, but I can see how it isn't for everyone. it feels like a related-but-separate topic that deserves its own thread (and probably a bigger PR).

For what it's worth, this is a commonly used pattern in the operator ecosystem: many projects ship a namespace-scoped CR alongside a cluster-scoped variant: cert-manager's Issuer / ClusterIssuer being an example. We can also see this pattern in a more recent operator with cnpg's ImangeCatalog / ClusterImageCatalog. Long term, what we'd love to be able to do is define our own DataStore resources directly inside our tenant namespaces, each binding to a Secret (which in our setup syncs PG credentials from our internal secret store). That would close the loop on multi-tenancy without requiring cluster-admin involvement for every new tenant. This is a topic for a separate discussion.


The PR description was probably overselling things. "narrow the watch cache" is a more honest framing than "namespace-scoped Kamaji", since the latter hints at guarantees this change does not actually provide. Would you be open to:

  1. reframing the change along those lines, and
  2. adding an rbac toggle in the chart so admins can opt out of the bundled RBAC?

@prometherion

Copy link
Copy Markdown
Member

What about letting users bring their own RBAC via a chart value like rbac: false? Defaults stay exactly as they are today, and admins who want a stricter posture can bind the existing ClusterRole with a RoleBinding scoped to one or more namespaces (which works fine for our use case). WDYT?

We would end up maintaining two different deployment strategies: a cluster-scoped one (the default), and one focused on single/multiple Namespaces. The latter, although optional, could raise issues with the security assessment we go through, and it's not acceptable to us. I'm not saying it's not possible, it would require time and efforts we're currently allocating to other features prioritized.

At which point ClusterRole vs Role doesn't really change the outcome.

That's why Kamaji is installed in its own Namespace: Tenants restricted to other Namespaces can't change it.

A TCP in namespace A still references a cluster-scoped DataStore, so cross-tenant leakage stays structurally possible even with a perfectly scoped cache and RBAC

I was referring to the privilege escalation mechanism in Kubernetes: you can't grant RBAC permissions that the user doesn't hold. To install Kamaji at the Tenant level, the Operator must have cluster-scoped permissions, which doesn't fit the multi-tenancy design in Kubernetes.

For what it's worth, this is a commonly used pattern in the operator ecosystem: many projects ship a namespace-scoped CR alongside a cluster-scoped variant: cert-manager's Issuer / ClusterIssuer being an example. We can also see this pattern in a more recent operator with cnpg's ImangeCatalog / ClusterImageCatalog. Long term, what we'd love to be able to do is define our own DataStore resources directly inside our tenant namespaces, each binding to a Secret (which in our setup syncs PG credentials from our internal secret store). That would close the loop on multi-tenancy without requiring cluster-admin involvement for every new tenant. This is a topic for a separate discussion.

That's an enhancement proposal; it can be put in our backlog. Kamaji has its own feature development map driven by customers: that's how we keep the project alive, getting financed directly from those benefiting from it.

  • reframing the change along those lines, and

Even addressing the changes, the better approach would be to have a local DataStore reference, but that would be highly impactful and would require a considerable amount of work. We're happy for contributions, but such a scope would be very massive and would require a very detailed review process. We would prefer addressing this on our own, being the ones maintaining the project for everybody, especially for our customers, who aren't looking for such a feature.

  • adding an rbac toggle in the chart so admins can opt out of the bundled RBAC?

This is a no-go for the security implications mentioned before.

@wilfriedroset

Copy link
Copy Markdown
Author

I understand all your points. Is there some options I did not identify/discuss here to continue the work on this feature/PR?

@prometherion

Copy link
Copy Markdown
Member

Such impactful changes are problematic since they burden the development activities, as happened with the NATS support, Gateway API, or other contributions that caused bug reports to our customers.

The development of these features requires financial support, either by contributing a dedicated maintainer to the project on behalf of the requester or through a formal feature request, development, and prioritisation process designed to ensure project maintainability and alignment with the expectations of commercial adopters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants