Doppler Secret Operator with GKE

rrmangum · July 13, 2023, 6:46pm

Previously my cluster was running the Doppler Secret Operator with no issue. My cluster settings were set to e2-medium machines with autoscaling, the cluster scaled to one e2-medium machine. Today I decided to switch my cluster from autoscaling to manual with 3 dedicated e2-micro machines. The rest of my cluster is functioning just fine, Only the doppler-operator-controller-manager is failing with the error: “Does not have minimum availability”. I do not believe it is an issue with resources as 3 e2 micro machines is pretty closely comparable to one e2-medium machine.

When I inspect the workload I have two errors: “Pod errors: Unschedulable” and “Does not have minimum availability”. The logs contain various messages some of which I’ll include here:

2023-07-13T18:00:45.964Z INFO controller-runtime.manager.controller.dopplersecret Stopping workers {“reconciler group”: “secrets.doppler.com”, “reconciler kind”: “DopplerSecret”}

“2023-07-13T18:00:41.972Z INFO controllers.DopplerSecret Reconciling dopplersecret {“dopplersecret”: “doppler-operator-system/doppler-token”}”

“2023-07-13T18:00:41.972Z INFO controllers.DopplerSecret Requeue duration set {“dopplersecret”: “doppler-operator-system/doppler-token”, “requeueAfter”: “1m0s”}”

“2023-07-13T18:00:41.978Z ERROR controllers.DopplerSecret Unable to update dopplersecret {“dopplersecret”: “doppler-operator-system/doppler-token”, “error”: “Failed to load Doppler Token: Failed to fetch token secret reference: Secret "doppler-token-secret" not found”}”

I did not make changes to the doppler-token or the Doppler secret operator configuration files. I believe this is an issue of the workload not being transferred to the new node pool since it lives in a different namespace.

watsonian · July 13, 2023, 7:43pm

Hi @rrmangum!

Our helm chart defines the following resource limits for the controller manager:

resources:
  limits:
    cpu: 100m
    memory: 256Mi
  requests:
    cpu: 100m
    memory: 256Mi

It’s also configured to create 1 replica by default. You should be able to adjust these in your values.yaml file if you’re using one. If you are, can you post what you’re using currently?

Regards,
-Joel

rrmangum · July 13, 2023, 7:47pm

I used kubectl to apply the secret operator. This is my .yaml for the secret operator. This same file also includes configurations for my application specific containers. I have them separated with the triple dash like this:

apiVersion: secrets.doppler.com/v1alpha1
kind: DopplerSecret
metadata:
  name: doppler-token
  namespace: doppler-operator-system
spec:
  tokenSecret: 
    name: doppler-token-secret
  managedSecret: 
    name: doppler-token
    namespace: default
---
# Application specific yaml configuration

watsonian · July 13, 2023, 8:17pm

@rrmangum Could you try running this for me?

kubectl get secrets -n doppler-operator-system

Does it list the doppler-token-secret secret? The next thing to ensure is that the namespace under metadata is doppler-operator-system for all of your CRDs. All DopplerSecret CRDs must exist in the same namespace as the operator. The managedSecret can use other namespaces, but the DopplerSecret itself must be in the same namespace as the operator.

Could you also run the following commands and send me the output?

kubectl get pods -n doppler-operator-system
kubectl describe pod $NAME_OF_POD_FROM_PREVIOUS_COMMAND -n doppler-operator-system

Regards,
-Joel

rrmangum · July 13, 2023, 8:23pm

kubectl get secrets -n doppler-operator-system

Returns “No resources found in doppler-operator-system namespace.” The doppler-token secret exists in the default namespace. Should it be in both namespaces? It was working before though and I didn’t change the location of the doppler-token secret.

I believe the only CRD I have is the one I sent you.

kubectl get pods -n doppler-operator-system

Returns:
NAME READY STATUS RESTARTS AGE
doppler-operator-controller-manager-cf7d9d8f-4zdz4 0/2 Pending 0 95m

kubectl describe pod $NAME_OF_POD_FROM_PREVIOUS_COMMAND -n doppler-operator-system

Returns:

Name:             doppler-operator-controller-manager-cf7d9d8f-4zdz4
Namespace:        doppler-operator-system
Priority:         0
Service Account:  doppler-operator-controller-manager
Node:             <none>
Labels:           control-plane=controller-manager
                  pod-template-hash=cf7d9d8f
Annotations:      cloud.google.com/cluster_autoscaler_unhelpable_since: 2023-07-13T18:45:22+0000
                  cloud.google.com/cluster_autoscaler_unhelpable_until: Inf
Status:           Pending
IP:               
IPs:              <none>
Controlled By:    ReplicaSet/doppler-operator-controller-manager-cf7d9d8f
Containers:
  kube-rbac-proxy:
    Image:      gcr.io/kubebuilder/kube-rbac-proxy:v0.14.1
    Port:       8443/TCP
    Host Port:  0/TCP
    Args:
      --secure-listen-address=0.0.0.0:8443
      --upstream=http://127.0.0.1:8080/
      --logtostderr=true
      --v=10
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-pglnv (ro)
  manager:
    Image:      dopplerhq/kubernetes-operator:1.2.7
    Port:       <none>
    Host Port:  <none>
    Command:
      /manager
    Args:
      --health-probe-bind-address=:8081
      --metrics-bind-address=127.0.0.1:8080
      --leader-elect
    Limits:
      cpu:     100m
      memory:  256Mi
    Requests:
      cpu:        100m
      memory:     256Mi
    Liveness:     http-get http://:8081/healthz delay=15s timeout=1s period=20s #success=1 #failure=3
    Readiness:    http-get http://:8081/readyz delay=5s timeout=1s period=10s #success=1 #failure=3
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-pglnv (ro)
Conditions:
  Type           Status
  PodScheduled   False 
Volumes:
  kube-api-access-pglnv:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason             Age                  From                Message
  ----     ------             ----                 ----                -------
  Normal   NotTriggerScaleUp  77s (x572 over 96m)  cluster-autoscaler  pod didn't trigger scale-up:
  Warning  FailedScheduling   42s (x22 over 96m)   default-scheduler   0/3 nodes are available: 3 Insufficient memory. preemption: 0/3 nodes are available: 3 No preemption victims found for incoming pod..

watsonian · July 13, 2023, 8:53pm

@rrmangum So, it looks like you have two potential problems. First, it looks like the Doppler operator pod isn’t running because there’s insufficient memory:

Warning  FailedScheduling   42s (x22 over 96m)   default-scheduler   0/3 nodes are available: 3 Insufficient memory. preemption: 0/3 nodes are available: 3 No preemption victims found for incoming pod..

Our controller has a 256Mi resource requirement, so it seems like your new cluster doesn’t have any nodes with sufficient free memory to schedule it on.

Second, it sounds like the doppler-token-secret secret is in the wrong namespace. It should be in the doppler-operator-system namespace. Note that this is distinct from the doppler-token DopplerSecret CRD you have described above. That CRD references a k8s secret that contains the Doppler token used to fetch the secrets for the CRD entry. In our documentation, this is created with the name doppler-token-secret and it should be created in the doppler-operator-system namespace. It could be that it exists in the default namespace due to a historical issue depending upon when you installed originally. In previous versions of the operator, it could be in other namespaces, but this was changed as a security update and now it has to be in the same namespace as the operator (along with the CRDs). As mentioned, the managedSecret in the CRD can be in any namespace though.

Regards,
-Joel

rrmangum · July 13, 2023, 9:44pm

I scaled up the cluster to 4 nodes and that had no effect, each node is providing ~500mb of memory so adding one node should have been enough for the Doppler secret operator. I do not think it is resourcing issue?

I added the namespace tag for the tokenSecret in the configuration file, and applied it to my cluster trying to specify where the doppler-token-secret is located.

apiVersion: secrets.doppler.com/v1alpha1
kind: DopplerSecret
metadata:
  name: doppler-token
  namespace: doppler-operator-system
spec:
  tokenSecret: 
    name: doppler-token-secret
    namespace: doppler-operator-system
  managedSecret: 
    name: doppler-token
    namespace: default

That also had no effect as kubectl get secrets -n doppler-operator-system is still returning no resources found in doppler-operator-system namespace. When I run kubectl get secrets -n default I see the doppler-token kubernetes secret. How else can I move the doppler-token-secret?

watsonian · July 13, 2023, 10:06pm

@rrmangum You don’t need to add a namespace entry for the tokenSecret. That must be in the operator namespace, so it’s inferred. You need to actually create the secret in that namespace:

kubectl create secret generic doppler-token-secret \
  --namespace doppler-operator-system \
  --from-literal=serviceToken=dp.st.dev.XXXX

With regard to adding another node – that may or may not help depending upon what gets deployed on each node. If anything else is being deployed, it’s very possible that the required space isn’t available. What’s needed is 256Mi of contiguous memory (i.e., if each node provides 1GB of memory and 900MB are used, having 5 nodes means you have 500MB of overall space free, but only 100MB of contiguous space on any given node). Try increasing the size of the nodes in your cluster rather than increasing the number of nodes and see if that helps.

rrmangum · July 13, 2023, 10:59pm

So I created the secret in the doppler-operator-system and I migrated workloads to a pool with e2-small machines. The secret operator started functioning correctly.

As a test I deleted the doppler-token-secret from the the doppler-operator-system namespace and the secret operator is still functioning correctly.

The cluster is working as intended now, thanks for the help!

rrmangum · July 21, 2023, 12:02am

@watsonian

Bringing this back because I clearly didn’t understand what I was doing! LOL

At this point I have created all the secrets necessary and applied them with my CRDs. I also added the annotation tag to my deployment CRD.

When I run kubectl describe dopplersecrets -n doppler-operator-system I can see the controller is running with messages like “Controller is continuously syncing secrets” and “Controller is ready to reload deployments. 2 found.”

However my application is failing to get my secrets. Any ideas?

rrmangum · July 21, 2023, 12:10am

I can also see my secret names using kubectl describe secret doppler-token. It just shows the size of each secret instead of its actual value.

rrmangum · July 21, 2023, 10:33pm

Found out the backend secret retrieval is working. I have two secrets on the frontend that are both undefined. Did I miss a configuration step?

watsonian · July 26, 2023, 6:57pm

@rrmangum Sorry for the delay getting back to you! Lost track of this thread.

It’s probably a misconfiguration of some kind. Could you post both the working and broken DopplerSecret CRD?

Topic		Replies	Views
GitOps, K8s, Doppler & Disaster recovery Discussions	4	897	July 29, 2022
Kubernetes operator token service account Need Help	2	280	November 10, 2023
Doppler with container sigterm Need Help	1	311	September 5, 2023
Doppler Kubernetes Operator - 'reload' annotation reloads all my pods Need Help	4	451	February 24, 2023
Create a secret for imagePullSecrets on K8S Need Help	2	550	October 26, 2023

Doppler Secret Operator with GKE

Related topics