Details
-
Bug
-
Status: Open
-
High
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
-
Yes
-
Yes
Description
Description:
k8s Cluster: EKS v1.21
Percona Operator: - v1.14
- Using the cr deployment from the example
cr.yaml - Using the secrets and RBAC from the example as well
https://github.com/percona/percona-server-mongodb-operator/blob/v1.14.0/deploy/rbac.yaml - https://github.com/percona/percona-server-mongodb-operator/blob/v1.14.0/deploy/secrets.yaml
Adding Services type LoadBalancer annotations, which are created properly (X3)
expose: exposeType: LoadBalancer servicePerPod: true serviceAnnotations: service.beta.kubernetes.io/aws-load-balancer-backend-protocol: tcp service.beta.kubernetes.io/aws-load-balancer-internal: "true" service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
The Cluster behaves well we don't see any issues with the Replicasets or any other pods
But, looking over the Operator logs we see failures in connecting to Mongo
specifically: failed to start balancer: failed to get mongos connection: ping mongo.
at start the logs errors contain the LB details like here
{ Addr: aab9156aff40940058d1faa56edc4ff3-2f77aba0fe4711b7.elb.us-east-1.amazonaws.com:27017, Type: Unknown, Last error: dial tcp: lookup aab9156aff40940058d1faa56edc4ff3-2f77aba0fe4711b7.elb.us-east-1.amazonaws.com on 172.20.0.10:53: no such host }
2023-05-08T09:18:32.598Z INFO Cluster state changed {"controller": "psmdb-controller", "object": {"name":"my-cluster-name","namespace":"mongodb-raw-cluster"}, "namespace": "mongodb-raw-cluster", "name": "my-cluster-name", "reconcileID": "1c503b62-1d13-497b-a690-d59bdba01270", "previous": "ready", "current": "initializing"} 2023-05-08T09:18:32.624Z ERROR Reconciler error {"controller": "psmdb-controller", "object": {"name":"my-cluster-name","namespace":"mongodb-raw-cluster"}, "namespace": "mongodb-raw-cluster", "name": "my-cluster-name", "reconcileID": "1c503b62-1d13-497b-a690-d59bdba01270", "error": "failed to start balancer: failed to get mongos connection: ping mongo: server selection error: context deadline exceeded, current topology: { Type: Unknown, Servers: [{ Addr: adaf7f7e941244ad89941ef3931abdb6-5f9477407cc2fa7f.elb.us-east-1.amazonaws.com:27017, Type: Unknown, Last error: dial tcp: lookup adaf7f7e941244ad89941ef3931abdb6-5f9477407cc2fa7f.elb.us-east-1.amazonaws.com on 172.20.0.10:53: no such host }, { Addr: aab9156aff40940058d1faa56edc4ff3-2f77aba0fe4711b7.elb.us-east-1.amazonaws.com:27017, Type: Unknown, Last error: dial tcp: lookup aab9156aff40940058d1faa56edc4ff3-2f77aba0fe4711b7.elb.us-east-1.amazonaws.com on 172.20.0.10:53: no such host }, { Addr: a4aaba90ff28142c399f01e118923726-a9c2f358ccf57032.elb.us-east-1.amazonaws.com:27017, Type: Unknown, Last error: dial tcp: lookup a4aaba90ff28142c399f01e118923726-a9c2f358ccf57032.elb.us-east-1.amazonaws.com on 172.20.0.10:53: no such host }, ] }", "errorVerbose": "server selection error: context deadline exceeded, current topology: { Type: Unknown, Servers: [{ Addr: adaf7f7e941244ad89941ef3931abdb6-5f9477407cc2fa7f.elb.us-east-1.amazonaws.com:27017, Type: Unknown, Last error: dial tcp: lookup adaf7f7e941244ad89941ef3931abdb6-5f9477407cc2fa7f.elb.us-east-1.amazonaws.com on 172.20.0.10:53: no such host }, { Addr: aab9156aff40940058d1faa56edc4ff3-2f77aba0fe4711b7.elb.us-east-1.amazonaws.com:27017, Type: Unknown, Last error: dial tcp: lookup aab9156aff40940058d1faa56edc4ff3-2f77aba0fe4711b7.elb.us-east-1.amazonaws.com on 172.20.0.10:53: no such host }, { Addr: a4aaba90ff28142c399f01e118923726-a9c2f358ccf57032.elb.us-east-1.amazonaws.com:27017, Type: Unknown, Last error: dial tcp: lookup a4aaba90ff28142c399f01e118923726-a9c2f358ccf57032.elb.us-east-1.amazonaws.com on 172.20.0.10:53: no such host }, ] }\nping mongo\ngithub.com/percona/percona-server-mongodb-operator/pkg/psmdb/mongo.Dial\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/psmdb/mongo/mongo.go:64\ngithub.com/percona/percona-server-mongodb-operator/pkg/psmdb.MongosClient\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/psmdb/client.go:70\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).mongosClientWithRole\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/connections.go:30\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).enableBalancerIfNeeded\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/balancer.go:78\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:507\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1594\nfailed to get mongos connection\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).enableBalancerIfNeeded\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/balancer.go:80\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:507\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1594\nfailed to start balancer\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:508\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1594"} sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:329 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /go/pkg/mod/sigs.k8s.io/contr[email protected]/pkg/internal/controller/controller.go:235
Later the error changed to this:
"failed to start balancer: failed to get mongos connection: ping mongo: timed out while checking out a connection from connection pool: context deadline exceeded; maxPoolSize: 100, connections in use by cursors: 0, connections in use by transactions: 0, connections in use by other operations: 1
2023-05-08T09:21:45.643Z ERROR Reconciler error {"controller": "psmdb-controller", "object": {"name":"my-cluster-name","namespace":"mongodb-raw-cluster"}, "namespace": "mongodb-raw-cluster", "name": "my-cluster-name", "reconcileID": "f0b875e7-03f0-4163-a8d4-c7545cf31fa2", "error": "failed to start balancer: failed to get mongos connection: ping mongo: timed out while checking out a connection from connection pool: context deadline exceeded; maxPoolSize: 100, connections in use by cursors: 0, connections in use by transactions: 0, connections in use by other operations: 1", "errorVerbose": "timed out while checking out a connection from connection pool: context deadline exceeded; maxPoolSize: 100, connections in use by cursors: 0, connections in use by transactions: 0, connections in use by other operations: 1\nping mongo\ngithub.com/percona/percona-server-mongodb-operator/pkg/psmdb/mongo.Dial\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/psmdb/mongo/mongo.go:64\ngithub.com/percona/percona-server-mongodb-operator/pkg/psmdb.MongosClient\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/psmdb/client.go:70\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).mongosClientWithRole\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/connections.go:30\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).enableBalancerIfNeeded\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/balancer.go:78\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:507\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1594\nfailed to get mongos connection\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).enableBalancerIfNeeded\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/balancer.go:80\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:507\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1594\nfailed to start balancer\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:508\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1594"} sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:329 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /go/pkg/mod/sigs.k8s.io/contr[email protected]/pkg/internal/controller/controller.go:235
I'm not sure what is wrong here, but from the code, it seems that the Operator tries to connect through the LoadBalancers...which should be wrong, since the operator is in the cluster and does not need to go outside the cluster for ping.
https://github.com/percona/percona-server-mongodb-operator/blob/00a07f560de62c871b7e3f891a82f9814177c25c/pkg/psmdb/service.go#L320-L332
Commit: https://github.com/percona/percona-server-mongodb-operator/pull/862
Ticket: https://jira.percona.com/browse/K8SPSMDB-599
it seems that the part about setting the endpoint with the host should be omitted
if mongos := cr.Spec.Sharding.Mongos; mongos.Expose.ExposeType == corev1.ServiceTypeLoadBalancer { for _, i := range svc.Status.LoadBalancer.Ingress { host = i.IP if len(i.Hostname) > 0 { host = i.Hostname } } } else { host = svc.Name + "." + cr.Namespace + "." + cr.Spec.ClusterServiceDNSSuffix } return host, nil }
Workaround
The workaround we found is to use ClusterIP and create the LoadBalancers servers externally to the Operator, which eliminates the error logs.
In both cases, the Cluster seems to be functioning well.
e.g.
apiVersion: v1
kind: Service
metadata:
annotations:
service.beta.kubernetes.io/aws-load-balancer-backend-protocol: tcp
service.beta.kubernetes.io/aws-load-balancer-internal: "true"
service.beta.kubernetes.io/aws-load-balancer-type: nlb
labels:
app.kubernetes.io/component: mongos
app.kubernetes.io/instance: my-cluster-name
statefulset.kubernetes.io/pod-name: my-cluster-name-mongos-0
name: my-cluster-name-lb-mongos-0
namespace: mongodb-raw-cluster
spec:
ports:
- name: mongodb
port: 27017
selector:
statefulset.kubernetes.io/pod-name: my-cluster-name-mongos-0
type: LoadBalancer
Be happy for a feedback on this