Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-20443

Use NodePort approach for enabling external communication to a Kafka cluster

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      A common approach for enabling external communication to kafka deployed on k8s is to place a load balancer service in front of each Kafka broker. That provides a simple but still secure solution using SSL termination.

      That is the approach we currently use in our EFD deployment at GKE.

      For single machine deployment using k3s ("kubes") we are limited by a single external IP address, so that approach does not work. Ingress cannot be used here to route the traffic to each broker because Kafka uses a different protocol than HTTP.

      In this ticket we'll investigate the NodePort approach for enabling external communication to the kafka cluster.

      We will test this approach on both k3s and gke deployments. In the end, my assumption in DM-19745 that we could have n>1 brokers behind 1 load balancer service may be right if I do it correctly.

        Attachments

          Issue Links

            Activity

            Hide
            afausti Angelo Fausti added a comment -

            Using Kafkacat to test the installation https://github.com/lsst-sqre/sqr-031/pull/3

            Show
            afausti Angelo Fausti added a comment - Using Kafkacat to test the installation https://github.com/lsst-sqre/sqr-031/pull/3
            Hide
            afausti Angelo Fausti added a comment - - edited

            In the DM-19745 implementation, for the k3s deployment we see the error below when reaching the kafka cluster:

            kafkacat -P -b test-efd0.lsst.codes:9094 -t test_topic
            Hello EFD!
            ^D
             
            % ERROR: Local: Broker transport failure: test-efd0.lsst.codes:9094/bootstrap:Failed to resolve test-efd1.lsst.codes
            

            In this scenario we cannot have one LoadBalancer for each broker, because we are limited by only one external IP.

            If we use the NodePort approach and configure KAFKA_ADVERTISED_LISTENERS properly we can still have one LoadBalancer for the cluster and n>1 brokers. Here is the rationale for that:

            From [1] "... when you run a client, the broker you pass to it is just where it’s going to go and get the metadata about brokers in the cluster from. The actual host & IP that it will connect to for reading/writing data is based on the data that the broker passes back in that initial connection—even if it’s just a single node and the broker returned is the same as the one connected to."

            [1] Kafka Listeners - Explained

            Show
            afausti Angelo Fausti added a comment - - edited In the DM-19745 implementation, for the k3s deployment we see the error below when reaching the kafka cluster: kafkacat -P -b test-efd0.lsst.codes:9094 -t test_topic Hello EFD! ^D   % ERROR: Local: Broker transport failure: test-efd0.lsst.codes:9094/bootstrap:Failed to resolve test-efd1.lsst.codes In this scenario we cannot have one LoadBalancer for each broker, because we are limited by only one external IP. If we use the NodePort approach and configure KAFKA_ADVERTISED_LISTENERS properly we can still have one LoadBalancer for the cluster and n>1 brokers. Here is the rationale for that: From [1] "... when you run a client, the broker you pass to it is just where it’s going to go and get the metadata about brokers in the cluster from. The actual host & IP that it will connect to for reading/writing data is based on the data that the broker passes back in that initial connection—even if it’s just a single node and the broker returned is the same as the one connected to." [1] Kafka Listeners - Explained
            Hide
            afausti Angelo Fausti added a comment - - edited

            To the NodePort approach, you can create services like this:

             
            [afausti@ts-csc-01 ~]$ cat confluent-0-nodeport.yaml
            apiVersion: v1
            kind: Service
            metadata:
              name: confluent-0-nodeport
            spec:
              selector:
                app: cp-kafka
              type: NodePort
              ports:
              - name: cp-kafka-0
                port: 9092
                targetPort: 9092
                nodePort: 31090
                protocol: TCP
            

            to reach each broker on ports 31090, 31091 and 31092.

            $ kafkacat -L -b test-efd0.lsst.codes:31090
            Metadata for all topics (from broker 0: 140.252.32.142:31090/0):
             3 brokers:
              broker 0 at 140.252.32.142:31090
              broker 2 at 140.252.32.142:31092
              broker 1 at 140.252.32.142:31091 (controller)
             125 topics
            

            Now the following works fine:

            kafkacat -P -b test-efd0.lsst.codes:31090 -t test_topic
            Hello EFD!
            ^D
            

            enabling debug one can see the actual broker used for sending the message

            kafkacat -P -b test-efd0.lsst.codes:31090 -t test_topic -d broker
            

            which was broker 1 on my test. So this test shows the NodePort approach working with one LoadBalancer.

            Show
            afausti Angelo Fausti added a comment - - edited To the NodePort approach, you can create services like this:   [afausti@ts-csc-01 ~]$ cat confluent-0-nodeport.yaml apiVersion: v1 kind: Service metadata: name: confluent-0-nodeport spec: selector: app: cp-kafka type: NodePort ports: - name: cp-kafka-0 port: 9092 targetPort: 9092 nodePort: 31090 protocol: TCP to reach each broker on ports 31090 , 31091 and 31092 . $ kafkacat -L -b test-efd0.lsst.codes:31090 Metadata for all topics (from broker 0: 140.252.32.142:31090/0): 3 brokers: broker 0 at 140.252.32.142:31090 broker 2 at 140.252.32.142:31092 broker 1 at 140.252.32.142:31091 (controller) 125 topics Now the following works fine: kafkacat -P -b test-efd0.lsst.codes:31090 -t test_topic Hello EFD! ^D enabling debug one can see the actual broker used for sending the message kafkacat -P -b test-efd0.lsst.codes:31090 -t test_topic -d broker which was broker 1 on my test. So this test shows the NodePort approach working with one LoadBalancer.
            Hide
            afausti Angelo Fausti added a comment -

            There's some work on cp-helm-charts recently on enabling NodePort for external access on k8s, notably:

            So I synced our cp-helm-charts fork with upstream release v5.2.2 to test the new features

            • PR to cp-helm-charts here

            and changed terraform-efd and terraform-efd-gke accordingly:

            • PRs to terraform-efd and terraform-efd
            Show
            afausti Angelo Fausti added a comment - There's some work on cp-helm-charts recently on enabling NodePort for external access on k8s, notably: https://github.com/confluentinc/cp-helm-charts/commit/2f64c115feb6c5021b6819f7447f711fb57cd36e https://github.com/confluentinc/cp-helm-charts/commit/e0e1636974527a8cc6cf71a73ff225a5bacdda54 So I synced our cp-helm-charts fork with upstream release v5.2.2 to test the new features PR to cp-helm-charts here and changed terraform-efd and terraform-efd-gke accordingly: PRs to terraform-efd and terraform-efd
            Hide
            afausti Angelo Fausti added a comment -

            Added secrets for deploying EFD test-gke instance to vault:

            $ vault kv list secret/dm/square/efd/test-gke
            Keys
            ----
            github
            grafana_oauth
            influxdb_admin
            influxdb_telegraf
            prometheus_oauth
            tls
            (env)
            

            Show
            afausti Angelo Fausti added a comment - Added secrets for deploying EFD test-gke instance to vault: $ vault kv list secret/dm/square/efd/test-gke Keys ---- github grafana_oauth influxdb_admin influxdb_telegraf prometheus_oauth tls (env)
            Hide
            afausti Angelo Fausti added a comment -

            The conclusion for this ticket is that we can use the Nodeport approach when deploying the cp-helm-charts to Kubes (k3s) with just one load balancer, and use 3 load balancers (= number of kafka brokers) when deploying to GKE or to a proper k8s cluster. The difference is reflected in the chart configuration.

            Show
            afausti Angelo Fausti added a comment - The conclusion for this ticket is that we can use the Nodeport approach when deploying the cp-helm-charts to Kubes (k3s) with just one load balancer, and use 3 load balancers (= number of kafka brokers) when deploying to GKE or to a proper k8s cluster. The difference is reflected in the chart configuration.

              People

              • Assignee:
                afausti Angelo Fausti
                Reporter:
                afausti Angelo Fausti
                Watchers:
                Angelo Fausti
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Summary Panel