# Use NodePort approach for enabling external communication to a Kafka cluster

XMLWordPrintable

## Details

• Type: Story
• Status: Done
• Resolution: Done
• Fix Version/s: None
• Component/s: None
• Labels:
None
• Story Points:
4.2
• Team:
SQuaRE

## Description

A common approach for enabling external communication to kafka deployed on k8s is to place a load balancer service in front of each Kafka broker. That provides a simple but still secure solution using SSL termination.

That is the approach we currently use in our EFD deployment at GKE.

For single machine deployment using k3s ("kubes") we are limited by a single external IP address, so that approach does not work. Ingress cannot be used here to route the traffic to each broker because Kafka uses a different protocol than HTTP.

In this ticket we'll investigate the NodePort approach for enabling external communication to the kafka cluster.

We will test this approach on both k3s and gke deployments. In the end, my assumption in DM-19745 that we could have n>1 brokers behind 1 load balancer service may be right if I do it correctly.

## Activity

Hide
Angelo Fausti added a comment -

Using Kafkacat to test the installation https://github.com/lsst-sqre/sqr-031/pull/3

Show
Angelo Fausti added a comment - Using Kafkacat to test the installation https://github.com/lsst-sqre/sqr-031/pull/3
Hide
Angelo Fausti added a comment - - edited

In the DM-19745 implementation, for the k3s deployment we see the error below when reaching the kafka cluster:

 kafkacat -P -b test-efd0.lsst.codes:9094 -t test_topic Hello EFD! ^D   % ERROR: Local: Broker transport failure: test-efd0.lsst.codes:9094/bootstrap:Failed to resolve test-efd1.lsst.codes 

In this scenario we cannot have one LoadBalancer for each broker, because we are limited by only one external IP.

If we use the NodePort approach and configure KAFKA_ADVERTISED_LISTENERS properly we can still have one LoadBalancer for the cluster and n>1 brokers. Here is the rationale for that:

From [1] "... when you run a client, the broker you pass to it is just where it’s going to go and get the metadata about brokers in the cluster from. The actual host & IP that it will connect to for reading/writing data is based on the data that the broker passes back in that initial connection—even if it’s just a single node and the broker returned is the same as the one connected to."

Show
Angelo Fausti added a comment - - edited In the DM-19745 implementation, for the k3s deployment we see the error below when reaching the kafka cluster: kafkacat -P -b test-efd0.lsst.codes:9094 -t test_topic Hello EFD! ^D   % ERROR: Local: Broker transport failure: test-efd0.lsst.codes:9094/bootstrap:Failed to resolve test-efd1.lsst.codes In this scenario we cannot have one LoadBalancer for each broker, because we are limited by only one external IP. If we use the NodePort approach and configure KAFKA_ADVERTISED_LISTENERS properly we can still have one LoadBalancer for the cluster and n>1 brokers. Here is the rationale for that: From [1] "... when you run a client, the broker you pass to it is just where it’s going to go and get the metadata about brokers in the cluster from. The actual host & IP that it will connect to for reading/writing data is based on the data that the broker passes back in that initial connection—even if it’s just a single node and the broker returned is the same as the one connected to." [1] Kafka Listeners - Explained
Hide
Angelo Fausti added a comment - - edited

To the NodePort approach, you can create services like this:

   [afausti@ts-csc-01 ~]$cat confluent-0-nodeport.yaml apiVersion: v1 kind: Service metadata:  name: confluent-0-nodeport spec:  selector:  app: cp-kafka  type: NodePort  ports:  - name: cp-kafka-0  port: 9092  targetPort: 9092  nodePort: 31090  protocol: TCP  to reach each broker on ports 31090, 31091 and 31092.  $ kafkacat -L -b test-efd0.lsst.codes:31090 Metadata for all topics (from broker 0: 140.252.32.142:31090/0):  3 brokers:  broker 0 at 140.252.32.142:31090  broker 2 at 140.252.32.142:31092  broker 1 at 140.252.32.142:31091 (controller)  125 topics 

Now the following works fine:

 kafkacat -P -b test-efd0.lsst.codes:31090 -t test_topic Hello EFD! ^D 

enabling debug one can see the actual broker used for sending the message

 kafkacat -P -b test-efd0.lsst.codes:31090 -t test_topic -d broker 

which was broker 1 on my test. So this test shows the NodePort approach working with one LoadBalancer.

Show
Angelo Fausti added a comment - - edited To the NodePort approach, you can create services like this:   [afausti@ts-csc-01 ~]$cat confluent-0-nodeport.yaml apiVersion: v1 kind: Service metadata: name: confluent-0-nodeport spec: selector: app: cp-kafka type: NodePort ports: - name: cp-kafka-0 port: 9092 targetPort: 9092 nodePort: 31090 protocol: TCP to reach each broker on ports 31090 , 31091 and 31092 .$ kafkacat -L -b test-efd0.lsst.codes:31090 Metadata for all topics (from broker 0: 140.252.32.142:31090/0): 3 brokers: broker 0 at 140.252.32.142:31090 broker 2 at 140.252.32.142:31092 broker 1 at 140.252.32.142:31091 (controller) 125 topics Now the following works fine: kafkacat -P -b test-efd0.lsst.codes:31090 -t test_topic Hello EFD! ^D enabling debug one can see the actual broker used for sending the message kafkacat -P -b test-efd0.lsst.codes:31090 -t test_topic -d broker which was broker 1 on my test. So this test shows the NodePort approach working with one LoadBalancer.
Hide
Angelo Fausti added a comment -

There's some work on cp-helm-charts recently on enabling NodePort for external access on k8s, notably:

So I synced our cp-helm-charts fork with upstream release v5.2.2 to test the new features

• PR to cp-helm-charts here

and changed terraform-efd and terraform-efd-gke accordingly:

• PRs to terraform-efd and terraform-efd
Show
Angelo Fausti added a comment - There's some work on cp-helm-charts recently on enabling NodePort for external access on k8s, notably: https://github.com/confluentinc/cp-helm-charts/commit/2f64c115feb6c5021b6819f7447f711fb57cd36e https://github.com/confluentinc/cp-helm-charts/commit/e0e1636974527a8cc6cf71a73ff225a5bacdda54 So I synced our cp-helm-charts fork with upstream release v5.2.2 to test the new features PR to cp-helm-charts here and changed terraform-efd and terraform-efd-gke accordingly: PRs to terraform-efd and terraform-efd
Hide
Angelo Fausti added a comment -

Added secrets for deploying EFD test-gke instance to vault:

 $vault kv list secret/dm/square/efd/test-gke Keys ---- github grafana_oauth influxdb_admin influxdb_telegraf prometheus_oauth tls (env)  Show Angelo Fausti added a comment - Added secrets for deploying EFD test-gke instance to vault:$ vault kv list secret/dm/square/efd/test-gke Keys ---- github grafana_oauth influxdb_admin influxdb_telegraf prometheus_oauth tls (env)
Hide
Angelo Fausti added a comment -

The conclusion for this ticket is that we can use the Nodeport approach when deploying the cp-helm-charts to Kubes (k3s) with just one load balancer, and use 3 load balancers (= number of kafka brokers) when deploying to GKE or to a proper k8s cluster. The difference is reflected in the chart configuration.

Show
Angelo Fausti added a comment - The conclusion for this ticket is that we can use the Nodeport approach when deploying the cp-helm-charts to Kubes (k3s) with just one load balancer, and use 3 load balancers (= number of kafka brokers) when deploying to GKE or to a proper k8s cluster. The difference is reflected in the chart configuration.

## People

• Assignee:
Angelo Fausti
Reporter:
Angelo Fausti
Watchers:
Angelo Fausti