Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-27550

How to fix kafka when there's mismatch between the schema IDs in the Schema Registry and in the messages, or how to avoid that.

    XMLWordPrintable

    Details

    • Type: Story
    • Status: To Do
    • Resolution: Unresolved
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      This ticket describes a failure mode in the EFD that requires more investigation to understand what the proper fix should be.

      We lost kube04 node in the NTS cluser (it had to be cordoned). One of the brokers (broker-1) was running there. However the broker-1 pod could not be Terminated and I wasn't able to reschedule it to another node. The only apparent solution was to redeploy Kafka (perhaps that was not really necessary, but I didn't have any path forward at that moment).

      [Now that I think about it retrospectively I could have used {{kubectl delete pods <pod> --grace-period=0 --force}} perhaps.]

      The problem was that redeploying Kafka removed all schemas from the schema registry but there were still messages in the persisted volumes of broker-0 and broker-2. These messages had the old schema ID recorded on their first bytes.

      [I didn't expect the schemas to be removed, they should be preserved in the {{_schemas}} internal topic since there's replication for this topic in the three brokers. Not sure what happened here.]

      After Kafka was re-deployed the producers registered the topic schemas in the Schema Registry again. However, there's no guarantee that the schema IDs are preserved for old messages. That explains the mismatch with schema IDs when trying to deserialize the messages and the cluster state is inconsistent.

        Attachments

          Activity

          There are no comments yet on this issue.

            People

            Assignee:
            afausti Angelo Fausti
            Reporter:
            afausti Angelo Fausti
            Watchers:
            Angelo Fausti
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Dates

              Created:
              Updated:

                Jenkins

                No builds found.