Skip to main content

Posts

Showing posts with the label debug

Deal with corrupted messages in Apache Kafka

Under some strange circumstances, it can happen that a message in a Kafka topic is corrupted. This often happens when using 3rd party frameworks with Kafka. In addition, Kafka < 0.9 does not have a lock on Log.read() at the consumer read level, but does have a lock on Log.write(). This can lead to a rare race condition as described in KAKFA-2477 [1]. A likely log entry looks like this: ERROR Error processing message, stopping consumer: (kafka.tools.ConsoleConsumer$) kafka.message.InvalidMessageException: Message is corrupt (stored crc = xxxxxxxxxx, computed crc = yyyyyyyyyy Kafka-Tools Kafka stores the offset of each consumer in Zookeeper. To read the offsets, Kafka provides handy tools [2]. But you can also use zkCli.sh, at least to display the consumer and the stored offsets. First we need to find the consumer for a topic (> Kafka 0.9): bin/kafka-consumer-groups.sh --zookeeper management01:2181 --describe --group test Prior to Kafka 0.9, the only way to get this in...

HDFS debugging scenario

The first step to debug issues in a running hadoop - environment to take a look at the stacktraces, easy accessible over jobtracker/stacks and let you show all running stacks in a jstack view. You will see the running processes, as an example I discuss a lab testing scenario, see below. http://jobtracker:50030/stacks Process Thread Dump:  43 active threads Thread 3203101 (IPC Client (47) connection to NAMENODE/IP:9000 from hdfs):   State: TIMED_WAITING   Blocked count: 6   Waited count: 7   Stack:     java.lang.Object.wait(Native Method)     org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:676)     org.apache.hadoop.ipc.Client$Connection.run(Client.java:719) In that case the RPC connection has a state "TIMED_WAIT" in a block and waited count. That means, the namenode could not answer the RPC request fast enough. The problem belongs the setup as I see often in production environments. For demonstration...