You can troubleshoot and trace possible errors that can occur with WSO2 Message Broker in a given environment by using the methods given below.
Debugging
The following table provides descriptions of the important classes in WSO2 MB that will be useful when you debug a session.
Class | Description | |
---|---|---|
Inbound | org.wso2.andes.kernel.disruptor.inbound. InboundEventManager | All inbound events (e.g. message arrival, subscription add/close events) are handled through this class. |
org.wso2.andes.kernel.disruptor.inbound. MessagePreProcessor | The incoming message goes through this processor first, where its message ID and destination data are populated to ensure the message order closest to the message arrival time. | |
org.wso2.andes.kernel.disruptor.inbound. ContentChunkHandler | This processor will take the message content chunks, convert them to the andes core chunk size and delegate the rest of the work to the MessageWriter. | |
org.wso2.andes.kernel.disruptor.inbound. MessageWriter | This processor will write the message metadata and content chunks to the storage database using a batch approach. | |
org.wso2.andes.kernel.disruptor.inbound. StateEventHandler | Upon saving the message to storage, this handler is triggered to notify a message received event, or to notify a message acknowledged event from the consumer side. | |
org.wso2.andes.kernel.disruptor.inbound. InboundTransactionEvent | This event is used to communicate the transaction commit/rollback events from the publisher side. | |
Outbound | org.wso2.andes.kernel.disruptor.delivery. DeliveryEventHandler | This processor is used to deliver the message to one/all of the active subscriptions based on the message destination. |
org.wso2.andes.kernel.MessageFlusher | This class is used to handover the message to the consumer after reading from the internal message buffer (readButUndeliveredMessages). | |
org.wso2.andes.kernel.slot.SlotDeliveryWorker | There are multiple SlotDeliveryWorkers managed by SlotDeliveryWorkerManager to read messages from the database after selecting a slot range from the coordinator. The messages are then pushed to the message flusher for delivery. | |
org.wso2.andes.kernel.slot. SlotManagerClusterMode | This is where the coordinator logic resides within WSO2 MB. All slots are managed and distributed through this class across the cluster. | |
AMQP | org.wso2.andes.server.AMQChannel | A channel is used for delivering and accepting messages to/from the broker. Each AMQP consumer/publisher has its own unique channel with a channel ID. |
org.wso2.andes.amqp.QpidAndesBridge | This is used as the bridge between the Qpid messaging events and Andes events. | |
MQTT | org.dna.mqtt.wso2.AndesMQTTBridge | This is used as the bridge between the moquette messaging events and Andes events. |
org.dna.mqtt.moquette.messaging.spi.impl. ProtocolProcessor | This handles all events coming through the moquette disruptor (e.g. subscriber-connect, pub-acks) and connects to the AndesMQTTBridge as required to bridge the MQTT functionality. | |
org.wso2.andes.mqtt.connectors. PersistenceStoreConnector | This class acts as an interface before storing MQTT messages to the message store, validating the message format, in addition to handling events such as consumer/publisher creation and closing in terms of the message store. | |
org.wso2.andes.mqtt. MQTTTopicManager | This class handles the lifecycle of MQTT subscriptions and also takes part in routing a given message to matching subscribers. |
Message tracing
This is an MB-specific logging implementation for tracing a message through its inbound event until it is delivered to the consumer application. This implementation has minimal impact on the performance of the broker functionality. To enable message tracing in WSO2 MB:
- Open the log4j.properties file stored in the
<MB_HOME>/repository/conf
folder. Uncomment the following:
#log4j.logger.org.wso2.andes.tools.utils.MessageTracer=TRACE,CARBON_TRACE_LOGFILE
Once message tracing is enabled, you can start the server and execute a grep
command with the relevant message ID you want to trace. This will give print all the logs related to your message ID on your terminal.
Head dump and thread stack analysis
As with any other java product, if the MB cluster fails due to a resource exhaustion, the heap and thread dumps will always point you towards the cause of the leak. Therefore, it is important to be able to retrieve heap and thread dumps from an environment at the point when an error occurs. This will avoid the necessity of reproducing the exact issue again (specially in case of production issues). A resource exhaustion can happen for two reasons:
- Due to a bug in the system.
- An actual limitation of resources based on low configuration values.
You can easily create a a heap dump and thread dump using the CarbonDump tool that is shipped with your product. This will also provide information about the product version and any patch inconsistencies.
Using wireshark to analyze protocol communication
Wireshark is a network traffic analysis tool with great filtering features. Given that WSO2 MB uses the AMQP and MQTT protocols (which are different from HTTP), wireshark is a good way of capturing the network traffic and verifying if the packets are going in the expected order with correct data.
Detecting database anomalies
This section explains how you can identify errors by evaluating the condition of the database. Even though most of the database schema is self explanatory, it is still good to know the special cases where the slot ranges are being stored and how the safe zone is being evaluated. The following diagram illustrates the slot-based message delivery algorithm:
Given that the coordinator is the decision maker on all operations, information on slots are also required to be maintained in a central location. Therefore, all the slot related information in the database are stored in mainly four tables as shown below.
Table | Description |
---|---|
| Each slot, the assigned node ID and the current status are maintained here. |
| Whenever a node communicates a possible slot range to the coordinator, the node will decide on the appropriate message ID range to be included in the slot and update this table with the last |
| This table contains the last published message ID for each node in order to calculate the global safe zone (minimum messageID from all nodes) that is required for deleting slots upon completion. |
| Whenever a slot is given by the coordinator to an MB node for processing, its |
Within this context, you can infer the following validations in the database at any given time:
There should not be any slots in the
MB_SLOT
table if theMB_METADATA
table is empty. This is an eventual guarantee. Even if there are slots queued for deletion, this rule must still be satisfied after some time.There should be no records in the
MB_METADATA
table if theMB_CONTENT
table is empty (one-to-one relationship).- Given the minimum message ID in the
MB_NODE_TO_LAST_PUBLISHED_ID
table, all slots within theMB_SLOT
table with the “assigned” status (state = 2) and the endMessageID less than the minimum published ID should be deleted (or at-least be cleared after some time).
Retrieving logs from the JMS client
You can simply monitor the logs from the JMS clients connecting to WSO2 MB by enabling the following startup property on the clients:
-Damqj.protocol.logging.level=true
Monitoring JAVA metrics
The metrics dashboard of WSO2 MB provides general JVM metrics as well as MB-specific metrics to help you identify how the broker is running in a loaded/relaxed environment. This functionality will give you information such as the unexpected increases of delivery channels, latencies of database reads/writes etc., which will help you identify possible errors in the system. See the documentation on metrics for instructions on how to configure and use the metrics dashboard.