Store Logs in a Relational Database
First, a hive query language is used to retrieve the required data. The data is then stored in a relational database so that gadgets can be created against the relational data. Read about creating hive queries to analyse data in the BAM documentation.
- Log in to BAM. Go to the management console and click Add in the Analytics menu.
- Write a hive script to retrieve data from Cassandra and store it in a relational database. For example, consider the following details to write a sample hive query. This query will retrieve all log information to a MySQL database.
- Default keyspace:
EVENT_KS
- The needed CF:
log_0_AS_2013_01_07
- The MySQL database:
MYBAMDB
Given below is the sample query written using the above information, to get the number of logs by the priority. You can change the necessary parameters accordingly (cassandra.cf.name,mapred.jdbc.url', mapred.jdbc.username, mapred.jdbc.password).
CREATE EXTERNAL TABLE IF NOT EXISTS LogEventInfo (key STRING, tenantID INT,serverName STRING, appName STRING, priority STRING,logTime DOUBLE,logger STRING,message STRING) STORED BY 'org.apache.hadoop.hive.cassandra.CassandraStorageHandler' WITH SERDEPROPERTIES ( "cassandra.host" = "localhost", "cassandra.port" = "9160","cassandra.ks.name" = "EVENT_KS", "cassandra.ks.username" = "admin","cassandra.ks.password" = "admin", "cassandra.cf.name" = "log_0_AS_2013_01_07", "cassandra.columns.mapping" = ":key,payload_tenantID,payload_serverName,payload_appName, payload_priority,payload_logTime,payload_logger,payload_message" ); CREATE EXTERNAL TABLE IF NOT EXISTS Logs(tenantID INT,serverName STRING, appName STRING, priority STRING,logTime DOUBLE,logger STRING,message STRING) STORED BY 'org.wso2.carbon.hadoop.hive.jdbc.storage.JDBCStorageHandler' TBLPROPERTIES ( 'mapred.jdbc.driver.class' = 'com.mysql.jdbc.Driver', 'mapred.jdbc.url' = 'jdbc:mysql://localhost:3306/MYBAMDB', 'mapred.jdbc.username' = 'root','mapred.jdbc.password' = 'root', 'hive.jdbc.update.on.duplicate' = 'true', 'hive.jdbc.table.create.query' = 'CREATE TABLE LogEvent(tenantID INT,serverName VARCHAR(200), appName VARCHAR(200), priority VARCHAR(200),logTime DOUBLE,logger VARCHAR(800),message VARCHAR(3800))'); insert overwrite table Logs select tenantID, serverName, appName, priority, logTime, logger, message from LogEventInfo; select tenantID, serverName, appName, priority, logTime, logger, message from LogEventInfo;
- The first part of hive query will map the cassandra data store to a storage handler which can extract data from cassandra.
- The second part will map the cassandra storage handler data to relational database storage handler in order to establish the link between the cassandra and relational database.
- The third part contains your logic for the analytics.
- Default keyspace:
- You can test your script by clicking Execute. If there are no errors, you can go to your mysql database and check for the data.
- Also, once you have done writing the hive script, you can schedule it as explained in Scheduling an Analytic Script.
Log Analysis using Gadgets
Once you have created the required analytics for your logs as explained above, you can view the analytics using Gadgets in BAM. You can create your own gadgets using the Gadget Gen Tool in BAM, based on how you want to analyse the log information. Read about gadgets in BAM for more details.
Realtime Log Analysis
WSO2 BAM with WSO2 CEP features has the perfect mechanism to perform realtime analysis of data and to alert the relevant parties when an error occurs. Read about realtime analytics in BAM for more details. The steps given below will explain how the system can be configured to perform realtime analysis of logs in BAM.
Step 1: Setup email notifications
Since mail transport in BAM is used to send email alerts to recipients, we must enable mail transport in BAM. To do that, go to <BAM_HOME>/repository/conf/axis2/axis2-client.xml
file and add email configurations.
<transportSender name="mailto" class="org.apache.axis2.transport.mail.MailTransportSender"> <parameter name="mail.smtp.from"></parameter> <parameter name="mail.smtp.user"></parameter> <parameter name="mail.smtp.password"></parameter> <parameter name="mail.smtp.host"></parameter> <parameter name="mail.smtp.port"></parameter> <parameter name="mail.smtp.starttls.enable">true</parameter> <parameter name="mail.smtp.auth">true</parameter> </transportSender>
Step 2: Creating Event Adaptors
In order to perform real time analytics of logs in BAM, we need to create an execution plan:
- Log in to BAM and go to the Configure tab.
- Create the following adaptors by clicking the relevant option on the navigator.
- Input Adaptor: To capture log events coming to BAM.
- Output Adaptor: To send emails.
Step 3: Creating Stream Definitions
- Go to the Event Processor menu in the Main tab of the BAM management console.
- Click Event Streams. The Available Event Streams page will open.
- Click Add Event Stream to create a new stream definition. You can now enter the details of the EventLog in the following screen.
- Enter a name for the Stream, a version and description in the respective fields.
In the Stream Attributes section, add the data from the LogEvent. For example, given below are the attributes in the appserver log event:
Meta Data Attributes Payload Data Attributes clientType {String}
tenantID {String}
serverName {String} appName {String} logTime {Long} priority {String} message {String} logger {String} ip {String} instance {String} stacktrace {String}
Step 4: Creating the Execution Plan
- Go to the Event Processor menu in the Main tab of the BAM management console.
- Click Execution Plans. The Available Execution Plans page will open.
- Click Add Execution Plan to create a new plan.
- Give a suitable name and description for the execution plan.
- Now, give the name of the input stream you created in the Import Stream field.
- Click Import to load the input stream.
- Now, write the CEP query, which will analyse the details from input stream. See the sample query written below for the purpose of querying errors in the logs. According to this query, if any error is found in an event, it will be sent to an output stream.
CEP Query
from LogEvents[priority == "ERROR"] select message,stacktrace,serverName insert into ExceptionStream
- After creating the query, add the stream to which we are sending the error logs. Note that according to our CEP query, the exported stream name should be '
ExceptionStream
'. - Click Add. The output stream (ExceptionStream) will be auto generated.
- Once we create the Exception stream, select the Exception stream as the exported stream and create a new formatter. This formatter is used to specify the email body and email related information, such as the subject, to address etc. Give the output mapping type as text so we can give the content of the email message body inline.
Email Body
Error Occurred in {{serverName}} – {{message}} {{stacktrace}}
In this body we are taking the message, stacktrace and server name from the OutputStream (ExceptionStream) and adding a readable message for the email message body.
- Add the event formatter and save the execution plan. Now we have successfully created the event trigger to monitory error logs.