WSO2 DAS stores data in the Data Access Layer (DAL) and performs various analysis operations on them according to defined analytic queries. Thereby, as the volume of the data stored grows over time, the analysis and summarization jobs will also eventually consume more time. Then you can apply data purging to reduce the time taken to execute the analytics scripts, as well as the disk usage, since usually it is not necessary to analyze all data to produce the final result.
...
- Log in to the Management Console as an admin user, if you are not already logged in.
- Click Main, and then click Data Explorer.
- Select the required table in the Table Name field, and then click Schedule Data Purging to open the Schedule Data Purging dialog box.
- Clear the Enable Data Purging check box as shown below.
- Click Save, and close the dialog box.
Global data purging
You can perform data purging as a global operation which will affect all tenants. Follow the steps below to perform global data purging.
- Navigate to
<DAS_HOME>/repository/conf/analytics/
analytics-config.xml
file. Change the configurations within the
<analytics-data-purging>
property as shown below.Code Block language xml <analytics-data-purging> <!-- Below entry will indicate purging is enable or not. If user wants to enable data purging for cluster then this property need to be enable in all nodes --> <purging-enable>true</purging-enable> <cron-expression>0 0 12 * * ?</cron-expression> <!-- Tables that need include to purging. Use regex expression to specify the table name that need include to purging.--> <purge-include-table-patterns> <table>.*</table> <!--<table>.*jmx.*</table>--> </purge-include-table-patterns> <!-- All records that insert before the specified retention time will be eligible to purge --> <data-retention-days>365</data-retention-days> </analytics-data-purging>
The properties of the above configuration file are shown below.
Property Description <purging-enable>
Change the value to true if you want to enable data purging. <cron-expression>
The cron expression to define how you want to schedule the data purging operation. For example, the following cron expression will configure the archive job to run at 12:00 PM (noon) every day : 0 0 12 * * ?
For more information on cron expressions, go to Oracle Documentation.<purge-include-table-patterns>
Specify the tables of which you want to purge data. By default, it is configured to perform data purging on all tables as follows: <table>.*</table>
However, you can specify the required tables by defining a regular expression or a table name within the<table>
property. Define one tag per each regular expression if you want to specify multiple tables.<data-retention-days>
Define the value as to keep data of only the last 'n' no of days back in the selected table. For example, the default value 365 will purge all data stored before 1 year.
Info You can purge all records by setting a minus value (e.g. -1) for
<data-retention-days>
.
Disabling data purging in a clustered mode
...