com.atlassian.confluence.content.render.xhtml.migration.exceptions.UnknownMacroMigrationException: The macro 'next_previous_links' is unknown.

Archiving Cassandra Data

WSO2 BAM uses MapReduce jobs to archive Cassandra data. As a result, you can archive a large amount of data using a cluster of Hadoop nodes. You run the archive process manually or schedule it using a cron expression as explained below.


  1. Log in to BAM management console and select Archive Data menu under the Configure menu.


    Archiving data manually

  2. The Cassandra data archive configuration opens. Select the Date Range option to manually archive data between a specific date range.
    manual data archival
    The configuration parameters are explained below.

    ParameterDescription

    Stream Name

    In the BAM data model, stream name maps to a Cassandra column family. You provide the stream name to archive the data stored under that stream name.
    Version

    Version of the stream. Used to specify which version to archive when there are multiple versions under the same stream name (as recommended).

    Date range

    Specifies the start and end dates. E.g., From - 25/01/2013 00:00:00 AM to 03/02/2013 00:00:00 AM

    Username/Password

    Cassandra username and password (same as BAM credentials)

    External Cassandra cluster
    Connection URL - connection details of Cassandra cluster. E.g.,10.100.60.150:9160,10.100.60.151:9160

    Scheduling the archive

  3. Select the Below this number of days option to schedule an archival process. For example:
    scheduling the archiveThe configuration parameters are explained below:

    ParameterDescription
    Stream NameIn the BAM data model, stream name maps to a Cassandra column family. You provide the stream name to archive the data stored under that stream name.
    VersionVersion of the stream. Used to specify which version to archive when there are multiple versions under the same stream name (as recommended).

    No of days

    Keeps only last 'n' no of days data in the Column Family. For example, according to above configuration, the system only runs data from the last 90 days and archives the older data.
    Cron expressionCron expression is used to schedule the archive process. For example, according to above configuration, the  archive job runs everyday at midnight.
    External Cassandra cluster Connection URL - connection details of Cassandra cluster. E.g.,10.100.60.150:9160,10.100.60.151:9160
    Username/PasswordCassandra username and password (same as BAM credentials)
    • Name of the archive column family is <original column family name> + _arch.
    • Cassandra streams are generated with underscores (_). Replace the underscores in the stream name with dot (.) when archiving. For example, if stream name is org_wso2_bam_phone_retail_store_kpi, mention it as org.wso2.bam.phone.retail_store.kpi when archiving.
  4. Click Submit once you are done.
    Once you submit a scheduled archive, the system creates a Hive script and executes it.

  5. Click Main, and then click List under the Analytics menu.

    Note that step 6 does not apply to the manual archiving process, which only executes the Hive query, but doesn't save it.

  6.  Select your script, and click the Schedule Script link associated with it to change the schedule time of your script.

com.atlassian.confluence.content.render.xhtml.migration.exceptions.UnknownMacroMigrationException: The macro 'next_previous_links2' is unknown.