This topic provides you with instructions on how to set up an active deployment deployment of WSO2 API Manager with multiple datacenters.
...
Raw data accumulation will only happen to each datacenter and will not be replicated. The summarization data (STATS_DB) in each datacenter will be accepted bi-directionally bidirectionally.
The exception is where API-M alerting usecase will not work in such a deployment is due to file-based indexing storage.
...
Note |
---|
If the two API-M Analytics nodes run in the same virtual machine, it is mandatory to have a port offset. Use port offset 1 and 2 for the two Analytics servers. |
Step 3 - Configure APIM 2.1.0 with
APIMAPI-
AnalyticsM Analytics 2.1.0 clustered setup
- Configure APIM 2.1.0 and the two APIMAPI-Analytics M Analytics 2.1.0 nodes. For instructions on how to configure these nodes, see Configuring APIM Analytics.
- When configuring databases, use the same set of databases used in Step 2.
- Open the
<API-M_HOME>/repository/conf/api-manager.xml
file, after enabling the Analytics. Add both the Analytics server URLs under the
DASServerURL
section as a comma separated list as shown below.Code Block <DASServerURL>{tcp://localhost:7612,tcp://localhost:7613}</DASServerURL>
Apply the solution to add the data center ID
Note | ||
---|---|---|
| ||
Make sure that you have configured the databases according to the instructions in the previous section |
To ensure that no primary key violation takes place, you have to change the database schema, by adding the data center ID as an extra column for the tables in STATS_DB, and also add it to the primary key combination. This is to make sure that when database syncing happens, both analytics clusters are able to write to their respective databases without conflicts. There is a custom spark User Defined Function (UDF) to read the data center name from a system property and that has been used whenever inserting data to the STATS_DB via the Spark script.
Follow the steps below to apply the changes for each of the datacenter.
- Shut down the APIM 2.1.0 server and the API-M Analytics 2.1.0 servers in the clustered setup.
Add the following parameter to the
<Analytics_Home>/repository/conf/analytics/spark/spark-defaults.conf
file, in each Analytics server node.Code Block spark.executor.extraJavaOptions -Dcarbon.data.center=DC1
- Copy Download and replace the
analytics-apim.xml
file in<Analytics_Home>/repository/conf/template-manager/domain-template/
directory in each Analytics server node. - Add Download and add the
org.wso2.analytics.apim.spark_2.1.0.jar
as a patch to each of the APIM API-M Analytics server nodes. This file contains the newly written UDF to get data center ID as system parameter. - Copy and replace the
<Analytics_Home>/repository/deployment/server/carbonapps/org_wso2_carbon_analytics_apim-1.0.0.car
file with this CApp, for each Analytics server nodes. Run the following PostgreSQL script against the WSO2AM_STATS_DB.
Expand title Expand to see the script... Code Block Alter table API_REQUEST_SUMMARY add column dataCenter varchar(256) NOT NULL DEFAULT 'DefaultDC'; Alter table API_REQUEST_SUMMARY DROP CONSTRAINT API_REQUEST_SUMMARY_pkey; Alter table API_REQUEST_SUMMARY ADD PRIMARY KEY (api,api_version,version,apiPublisher,consumerKey,userId,context,hostName,year,month,day,dataCenter); Alter table API_VERSION_USAGE_SUMMARY add column dataCenter varchar(254) NOT NULL DEFAULT 'DefaultDC'; Alter table API_VERSION_USAGE_SUMMARY DROP CONSTRAINT API_VERSION_USAGE_SUMMARY_pkey; Alter table API_VERSION_USAGE_SUMMARY ADD PRIMARY KEY (api,version,apiPublisher,context,hostName,year,month,day,dataCenter); Alter table API_Resource_USAGE_SUMMARY add column dataCenter varchar(254) NOT NULL DEFAULT 'DefaultDC'; Alter table API_Resource_USAGE_SUMMARY DROP CONSTRAINT API_Resource_USAGE_SUMMARY_pkey; Alter table API_Resource_USAGE_SUMMARY ADD PRIMARY KEY (api,version,apiPublisher,consumerKey,context,resourcePath,method,hostName,year,month,day,dataCenter); Alter table API_RESPONSE_SUMMARY add column dataCenter varchar(254) NOT NULL DEFAULT 'DefaultDC'; Alter table API_RESPONSE_SUMMARY DROP CONSTRAINT API_RESPONSE_SUMMARY_pkey; Alter table API_RESPONSE_SUMMARY ADD PRIMARY KEY (api_version,apiPublisher,context,hostName,year,month,day,dataCenter); Alter table API_FAULT_SUMMARY add column dataCenter varchar(254) NOT NULL DEFAULT 'DefaultDC'; Alter table API_FAULT_SUMMARY DROP CONSTRAINT API_FAULT_SUMMARY_pkey; Alter table API_FAULT_SUMMARY ADD PRIMARY KEY (api,version,apiPublisher,consumerKey,context,hostName,year,month,day,dataCenter); Alter table API_DESTINATION_SUMMARY add column dataCenter varchar(254) NOT NULL DEFAULT 'DefaultDC'; Alter table API_DESTINATION_SUMMARY DROP CONSTRAINT API_DESTINATION_SUMMARY_pkey; Alter table API_DESTINATION_SUMMARY ADD PRIMARY KEY (api,version,apiPublisher,context,destination,hostName,year,month,day,dataCenter); Alter table API_LAST_ACCESS_TIME_SUMMARY add column dataCenter varchar(254) NOT NULL DEFAULT 'DefaultDC'; Alter table API_LAST_ACCESS_TIME_SUMMARY DROP CONSTRAINT API_LAST_ACCESS_TIME_SUMMARY_pkey; Alter table API_LAST_ACCESS_TIME_SUMMARY ADD PRIMARY KEY (tenantDomain,apiPublisher,api,dataCenter); Alter table API_EXE_TME_DAY_SUMMARY add column dataCenter varchar(254) NOT NULL DEFAULT 'DefaultDC'; Alter table API_EXE_TME_DAY_SUMMARY DROP CONSTRAINT API_EXE_TME_DAY_SUMMARY_pkey; Alter table API_EXE_TME_DAY_SUMMARY ADD PRIMARY KEY (api,version,apiPublisher,context,year,month,day,tenantDomain,dataCenter); Alter table API_EXE_TIME_HOUR_SUMMARY add column dataCenter varchar(254) NOT NULL DEFAULT 'DefaultDC'; Alter table API_EXE_TIME_HOUR_SUMMARY DROP CONSTRAINT API_EXE_TIME_HOUR_SUMMARY_pkey; Alter table API_EXE_TIME_HOUR_SUMMARY ADD PRIMARY KEY (api,version,tenantDomain,apiPublisher,context,year,month,day,hour,dataCenter); Alter table API_EXE_TIME_MIN_SUMMARY add column dataCenter varchar(254) NOT NULL DEFAULT 'DefaultDC'; Alter table API_EXE_TIME_MIN_SUMMARY DROP CONSTRAINT API_EXE_TIME_MIN_SUMMARY_pkey; Alter table API_EXE_TIME_MIN_SUMMARY ADD PRIMARY KEY (api,version,tenantDomain,apiPublisher,context,year,month,day,hour,minutes,dataCenter); Alter table API_THROTTLED_OUT_SUMMARY add column dataCenter varchar(254) NOT NULL DEFAULT 'DefaultDC'; Alter table API_THROTTLED_OUT_SUMMARY DROP CONSTRAINT API_THROTTLED_OUT_SUMMARY_pkey; Alter table API_THROTTLED_OUT_SUMMARY ADD PRIMARY KEY (api,api_version,context,apiPublisher,applicationName,tenantDomain,year,month,day,throttledOutReason,dataCenter); Alter table API_REQ_USER_BROW_SUMMARY add column dataCenter varchar(254) NOT NULL DEFAULT 'DefaultDC'; Alter table API_REQ_USER_BROW_SUMMARY DROP CONSTRAINT API_REQ_USER_BROW_SUMMARY_pkey; Alter table API_REQ_USER_BROW_SUMMARY ADD PRIMARY KEY (api,version,apiPublisher,year,month,day,os,browser,tenantDomain,dataCenter); /*Execute following queries only "APIM_GEO_LOCATION_STATS" are enabled fom admin app. Alter table API_REQ_GEO_LOC_SUMMARY add column dataCenter varchar(254); Alter table API_REQ_GEO_LOC_SUMMARY drop primary key; Alter table API_REQ_GEO_LOC_SUMMARY ADD PRIMARY KEY (api,version,apiPublisher,year,month,day,country,city,tenantDomain,dataCenter);
- Restart the APIM API-M 2.1.0 server and the API-M Analytics 2.1.0 servers in the clustered setup.
Synchronize the databases
Info | ||
---|---|---|
| ||
In the active-active data center architecture, the request may come to one of the datacenters and be fulfilled by that datacenter. The analytics-related details of that request will be stored in the STATS_DB of the same data center. Therefore, when requesting for analytics-related details, both datacenters can provide different details according to their STATS_DBs. To avoid this, we need to maintain same set of data in the STATS_DBs of both the data centers. |
You can synchronize databases by sharing the STATS_DB or by using a replication mechanism. Inserting the data center ID to the primary key into all the tables in the STATS_DB and include it in the composite key can be done in two methods.
- Using a bi-directional replication mechanism - This is a master-master node replication, where changes done in one node will be replicated in other nodes.
- Master-slave mechanism - The STATS_DB will be shared among all the nodes. When the master node becomes unavailable, the slave nodes will function as the master node.
Follow the steps below to synchronize the databases using the bi-directional replication(BDR) mechanism.
Warning |
---|
Note that these instructions are tested with Ubuntu OS and PostgreSQL |
Note | |||||||
---|---|---|---|---|---|---|---|
| |||||||
Install and enable the PostgreSQL apt repository for PGDG. This repository is required by the BDR packages.
|
Create a 2ndquadrant.list file in the
/etc/apt/sources.list.d/
with the repository URL given below. Change codename according to your OS versionCode Block deb http://packages.2ndquadrant.com/bdr/apt/ codename-2ndquadrant main
Import the repository key from here. Update the package lists and install the packages.
Code Block wget --quiet -O - http://packages.2ndquadrant.com/bdr/apt/AA7A6805.asc | sudo apt-key add - sudo apt-get update
Remove the
postgresql-9.4
packages, if you have them installed already.Tip BDR requires a patched version of PostgreSQL 9.4 that conflicts with the official packages. If you already have PostgreSQL 9.4 installed either from apt.postgresql.org or your official distribution repository, you will need to make a dump of all your databases, then uninstall the official PostgreSQL 9.4 packages before you install the BDR.
Code Block title To get the du_dump... pg_dump database1 -f backup_stat_db.sql
Code Block title To remove the postgresql-9.4 packages... sudo apt-get remove postgresql-9.4
Install the BDR packages. Sample commands are given below.
Code Block sudo apt-get update sudo apt-get install postgresql-bdr-9.4 postgresql-bdr-9.4-bdr-plugin
Make the following changes to the files in the /etc/postgresql/9.4/main/ directory in both nodes.
Code Block title postgresql.conf listen_addresses = '*' shared_preload_libraries = 'bdr' wal_level = 'logical' track_commit_timestamp = on max_connections = 100 max_wal_senders = 10 max_replication_slots = 10 max_worker_processes = 10
Code Block title pg_hba.conf #Add the following configs hostssl all all x.x.x.x/32 trust # Own IP address hostssl all all z.z.z.z/32 trust # Second node IP address hostssl replication postgres x.x.x.x/32 trust # Own IP address hostssl replication postgres z.z.z.z/32 trust # Second node IP address
Restart PostgreSQL in both nodes. Sample commands are given below.
Code Block systemctl unmask postgresql systemctl restart postgresql
Create the STATS_DB database and users.
Code Block CREATE DATABASE stat_db; CREATE ROLE stat_db_user WITH SUPERUSER LOGIN PASSWORD 'SuperPass'; GRANT ALL PRIVILEGES ON DATABASE stat_dbTO stat_db_user;
Create BDR extension on the STATS_DB in both nodes. Sample commands are given below.
Code Block \c stat_db; create extension pgcrypto; create extension btree_gist; create extension bdr;
You can check the BDR extension as follows:
Localtabgroup Localtab active true title Command Code Block SELECT bdr.bdr_variant();
Localtab title Response Code Block stat_db=# SELECT bdr.bdr_variant(); Result:- Bdr_variant ------------- BDR (1 row)
Localtabgroup Localtab active true title Command Code Block SELECT bdr.bdr_version();
Localtab title Response Code Block stat_db =# SELECT bdr.bdr_version(); Result:- Bdr_version ------------------- 1.0.2-2016-11-11- (1 row)
Create the first master node.
Warning Do this step only in Node 1.
Code Block title Creating the first master node SELECT bdr.bdr_group_create(local_node_name := 'node1', node_external_dsn := 'host=<OWN EXTERNAL IP> port=5432 dbname=stat_db');
You can verify this as shown below.
Localtabgroup Localtab active true title Command Code Block SELECT bdr.bdr_node_join_wait_for_ready();
Localtab title Response Code Block stat_db=# SELECT bdr.bdr_node_join_wait_for_ready(); Bdr_node_join_wait_for_ready ------------------------------ (1 row)
Create the second master node.
Warning Do this step only in Node 2.
Code Block title Creating the second master node SELECT bdr.bdr_group_join(local_node_name := 'node2', node_external_dsn := 'host=<OWN EXTERNAL IP> port=5432 dbname= stat_db', join_using_dsn := 'host=<NODE1 EXTERNAL IP> port=5432 dbname= stat_db');
You can verify this with the same command given in the previous step.
Restore database data.
Warning Do this step in only one of the two nodes.
Code Block psql stat_db < backup_stat_db.sql
You have now successfully set up an active multi data center deploymentmulti datacenter deployment.