WSO2 Enterprise Integrator (WSO2 EI) consists of four profiles (ESB, Message Broker, Business Process Server, and Analytics) that can persist a user's personally identifiable information (PII) in various sources, namely log files and RDBMSs. However, organizations that use WSO2 EI have a legal obligation to remove all instances of a user's PII from the system if the relevant user requests. For example, consider a situation where an employee resigns from the organization and, thereby, requests the organization to remove all instances of one's PII from the organization's system. You can fulfill this requirement by anonymizing the user's PII in the system, or (in some cases) by completely removing such PII from the system.
...
Every log statement follows the same pattern where the "USER_NAME" keyword is followed by an actual username (in this example it is "Sam"). The regex pattern of this log statement will be as shown below. The Forget-Me Tool will use the below regex pattern to anonymize the username.
This pattern should be added to the
ei-patterns.xml
file (stored in the<EI_HOME>/wso2/tools/forget-me/conf/log-config/
directory).Code Block <pattern key="pattern3"> <detectPattern>(.)*(USER_NAME)(.)*${username}(.)*</detectPattern> <replacePattern>${username}</replacePattern> </pattern>
Update the
config.json
file (stored in the<EI_HOME>/wso2/tools/forget-me/conf/
directory) as shown below. This file contains references to all the log files (except any service-specific log file) in the system that store the above user information. If you have enabled a service-specific log file, you need to add that file name (see the element descriptions given below).Code Block { "processors" : [ "log-file" ], "directories": [ { "dir": "log-config", "type": "log-file", "processor" : "log-file", "log-file-path" : "<EI_HOME>/repository/logs", "log-file-name-regex" : "(audit.log|warn.log|wso2carbon.log)(.)*" } ] }
The elements in the above configuration are explained below.
- "processors": The processors listed for this element specifies whether the tool will on log files, RDBMSs, or analytics streams. In the case of the ESB profile, we only need to remove PII from log files, and therefore, the processor is set to "log-file".
- "directories": This element lists the directories that correspond to the processors. In the case of the ESB profile, we need to specify the directories that store log files.
"log-file-path": This specifies the directory path to the log files. Note that all the relevant log files are stored in the
<EI_HOME>/repository/logs/
directory.Note Be sure to replace the "log-file-path" value with the correct absolute path to the location where the log files are stored. If you are on Windows, be sure to use the forward slash ("/") instead of the back slash ("\"). For example:
C:/Users/Administrator/Desktop/wso2ei-6.2.0/repository/log
."log-file-name-regex": This gives the list of log files (stored in the log-file-path) that will persist the user's PII. Note that the above log-file-name-regex includes the audit.log, warn.log, and wso2carbon.log files, as well as the archived files of the same logs. If you have enabled a service-specific log file, be sure to add the file name to this list.
Open a command prompt and navigate to the
<EI_HOME>/bin
directory.Execute the following command to anonymize the user information that was added to the ei-patterns.xml file:
On Linux:
Code Block ./forgetme.sh -U Sam
On Windows:
Code Block forgetme.bat -U Sam
This will result in the following:
Copies will be created of all the log files specified in the config.json file. The following is the format of the log copy:
anon-<time_stamp>-<original_log_name>.log
. For example,anon-1520946791793-warn.log
.The PII will be anonymized in the copies. The log files will display the user information as a pseudonym.
Code Block [EI-Core] INFO - LogMediator USER_NAME = 86c3bfd9-f97c-4b08-9f15-772dcb0c1c
Note For the list of commands you can run using the Forget-Me tool, see this link.
...
Anonymizing PII in the BPMN (activiti) component
The PII references stored by the BPMN component can be removed from log files as well as the BPMN-specific database by using the Forget-Me Tool.
Follow the steps given below.
- Add the relevant drivers for your BPMN-specific database to the
<EI_HOME>/wso2/tools/forget-me/lib
directory. For example, if you have changed your BPMN database from the default H2 database to MySQL, copy the MySQL driver to this given directory. - Open the activiti
-
datasources.
xml file (stored in the <EI_HOME>/wso2/tools/forget-me/conf/datasources/ directory), and specify the details of the RDBMS that stores the metadata from BPMN workflows. Update the
config.json
file ( stored in the<EI_HOME>/wso2/tools/forget-me/conf/
directory) as shown below. This file contains references to all the log files in the system, and the RDBMS that stores the user information form BPMN workflows.Code Block { "processors" : [ "log-file", "rdbms" ], "directories": [ { "dir": "log-config", "type": "log-file", "processor" : "log-file", "log-file-path" : "<EI_HOME>/wso2/business-process/repository/logs", "log-file-name-regex" : "(audit.log|warn.log|wso2carbon.log)(.)*" }, { "dir": "sql", "type": "rdbms", "processor" : "rdbms" } ], "extensions": [ { "dir": "datasources", "type": "datasource", "processor" : "rdbms" } ] }
The elements in the above configuration are explained below.
- "processors": The processors listed for this element specifies whether the tool will run for log files, RDBMSs, or analytics streams. In the case of the BPMN component of the BPS profile, we need to remove PII from log files, as well as the BPMN-specific database. Therefore, the processor is set to "log-file","rdbms".
- "directories": This element lists the directories that correspond to the processors. In the case of the BPMN component, we need to specify the directories that store log files, as well as the directory of the SQL scripts for the BPMN database. Therefore, the above configuration contains two directories: "log-config" and "sql".
"log-file-path": This specifies the directory path to the logs. In this example, all the relevant log files for BPS are stored in the
<EI_HOME>/wso2/business-process/repository/logs/
directory.Note Be sure to replace the "log-file-path" value with the correct absolute path to the location where the log files are stored. If you are on Windows, be sure to use the forward slash ("/") instead of the back slash ("\"). For example:
C:/Users/Administrator/Desktop/wso2ei-6.2.0/repository/log
."log-file-name-regex": This gives the list of log files (stored in the log-file-path) that will persist the user's PII. Note that the above log-file-name-regex includes the audit.log, warn.log, and wso2carbon.log files, as well as the archived files of the same logs.
Open a command prompt and navigate to the
<EI_HOME>/bin
directory.Run the tool using the following command:
On Linux:
Code Block ./forgetme.sh -U <USERNAME>
On Windows:
Code Block forgetme.bat -U <USERNAME>
This will result in the following:
Copies will be created of all the log files specified in the config.json file. The following is the format of the log copy:
anon-<time_stamp>-<original_log_name>.log
. For example,anon-1520946791793-warn.log
.The PII will be anonymized in the copies. The log files will display the user information as a pseudonym.
- The user's PII will be removed from the BPMN database.
Note For the list of commands you can run using the Forget-Me tool, see this link.
...
Note that the PII is not removed from the original log files. It is the responsibility of the organization to remove the original log files that contain the user's PII.
Removing Human Task and BPEL process instances
If you are using Human Tasks and BPEL workflows in your BPS profile, you can remove a user's personally identifiable information (PII) from the BPS instance by removing all process instances and task instances (associated with message exchanges) from the server.
WSO2 EI is shipped with a set of SQL scripts (stored in the bpel
and humantask
folders in the <EI_HOME>/wso2/business-process/repository/resources/cleanup-scripts
directory) that you can use for removing process instances and task instances from the BPS profile. There are two ways of doing this:
...
Stream Name | Attribute List |
---|---|
org.wso2.gdpr.students |
|
org.wso2.gdpr.students.marks |
|
These PII references can be removed from the Analytics database by using the Forget-Me Tool. Follow the steps given below.
- Add the relevant drivers for your Analytics-specific databases to the
<EI_HOME>/wso2/tools/forget-me/lib
directory. For example, if you have changed your Analytics databases from the default H2 instances to MySQL, copy the MySQL driver to this given directory. - Create a folder named 'streams' in the
<EI_HOME>/wso2/tools/forget-me/conf/
directory. Create a new file named streams.json with the content shown below, and store it in the /streams directory that you created in the previous step. This file holds the details of the streams and the attributes with PII that we need to remove from the database.
Code Block { "streams": [ { "streamName": "org.wso2.gdpr.students", "attributes": ["username", "email", "dateOfBirth"], "id": "username" }, { "streamName": "org.wso2.gdpr.students.marks", "attributes": ["username"], "id": "username" } ] }
The above configuration includes the following:
- Stream Name: The name of the stream.
- Attributes: The list of attributes that contain PII.
- id: The ID attribute, which holds the value that needs to be anonymized (replaced with a pseudonym).
Update the
config.json
file ( stored in the<EI_HOME>/wso2/tools/forget-me/conf/
directory) as shown below.Code Block language js { "processors": [ "analytics-streams" ], "directories": [ { "dir": "analytics-streams", "type": "analytics-streams", "processor": "analytics-streams" } ] }
Open a command prompt and navigate to the
<EI_HOME>/bin
directory.Run the tool using the following command:
On Linux:
Code Block ./forgetme.sh -U <USERNAME> -carbon <EI_ANALYTICS_HOME>
On Windows:
Code Block forgetme.bat -U <USERNAME> -carbon <EI_ANALYTICS_HOME>
Note For the list of commands you can run using the Forget-Me tool, see this link.
Excerpt | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| |||||||||||||||||
Anonymizing PII of business process analytics Shown below are the data streams used by the BPS profile of WSO2 EI along with sample attributes with PII references.
These PII references can be removed from the Analytics database by using the Forget-Me Tool. Follow the steps given below.
|