Add Apache Solr-Based Indexing

The API Manager has Apache Solr based indexing for API documentation content. It provides both the API Publisher and Store full-text search facility to search through API documentation, find documents and related APIs. The search syntax is doc:keyword. Search criteria looks for the keyword in any word/phrase in the documentation content and returns both the matching documents and associated APIs.

The following media types have Apache Solr based indexers by default, configured using the <Indexers> element in <APIM_HOME>/repository/conf/registry.xml.

Text : text/plain
PDF : application/pdf
MS word : application/msword
MS Powerpoint : application/vnd.ms-powerpoint
MS Excel : application/vnd.ms-excel
XML : application/xml

Writing a custom index

In addition to the default ones, you can write your own indexer implementation and register it as follows:

Write a custom indexer. Given below is a sample indexer code.

package org.wso2.indexing.sample;

import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Arrays;
import org.apache.solr.common.SolrException;
import org.wso2.carbon.registry.core.exceptions.RegistryException;
import org.wso2.carbon.registry.core.utils.RegistryUtils;
import org.wso2.carbon.registry.indexing.IndexingConstants;
import org.wso2.carbon.registry.indexing.AsyncIndexer.File2Index;
import org.wso2.carbon.registry.indexing.indexer.Indexer;
import org.wso2.carbon.registry.indexing.solr.IndexDocument;

public class PlainTextIndexer implements Indexer {
    public IndexDocument getIndexedDocument(File2Index fileData) throws SolrException,
            RegistryException {
             
             /* Create index document with resource path and raw content*/
             IndexDocument indexDoc = new IndexDocument(fileData.path, RegistryUtils.decodeBytes(fileData.data), null);
             
             /* You can specify required field/value pairs for this indexing document.
              * When searching we can query on these fields  */
             Map<String, List<String>> fields = new HashMap<String, List<String>>();
             fields.put("path", Arrays.asList(fileData.path));
                                     
             if (fileData.mediaType != null) {
                         fields.put(IndexingConstants.FIELD_MEDIA_TYPE, Arrays.asList(fileData.mediaType));
             } else {
                         fields.put(IndexingConstants.FIELD_MEDIA_TYPE, Arrays.asList("text/plain"));
             }
             
             /* set fields for index document*/
             indexDoc.setFields(fields);             
             return indexDoc;
    }    
}

Add the custom indexer JAR file to <APIM_HOME>/repository/components/lib directory.
Update the <Indexers> element in <APIM_HOME>/repository/conf/registry.xml file with the new indexer. The content is indexed using this media type. For example,
```
<indexers>
     <indexer class="org.wso2.indexing.sample.PlainTextIndexer" mediaTypeRegEx="text/plain" profiles="default,api-store,api-publisher"/>
</indexers>
```
The attributes of the above configuration are described below:
class Java class name of the indexer
mefiaTypeRegEx A regex pattern to match the media type
profiles APIM profiles in which the indexer is available
Restart the server.
Add API documentation using the new media type and then search some term in the documentation using the syntax (doc:keyword). You will see how the documentation has got indexed according to the media type.

class	Java class name of the indexer
mefiaTypeRegEx	A regex pattern to match the media type
profiles	APIM profiles in which the indexer is available