

Solr provide facility to create custom transformers.In this article we are going to discuss transformers, why we need it , how we create it , and configure it.
If you need any kind of custom processing before sending the row to Solr, you can write a transformer of your own.
Let us take an example use-case. Suppose, you have a field named “FULL_TEXT” in your schema which is of type=”text_general”. In Database only filePath is stored but we want to index file content. A solution is to write a ReadFileTransformer which read fileContent and pass it to solr for indexing.
Table of Contents
Step 1: Write ReadFileTransformer
To write any custom transformers in solr we need to perform following steps.
- Add solr-dataimporthandler and slf4j library in project classpath
- Need to extends Transformer class
- Override it’ transformerRow method
- Write Logic to read file and put it into map
package com.javadevzone;
import org.apache.solr.handler.dataimport.Context;
import org.apache.solr.handler.dataimport.DataImporter;
import org.apache.solr.handler.dataimport.Transformer;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.List;
import java.util.Map;
/**
* Created by JavaDeveloperZone on 7/27/2017.
*/
public class ReadFileTransformer extends Transformer {
private static Logger LOGGER = LoggerFactory.getLogger(ReadFileTransformer.class);
@Override
public Object transformRow(Map<String, Object> row, Context context) {
List<Map<String, String>> fields = context.getAllEntityFields();
for (Map<String, String> field : fields) {
// Check if this field has readFile="true" specified in the data-config.xml
String trim = field.get("readFile");
if ("true".equals(trim)){
String columnName = field.get(DataImporter.COLUMN);
// Get this field's value from the current row
Object filePath = row.get(columnName);
// Read file content and put the updated value back in the current row
if (filePath != null) {
try {
Path path = Paths.get(filePath.toString());
if (Files.exists(path) && !Files.isDirectory(path)) {
byte[] fileContent = Files.readAllBytes(path);
row.put(columnName, new String(fileContent,0,fileContent.length));
}
}catch (Exception e){
LOGGER.error("Error while reading file!!! ",e);
}
}
}
}
return row;
}
}
Step 2 : Configure ReadFileTransformer
Configure ReadFileTransformer in db-data-config.xml as below.
<dataConfig>
<dataSource name="jdbc-1" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost/products" user="root" password="root@123" />
<document name="products">
<entity name="item" dataSource="jdbc-1" query="select * from item">
<field column="ID" name="ID" />
<field column="FULL_TEXT_1" name="FULL_TEXT_1" readFile="true"/>
<field column="FULL_TEXT_2" name="FULL_TEXT_2" readFile="true"/>
</entity>
</document>
</dataConfig>Step 3 : Configure ReadFileTransformer dependency
Build project dependency and add it to solr core lib directory.
Refer DIHCustomTransformer for more details.
