

Solr provide facility to create custom transformers.In this article we are going to discuss transformers, why we need it , how we create it , and configure it.
If you need any kind of custom processing before sending the row to Solr, you can write a transformer of your own.
Let us take an example use-case. Suppose, you have a field named “FULL_TEXT” in your schema which is of type=”text_general”. In Database only filePath is stored but we want to index file content. A solution is to write a ReadFileTransformer which read fileContent and pass it to solr for indexing.
Table of Contents
Step 1: Write ReadFileTransformer
To write any custom transformers in solr we need to perform following steps.
- Add solr-dataimporthandler and slf4j library in project classpath
- Need to extends Transformer class
- Override it’ transformerRow method
- Write Logic to read file and put it into map
package com.javadevzone; import org.apache.solr.handler.dataimport.Context; import org.apache.solr.handler.dataimport.DataImporter; import org.apache.solr.handler.dataimport.Transformer; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import java.nio.file.Files; import java.nio.file.Path; import java.nio.file.Paths; import java.util.List; import java.util.Map; /** * Created by JavaDeveloperZone on 7/27/2017. */ public class ReadFileTransformer extends Transformer { private static Logger LOGGER = LoggerFactory.getLogger(ReadFileTransformer.class); @Override public Object transformRow(Map<String, Object> row, Context context) { List<Map<String, String>> fields = context.getAllEntityFields(); for (Map<String, String> field : fields) { // Check if this field has readFile="true" specified in the data-config.xml String trim = field.get("readFile"); if ("true".equals(trim)){ String columnName = field.get(DataImporter.COLUMN); // Get this field's value from the current row Object filePath = row.get(columnName); // Read file content and put the updated value back in the current row if (filePath != null) { try { Path path = Paths.get(filePath.toString()); if (Files.exists(path) && !Files.isDirectory(path)) { byte[] fileContent = Files.readAllBytes(path); row.put(columnName, new String(fileContent,0,fileContent.length)); } }catch (Exception e){ LOGGER.error("Error while reading file!!! ",e); } } } } return row; } }
Step 2 : Configure ReadFileTransformer
Configure ReadFileTransformer in db-data-config.xml as below.
<dataConfig> <dataSource name="jdbc-1" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost/products" user="root" password="root@123" /> <document name="products"> <entity name="item" dataSource="jdbc-1" query="select * from item"> <field column="ID" name="ID" /> <field column="FULL_TEXT_1" name="FULL_TEXT_1" readFile="true"/> <field column="FULL_TEXT_2" name="FULL_TEXT_2" readFile="true"/> </entity> </document> </dataConfig>
Step 3 : Configure ReadFileTransformer dependency
Build project dependency and add it to solr core lib directory.
Refer DIHCustomTransformer for more details.