Solr custom transformers

Solr provide facility to create custom transformers.In this article we are going to discuss transformers, why we need it , how we create it , and configure it.

If you need any kind of custom processing before sending the row to Solr, you can write a transformer of your own.

Let us take an example use-case. Suppose, you have a field named “FULL_TEXT” in your schema which is of type=”text_general”. In Database only filePath is stored but we want to index file content. A solution is to write a ReadFileTransformer which read fileContent and pass it to solr for indexing.

Table of Contents

Step 1: Write ReadFileTransformer
Step 2 : Configure ReadFileTransformer
Step 3 : Configure ReadFileTransformer dependency
Was this post helpful?

Step 1: Write ReadFileTransformer

To write any custom transformers in solr we need to perform following steps.

Add solr-dataimporthandler and slf4j library in project classpath
Need to extends Transformer class
Override it’ transformerRow method
Write Logic to read file and put it into map

package com.javadevzone;
import org.apache.solr.handler.dataimport.Context;
import org.apache.solr.handler.dataimport.DataImporter;
import org.apache.solr.handler.dataimport.Transformer;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.List;
import java.util.Map;

/**
 * Created by JavaDeveloperZone on 7/27/2017.
 */
public class ReadFileTransformer extends Transformer {
    private static Logger LOGGER = LoggerFactory.getLogger(ReadFileTransformer.class);
    @Override
    public Object transformRow(Map<String, Object> row, Context context) {
        List<Map<String, String>> fields = context.getAllEntityFields();
        for (Map<String, String> field : fields) {
            // Check if this field has readFile="true" specified in the data-config.xml
            String trim = field.get("readFile");
            if ("true".equals(trim)){
                String columnName = field.get(DataImporter.COLUMN);
                // Get this field's value from the current row
                Object filePath = row.get(columnName);
                // Read file content and put the updated value back in the current row
                if (filePath != null) {
                    try {
                        Path path = Paths.get(filePath.toString());
                        if (Files.exists(path) && !Files.isDirectory(path)) {
                            byte[] fileContent = Files.readAllBytes(path);
                            row.put(columnName, new String(fileContent,0,fileContent.length));
                        }
                    }catch (Exception e){
                        LOGGER.error("Error while reading file!!! ",e);
                    }
                }
            }
        }
        return row;
    }
}

Step 2 : Configure ReadFileTransformer

Configure ReadFileTransformer in db-data-config.xml as below.

<dataConfig>
<dataSource name="jdbc-1" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost/products" user="root" password="root@123" />
    <document name="products">
        <entity name="item" dataSource="jdbc-1" query="select * from item">
            <field column="ID" name="ID" />
            <field column="FULL_TEXT_1" name="FULL_TEXT_1" readFile="true"/>
            <field column="FULL_TEXT_2" name="FULL_TEXT_2" readFile="true"/>
        </entity>
    </document>
</dataConfig>

Step 3 : Configure ReadFileTransformer dependency

Build project dependency and add it to solr core lib directory.

Refer DIHCustomTransformer for more details.

Was this post helpful?

Let us know if you liked the post. That’s the only way we can improve.

Tags: indexing, solr, transformers

Java Developer Zone

http://javadeveloperzone.com

JavaDeveloperZone is a group of innovative software developers. We are experienced in, ● Java Software Development ● Java web development ● Big Data development ● Data analytics ● Artificial Intelligence Development Our contributions will help Java developers and make development journey easy. Feel free to ask any questions and suggestions. Always have space for improvement! Feel free to Contact us for any software development services.

Solr custom transformers

Step 1: Write ReadFileTransformer

Step 2 : Configure ReadFileTransformer

Step 3 : Configure ReadFileTransformer dependency

Was this post helpful?

Related Articles

Solr dataimporthandler class not found exception

Solr index document from database – Data Import handler

Solr Query for compare two string fields

Leave a Reply Cancel reply