

Solr support many languages where user can indexing/searching their documents.In this article we will discuss how indexing/searching done in one of the most popular language in india which is also nation’s national language.
Solr provide three filters to handle hindi language very well.These are as below:
- IndicNormalizationFilterFactory
- HindiNormalizationFilterFactory
- HindiStemFilterFactory
Let’s look now how we can configure above filterfactories and use them.
Table of Contents
Step 1: Create FieldTye
Create custom fieldType and add above FilterFactory as below.
<fieldType name="text_hindi" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="hindi/synonyms.txt" ignoreCase="true" expand="true"/>
<!-- Case insensitive stop word removal.
add enablePositionIncrements=true in both the index and query
analyzers to leave a 'gap' for more accurate phrase queries.
-->
<filter class="solr.StopFilterFactory" words="hindi/stopwords.txt" ignoreCase="true" enablePositionIncrements="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.HindiStemFilterFactory" protected="hindi/protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
<filter class="solr.IndicNormalizationFilterFactory"/>
<filter class="solr.HindiNormalizationFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" words="hindi/stopwords.txt" ignoreCase="true" enablePositionIncrements="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.HindiStemFilterFactory" protected="hindi/protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
<filter class="solr.IndicNormalizationFilterFactory"/>
<filter class="solr.HindiNormalizationFilterFactory"/>
</analyzer>
</fieldType>Step 2: Field Configuration
Now use above created field type in field defination.
<field name="FULL_TEXT" type="text_hindi" indexed="true" stored="true"/>
Step 3: Add documents
Add documents which has hindi content like “जावा डेवलपर ज़ोन बहुत अच्छे ब्लॉग लिखते हैं”. here we are using solr upload document command solr gui dashboard.

Step 4: Search documents
That’s it.To test whether particular document is indexed or not.Fire query like FULL_TEXT:”जावा डेवलपर”.Solr will return one document as below.

Refer Language Analysis , Stemming , Configure stop words , Configure synonyms for more details.
