Solr provides the option to configure synonyms for use during both indexing and querying of textual data.

Consider for example the words MB,mib,megabyte,megabytes , all these four variation may contain ur documents or our site content. If want to enable Solr semantic search than one of the option is to set synonyms.Generally it’s good idea to set synonyms at query time.If we enable synonyms at index time our index size may increased.

Let’s have a look at the complete example of configuration and searching process with synonyms and how it’s behave.

Step 1 : Create field type or change existing one

We need to add Filter called SynonymFilterFactory in our field type defination to enable synonyms while indexing or searching.

<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>

Configuration parameter:

1.  synonyms

It indiate the name of the synonyms file name.

2. ingoneCase

It specify the case sensivity. If it’s value is true then it does not check for case sensitivity.

3. expand

It is true,a synonym will be expanded to all equivalent synonyms. If it is false, all equivalent synonyms will be reduced to the first in the list.

FieldType configuration :

<fieldType name="text_gen_synonyms" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
        <!-- in this example, we will only use synonyms at query time
        <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
        -->
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
</fieldType>

Step 2 : Field configuration:

We are using above created field type in our field defination.

<field name="FULL_TEXT" type="text_gen_synonyms" indexed="true" stored="true"/>

Step 3 : Configure synonyms:

Add below synonyms mapping to synonyms.txt file.Solr synonyms mapping file has fix format like

word => synonyms_word_1 , synonyms_word_2

pixima => pixma
GB => gib,gigabyte,gigabytes
MB => mib,megabyte,megabytes
TV => Television, Televisions, TVs

That’s it.Now to cross verify our synonyms configure do following.

  1. Select solr core name from drop down list
  2. Click on Analysis.
  3. Select field name that we have created earlier.
  4. Enter text in Field Value(query) like “tv GB” and click on analysis value.
  5. you will see as below that all synonyms has been added by solr synonymsFilterFactory. All the synonyms is expanded because of we have kept expand=true in field defination.
    Configure synonyms

If we kept expand=false in our field type defination like below

<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="false"/>

then only first term is consider for all the synonyms.

Configure synonyms 2

Refer SynonymFilterFactory for more details.

 

Was this post helpful?

Leave a Reply

Your email address will not be published. Required fields are marked *