1. Overview

Solr supports fuzzy search based on Damerau-Levenshtein Distance or Edit Distance algorithm.

Fuzzy searches discover terms that are similar to a specified term without necessarily being an exact match

2. How it works

The First step is to it will generate all possible matching terms that are within the maximum edit distance specified in fuzziness and then checks the term dictionary to find out which of those generated terms actually exist in the index.

3. Syntax

~ operator is used to run fuzzy searches.We need to add ~ operator after every single term and we can also specify edit distance which is optional after that as below.

{FIELD_NAME:TERM_1~{Edit_Distance} OR
  FIELD_NAME:TERM_2~{Edit_Distance} OR
  FIELD_NAME:TERM_2~{Edit_Distance}

4. Examples

Now let’s look at a couple of example for better understanding. We have indexed techproducts example data and use it in all examples.

4.1 Without Edit Distance Example

If we have not specified edit distance then solr take 2 as a default value of edit distance.

4.1.1 Query

http://localhost:8983/solr/FuzzySearchExample/select?
  indent=on&
  q=manu:Samsu~
  &wt=json&
  fl=id,manu

Above query will match term like samsung ,  samsun , samsuns etc..

4.1.1 Output

{
  "responseHeader":{
    "status":0,
    "QTime":198,
    "params":{
      "q":"manu:Samsun~",
      "indent":"on",
      "fl":"id,manu",
      "wt":"json"}},
  "response":{"numFound":1,"start":0,"docs":[
      {
        "id":"SP2514N",
        "manu":"Samsung Electronics Co. Ltd."}]
  }}

4.2 Edit Distance Example

4.2.1 Query

http://localhost:8983/solr/FuzzySearchExample/select?
  indent=on&
  q=manu:Samsun~1 AND manu:Electroni~2&
  wt=json&fl=id,manu

4.2.1 Output

{
  "responseHeader":{
    "status":0,
    "QTime":1181,
    "params":{
      "q":"manu:Samsun~1 AND manu:Electroni~2",
      "indent":"on",
      "fl":"id,manu",
      "wt":"json"}},
  "response":{"numFound":1,"start":0,"docs":[
      {
        "id":"SP2514N",
        "manu":"Samsung Electronics Co. Ltd."}]
  }}

4.3 Incorrect Edit Distance Example

If we give edit distance value in fraction and it is greater then 1 then solr throws syntax error like “Fractional Edit distance not allowed”

Query term like develope~0.8 will work but develope~1.1 will not work.

If edit distance is in fraction and less than 1 then solr convert that value(In our case it is 0.8) to proper edit distance and run user query.

4.3.1 Query

http://localhost:8983/solr/FuzzySearchExample/select?
  indent=on&
  q=manu:Samsun~1 AND manu:Electroni~1.5&
  wt=json&fl=id,manu

4.3.2 Output

{
  "responseHeader":{
    "status":400,
    "QTime":38,
    "params":{
      "q":"manu:Samsun~1 AND manu:Electroni~1.5",
      "indent":"on",
      "fl":"id,manu",
      "wt":"json"}},
  "error":{
    "metadata":[
      "error-class","org.apache.solr.common.SolrException",
      "root-error-class","org.apache.solr.search.SyntaxError"],
    "msg":"org.apache.solr.search.SyntaxError: Fractional edit distances are not allowed!",
    "code":400}}

5. Conclusion

In this article we have discussed fuzzy search syntax, how to run fuzzy search with various examples like without edit distance, with edit distance and incorrect or wrong edit distance.

6. References

Refer Solr Reference GuideSolr Multiple Filter QueriesMain Query vs Filter Query for more details.

Was this post helpful?
Let us know, if you liked the post. Only in this way, we can improve us.
Yes
No

Leave a Reply

Your email address will not be published. Required fields are marked *