1. Overview

Proximity Search is an extension of phrase search, In phrase search, two query terms should be in a relative position.Proximity search allows us to search more than one query terms in same context or within a specified slop.Slop is the maximum distance between two given query terms.

Proximity search return documents in which the words are widely separated, Solr will give higher relevance score to documents if it contains query terms much closer.

2. How it works

Solr will rewrite proximity search query in NearSpanQuery.NearSpanQuery give facility to allow search with given slop.

There are two types of NearSpanQuery, first one is NearSpansOrdered which give results if terms in documents in the same order as query terms order with a given slop.While NearSpanUnOrdered does not consider order.

3. Syntax

~ operator is used to run proximity searches.We need to add ~ operator after phrase query and a numeric value (valid slop) as below.

 {FIELD_NAME_1:"TERM_1 TERM_2"~<SLOP>} OR
{FIELD_NAME_2:"TERM_1 TERM_2 TERM_3"~<SLOP>}

4. Examples

Now let’s look at a couple of example for better understanding. We have indexed techproducts example data and use it in all examples.

4.1 Query

Here we have used very less numeric slop as 2 which used if a user wants to search more specific way or used if data contain some garbage content between two terms.

http://localhost:8983/solr/ProximitySearchExample/select?
  indent=on&
  q=name:"GB18030 characters"~2&
  wt=json&
  rows=10

4.2 Output

{
  "responseHeader":{
    "status":0,
    "QTime":2,
    "params":{
      "q":"name:\"GB18030  characters\"~2",
      "indent":"on",
      "wt":"json"}},
  "response":{"numFound":1,"start":0,"docs":[
      {
        "id":"GB18030TEST",
        "name":"Test with some GB18030 encoded characters",
        "features":["No accents here",
          "这是一个功能",
          "This is a feature (translated)",
          "这份文件是很有光泽",
          "This document is very shiny (translated)"],
        "price":0.0,
        "price_c":"0.0,USD",
        "inStock":true,
        "_version_":1596475132917841920}]
  }}

4.3 Query

Here we have used very high numeric slop as 10 which used if a user does not want to search more specific but give little bit general purpose query.

http://localhost:8983/solr/ProximitySearchExample/select?
  indent=on&
  q=name:"Samsung hard drive"~10&
  wt=json&
  rows=10

4.4 Output

{
  "responseHeader":{
    "status":0,
    "QTime":2,
    "params":{
      "q":"name:\"Samsung hard drive\"~10",
      "indent":"on",
      "wt":"json"}},
  "response":{"numFound":1,"start":0,"docs":[
      {
        "id":"SP2514N",
        "name":"Samsung SpinPoint P120 SP2514N - hard drive - 250 GB - ATA-133",
        "manu":"Samsung Electronics Co. Ltd.",
        "manu_id_s":"samsung",
        "cat":["electronics",
          "hard drive"],
        "features":["7200RPM, 8MB cache, IDE Ultra ATA-133",
          "NoiseGuard, SilentSeek technology, Fluid Dynamic Bearing (FDB) motor"],
        "price":92.0,
        "price_c":"92.0,USD",
        "popularity":6,
        "inStock":true,
        "manufacturedate_dt":"2006-02-13T15:26:37Z",
        "store":"35.0752,-97.032",
        "_version_":1596475133470441472}]
  }}

5. Conclusion

In this article we have discussed basic of Proximity search, how it’s work and it’s syntax and how to run proximity search with various slops.Solr internally use NearSpanOrdered and NearSpanUnOrdered to run proximity query.

6. References

Refer Solr Reference GuideSolr Multiple Filter QueriesMain Query vs Filter Query , Solr Fuzzy Search for more details.

Was this post helpful?

Leave a Reply

Your email address will not be published. Required fields are marked *