

Solr support regular expression search support.The Solr/Lucene regular expression engine is not Perl-compatible but supports a smaller range of operators.
In previous article solr-regular-expression-part-1 we have discussed some of the basic operators that solr/lucene supports.
Table of Contents
Grouping
Parentheses “()” can be used to form sub-patterns. The quantity operators listed above operate on the shortest previous pattern, which can be a group. For string “ababab”:
(ab)+ # match ab(ab)+ # match (..)+ # match (...)+ # no match (ab)* # match abab(ab)? # match ab(ab)? # no match (ab){3} # match (ab){1,2} # no match
Alternation
The pipe symbol “|” acts as an OR operator. The match will succeed if the pattern on either the left-hand side OR the right-hand side matches. The alternation applies to the longest pattern, not the shortest. For string “aabb”:
aabb|bbaa # match aacc|bb # no match aa(cc|bb) # match a+|b+ # no match a+b+|b+a+ # match a+(b|c)+ # match
Character classes
Ranges of potential characters may be represented as character classes by enclosing them in square brackets “[]”. A leading ^ negates the character class. The allowed forms are:
[abc] # 'a' or 'b' or 'c' [a-c] # 'a' or 'b' or 'c' [-abc] # '-' or 'a' or 'b' or 'c' [abc\-] # '-' or 'a' or 'b' or 'c' [^abc] # any character except 'a' or 'b' or 'c' [^a-c] # any character except 'a' or 'b' or 'c' [^-abc] # any character except '-' or 'a' or 'b' or 'c' [^abc\-] # any character except '-' or 'a' or 'b' or 'c'
“the dash “-” indicates a range of characters, unless it is the first character or if it is escaped with a backslash.”
For string “abcd”:
ab[cd]+ # match [a-d]+ # match [^a-d]+ # no match
Optional operators
These operators are available by default as the flags parameter defaults to ALL.
Complement
The complement is probably the most useful option. The shortest pattern that follows a tilde “~” is negated. For instance, “ab~cd” means:
Starts with a
Followed by b
Followed by a string of any length that it anything but c
Ends with d
For the string “abcdef”:
ab~df # match ab~cf # match ab~cdef # no match a~(cb)def # match a~(bc)def # no match
Enabled with the COMPLEMENT or ALL flags.
Interval
The interval option enables the use of numeric ranges, enclosed by angle brackets “<>”. For string: “solr90”:
solr<1-100> # match solr<01-100> # match solr<001-100> # no match
Enabled with the INTERVAL or ALL flags.
Intersection
The ampersand “&” joins two patterns in a way that both of them have to match. For string “aaabbb”:
aaa.+&.+bbb # match aaa&bbb # no match
Using this feature usually means that you should rewrite your regular expression.
Any string
The at sign “@” matches any string in its entirety. This could be combined with the intersection and complement above to express “everything except”. For instance:
@&~(solr.+) # anything except string beginning with "solr"
Click solr-regular-expression-part-1 to ready solr regular expression part-1.