* Stop escaping ? and {. As of SQL:2008, SIMILAR TO is defined to have
POSIX-compatible interpretation of ? as well as {m,n} and related constructs,
so we should allow these things through to our regex engine.
* Escape ^ and $. It appears that our regex engine will treat ^^ at the
beginning of the string the same as ^, and similarly for $$ at the end of
the string, which meant that SIMILAR TO was effectively ignoring ^ at the
start of the pattern and $ at the end. Since these are not supposed to be
metacharacters, this is a bug.
The second part of this is arguably a back-patchable bug fix, but I'm
hesitant to do that because it might break applications that are expecting
something like "col SIMILAR TO '^foo$'" to work like a POSIX pattern.
Seems safer to only change it at a major version boundary.
Per discussion of an example from Doug Gorley.
-
+
Functions and Operators
or more times.
+
+ ? denotes repetition of the previous item zero
+ or one time.
+
+
+
+ {>m>} denotes repetition
+ of the previous item exactly m> times.
+
+
+
+ {>m>,} denotes repetition
+ of the previous item m> or more times.
+
+
+
+ {>m>,>n>}>
+ denotes repetition of the previous item at least m> and
+ not more than n> times.
+
+
Parentheses () can be used to group items into
- Notice that bounded repetition operators (?> and
- {...}>) are not provided, though they exist in POSIX.
- Also, the period (.>) is not a metacharacter.
+ Notice that the period (.>) is not a metacharacter
+ for SIMILAR TO>.
*
*
* IDENTIFICATION
- * $PostgreSQL: pgsql/src/backend/utils/adt/regexp.c,v 1.82 2009/06/11 14:49:04 momjian Exp $
+ * $PostgreSQL: pgsql/src/backend/utils/adt/regexp.c,v 1.83 2009/10/10 03:50:15 tgl Exp $
*
* Alistair Crooks added the code for the regex caching
* agc - cached the regular expressions used - there's a good chance
/*
* similar_escape()
- * Convert a SQL99 regexp pattern to POSIX style, so it can be used by
+ * Convert a SQL:2008 regexp pattern to POSIX style, so it can be used by
* our regexp engine.
*/
Datum
}
else if (pchar == '_')
*r++ = '.';
- else if (pchar == '\\' || pchar == '.' || pchar == '?' ||
- pchar == '{')
+ else if (pchar == '\\' || pchar == '.' ||
+ pchar == '^' || pchar == '$')
{
*r++ = '\\';
*r++ = pchar;