Bruce Momjian [Fri, 7 Sep 2018 00:42:24 +0000 (20:42 -0400)]
doc: wording fix
Author: Liudmila Mantrova
Backpatch-through: 9.6 and 10 only
Tom Lane [Thu, 6 Sep 2018 14:49:45 +0000 (10:49 -0400)]
Make contrib/unaccent's unaccent() function work when not in search path.
Since the fixes for CVE-2018-1058, we've advised people to schema-qualify
function references in order to fix failures in code that executes under
a minimal search_path setting. However, that's insufficient to make the
single-argument form of unaccent() work, because it looks up the "unaccent"
text search dictionary using the search path.
The most expedient answer seems to be to remove the search_path dependency
by making it look in the same schema that the unaccent() function itself
is declared in. This will definitely work for the normal usage of this
function with the unaccent dictionary provided by the extension.
It's barely possible that there are people who were relying on the
search-path-dependent behavior to select other dictionaries with the same
name; but if there are any such people at all, they can still get that
behavior by writing unaccent('unaccent', ...), or possibly
unaccent('unaccent'::text::regdictionary, ...) if the lookup has to be
postponed to runtime.
Per complaint from Gunnlaugur Thor Briem. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/CAPs+M8LCex6d=DeneofdsoJVijaG59m9V0ggbb3pOH7hZO4+cQ@mail.gmail.com
Amit Kapila [Thu, 6 Sep 2018 04:49:51 +0000 (10:19 +0530)]
Fix the overrun in hash index metapage for smaller block sizes.
The commit
620b49a1 changed the value of HASH_MAX_BITMAPS with the intent
to allow many non-unique values in hash indexes without worrying to reach
the limit of the number of overflow pages. At that time, this didn't
occur to us that it can overrun the block for smaller block sizes.
Choose the value of HASH_MAX_BITMAPS based on BLCKSZ such that it gives
the same answer as now for the cases where the overrun doesn't occur, and
some other sufficiently-value for the cases where an overrun currently
does occur. This allows us not to change the behavior in any case that
currently works, so there's really no reason for a HASH_VERSION bump.
Author: Dilip Kumar
Reviewed-by: Amit Kapila
Backpatch-through: 10
Discussion: https://postgr.es/m/CAA4eK1LtF4VmU4mx_+i72ff1MdNZ8XaJMGkt2HV8+uSWcn8t4A@mail.gmail.com
Tom Lane [Wed, 5 Sep 2018 17:47:16 +0000 (13:47 -0400)]
Make argument names of pg_get_object_address consistent, and fix docs.
pg_get_object_address and pg_identify_object_as_address are supposed
to be inverses, but they disagreed as to the names of the arguments
representing the textual form of an object address. Moreover, the
documented argument names didn't agree with reality at all, either
for these functions or pg_identify_object.
In HEAD and v11, I think we can get away with renaming the input
arguments of pg_get_object_address to match the outputs of
pg_identify_object_as_address. In theory that might break queries
using named-argument notation to call pg_get_object_address, but
it seems really unlikely that anybody is doing that, or that they'd
have much trouble adjusting if they were. In older branches, we'll
just live with the lack of consistency.
Aside from fixing the documentation of these functions to match reality,
I couldn't resist the temptation to do some copy-editing.
Per complaint from Jean-Pierre Pelletier. Back-patch to 9.5 where these
functions were introduced. (Before v11, this is a documentation change
only.)
Discussion: https://postgr.es/m/CANGqjDnWH8wsTY_GzDUxbt4i=y-85SJreZin4Hm8uOqv1vzRQA@mail.gmail.com
Bruce Momjian [Wed, 5 Sep 2018 02:34:07 +0000 (22:34 -0400)]
docs: improve AT TIME ZONE description
The previous description was unclear. Also add a third example, change
use of time zone acronyms to more verbose descriptions, and add a
mention that using 'time' with AT TIME ZONE uses the current time zone
rules.
Backpatch-through: 9.3
Amit Kapila [Tue, 4 Sep 2018 05:19:05 +0000 (10:49 +0530)]
Prohibit pushing subqueries containing window function calculation to
workers.
Allowing window function calculation in workers leads to inconsistent
results because if the input row ordering is not fully deterministic, the
output of window functions might vary across workers. The fix is to treat
them as parallel-restricted.
In the passing, improve the coding pattern in max_parallel_hazard_walker
so that it has a chain of mutually-exclusive if ... else if ... else if
... else if ... IsA tests.
Reported-by: Marko Tiikkaja
Bug: 15324
Author: Amit Kapila
Reviewed-by: Tom Lane
Backpatch-through: 9.6
Discussion: https://postgr.es/m/CAL9smLAnfPJCDUUG4ckX2iznj53V7VSMsYefzZieN93YxTNOcw@mail.gmail.com
Amit Kapila [Tue, 4 Sep 2018 03:13:37 +0000 (08:43 +0530)]
During the split, set checksum on an empty hash index page.
On a split, we allocate a new splitpoint's worth of bucket pages wherein
we initialize the last page with zeros which is fine, but we forgot to set
the checksum for that last page.
We decided to back-patch this fix till 10 because we don't have an easy
way to test it in prior versions. Another reason is that the hash-index
code is changed heavily in 10, so it is not advisable to push the fix
without testing it in prior versions.
Author: Amit Kapila
Reviewed-by: Yugo Nagata
Backpatch-through: 10
Discussion: https://postgr.es/m/
5d03686d-727c-dbf8-0064-
bf8b97ffe850@2ndquadrant.com
Michael Paquier [Sun, 2 Sep 2018 19:40:45 +0000 (12:40 -0700)]
Fix initial sync of slot parent directory when restoring status
At the beginning of recovery, information from replication slots is
recovered from disk to memory. In order to ensure the durability of the
information, the status file as well as its parent directory are
synced. It happens that the sync on the parent directory was done
directly using the status file path, which is logically incorrect, and
the current code has been doing a sync on the same object twice in a
row.
Reported-by: Konstantin Knizhnik
Diagnosed-by: Konstantin Knizhnik
Author: Michael Paquier
Discussion: https://postgr.es/m/
9eb1a6d5-b66f-2640-598d-
c5ea46b8f68a@postgrespro.ru
Backpatch-through: 9.4-
Tom Lane [Sat, 1 Sep 2018 20:02:47 +0000 (16:02 -0400)]
Doc: fix oversights in "Client/Server Character Set Conversions" table.
This table claimed that JOHAB could be used as a server encoding, which
was true originally but hasn't been true since 8.3. It also lacked
entries for EUC_JIS_2004 and SHIFT_JIS_2004.
JOHAB problem noted by Lars Kanis, the others by me.
Discussion: https://postgr.es/m/
c0f514a1-b7a9-b9ea-1c02-
c34aead56c06@greiz-reinsdorf.de
Tom Lane [Sat, 1 Sep 2018 19:27:13 +0000 (15:27 -0400)]
Avoid using potentially-under-aligned page buffers.
There's a project policy against using plain "char buf[BLCKSZ]" local
or static variables as page buffers; preferred style is to palloc or
malloc each buffer to ensure it is MAXALIGN'd. However, that policy's
been ignored in an increasing number of places. We've apparently got
away with it so far, probably because (a) relatively few people use
platforms on which misalignment causes core dumps and/or (b) the
variables chance to be sufficiently aligned anyway. But this is not
something to rely on. Moreover, even if we don't get a core dump,
we might be paying a lot of cycles for misaligned accesses.
To fix, invent new union types PGAlignedBlock and PGAlignedXLogBlock
that the compiler must allocate with sufficient alignment, and use
those in place of plain char arrays.
I used these types even for variables where there's no risk of a
misaligned access, since ensuring proper alignment should make
kernel data transfers faster. I also changed some places where
we had been palloc'ing short-lived buffers, for coding style
uniformity and to save palloc/pfree overhead.
Since this seems to be a live portability hazard (despite the lack
of field reports), back-patch to all supported versions.
Patch by me; thanks to Michael Paquier for review.
Discussion: https://postgr.es/m/
1535618100[email protected]
Noah Misch [Sat, 1 Sep 2018 05:59:58 +0000 (22:59 -0700)]
Ignore server-side delays when enforcing wal_sender_timeout.
Healthy clients of servers having poor I/O performance, such as
buildfarm members hamster and tern, saw unexpected timeouts. That
disagreed with documentation. This fix adds one gettimeofday() call
whenever ProcessRepliesIfAny() finds no client reply messages.
Back-patch to 9.4; the bug's symptom is rare and mild, and the code all
moved between 9.3 and 9.4.
Discussion: https://postgr.es/m/
20180826034600[email protected]
Michael Paquier [Fri, 31 Aug 2018 18:04:07 +0000 (11:04 -0700)]
Ensure correct minimum consistent point on standbys
Startup process has improved its calculation of incorrect minimum
consistent point in
8d68ee6, which ensures that all WAL available gets
replayed when doing crash recovery, and has introduced an incorrect
calculation of the minimum recovery point for non-startup processes,
which can cause incorrect page references on a standby when for example
the background writer flushed a couple of pages on-disk but was not
updating the control file to let a subsequent crash recovery replay to
where it should have.
The only case where this has been reported to be a problem is when a
standby needs to calculate the latest removed xid when replaying a btree
deletion record, so one would need connections on a standby that happen
just after recovery has thought it reached a consistent point. Using a
background worker which is started after the consistent point is reached
would be the easiest way to get into problems if it connects to a
database. Having clients which attempt to connect periodically could
also be a problem, but the odds of seeing this problem are much lower.
The fix used is pretty simple, as the idea is to give access to the
minimum recovery point written in the control file to non-startup
processes so as they use a reference, while the startup process still
initializes its own references of the minimum consistent point so as the
original problem with incorrect page references happening post-promotion
with a crash do not show up.
Reported-by: Alexander Kukushkin
Diagnosed-by: Alexander Kukushkin
Author: Michael Paquier
Reviewed-by: Kyotaro Horiguchi, Alexander Kukushkin
Discussion: https://postgr.es/m/
153492341830.1368.
3936905691758473953@wrigleys.postgresql.org
Backpatch-through: 9.3
Alexander Korotkov [Thu, 30 Aug 2018 11:18:53 +0000 (14:18 +0300)]
Enforce cube dimension limit in all cube construction functions
contrib/cube has a limit to 100 dimensions for cube datatype. However, it's
not enforced everywhere, and one can actually construct cube with more than
100 dimensions having then trouble with dump/restore. This commit add checks
for dimensions limit in all functions responsible for cube construction.
Backpatch to all supported versions.
Reported-by: Andrew Gierth
Discussion: https://postgr.es/m/87va7uybt4.fsf%40news-spur.riddles.org.uk
Author: Andrey Borodin with small additions by me
Review: Tom Lane
Backpatch-through: 9.3
Alexander Korotkov [Thu, 30 Aug 2018 11:09:25 +0000 (14:09 +0300)]
Split contrib/cube platform-depended checks into separate test
We're currently maintaining two outputs for cube regression test. But that
appears to be unsuitable, because these outputs are different in out few checks
involving scientific notation. So, split checks involving scientific notation
into separate test, making contrib/cube easier to maintain. Backpatch to all
supported versions in order to make further backpatching easier.
Discussion: https://postgr.es/m/CAPpHfdvJgWjxHsJTtT%2Bo1tz3OR8EFHcLQjhp-d3%2BUcmJLh-fQA%40mail.gmail.com
Author: Alexander Korotkov
Backpatch-through: 9.3
Tom Lane [Fri, 31 Aug 2018 16:26:20 +0000 (12:26 -0400)]
Make checksum_impl.h safe to compile with -fstrict-aliasing.
In general, Postgres requires -fno-strict-aliasing with compilers that
implement C99 strict aliasing rules. There's little hope of getting
rid of that overall. But it seems like it would be a good idea if
storage/checksum_impl.h in particular didn't depend on it, because
that header is explicitly intended to be included by external programs.
We don't have a lot of control over the compiler switches that an
external program might use, as shown by Michael Banck's report of
failure in a privately-modified version of pg_verify_checksums.
Hence, switch to using a union in place of willy-nilly pointer casting
inside this file. I think this makes the code a bit more readable
anyway.
checksum_impl.h hasn't changed since it was introduced in 9.3,
so back-patch all the way.
Discussion: https://postgr.es/m/
1535618100[email protected]
Alvaro Herrera [Thu, 30 Aug 2018 08:39:56 +0000 (05:39 -0300)]
Mention change of width of values generated by SERIAL sequences
This changed during pg10 development, but had not been documented.
Co-authored-by: Jonathan S. Katz
Discussion: https://postgr.es/m/
20180828163408[email protected]
Michael Paquier [Thu, 30 Aug 2018 00:11:19 +0000 (17:11 -0700)]
Stop bgworkers during fast shutdown with postmaster in startup phase
When a postmaster gets into its phase PM_STARTUP, it would start
background workers using BgWorkerStart_PostmasterStart mode immediately,
which would cause problems for a fast shutdown as the postmaster forgets
to send SIGTERM to already-started background workers. With smart and
immediate shutdowns, this correctly happened, and fast shutdown is the
only mode missing the shot.
Author: Alexander Kukushkin
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/CAFh8B=mvnD8+DZUfzpi50DoaDfZRDfd7S=gwj5vU9GYn8UvHkA@mail.gmail.com
Backpatch-through: 9.5
Tom Lane [Tue, 28 Aug 2018 23:46:59 +0000 (19:46 -0400)]
Make pg_restore's identify_locking_dependencies() more bulletproof.
This function had a blacklist of dump object types that it believed
needed exclusive lock ... but we hadn't maintained that, so that it
was missing ROW SECURITY, POLICY, and INDEX ATTACH items, all of
which need (or should be treated as needing) exclusive lock.
Since the same oversight seems likely in future, let's reverse the
sense of the test so that the code has a whitelist of safe object
types; better to wrongly assume a command can't be run in parallel
than the opposite. Currently the only POST_DATA object type that's
safe is CREATE INDEX ... and that list hasn't changed in a long time.
Back-patch to 9.5 where RLS came in.
Discussion: https://postgr.es/m/11450.
1535483506@sss.pgh.pa.us
Andrew Gierth [Tue, 28 Aug 2018 13:43:51 +0000 (14:43 +0100)]
postgres_fdw: don't push ORDER BY with no vars (bug #15352)
Commit
aa09cd242 changed a condition in find_em_expr_for_rel from
being a bms_equal comparison of relids to bms_is_subset, in order to
support order by clauses on foreign joins. But this also allows
through the degenerate case of expressions with no Vars at all (and
hence empty relids), including integer constants which will be parsed
unexpectedly on the remote (viz. "ERROR: ORDER BY position 0 is not in
select list" as in the bug report).
Repair by adding an additional !bms_is_empty test.
Backpatch through to 9.6 where the aforementioned change was made.
Per bug #15352 from Maksym Boguk; analysis and patch by me.
Discussion: https://postgr.es/m/
153518420278.1478.
14875560810251994661@wrigleys.postgresql.org
Andrew Gierth [Tue, 28 Aug 2018 08:52:25 +0000 (09:52 +0100)]
Avoid quadratic slowdown in regexp match/split functions.
regexp_matches, regexp_split_to_table and regexp_split_to_array all
work by compiling a list of match positions as character offsets (NOT
byte positions) in the source string.
Formerly, they then used text_substr to extract the matched text; but
in a multi-byte encoding, that counts the characters in the string,
and the characters needed to reach the starting byte position, on
every call. Accordingly, the performance degraded as the product of
the input string length and the number of match positions, such that
splitting a string of a few hundred kbytes could take many minutes.
Repair by keeping the wide-character copy of the input string
available (only in the case where encoding_max_length is not 1) after
performing the match operation, and extracting substrings from that
instead. This reduces the complexity to being linear in the number of
result bytes, discounting the actual regexp match itself (which is not
affected by this patch).
In passing, remove cleanup using retail pfree() which was obsoleted by
commit
ff428cded (Feb 2008) which made cleanup of SRF multi-call
contexts automatic. Also increase (to ~134 million) the maximum number
of matches and provide an error message when it is reached.
Backpatch all the way because this has been wrong forever.
Analysis and patch by me; review by Kaiting Chen.
Discussion: https://postgr.es/m/
[email protected]
see also https://postgr.es/m/
[email protected]
Tom Lane [Mon, 27 Aug 2018 19:11:12 +0000 (15:11 -0400)]
Fix missing dependency for pg_dump's ENABLE ROW LEVEL SECURITY items.
The archive should show a dependency on the item's table, but it failed
to include one. This could cause failures in parallel restore due to
emitting ALTER TABLE ... ENABLE ROW LEVEL SECURITY before restoring
the table's data. In practice the odds of a problem seem low, since
you would typically need to have set FORCE ROW LEVEL SECURITY as well,
and you'd also need a very high --jobs count to have any chance of this
happening. That probably explains the lack of field reports.
Still, it's a bug, so back-patch to 9.5 where RLS was introduced.
Discussion: https://postgr.es/m/19784.
1535390902@sss.pgh.pa.us
Tom Lane [Sun, 26 Aug 2018 18:21:55 +0000 (14:21 -0400)]
Make syslogger more robust against failures in opening CSV log files.
The previous coding figured it'd be good enough to postpone opening
the first CSV log file until we got a message we needed to write there.
This is unsafe, though, because if the open fails we end up in infinite
recursion trying to report the failure. Instead make the CSV log file
management code look as nearly as possible like the longstanding logic
for the stderr log file. In particular, open it immediately at postmaster
startup (if enabled), or when we get a SIGHUP in which we find that
log_destination has been changed to enable CSV logging.
It seems OK to fail if a postmaster-start-time open attempt fails, as
we've long done for the stderr log file. But we can't die if we fail
to open a CSV log file during SIGHUP, so we're still left with a problem.
In that case, write any output meant for the CSV log file to the stderr
log file. (This will also cover race-condition cases in which backends
send CSV log data before or after we have the CSV log file open.)
This patch also fixes an ancient oversight that, if CSV logging was
turned off during a SIGHUP, we never actually closed the last CSV
log file.
In passing, remember to reset whereToSendOutput = DestNone during syslogger
start, since (unlike all other postmaster children) it's forked before the
postmaster has done that. This made for a platform-dependent difference
in error reporting behavior between the syslogger and other children:
except on Windows, it'd report problems to the original postmaster stderr
as well as the normal error log file(s). It's barely possible that that
was intentional at some point; but it doesn't seem likely to be desirable
in production, and the platform dependency definitely isn't desirable.
Per report from Alexander Kukushkin. It's been like this for a long time,
so back-patch to all supported branches.
Discussion: https://postgr.es/m/CAFh8B==iLUD_gqC-dAENS0V+kVrCeGiKujtKqSQ7++S-caaChw@mail.gmail.com
Bruce Momjian [Sat, 25 Aug 2018 19:03:32 +0000 (15:03 -0400)]
doc: correct syntax of pgtrgm examples in older releases
Reported-by: Liudmila Mantrova
Discussion: https://postgr.es/m/
ded40ecb-557e-8c50-7d58-
69f4b5226664@postgrespro.ru
Backpatch-through: 9.6 and 10 only
Bruce Momjian [Sat, 25 Aug 2018 17:35:14 +0000 (13:35 -0400)]
doc: "Latest checkpoint location" will not match in pg_upgrade
Mention that "Latest checkpoint location" will not match in pg_upgrade
if the standby server is still running during the upgrade, which is
possible. "Match" text first appeared in PG 9.5.
Reported-by: Paul Bonaud
Discussion: https://postgr.es/m/
c7268794-edb4-1772-3bfd-
04c54585c24e@trainline.com
Backpatch-through: 9.5
Bruce Momjian [Sat, 25 Aug 2018 17:01:24 +0000 (13:01 -0400)]
doc: add doc link for 'applicable_roles'
Reported-by: Ashutosh Sharma
Discussion: https://postgr.es/m/CAE9k0PnhnL6MNDLuvkk8USzOa_DpzDzFQPAM_uaGuXbh9HMKYw@mail.gmail.com
Author: Ashutosh Sharma
Backpatch-through: 9.3
Bruce Momjian [Sat, 25 Aug 2018 16:01:53 +0000 (12:01 -0400)]
docs: Clarify pg_ctl initdb option text to match options proto.
The options string appeared in PG 10.
Reported-by: pgsql-kr@postgr