Add text to "Populating a Database" pointing out that bulk data load into a

author Tom Lane

Sat, 29 May 2010 21:08:04 +0000 (21:08 +0000)

committer Tom Lane

Sat, 29 May 2010 21:08:04 +0000 (21:08 +0000)
author Tom Lane
Sat, 29 May 2010 21:08:04 +0000 (21:08 +0000)
committer Tom Lane
Sat, 29 May 2010 21:08:04 +0000 (21:08 +0000)
diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml

index 9400ebcc151a7955a2faa278996ab30541768cee..4b6768bb694428143ee16e7c28d9e79c0faf7f89 100644 (file)
--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -1,4 +1,4 @@
-
+
  
   
    Performance Tips
@@ -870,11 +870,11 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
  
     
      If you are adding large amounts of data to an existing table,
-    it might be a win to drop the index,
-    load the table, and then recreate the index.  Of course, the
+    it might be a win to drop the indexes,
+    load the table, and then recreate the indexes.  Of course, the
      database performance for other users might suffer
-    during the time the index is missing.  One should also think
-    twice before dropping unique indexes, since the error checking
+    during the time the indexes are missing.  One should also think
+    twice before dropping a unique index, since the error checking
      afforded by the unique constraint will be lost while the index is
      missing.
     
@@ -890,6 +890,19 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
      the constraints.  Again, there is a trade-off between data load
      speed and loss of error checking while the constraint is missing.
     
+
+   
+    What's more, when you load data into a table with existing foreign key
+    constraints, each new row requires an entry in the server's list of
+    pending trigger events (since it is the firing of a trigger that checks
+    the row's foreign key constraint).  Loading many millions of rows can
+    cause the trigger event queue to overflow available memory, leading to
+    intolerable swapping or even outright failure of the command.  Therefore
+    it may be necessary, not just desirable, to drop and re-apply
+    foreign keys when loading large amounts of data.  If temporarily removing
+    the constraint isn't acceptable, the only other recourse may be to split
+    up the load operation into smaller transactions.
+   
    
  
    
@@ -930,11 +943,11 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
      When loading large amounts of data into an installation that uses
      WAL archiving or streaming replication, it might be faster to take a
      new base backup after the load has completed than to process a large
-    amount of incremental WAL data. You might want to disable archiving
-    and streaming replication while loading, by setting
+    amount of incremental WAL data.  To prevent incremental WAL logging
+    while loading, disable archiving and streaming replication, by setting
       to minimal,
-     off, and
-     to zero).
+     to off, and
+     to zero.
      But note that changing these settings requires a server restart.
     
  
@@ -1006,7 +1019,8 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
      pg_dump dump as quickly as possible, you need to
      do a few extra things manually.  (Note that these points apply while
      restoring a dump, not while creating it.
-    The same points apply when using pg_restore to load
+    The same points apply whether loading a text dump with
+    psql or using pg_restore to load
      from a pg_dump archive file.)
     
  
@@ -1027,10 +1041,11 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
       
        
         If using WAL archiving or streaming replication, consider disabling
-       them during the restore. To do that, set archive_mode off,
+       them during the restore. To do that, set archive_mode
+       to off,
         wal_level to minimal, and
-       max_wal_senders zero before loading the dump script,
-       and afterwards set them back to the right values and take a fresh
+       max_wal_senders to zero before loading the dump.
+       Afterwards, set them back to the right values and take a fresh
         base backup.
        
       
@@ -1044,10 +1059,14 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
         possibly discarding many hours of processing.  Depending on how
         interrelated the data is, that might seem preferable to manual cleanup,
         or not.  COPY commands will run fastest if you use a single
-       transaction and have WAL archiving turned off. 
-       pg_restore also has a 
-       which allows concurrent data loading and index creation, and has
-       the performance advantages of doing COPY in a single transaction.
+       transaction and have WAL archiving turned off.
+      
+     
+     
+      
+       If multiple CPUs are available in the database server, consider using
+       pg_restore's 
+       allows concurrent data loading and index creation.
author	Tom Lane
	Sat, 29 May 2010 21:08:04 +0000 (21:08 +0000)
committer	Tom Lane
	Sat, 29 May 2010 21:08:04 +0000 (21:08 +0000)