+++ /dev/null
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA25886
- for
; Sun, 12 Mar 2000 23:31:10 -0500 (EST)
-Received: from news.tht.net (news.hub.org [216.126.91.242]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id XAA04589 for
; Sun, 12 Mar 2000 23:19:33 -0500 (EST)
-Received: from hub.org (hub.org [216.126.84.1])
- by news.tht.net (8.9.3/8.9.3) with SMTP id XAA42854;
- Sun, 12 Mar 2000 23:05:05 -0500 (EST)
- by hub.org (8.9.3/8.9.3) with ESMTP id XAA95917
- for
; Sun, 12 Mar 2000 23:00:56 -0500 (EST)
-Received: (from pgman@localhost)
- by candle.pha.pa.us (8.9.0/8.9.0) id WAA25403
-Subject: [HACKERS] Fix for RENAME
-To: PostgreSQL-development
-Date: Sun, 12 Mar 2000 22:59:56 -0500 (EST)
-X-Mailer: ELM [version 2.4ME+ PL72 (25)]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Precedence: bulk
-Status: RO
-
-I have thought about the issue with ALTER TABLE RENAME and keeping the
-file system in sync with the database.
-
-It seems there are three commands that can cause these to get out of
-sync:
-
- CREATE TABLE/INDEX
- DROP TABLE/INDEX
- ALTER TABLE RENAME
-
-Now, if we had file names based only on the oid, we can eliminate file
-renaming for RENAME, but the others are still a problem.
-
-Seems there are three ways to get out of sync:
-
- ABORT transaction
- backend crash
- OS crash
-
-The last two are the same, except the backend crash restarts the
-postmaster, while the OS crash has the postmaster starting up normally.
-
-Here is my idea. Create a C List of file names to unlink on transaction
-commit or abort. For CREATE, unlink created files on transaction ABORT.
-For DROP, unlink dropped files on COMMIT. For RENAME, create a hard
-link for the new table linked to old table, and unlink the old file name
-on COMMIT or the new file on ABORT.
-
-That takes care of COMMIT and ABORT. For backend crash or OS crash, add
-a postgres command-line flag for recovery. Have the postmaster on
-startup or shared memory refresh start up a postgres backend on every
-database with the recovery flag set. Have the postgres backend find all
-the oids in the pg_class table, and have it go through every file in the
-database directory and remove all files that don't match the oids/names
-in pg_class. Also, remove all old sort, noname, and temp files at the
-same time. Seems we should be doing this anyway.
-
-Care would have to be taken that a corrupted database that caused a
-postgres crash on connection would not get the postmaster startup into
-an infinite loop.
-
-Comments?
-
---
- Bruce Momjian | http://www.op.net/~candle
- + If your life is a hard drive, | 830 Blythe Avenue
- + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA23826
- for
; Tue, 14 Mar 2000 13:33:29 -0500 (EST)
-Received: by wallace.ece.rice.edu
- via sendmail from stdin
- id (Debian Smail3.2.0.102)
-Date: Tue, 14 Mar 2000 12:33:32 -0600
-From: "Ross J. Reedstrom"
-To: Hiroshi Inoue
-Subject: Re: [HACKERS] Fix for RENAME
-Mime-Version: 1.0
-Content-Type: text/plain; charset=us-ascii
-User-Agent: Mutt/1.0i
-Status: RO
-
-Hiroshi -
-I've just about finished working up a patch to store the physical
-file name in the pg_class table. There are only two places that
-require a Rule for generating the filename, and one of them is
-only used for bootstrapping. For the initial cut, I used the rule:
-
-The filename consists of the TABLENAME, and underscore, and the OID.
-If this is longer than NAMEDATALEN, shorten the TABLENAME.
-
-I implemented this rule by exporting Tom's makeObjectName function
-from analyze.c, which is used to make other system generated names
-that are have a requirement to be human readable. Replacing this
-rule with any other in the future would be straightforward, except
-for bootstrap. There are a number of places in bootstrap that need to
-know the filename. I've factored them out into yet another set of
-#defines (in catname.h) to make that easier.
-
-
-I'm working through the regression tests right now: this is a relatively
-extensive change, since it modifies the low level access routines, and the
-buffer cache (which I indexed on physical filename, rather than relname,
-as it is now) Hopefully, I caught all the places that assume relname ==
-filename == unique name within a single database (see, I want schemas...)
-
-Ross
---
-Ross J. Reedstrom, Ph.D.,
-NSBRI Research Scientist/Programmer
-Computer and Information Technology Institute
-Rice University, 6100 S. Main St., Houston, TX 77005
-
-
-
-
-
-On Tue, Mar 14, 2000 at 02:24:52PM +0900, Hiroshi Inoue wrote:
-> > -----Original Message-----
-> >
-> > > > They use the existing table file. It is only when
-> > > > adding/removing/renaming file system files that this
-> > out-of-sync problem
-> > > > happens.
-> > > >
-> >
-> > Not sure. I was going to get the CREATE/DROP/RENAME working as it
-> > should then as we add more features, we can implement this solution for
-> > them too.
-> >
->
-> Hmm,is general solution difficult ?
-> Is more flexible naming rule bad ?
->
-> This the 3rd or 4th time that I mention the following.
->
-> PostgreSQL doesn't keep the information in itself where tables are
-> allocated. So we need a naming rule to find where existent tables
-> are allocated. Don't you wonder the spec ?
->
-> Regards.
->
-> Hiroshi Inoue
->
->
-
-Received: from hub.org (hub.org [216.126.84.1])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA06093
- for
; Tue, 14 Mar 2000 19:14:13 -0500 (EST)
-Received: from hub.org (hub.org [216.126.84.1])
- by hub.org (8.9.3/8.9.3) with SMTP id SAA95465;
- Tue, 14 Mar 2000 18:45:35 -0500 (EST)
- by hub.org (8.9.3/8.9.3) with ESMTP id NAA31276
- for
; Tue, 14 Mar 2000 13:33:52 -0500 (EST)
-Received: by wallace.ece.rice.edu
- via sendmail from stdin
- id (Debian Smail3.2.0.102)
-Date: Tue, 14 Mar 2000 12:33:32 -0600
-From: "Ross J. Reedstrom"
-To: Hiroshi Inoue
-Subject: Re: [HACKERS] Fix for RENAME
-Mime-Version: 1.0
-Content-Type: text/plain; charset=us-ascii
-User-Agent: Mutt/1.0i
-Precedence: bulk
-Status: RO
-
-Hiroshi -
-I've just about finished working up a patch to store the physical
-file name in the pg_class table. There are only two places that
-require a Rule for generating the filename, and one of them is
-only used for bootstrapping. For the initial cut, I used the rule:
-
-The filename consists of the TABLENAME, and underscore, and the OID.
-If this is longer than NAMEDATALEN, shorten the TABLENAME.
-
-I implemented this rule by exporting Tom's makeObjectName function
-from analyze.c, which is used to make other system generated names
-that are have a requirement to be human readable. Replacing this
-rule with any other in the future would be straightforward, except
-for bootstrap. There are a number of places in bootstrap that need to
-know the filename. I've factored them out into yet another set of
-#defines (in catname.h) to make that easier.
-
-
-I'm working through the regression tests right now: this is a relatively
-extensive change, since it modifies the low level access routines, and the
-buffer cache (which I indexed on physical filename, rather than relname,
-as it is now) Hopefully, I caught all the places that assume relname ==
-filename == unique name within a single database (see, I want schemas...)
-
-Ross
---
-Ross J. Reedstrom, Ph.D.,
-NSBRI Research Scientist/Programmer
-Computer and Information Technology Institute
-Rice University, 6100 S. Main St., Houston, TX 77005
-
-
-
-
-
-On Tue, Mar 14, 2000 at 02:24:52PM +0900, Hiroshi Inoue wrote:
-> > -----Original Message-----
-> >
-> > > > They use the existing table file. It is only when
-> > > > adding/removing/renaming file system files that this
-> > out-of-sync problem
-> > > > happens.
-> > > >
-> >
-> > Not sure. I was going to get the CREATE/DROP/RENAME working as it
-> > should then as we add more features, we can implement this solution for
-> > them too.
-> >
->
-> Hmm,is general solution difficult ?
-> Is more flexible naming rule bad ?
->
-> This the 3rd or 4th time that I mention the following.
->
-> PostgreSQL doesn't keep the information in itself where tables are
-> allocated. So we need a naming rule to find where existent tables
-> are allocated. Don't you wonder the spec ?
->
-> Regards.
->
-> Hiroshi Inoue
->
->
-
-Received: from corvette.mascari.com (dhcp26136016.columbus.rr.com [24.26.136.16])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA04395
- for
; Tue, 14 Mar 2000 17:32:14 -0500 (EST)
-Received: from mascari.com (ferrari.mascari.com [192.168.2.1])
- by corvette.mascari.com (8.9.3/8.9.3) with ESMTP id RAA09562;
- Tue, 14 Mar 2000 17:27:22 -0500
-Date: Tue, 14 Mar 2000 17:28:26 -0500
-From: Mike Mascari
-X-Mailer: Mozilla 4.7 [en] (Win95; I)
-X-Accept-Language: en
-MIME-Version: 1.0
-CC: Hiroshi Inoue ,
-Subject: Re: [HACKERS] Fix for RENAME
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: RO
-
-Bruce Momjian wrote:
->
-> > Hmm,is general solution difficult ?
-> > Is more flexible naming rule bad ?
-> >
-> > This the 3rd or 4th time that I mention the following.
->
-> That's because I didn't understand.
->
-> >
-> > PostgreSQL doesn't keep the information in itself where tables are
-> > allocated. So we need a naming rule to find where existent tables
-> > are allocated. Don't you wonder the spec ?
->
-> How does naming the files in the database help our DROP/CREATE problem?
-> It would help RENAME a little bit. Not sure about the others because
-> currently they don't have a problem.
-
-I've been thinking about this somewhat, and I think the first
-step necessary in correctly supporting ROLLBACK-able DDL
-statements in transactions is the change to _.
-Imagine the scenario:
-
-CREATE TABLE test (key int4);
-
-a) Session #1:
-
-BEGIN;
-
-b) Session #2:
-
-BEGIN;
-DROP TABLE test;
-CREATE TABLE test (value varchar(32));
-
-c) Session #1:
-
-DROP TABLE test;
-COMMIT;
-
-d) Session #2:
-
-COMMIT;
-
-What's clear to me is that, if DDL statements are to be
-ROLLBACK-able, either (1) an AccessExclusive lock is held on the
-relation until transaction commit (like Phillip Warner stated was
-Dec/Rdb's behavior) or (2) PostgreSQL must be capable of
-supporting "multi-versioned schema" as well as tuples. Before
-step 'c' is executed, both tables must simultaneously exist in
-the database with the same name, which works fine in the cataloge
-thanks to MVCC, but requires that, on disk, there exists:
-
-test_01231 - Session #1's table, available for ROLLBACK
-test_13421 - Session #2's table, available for COMMIT
-
-Now, I believe it was Andreas who suggested that VACUUM be
-modified to perform cleanup. I agree with this. VACUUM will need
-to check for aborted relation tuples in pg_class and remove the
-associated file from the filesystem in the event, for example,
-that Session #2 aborted -or- Session #1 aborted leaving the
-original pg_class tuple the "active" one and Session #2 attempted
-to COMMIT, which violates the UNIQUE constraint on the relname of
-pg_class. In addition, for "active" relation entries, VACUUM
-should verify the filename is
-_ for the given oid. If it is not, it should rename
-the filename on the filesystem. Again, this is purely cosmetic
-for administrative purposes only, but would allow
-for lack of atomicity only with respect to the label of the
-relation file, until the next
-VACUUM is run.
-
-For the case of ALTER TABLE RENAME, ALTER TABLE DROP COLUMN,
-etc., the same functionality would apply. But, as in previous
-discussions regarding ALTER TABLE DROP COLUMN, PostgreSQL MUST be
-capable of allowing multiple tuples with different attribute
-counts and types within the same relation:
-
-CREATE TABLE test (key int4);
-
-a) Session #1:
-
-BEGIN;
-
-b) Session #2:
-
-BEGIN;
-ALTER TABLE test ADD COLUMN value int4;
-INSERT INTO test values (1, 1);
-
-c) Session #1:
-
-INSERT INTO test values (0);
-COMMIT;
-
-d) Session #2:
-
-COMMIT;
-
-This also means that Hiroshi's plan to suppress the visibility of
-attributes for ALTER TABLE DROP COLUMN would be required anyway,
-to allow for "multi-versioning" of attributes within a single
-tuple (i.e., like multi-versioning of tuples within relations),
-an attribute is either visible or not, but the tuple should
-always grow, until, of course, the next VACUUM.
-
-So, to support rollback-able DDL statements ("multi-versioning
-schema", if you will), PostgreSQL needs:
-
-1) relation names of the form _
-2) support "multi-versioning" of attributes within a single tuple
-3) modify VACUUM to:
-
- A) Remove filesystem files whose pg_class tuples are no longer
-valid
- B) Rename filesystem files to relname of pg_class when the
-_ doesn't match
- C) Reconstruct relations after attributes have been
-added/dropped.
-
-4) All DDL statements should perform their non-create filesystem
-functions in the now infamous "post-transaction-commit" trigger.
-If the backend should crash between the time the transaction
-committed and the rename() or unlink(), no adverse affects would
-be encountered with the database WRT data, VACUUM would clean up
-the rename() problem, and, worst-case scenario, an old
-_ file would lie around unused. But at least it
-would no longer prohibit the creation of a table by the same
-name....
-
-Just my humble opinion,
-
-Mike Mascari
-
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA08792
- for
; Tue, 14 Mar 2000 21:30:35 -0500 (EST)
-Received: from cadzone ([126.0.1.40] (may be forged))
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
- id LAA00515; Wed, 15 Mar 2000 11:29:09 +0900
-From: "Hiroshi Inoue"
-To: "Ross J. Reedstrom" ,
-Cc: "PostgreSQL-development"
-Subject: RE: [HACKERS] Fix for RENAME
-Date: Wed, 15 Mar 2000 11:35:46 +0900
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="iso-8859-1"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
-Importance: Normal
-Status: ROr
-
-> -----Original Message-----
->
-> Hiroshi -
-> I've just about finished working up a patch to store the physical
-> file name in the pg_class table. There are only two places that
-> require a Rule for generating the filename, and one of them is
-> only used for bootstrapping.
-
-Thanks for your trial.
-It's nice that only two places require naming rule.
-
-I don't stick to one naming rule.
-The only limitation is the uniqueness and the rule
-could be changed according to situations.
-For example,we could change the naming rule according to
-the kind of relation such as system/user relations.
-
-I'm now inclined to introduce a new system relation to store
-the physical path name. It could also have table(data)space
-information in the (near ?) future.
-It seems better to separate it from pg_class because table(data?)
-space may change the concept of table allocation.
-
-Comments ?
-
-Regards.
-
-Hiroshi Inoue
-
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA17887
- for
; Wed, 15 Mar 2000 03:00:57 -0500 (EST)
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id CAA02974 for
; Wed, 15 Mar 2000 02:54:44 -0500 (EST)
-Received: from cadzone ([126.0.1.40] (may be forged))
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
- id QAA00734; Wed, 15 Mar 2000 16:53:56 +0900
-From: "Hiroshi Inoue"
-Cc: "Ross J. Reedstrom" ,
- "PostgreSQL-development"
-Subject: RE: [HACKERS] Fix for RENAME
-Date: Wed, 15 Mar 2000 17:00:35 +0900
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="iso-8859-1"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
-Importance: Normal
-Status: ROr
-
-> -----Original Message-----
->
-> > I'm now inclined to introduce a new system relation to store
-> > the physical path name. It could also have table(data)space
-> > information in the (near ?) future.
-> > It seems better to separate it from pg_class because table(data?)
-> > space may change the concept of table allocation.
->
-> Why not just put it in pg_class?
->
-
-Not sure,it's only my feeling.
-Comments please,everyone.
-
-We have taken a practical way which doesn't break file per table
-assumption in this thread and it wouldn't so difficult to implement.
-In fact Ross has already tried it.
-
-However there was a discussion about data(table)space for
-months ago and currently a new discussion is there.
-Judging from the previous discussion,I can't expect so much
-that it could get a practical consensus(How many opinions there
-were). We can make a practical step toward future by encapsulating
-the information of table allocation. Separating table alloc info from
-pg_class seems one of the way.
-There may be more essential things for encapsulation.
-
-Comments ?
-
-Regards.
-
-Hiroshi Inoue
-
-
-Received: from hub.org (hub.org [216.126.84.1])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id EAA05789
- for
; Thu, 16 Mar 2000 04:02:29 -0500 (EST)
-Received: from hub.org (hub.org [216.126.84.1])
- by hub.org (8.9.3/8.9.3) with SMTP id CAA27302;
- Thu, 16 Mar 2000 02:58:55 -0500 (EST)
- by hub.org (8.9.3/8.9.3) with ESMTP id CAA23907
- for
; Thu, 16 Mar 2000 02:37:54 -0500 (EST)
-Received: from darwin.oche.de (uucp@localhost)
- by downtown.oche.de (8.9.3/8.9.3/Debian/GNU) with SMTP id IAA30654
- for
; Thu, 16 Mar 2000 08:40:04 +0100
-Received: from mne by darwin.oche.de with local (Exim 3.12 #1 (Debian))
- id 12VUhX-0003Vz-00
- for
; Thu, 16 Mar 2000 08:28:11 +0100
-Date: Thu, 16 Mar 2000 08:28:11 +0100 (CET)
-From: Martin Neumann
-Subject: [HACKERS] RfD: Design of tablespaces
-MIME-Version: 1.0
-Content-Type: TEXT/plain; CHARSET=US-ASCII
-Message-Id:
-Precedence: bulk
-Status: RO
-
-
-I have written some thoughts on the concept of tablespace
-down. I would be happy to get some comments on it.
-
------------------------------------------------------------------
- Implementation of tablespaces within PostgreSQL
-- a brainstorming paper designed for general discussion -
-
-by Martin Neumann, 2000/3/15
-
-
-1. What are tablespaces?
--------------------------
-
-Tablespaces make it possible to distribute storage objects
-over multiple points of storage (POS). Therefor one could
-say a tablespace can be a POS.
-
-Example:
-
-tablespace_a -----> /mnt/raid/arena0/
-tablespace_b -----> /mnt/raid/emc0/
-
-Tablespaces can also store their data on other tablespaces:
-
-tablespace_c -----> tablespace_b
-
-This is quite interessting for administration purposes.
-
-
-2. What are its advantages?
-----------------------------
-
-As you can choose a different tablespace for every storage
-object (table, index etc.) it is easy to improve the following
-aspects of your system:
-
- - Reliability
-
- You can put storage objects (mostly tables) you strongly depend
- on onto a more reliable tablespace (mirrored RAID or perhaps
- simply a directory which gets backuped more often than others).
-
- - Speed
-
- You can put storage objects you rarely need onto a rather slow
- tablespace and keep your quick tablespaces clean from this.
-
- A fast, but more expensive RAID-Stripeset can be used more
- efficiently as it doesn't get filled with non-performance
- sensitive data.
-
- But also distributing storage objects which have equal needs
- in sense of speed onto different tablespaces makes sense as
- you gain more speed by distributing data over more than one
- harddisk spindle.
-
- - Manageability
-
- You can grant and revoke rights on base of a tablespace.
-
- As every storage object belongs to exactly one tablespace,
- you can easily group storage objects using a tablespace.
-
-
-3. What about disk I/O?
-------------------------
-
-Tablespaces tell the storage manager only where to store
-the data, not how. This is the reasonable way.
-
-
-4. Usage
----------
-
-CREATE TABLESPACE tsname TYPE storage_type storage_options
-
-Examples:
-
-CREATE TABLESPACE tsemc0
- TYPE classic DIRECTORY /mnt/raid/emc0 NOFSYNC
-
-CREATE TABLESPACE tsarena0 TYPE raw DEVICE /dev/araid/0
- MINSIZE 128 MAXSIZE 4096 GROW 4 32 SHRINK 2 32
- BLOCKSIZE 16384
-
-CREATE TABLESPACE quick0 TYPE link TABLESPACE tsarena0;
-
---
-
-CREATE TABLE tbname ( ... ) TABLESPACE tsname;
-
-Examples:
-
-CREATE TABLE foo (
- id int4 NOT NULL UNIQUE,
- name text NOT NULL
-) TABLESPACE tsemc0;
-
-CREATE TABLE bar (
- id int4 NOT NULL UNIQUE,
- name text NOT NULL
-) TABLESPACE default;
-
-If the tablespace isn't given, the storage objects gets created
-in the "default" tablespace.
-
-"default" is the PostgreSQL's default tablespace and the only one
-which has to exist on each system.
-
---
-
-ALTER TABLESPACE tsname tssettings
-
-Examples:
-
-ALTER TABLESPACE tsemc0 DIRECTORY /mnt/raid/emc1
-
-
-NOTE: altering tablespaces without recreating the contained
-storage objects introduces many problems.
-Realisation is difficult and won't be my first goal.
-
---
-
-DROP TABLESPACE tsname [FORCE]
-
-Examples:
-
-DROP TABLESPACE tsarena0
-
-This will immediately remove the tablespace tsarena0
-if it contains no storage objects.
-
-If it still contains some the tablespace is marked for
-deletion.
-
-This means:
-1. you can't create new storage objects in the tablespace
-2. if the last storage object inside gets dropped, the
- tablespace will be removed.
-
-
-DROP TABLESPACE tsarena0 FORCE
-
-This will remove the tablespace including all contained
-storage objects immediately.
-
---
-
-VACUUM tsname
-
-Example:
-
-VACUUM tsemc1
-
-This will vacuum a single tablespace with all contained
-storage objects.
------------------------------------------------------------------
-
---
-Martin Neumann, Welkenrather Str. 118c, 52074 Aachen, Germany
-Tel. 0241 / 8876-080 - Mobil: 0173 / 27 69 632
-..------.---------------------------------------------------------
-| at | Inform GmbH - Abteilung Airport Logistics
-| work | Pascalstr. 23 - 52076 Aachen - Tel. 02408 / 9456-0
-
-
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA21372
- for
; Wed, 14 Jun 2000 19:00:59 -0400 (EDT)
-Received: from mailout02.sul.t-online.com (mailout02.sul.t-online.com [194.25.134.17]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id SAA01930 for
; Wed, 14 Jun 2000 18:51:11 -0400 (EDT)
-Received: from fwd01.sul.t-online.de
- by mailout02.sul.t-online.com with smtp
- id 132Lz6-0004ec-01; Thu, 15 Jun 2000 00:50:08 +0200
-Received: from hot.jw.home (340000654369-0001@[62.224.107.172]) by fwd01.sul.t-online.de
- with esmtp id 132Lyy-0tYyi9C; Thu, 15 Jun 2000 00:50:00 +0200
-Received: (from wieck@localhost)
- by hot.jw.home (8.8.5/8.8.5) id WAA07887;
- Wed, 14 Jun 2000 22:43:39 +0200
-Subject: Re: [HACKERS] Big 7.1 open items
- am"
-To: Tom Lane
-Date: Wed, 14 Jun 2000 22:43:39 +0200 (MEST)
-CC: Oliver Elphick
, Bruce Momjian ,
-Reply-To: Jan Wieck
-X-Mailer: ELM [version 2.4ME+ PL68 (25)]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Status: ROr
-
-Tom Lane wrote:
-> "Oliver Elphick"
writes:
-> > I suggest that DROP TABLE in a transaction should not be allowed.
->
-> I had actually made it do that for a short time early this year,
-> and was shouted down. On reflection I have to agree; it's too useful
-> to be able to do
->
-> begin;
-> drop table foo;
-> create table foo(new schema);
-> ...
-> end;
->
-> You do indeed lose big if you suffer an error partway through, but
-> the answer to that is to fix our file naming conventions so that we
-> can support rollback of drop table.
-
- Belongs IMHO to the discussion to keep separate what is
- separate (having indices/toast-relations/etc. in separate
- directories and whatnot).
-
- I've never been really happy with the file naming
- conventions. The need of a filesystem entry to have the same
- name of the DB object that is associated with it isn't right.
- I know, some people love to be able to easily identify the
- files with ls(1). OTOH what is that good for?
-
- Well, someone can easily see how big the disk footprint of
- his data is. Whow - what an info. Anything else?
-
- Why not changing the naming to be something like this:
-
- /catalog_tables/pg_...
- /catalog_index/pg_...
- /user_tables/oid_...
- /user_index/oid_...
- /temp_tables/oid_...
- /temp_index/oid_...
- /toast_tables/oid_...
- /toast_index/oid_...
- /whatnot_???/...
-
- This way, it would be much easier to separate all the
- different object types to different physical media. We would
- loose some transparency, but I've allways wondered what
- people USE that for (except for just wanna know). For
- convinience we could implement another little utility that
- tells the object size like
-
- DESCRIBE TABLE/VIEW/whatnot
-
- that returns the physical location and storage details of the
- object. And psql could use it to print this info additional
- on the \d commands. Would give unprivileged users access to
- this info, so be it, it's not a security issue IMHO.
-
- The subdirectory an object goes into has to be controlled by
- the relkind. So we need to tidy up that a little too. I think
- it's worth it.
-
- The objects storage location (the bare file) now would
- contain the OID. So we avoid naming conflicts for temp
- tables, naming conflicts during DROP/CREATE in a transaction
- and all the like.
-
- Comments?
-
-
-Jan
-
---
-
-#======================================================================#
-# It's easier to get forgiveness for being wrong than for being right. #
-# Let's break this rule - forgive me. #
-
-
-
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA02821
- for
; Wed, 14 Jun 2000 22:06:52 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id WAA16609;
- Wed, 14 Jun 2000 22:07:16 -0400 (EDT)
-To: Jan Wieck
-cc: Oliver Elphick
, Bruce Momjian ,
-Subject: Re: [HACKERS] Big 7.1 open items
- message dated "Wed, 14 Jun 2000 22:43:39 +0200"
-Date: Wed, 14 Jun 2000 22:07:15 -0400
-From: Tom Lane
-Status: RO
-
-> I've never been really happy with the file naming
-> conventions. The need of a filesystem entry to have the same
-> name of the DB object that is associated with it isn't right.
-> I know, some people love to be able to easily identify the
-> files with ls(1). OTOH what is that good for?
-
-I agree with Jan on this: let's just change the file names over to
-be OIDs. Then we can have rollbackable DROP and RENAME TABLE easily.
-Naming the files after the logical names of the tables is nice if it
-doesn't cost anything, but it is *not* worth the trouble to preserve
-a relationship between filename and tablename when it is costing us.
-And it's costing us big time. That single feature is hurting us on
-functionality, robustness, and portability, and for what benefit?
-Not nearly enough. It's time to just let go of it.
-
-> Why not changing the naming to be something like this:
-
-> /catalog_tables/pg_...
-> /catalog_index/pg_...
-> /user_tables/oid_...
-> /user_index/oid_...
-> /temp_tables/oid_...
-> /temp_index/oid_...
-> /toast_tables/oid_...
-> /toast_index/oid_...
-> /whatnot_???/...
-
-I don't see a lot of value in that. Better to do something like
-tablespaces:
-
- //
-
- regards, tom lane
-
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA25561
- for
; Wed, 14 Jun 2000 22:20:56 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id WAA16708;
- Wed, 14 Jun 2000 22:21:30 -0400 (EDT)
-cc: Jan Wieck
, Oliver Elphick ,
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Bruce Momjian
- message dated "Wed, 14 Jun 2000 19:13:47 -0400"
-Date: Wed, 14 Jun 2000 22:21:30 -0400
-From: Tom Lane
-Status: ROr
-
-> You need something that works from the command line, and something that
-> works if PostgreSQL is not running. How would you restore one file from
-> a tape.
-
-"Restore one file from a tape"? How are you going to do that anyway?
-You can't save and restore portions of a database like that, because
-of transaction commit status problems. To restore table X correctly,
-you'd have to restore pg_log as well, and then your other tables are
-hosed --- unless you also restore all of them from the backup. Only
-a complete database restore from tape would work, and for that you
-don't need to tell which file is which. So the above argument is a
-red herring.
-
-I realize it's nice to be able to tell which table file is which by
-eyeball, but the price we are paying for that small convenience is
-just too high. Give that up, and we can have rollbackable DROP and
-RENAME now (I'll personally commit to making it happen for 7.1).
-Continue to insist on it, and I don't think we'll *ever* have those
-features in a really robust form. It's just not possible to do
-multiple file renames atomically.
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA05943
- for
; Wed, 14 Jun 2000 22:23:24 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5F2ME840721;
- Wed, 14 Jun 2000 22:22:14 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5F2Le840155
- for
; Wed, 14 Jun 2000 22:21:41 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id WAA16708;
- Wed, 14 Jun 2000 22:21:30 -0400 (EDT)
-cc: Jan Wieck
, Oliver Elphick ,
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Bruce Momjian
- message dated "Wed, 14 Jun 2000 19:13:47 -0400"
-Date: Wed, 14 Jun 2000 22:21:30 -0400
-From: Tom Lane
-Precedence: bulk
-Status: ROr
-
-> You need something that works from the command line, and something that
-> works if PostgreSQL is not running. How would you restore one file from
-> a tape.
-
-"Restore one file from a tape"? How are you going to do that anyway?
-You can't save and restore portions of a database like that, because
-of transaction commit status problems. To restore table X correctly,
-you'd have to restore pg_log as well, and then your other tables are
-hosed --- unless you also restore all of them from the backup. Only
-a complete database restore from tape would work, and for that you
-don't need to tell which file is which. So the above argument is a
-red herring.
-
-I realize it's nice to be able to tell which table file is which by
-eyeball, but the price we are paying for that small convenience is
-just too high. Give that up, and we can have rollbackable DROP and
-RENAME now (I'll personally commit to making it happen for 7.1).
-Continue to insist on it, and I don't think we'll *ever* have those
-features in a really robust form. It's just not possible to do
-multiple file renames atomically.
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA10091
- for
; Wed, 14 Jun 2000 22:31:41 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5F2UI853244;
- Wed, 14 Jun 2000 22:30:18 -0400 (EDT)
- by hub.org (8.10.1/8.10.1) with ESMTP id e5F2Th852641
- for
; Wed, 14 Jun 2000 22:29:43 -0400 (EDT)
-Received: (from pgman@localhost)
- by candle.pha.pa.us (8.9.0/8.9.0) id WAA06576;
- Wed, 14 Jun 2000 22:28:53 -0400 (EDT)
-Subject: Re: [HACKERS] Big 7.1 open items
- pm"
-To: Tom Lane
-Date: Wed, 14 Jun 2000 22:28:53 -0400 (EDT)
-CC: Jan Wieck
, Oliver Elphick ,
-X-Mailer: ELM [version 2.4ME+ PL77 (25)]
-MIME-Version: 1.0
-Content-Transfer-Encoding: 7bit
-Content-Type: text/plain; charset=US-ASCII
-Precedence: bulk
-Status: RO
-
-> > You need something that works from the command line, and something that
-> > works if PostgreSQL is not running. How would you restore one file from
-> > a tape.
->
-> "Restore one file from a tape"? How are you going to do that anyway?
-> You can't save and restore portions of a database like that, because
-> of transaction commit status problems. To restore table X correctly,
-> you'd have to restore pg_log as well, and then your other tables are
-> hosed --- unless you also restore all of them from the backup. Only
-> a complete database restore from tape would work, and for that you
-> don't need to tell which file is which. So the above argument is a
-> red herring.
->
-> I realize it's nice to be able to tell which table file is which by
-> eyeball, but the price we are paying for that small convenience is
-> just too high. Give that up, and we can have rollbackable DROP and
-> RENAME now (I'll personally commit to making it happen for 7.1).
-> Continue to insist on it, and I don't think we'll *ever* have those
-> features in a really robust form. It's just not possible to do
-> multiple file renames atomically.
->
-
-OK, I am flexible. (Yea, right.) :-)
-
-But seriously, let me give some background. I used Ingres, that used
-the VMS file system, but used strange sequential AAAF324 numbers for
-tables. When someone deleted a table, or we were looking at what tables
-were using disk space, it was impossible to find the Ingres table names
-that went with the file. There was a system table that showed it, but
-it was poorly documented, and if you deleted the table, there was no way
-to look on the tape to find out which file to restore.
-
-As far as pg_log, you certainly would not expect to get any information
-back from the time of the backup table to current, so the current pg_log
-would be just fine.
-
-Basically, I guess we have to do it, but we have to print the proper
-error messages for cases in the backend we just print the file name.
-Also, we have to now replace the 'ls -l' command with something that
-will be meaningful.
-
-Right now, we use 'ps' with args to display backend information, and ls
--l to show disk information. We are going to lose that here.
-
-
-
---
- Bruce Momjian | http://www.op.net/~candle
- + If your life is a hard drive, | 830 Blythe Avenue
- + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
-
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA09340
- for
; Wed, 14 Jun 2000 22:31:00 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id WAA16783
- for
; Wed, 14 Jun 2000 22:31:34 -0400 (EDT)
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Bruce Momjian
- message dated "Wed, 14 Jun 2000 22:23:58 -0400"
-Date: Wed, 14 Jun 2000 22:31:33 -0400
-From: Tom Lane
-Status: RO
-
-> Can I phone you?
-
-Sure, I'm here.
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA27501
- for
; Wed, 14 Jun 2000 22:38:28 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5F2bD870244;
- Wed, 14 Jun 2000 22:37:13 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5F2af869743
- for
; Wed, 14 Jun 2000 22:36:41 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id WAA16814;
- Wed, 14 Jun 2000 22:36:19 -0400 (EDT)
-cc: Jan Wieck
, Oliver Elphick ,
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Bruce Momjian
- message dated "Wed, 14 Jun 2000 22:28:53 -0400"
-Date: Wed, 14 Jun 2000 22:36:19 -0400
-From: Tom Lane
-Precedence: bulk
-Status: ROr
-
-> But seriously, let me give some background. I used Ingres, that used
-> the VMS file system, but used strange sequential AAAF324 numbers for
-> tables. When someone deleted a table, or we were looking at what tables
-> were using disk space, it was impossible to find the Ingres table names
-> that went with the file. There was a system table that showed it, but
-> it was poorly documented, and if you deleted the table, there was no way
-> to look on the tape to find out which file to restore.
-
-Fair enough, but it seems to me that the answer is to expend some effort
-on system admin support tools. We could do a lot in that line with less
-effort than trying to make a fundamentally mismatched filesystem
-representation do what we need.
-
- regards, tom lane
-
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA06306
- for
; Wed, 14 Jun 2000 23:13:26 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id XAA16988;
- Wed, 14 Jun 2000 23:13:53 -0400 (EDT)
-cc: Jan Wieck
, Oliver Elphick ,
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Bruce Momjian
- message dated "Wed, 14 Jun 2000 22:44:16 -0400"
-Date: Wed, 14 Jun 2000 23:13:52 -0400
-From: Tom Lane
-Status: ROr
-
-> That was my point --- that in doing this change, we are taking on more
-> TODO items, that may detract from our main TODO items.
-
-True, but they are also TODO items that could be handled by people other
-than the inner circle of key developers. The actual rejiggering of
-table-to-filename mapping is going to have to be done by one of the
-small number of people who are fully up to speed on backend internals.
-But we've got a lot more folks who would be able (and, hopefully,
-willing) to design and code whatever tools are needed to make the
-dbadmin's job easier in the face of the new filesystem layout. I'd
-rather not expend a lot of core time to avoid needing those tools,
-especially when I feel the old approach is fatally flawed anyway.
-
-> Even gdb shows us the filename/tablename in backtraces. We are never
-> going to be able to reproduce that.
-
-Backtraces from *what*, exactly? 99% of the backend is still going
-to be dealing with the same data as ever. It might be that poking
-around in fd.c will be a little harder, but considering that fd.c
-doesn't really know or care what the files it's manipulating are
-anyway, I'm not convinced that this is a real issue.
-
-> I guess I don't consider table schema commands inside transactions and
-> such to be as big an items as the utility features we will need to
-> build.
-
-You've *got* to be kidding. We're constantly seeing complaints about
-the fact that rolling back DROP or RENAME TABLE fails --- and worse,
-leaves the table in a corrupted/inconsistent state. As far as I can
-tell, that's one of the worst robustness problems we've got left to
-fix. This is a big deal IMHO, and I want it to be fixed and fixed
-right. I don't see how to fix it right if we try to keep physical
-filenames tied to logical tablenames.
-
-Moreover, that restriction will continue to hurt us if we try to
-preserve it while implementing tablespaces, ANSI schemas, etc.
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA07268
- for
; Wed, 14 Jun 2000 23:16:54 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5F3Em841832;
- Wed, 14 Jun 2000 23:14:48 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5F3EG841655
- for
; Wed, 14 Jun 2000 23:14:16 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id XAA16988;
- Wed, 14 Jun 2000 23:13:53 -0400 (EDT)
-cc: Jan Wieck
, Oliver Elphick ,
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Bruce Momjian
- message dated "Wed, 14 Jun 2000 22:44:16 -0400"
-Date: Wed, 14 Jun 2000 23:13:52 -0400
-From: Tom Lane
-Precedence: bulk
-Status: ROr
-
-> That was my point --- that in doing this change, we are taking on more
-> TODO items, that may detract from our main TODO items.
-
-True, but they are also TODO items that could be handled by people other
-than the inner circle of key developers. The actual rejiggering of
-table-to-filename mapping is going to have to be done by one of the
-small number of people who are fully up to speed on backend internals.
-But we've got a lot more folks who would be able (and, hopefully,
-willing) to design and code whatever tools are needed to make the
-dbadmin's job easier in the face of the new filesystem layout. I'd
-rather not expend a lot of core time to avoid needing those tools,
-especially when I feel the old approach is fatally flawed anyway.
-
-> Even gdb shows us the filename/tablename in backtraces. We are never
-> going to be able to reproduce that.
-
-Backtraces from *what*, exactly? 99% of the backend is still going
-to be dealing with the same data as ever. It might be that poking
-around in fd.c will be a little harder, but considering that fd.c
-doesn't really know or care what the files it's manipulating are
-anyway, I'm not convinced that this is a real issue.
-
-> I guess I don't consider table schema commands inside transactions and
-> such to be as big an items as the utility features we will need to
-> build.
-
-You've *got* to be kidding. We're constantly seeing complaints about
-the fact that rolling back DROP or RENAME TABLE fails --- and worse,
-leaves the table in a corrupted/inconsistent state. As far as I can
-tell, that's one of the worst robustness problems we've got left to
-fix. This is a big deal IMHO, and I want it to be fixed and fixed
-right. I don't see how to fix it right if we try to keep physical
-filenames tied to logical tablenames.
-
-Moreover, that restriction will continue to hurt us if we try to
-preserve it while implementing tablespaces, ANSI schemas, etc.
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA24286
- for
; Thu, 15 Jun 2000 03:03:32 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5F72T815284;
- Thu, 15 Jun 2000 03:02:29 -0400 (EDT)
-Received: from mailo.vtcif.telstra.com.au (mailo.vtcif.telstra.com.au [202.12.144.17])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5F721814963
- for
; Thu, 15 Jun 2000 03:02:01 -0400 (EDT)
-Received: (from uucp@localhost) by mailo.vtcif.telstra.com.au (8.8.2/8.6.9) id RAA01186; Thu, 15 Jun 2000 17:01:48 +1000 (EST)
-Received: from maili.vtcif.telstra.com.au(202.12.142.17)
- via SMTP by mailo.vtcif.telstra.com.au, id smtpd0SbI.z; Thu Jun 15 17:00:39 2000
-Received: (from uucp@localhost) by maili.vtcif.telstra.com.au (8.8.2/8.6.9) id RAA21419; Thu, 15 Jun 2000 17:00:37 +1000 (EST)
-Received: from localhost(127.0.0.1), claiming to be "mail.cdn.telstra.com.au"
- via SMTP by localhost, id smtpdWTHrU_; Thu Jun 15 16:59:34 2000
-Received: from lunitari.nimrod.itg.telecom.com.au (lunitari.nimrod.itg.telecom.com.au [192.53.254.48]) by mail.cdn.telstra.com.au (8.8.2/8.6.9) with ESMTP id QAA04796; Thu, 15 Jun 2000 16:59:33 +1000 (EST)
-Received: from nimrod.itg.telecom.com.au (majere [192.53.254.45])
- by lunitari.nimrod.itg.telecom.com.au (8.9.1/8.9.3) with ESMTP id QAA18056;
- Thu, 15 Jun 2000 16:58:17 +1000 (EST)
-Date: Thu, 15 Jun 2000 16:56:12 +1000
-From: Chris Bitmead
-Organization: IBM Global Services
-X-Mailer: Mozilla 4.6 [en] (X11; I; SunOS 5.6 sun4u)
-X-Accept-Language: en
-MIME-Version: 1.0
-To: "Ross J. Reedstrom"
-CC: PostgreSQL-development
-Subject: Re: [HACKERS] Big 7.1 open items
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Precedence: bulk
-Status: RO
-
-"Ross J. Reedstrom" wrote:
-
-> Any strong objections to the mixed relname_oid solution? It gets us
-> everything oids does, and still lets Bruce use 'ls -l' to find the big
-> tables, putting off writing any admin tools that'll need to be rewritten,
-> anyway.
-
-Doesn't relname_oid defeat the purpose of oid file names, which is that
-they don't change when the table is renamed? Wasn't it going to be oids
-with a tool to create a symlink of relname -> oid ?
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA24604
- for
; Thu, 15 Jun 2000 03:31:15 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id DAA01191 for
; Thu, 15 Jun 2000 03:15:28 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5F7CP835301;
- Thu, 15 Jun 2000 03:12:25 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5F7Bt833744
- for
; Thu, 15 Jun 2000 03:11:55 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id DAA18801;
- Thu, 15 Jun 2000 03:11:53 -0400 (EDT)
-To: "Ross J. Reedstrom"
-cc: PostgreSQL-development
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to "Ross J. Reedstrom"
- message dated "Thu, 15 Jun 2000 01:03:12 -0500"
-Date: Thu, 15 Jun 2000 03:11:52 -0400
-From: Tom Lane
-Precedence: bulk
-Status: RO
-
-"Ross J. Reedstrom" writes:
-> Any strong objections to the mixed relname_oid solution?
-
-Yes!
-
-You cannot make it work reliably unless the relname part is the original
-relname and does not track ALTER TABLE RENAME. IMHO having an obsolete
-relname in the filename is worse than not having the relname at all;
-it's a recipe for confusion, it means you still need admin tools to tell
-which end is really up, and what's worst is you might think you don't.
-
-Furthermore it requires an additional column in pg_class to keep track
-of the original relname, which is a waste of space and effort.
-
-It also creates a portability risk, or at least fails to remove one,
-since you are critically dependent on the assumption that the OS
-supports long filenames --- on a filesystem that truncates names to less
-than about 45 characters you're in very deep trouble. An OID-only
-approach still works on traditional 14-char-filename Unix filesystems
-(it'd mostly even work on DOS 8+3, though I doubt we care about that).
-
-Finally, one of the reasons I want to go to filenames based only on OID
-is that that'll make life easier for mdblindwrt. Original relname + OID
-doesn't help, in fact it makes life harder (more shmem space needed to
-keep track of the filename for each buffer).
-
-Can we *PLEASE JUST LET GO* of this bad idea? No relname in the
-filename. Period.
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA24592
- for
; Thu, 15 Jun 2000 03:31:10 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id DAA01213 for
; Thu, 15 Jun 2000 03:15:46 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id DAA18833;
- Thu, 15 Jun 2000 03:14:30 -0400 (EDT)
-cc: Jan Wieck
, Oliver Elphick ,
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Bruce Momjian
- message dated "Wed, 14 Jun 2000 23:21:15 -0400"
-Date: Thu, 15 Jun 2000 03:14:30 -0400
-From: Tom Lane
-Status: RO
-
-> Well, we did have someone do a test implementation of oid file names,
-> and their report was that is looked pretty ugly. However, if people are
-> convinced it has to be done, we can get started. I guess I was waiting
-> for Vadim's storage manager, where the whole idea of separate files is
-> going to go away anyway, I suspect. We would then have to re-write all
-> our admin tools for the new format.
-
-I seem to recall him saying that he wanted to go to filename == OID
-just like I'm suggesting. But I agree we probably ought to hold off
-doing anything until he gets back from Russia and can let us know
-whether that's still his plan. If he is planning one-huge-file or
-something like that, we might as well let these issues go unfixed
-for one more release cycle.
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA24601
- for
; Thu, 15 Jun 2000 03:31:14 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id DAA01428 for
; Thu, 15 Jun 2000 03:19:39 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5F7GP843802;
- Thu, 15 Jun 2000 03:16:25 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5F7Fr842651
- for
; Thu, 15 Jun 2000 03:15:53 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id DAA18833;
- Thu, 15 Jun 2000 03:14:30 -0400 (EDT)
-cc: Jan Wieck
, Oliver Elphick ,
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Bruce Momjian
- message dated "Wed, 14 Jun 2000 23:21:15 -0400"
-Date: Thu, 15 Jun 2000 03:14:30 -0400
-From: Tom Lane
-Precedence: bulk
-Status: RO
-
-> Well, we did have someone do a test implementation of oid file names,
-> and their report was that is looked pretty ugly. However, if people are
-> convinced it has to be done, we can get started. I guess I was waiting
-> for Vadim's storage manager, where the whole idea of separate files is
-> going to go away anyway, I suspect. We would then have to re-write all
-> our admin tools for the new format.
-
-I seem to recall him saying that he wanted to go to filename == OID
-just like I'm suggesting. But I agree we probably ought to hold off
-doing anything until he gets back from Russia and can let us know
-whether that's still his plan. If he is planning one-huge-file or
-something like that, we might as well let these issues go unfixed
-for one more release cycle.
-
- regards, tom lane
-
-Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA24584
- for
; Thu, 15 Jun 2000 03:30:56 -0400 (EDT)
-Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
- by gandalf.it-austria.net (xxx/xxx) with ESMTP id JAA29140;
- Thu, 15 Jun 2000 09:31:12 +0200
-Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
- id ; Thu, 15 Jun 2000 09:31:12 +0200
-Message-ID: <219F68D65015D011A8E000006F8590C604AF7DE4@sdexcsrv1.f000.d0188.sd.spardat.at>
-From: Zeugswetter Andreas SB
-To: "'Tom Lane'"
, Bruce Momjian
-Cc: Jan Wieck
, Oliver Elphick ,
-Subject: AW: [HACKERS] Big 7.1 open items
-Date: Thu, 15 Jun 2000 09:31:11 +0200
-MIME-Version: 1.0
-X-Mailer: Internet Mail Service (5.5.2448.0)
-Content-Type: text/plain;
- charset="iso-8859-1"
-Status: RO
-
-
-> > You need something that works from the command line, and
-> something that
-> > works if PostgreSQL is not running. How would you restore
-> one file from
-> > a tape.
->
-> "Restore one file from a tape"? How are you going to do that anyway?
-> You can't save and restore portions of a database like that, because
-> of transaction commit status problems. To restore table X correctly,
-> you'd have to restore pg_log as well, and then your other tables are
-> hosed --- unless you also restore all of them from the backup. Only
-> a complete database restore from tape would work, and for that you
-> don't need to tell which file is which. So the above argument is a
-> red herring.
-
->From what I know it is possible to simply restore one table file
-since pg_log keeps all tid's. Of course it cannot guarantee integrity
-and does not work if the table was altered.
-
-> I realize it's nice to be able to tell which table file is which by
-> eyeball, but the price we are paying for that small convenience is
-> just too high. Give that up, and we can have rollbackable DROP and
-> RENAME now (I'll personally commit to making it happen for 7.1).
-> Continue to insist on it, and I don't think we'll *ever* have those
-> features in a really robust form. It's just not possible to do
-> multiple file renames atomically.
-
-In the last proposal Bruce and I had it all layed out for tabname + oid
-with no overhead in the normal situation, and little overhead if a rename
-table crashed or was not rolled back or committed properly
-which imho had all advantages combined.
-
-Andreas
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id EAA25144
- for
; Thu, 15 Jun 2000 04:31:03 -0400 (EDT)
-Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id EAA03225 for
; Thu, 15 Jun 2000 04:05:41 -0400 (EDT)
-Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
- by gandalf.it-austria.net (xxx/xxx) with ESMTP id KAA100894;
- Thu, 15 Jun 2000 10:04:52 +0200
-Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
- id ; Thu, 15 Jun 2000 10:04:52 +0200
-Message-ID: <219F68D65015D011A8E000006F8590C604AF7DE7@sdexcsrv1.f000.d0188.sd.spardat.at>
-From: Zeugswetter Andreas SB
-To: "'Don Baccus'" ,
- Bruce Momjian
-Cc: Jan Wieck
, Oliver Elphick ,
-Subject: AW: [HACKERS] Big 7.1 open items
-Date: Thu, 15 Jun 2000 10:04:51 +0200
-MIME-Version: 1.0
-X-Mailer: Internet Mail Service (5.5.2448.0)
-Content-Type: text/plain;
- charset="windows-1252"
-Status: RO
-
-
-> In reality, very few people are going to be interested in restoring
-> a table in a way that breaks referential integrity and other
-> normal assumptions about what exists in the database.
-
-This is not true. In my DBA history it would have saved me manweeks
-of work if an easy and efficient restore of one single table from backup
-would have been available in Informix and Oracle.
-We allways had to restore most of the whole system to another machine only
-to get back at some table info that would then be manually re-added
-to the production system.
-A restore of one table to a different/new tablename would have been
-very convenient, and this is currently possible in PostgreSQL.
-(create new table with same schema, then replace new table data file
-with file from backup)
-
-> The reality
-> is that most people are going to engage in a little time travel
-> to a past, consistent backup rather than do as you suggest.
-
-No, this is what is done most of the time, but it is very inconvenient
-to tell people that they loose all work from past days, so it is usually
-done as I noted above if possible. We once had a situation where all data
-was deleted from a table, but the problem was only noticed 3 weeks later.
-
-> This is going to be more and more true as Postgres gains more and
-> more acceptance in (no offense intended) the real world.
->
-> >Right now, we use 'ps' with args to display backend
-> information, and ls
-> >-l to show disk information. We are going to lose that here.
->
-> Dependence on "ls -l" is, IMO, a very weak argument.
-
-In normal situations where everything works I agree, it is the
-error situations where it really helps if you see what data is where.
-debugging, lsof, Bruce already named them.
-
-Andreas
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id EAA25151
- for
; Thu, 15 Jun 2000 04:31:07 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id EAA04151 for
; Thu, 15 Jun 2000 04:30:23 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5F8RI883087;
- Thu, 15 Jun 2000 04:27:18 -0400 (EDT)
-Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5F8Qx881928
- for
; Thu, 15 Jun 2000 04:27:00 -0400 (EDT)
-Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
- by gandalf.it-austria.net (xxx/xxx) with ESMTP id KAA79848;
- Thu, 15 Jun 2000 10:26:13 +0200
-Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
- id ; Thu, 15 Jun 2000 10:26:14 +0200
-Message-ID: <219F68D65015D011A8E000006F8590C604AF7DE8@sdexcsrv1.f000.d0188.sd.spardat.at>
-From: Zeugswetter Andreas SB
-To: "'Tom Lane'" ,
- "Ross J. Reedstrom"
-
-Cc: PostgreSQL-development
-Subject: AW: [HACKERS] Big 7.1 open items
-Date: Thu, 15 Jun 2000 10:26:12 +0200
-MIME-Version: 1.0
-X-Mailer: Internet Mail Service (5.5.2448.0)
-Content-Type: text/plain;
- charset="iso-8859-1"
-Precedence: bulk
-Status: ROr
-
-
-> "Ross J. Reedstrom" writes:
-> > Any strong objections to the mixed relname_oid solution?
->
-> Yes!
->
-> You cannot make it work reliably unless the relname part is
-> the original
-> relname and does not track ALTER TABLE RENAME.
-
-It does, or should at least. Only problem case is where db crashes during
-alter or commit/rollback. This could be fixed by first open that fails to
-find the file
-or vacuum, or some other utility.
-
-> IMHO having
-> an obsolete
-> relname in the filename is worse than not having the relname at all;
-> it's a recipe for confusion, it means you still need admin
-> tools to tell
-> which end is really up, and what's worst is you might think you don't.
->
-> Furthermore it requires an additional column in pg_class to keep track
-> of the original relname, which is a waste of space and effort.
-
-it does not.
-
-> Finally, one of the reasons I want to go to filenames based
-> only on OID
-> is that that'll make life easier for mdblindwrt. Original
-> relname + OID
-> doesn't help, in fact it makes life harder (more shmem space needed to
-> keep track of the filename for each buffer).
-
-I do not see this. filename is constructed from relname+oid.
-if not found, do directory scan for *_.dat, if found --> rename.
-
-Andreas
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id FAA25462
- for
; Thu, 15 Jun 2000 05:01:02 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id EAA04667 for
; Thu, 15 Jun 2000 04:45:51 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5F8gr817124;
- Thu, 15 Jun 2000 04:42:53 -0400 (EDT)
-Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5F8gX815763
- for
; Thu, 15 Jun 2000 04:42:34 -0400 (EDT)
-Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
- by gandalf.it-austria.net (xxx/xxx) with ESMTP id KAA29072;
- Thu, 15 Jun 2000 10:41:51 +0200
-Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
- id ; Thu, 15 Jun 2000 10:41:51 +0200
-Message-ID: <219F68D65015D011A8E000006F8590C604AF7DE9@sdexcsrv1.f000.d0188.sd.spardat.at>
-From: Zeugswetter Andreas SB
-To: "'Tom Lane'"
-Cc: PostgreSQL-development
-Subject: AW: [HACKERS] Big 7.1 open items
-Date: Thu, 15 Jun 2000 10:41:50 +0200
-MIME-Version: 1.0
-X-Mailer: Internet Mail Service (5.5.2448.0)
-Content-Type: text/plain;
- charset="iso-8859-1"
-Precedence: bulk
-Status: RO
-
-> It's just not possible to do
-> multiple file renames atomically.
-
-This is not necessary, since *_ is unique regardless of relname prefix.
-
-Andreas
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id IAA03846
- for
; Thu, 15 Jun 2000 08:30:58 -0400 (EDT)
-Received: from thelab.hub.org (nat193.152.mpoweredpc.net [142.177.193.152]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id IAA14167 for
; Thu, 15 Jun 2000 08:16:58 -0400 (EDT)
-Received: from localhost (scrappy@localhost)
- by thelab.hub.org (8.9.3/8.9.3) with ESMTP id JAA74856;
- Thu, 15 Jun 2000 09:14:29 -0300 (ADT)
-X-Authentication-Warning: thelab.hub.org: scrappy owned process doing -bs
-Date: Thu, 15 Jun 2000 09:14:29 -0300 (ADT)
-From: The Hermit Hacker
-cc: Tom Lane , Jan Wieck ,
-Subject: Re: [HACKERS] Big 7.1 open items
-MIME-Version: 1.0
-Content-Type: TEXT/PLAIN; charset=US-ASCII
-Status: RO
-
-On Wed, 14 Jun 2000, Bruce Momjian wrote:
-
-> > Backtraces from *what*, exactly? 99% of the backend is still going
-> > to be dealing with the same data as ever. It might be that poking
-> > around in fd.c will be a little harder, but considering that fd.c
-> > doesn't really know or care what the files it's manipulating are
-> > anyway, I'm not convinced that this is a real issue.
->
-> I was just throwing gdb out as an example. The bigger ones are ls,
-> lsof/fstat, and tar.
-
-You've lost me on this one ... if someone does an lsof of the process, it
-will still provide them a list of open files ... are you complaining about
-the extra step required to translate the file name to a "valid table"?
-
-Oh, one point here ... this whole 'filenaming issue' ... as far as ls is
-concerned, at least, only affects the superuser, since he's the only one
-that can go 'ls'ng around i nthe directories ...
-
-And, ummm, how hard would it be to have \d in psql display the "physical
-table name" as part of its output?
-
-Slight tangent here:
-
-One thing that I think would be great if we could add is some sort of:
-
-SELECT db_name, disk_space;
-
-query wher a database owner, not the superuser, could see how much disk
-space their tables are using up ... possible?
-
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id IAA03842
- for
; Thu, 15 Jun 2000 08:30:54 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id IAA15241 for
; Thu, 15 Jun 2000 08:31:29 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5FCSM877572;
- Thu, 15 Jun 2000 08:28:22 -0400 (EDT)
-Received: from zrtps06s.us.nortel.com ([47.140.48.50])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5FCRS877255
- for
; Thu, 15 Jun 2000 08:27:28 -0400 (EDT)
-Received: from ertpg15e1.nortelnetworks.com (actually zrtph06n.us.nortel.com)
- by zrtps06s.us.nortel.com; Thu, 15 Jun 2000 08:26:34 -0400
-Received: from zrtpd004.us.nortel.com (actually zrtpd004)
- by ertpg15e1.nortelnetworks.com; Thu, 15 Jun 2000 08:26:11 -0400
-Received: from zrtpd003.us.nortel.com ([47.140.224.137])
- by zrtpd004.us.nortel.com
- with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2650.21)
- id MPQCZWMM; Thu, 15 Jun 2000 08:26:10 -0400
-Received: from americasm01.nt.com (hrtpp28d.us.nortel.com [47.190.110.250])
- by zrtpd003.us.nortel.com
- with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2650.21)
- id L1N0XG78; Thu, 15 Jun 2000 08:26:12 -0400
-Date: Thu, 15 Jun 2000 08:28:12 -0400
-X-Sybari-Space: 00000000 00000000 00000000
-From: "Mark Hollomon"
-Reply-To: "Mark Hollomon"
-Organization: Nortel Networks
-X-Mailer: Mozilla 4.04 [en] (Win95; U)
-MIME-Version: 1.0
-To: "Ross J. Reedstrom"
-CC: PostgreSQL-development
-Subject: Re: [HACKERS] Big 7.1 open items
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-X-Orig:
-Precedence: bulk
-Status: RO
-
-Ross J. Reedstrom wrote:
->
-> Any strong objections to the mixed relname_oid solution? It gets us
-> everything oids does, and still lets Bruce use 'ls -l' to find the big
-> tables, putting off writing any admin tools that'll need to be rewritten,
-> anyway.
-
-I would object to the mixed name.
-
-Consider:
-
-CREATE TABLE FOO ....
-ALTER TABLE FOO RENAME FOO_OLD;
-CREATE TABLE FOO ....
-
-For the same atomicity reason, rename can't change the
-name of the files. So, which foo_ is the FOO_OLD
-and which is FOO?
-
-In other words, in the presence of rename, putting
-relname in the filename is misleading at best.
-
---
-
-Mark Hollomon
-ESN 451-9008 (302)454-9008
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id IAA03837
- for
; Thu, 15 Jun 2000 08:30:45 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5FCTb883200;
- Thu, 15 Jun 2000 08:29:37 -0400 (EDT)
-Received: from smtp1.andrew.cmu.edu (SMTP1.ANDREW.CMU.EDU [128.2.10.81])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5FCT7881265
- for
; Thu, 15 Jun 2000 08:29:07 -0400 (EDT)
-Received: from export.andrew.cmu.edu (EXPORT.ANDREW.CMU.EDU [128.2.23.2])
- by smtp1.andrew.cmu.edu (8.9.3/8.9.3) with ESMTP id IAA02782
- for
; Thu, 15 Jun 2000 08:29:02 -0400 (EDT)
-Date: Thu, 15 Jun 2000 08:29:02 -0400 (EDT)
-From: Brian E Gallew
-X-Mailer: BatIMail version 3.2
-To: "PostgreSQL-development"
-Subject: Re: [HACKERS] Big 7.1 open items
-Mime-Version: 1.0 (generated by tm-edit 7.106)
-Content-Type: multipart/signed; protocol="application/pgp-signature";
- boundary="pgp-sign-Multipart_Thu_Jun_15_08:29:00_2000-1"; micalg=pgp-md5
-Content-Transfer-Encoding: 7bit
-Precedence: bulk
-Status: RO
-
-
---pgp-sign-Multipart_Thu_Jun_15_08:29:00_2000-1
-Content-Type: text/plain; charset=US-ASCII
-
-Then spoke up and said:
-> Precedence: bulk
->
-> > But seriously, let me give some background. I used Ingres, that used
-> > the VMS file system, but used strange sequential AAAF324 numbers for
-> > tables. When someone deleted a table, or we were looking at what tables
-> > were using disk space, it was impossible to find the Ingres table names
-> > that went with the file. There was a system table that showed it, but
-> > it was poorly documented, and if you deleted the table, there was no way
-> > to look on the tape to find out which file to restore.
->
-> Fair enough, but it seems to me that the answer is to expend some effort
-> on system admin support tools. We could do a lot in that line with less
-> effort than trying to make a fundamentally mismatched filesystem
-> representation do what we need.
-
-We've been an Ingres shop as long as there's been an Ingres. While
-we've also had the problem Bruce noticed with table names, we've
-*also* used the trivial fix of running a (simple) Report Writer job
-each night, immediately before the backup, that lists all of the
-database tables/indicies and the underlying files.
-
-True, if someone drops/recreates a table twice between backups we
-can't find the intermediate file name, but since we also haven't
-backed up that filename, this isn't an issue.
-
-Also, the consistency issue is really not as important as you would
-think. If you are restoring a table, you want the information in it,
-whether or not it's consistent with anything else. I've done hundreds
-of table restores (can you say "modify table to heap"?) and never once
-has inconsistency been an issue. Oh, yeah, and we don't shut the
-database down for this, either. (That last isn't my choice, BTW.)
-
---
-=====================================================================
-| JAVA must have been developed in the wilds of West Virginia. |
-| After all, why else would it support only single inheritance?? |
-=====================================================================
-=====================================================================
-
---pgp-sign-Multipart_Thu_Jun_15_08:29:00_2000-1
-Content-Type: application/pgp-signature
-Content-Transfer-Encoding: 7bit
-
------BEGIN PGP MESSAGE-----
-Version: 2.6.2
-Comment: Processed by Mailcrypt 3.3, an Emacs/PGP interface
-
-iQBVAwUBOUjMDYdzVnzma+gdAQHUowH+JglNasUWT5RKSnF3pzNdy5nyrGmLhbWa
-Oom1oUqToxcyfjVFL34dXpnIlvNHO0K2Di4NKZ9HykwOHzrnExf15w==
-=yXoe
------END PGP MESSAGE-----
-
---pgp-sign-Multipart_Thu_Jun_15_08:29:00_2000-1--
-
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id JAA04418
- for
; Thu, 15 Jun 2000 09:31:04 -0400 (EDT)
-Received: from smtp.pacifier.com (comet.pacifier.com [199.2.117.155]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id JAA20080 for
; Thu, 15 Jun 2000 09:22:36 -0400 (EDT)
-Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68])
- by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id GAA05755;
- Thu, 15 Jun 2000 06:21:54 -0700 (PDT)
-X-Mailer: Windows Eudora Pro Version 3.0.1 (32)
-Date: Thu, 15 Jun 2000 05:40:49 -0700
-To: Zeugswetter Andreas SB ,
- Bruce Momjian
, Tom Lane
-From: Don Baccus
-Subject: Re: AW: [HACKERS] Big 7.1 open items
-Cc: Jan Wieck
, Oliver Elphick ,
- 188.sd.spardat.at>
-Mime-Version: 1.0
-Content-Type: text/plain; charset="us-ascii"
-Status: RO
-
-At 10:04 AM 6/15/00 +0200, Zeugswetter Andreas SB wrote:
->
->> In reality, very few people are going to be interested in restoring
->> a table in a way that breaks referential integrity and other
->> normal assumptions about what exists in the database.
->
->This is not true. In my DBA history it would have saved me manweeks
->of work if an easy and efficient restore of one single table from backup
->would have been available in Informix and Oracle.
->We allways had to restore most of the whole system to another machine only
->to get back at some table info that would then be manually re-added
->to the production system.
-
-I'm missing something, I guess. You would do a createdb, do a filesystem
-copy of pg_log and one file into it, and then read data from the table
- without having to restore the other tables in the database?
-
-I'm just curious - when was the last time you restored a Postgres
-database in this piecemeal manner, and how often do you do it?
-
-
-
-- Don Baccus, Portland OR
- Nature photos, on-line guides, Pacific Northwest
- Rare Bird Alert Service and other goodies at
- http://donb.photo.net.
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA04607
- for
; Thu, 15 Jun 2000 14:46:21 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id MAA12695 for
; Thu, 15 Jun 2000 12:48:58 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5FGjXI40370;
- Thu, 15 Jun 2000 12:45:33 -0400 (EDT)
-Received: from wallace.ece.rice.edu (wallace.ece.rice.edu [128.42.12.154])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5FGjJI39359
- for
; Thu, 15 Jun 2000 12:45:20 -0400 (EDT)
-Received: by rice.edu
- via sendmail from stdin
- id (Debian Smail3.2.0.102)
-Date: Thu, 15 Jun 2000 11:45:19 -0500
-From: "Ross J. Reedstrom"
-To: Tom Lane
-Cc: PostgreSQL-development
-Subject: Re: [HACKERS] Big 7.1 open items
-Mail-Followup-To: Tom Lane ,
-Mime-Version: 1.0
-Content-Type: text/plain; charset=us-ascii
-User-Agent: Mutt/1.0i
-Precedence: bulk
-Status: ROr
-
-On Thu, Jun 15, 2000 at 03:11:52AM -0400, Tom Lane wrote:
-> "Ross J. Reedstrom" writes:
-> > Any strong objections to the mixed relname_oid solution?
->
-> Yes!
->
-> You cannot make it work reliably unless the relname part is the original
-> relname and does not track ALTER TABLE RENAME. IMHO having an obsolete
-> relname in the filename is worse than not having the relname at all;
-> it's a recipe for confusion, it means you still need admin tools to tell
-> which end is really up, and what's worst is you might think you don't.
-
-The plan here was to let VACUUM handle renaming the file, since it
-will already have all the necessary locks. This shortens the window
-of confusion. ALTER TABLE RENAME doesn't happen that often, really -
-the relname is there just for human consumption, then.
-
->
-> Furthermore it requires an additional column in pg_class to keep track
-> of the original relname, which is a waste of space and effort.
->
-
-I actually started down this path thinking about implementing SCHEMA,
-since tables in the same DB but in different schema can have the same
-relname, I figured I needed to change that. We'll need something in
-pg_class to keep track of what schema a relation is in, instead.
-
-> It also creates a portability risk, or at least fails to remove one,
-> since you are critically dependent on the assumption that the OS
-> supports long filenames --- on a filesystem that truncates names to less
-> than about 45 characters you're in very deep trouble. An OID-only
-> approach still works on traditional 14-char-filename Unix filesystems
-> (it'd mostly even work on DOS 8+3, though I doubt we care about that).
-
-Actually, no. Since I store the filename in a name attribute, I used this
-nifty function somebody wrote, makeObjectName, to trim the relname part,
-but leave the oid. (Yes, I know it's yours ;-)
-
->
-> Finally, one of the reasons I want to go to filenames based only on OID
-> is that that'll make life easier for mdblindwrt. Original relname + OID
-> doesn't help, in fact it makes life harder (more shmem space needed to
-> keep track of the filename for each buffer).
-
-Can you explain in more detail how this helps? Not by letting the bufmgr
-know that oid == filename, I hope. We need to improving the abstraction
-of the smgr, not add another violation. Ah, sorry, mdblindwrt _is_
-in the smgr.
-
-Hmm, grovelling through that code, I see how it could be simpler if reloid
-== filename. Heck, we even get to save shmem in the buffdesc.blind part,
-since we only need the dbname in there, now.
-
-Hmm, I see I missed the relpath_blind() in my patch - oops. (relpath()
-is always called with RelationGetPhysicalRelationName(), and that's
-where I was putting in the relphysname)
-
-Hmm, what's all this with functions in catalog.c that are only called by
-smgr/md.c? seems to me that anything having to do with physical storage
-(like the path!) belongs in the smgr abstraction.
-
->
-> Can we *PLEASE JUST LET GO* of this bad idea? No relname in the
-> filename. Period.
->
-
-Gee, so dogmatic. No one besides Bruce and Hiroshi discussed this _at
-all_ when I first put up patches two month ago. O.K., I'll do the oids
-only version (and fix up relpath_blind)
-
-Ross
-
---
-Ross J. Reedstrom, Ph.D.,
-NSBRI Research Scientist/Programmer
-Computer and Information Technology Institute
-Rice University, 6100 S. Main St., Houston, TX 77005
-
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA27548
- for
; Thu, 15 Jun 2000 17:45:37 -0400 (EDT)
-Received: from mcadnote1 (ppm122.noc.fukui.nsk.ne.jp [210.161.188.41])
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
- id GAA07248; Fri, 16 Jun 2000 06:45:30 +0900
-From: "Hiroshi Inoue"
- "Ross J. Reedstrom"
-Cc: "Tom Lane" ,
- "PostgreSQL-development"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Fri, 16 Jun 2000 06:48:21 +0900
-Message-ID:
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="us-ascii"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
-Importance: Normal
-Status: ROr
-
-> -----Original Message-----
->
-> > > Can we *PLEASE JUST LET GO* of this bad idea? No relname in the
-> > > filename. Period.
-> > >
-> >
-> > Gee, so dogmatic. No one besides Bruce and Hiroshi discussed this _at
-> > all_ when I first put up patches two month ago. O.K., I'll do the oids
-> > only version (and fix up relpath_blind)
->
-> Hold on. I don't think we want that work done yet. Seems even Tom is
-> thinking that if Vadim is going to re-do everything later anyway, we may
-> be better with a relname/oid solution that does require additional
-> administration apps.
->
-
-Hmm,why is naming rule first ?
-
-I've never enphasized naming rule except that it should be unique.
-It has been my main point to reduce the necessity of naming rule
-as possible. IIRC,by keeping the stored place in pg_class,Ross's
-trial patch remains only 2 places where naming rule is required.
-So wouldn't we be free from naming rule(it would not be so difficult
-to change naming rule if the rule is found to be bad) ?
-
-I've also mentioned many times neither relname nor oid is sufficient
-for the uniqueness. In addiiton neither relname nor oid would be
-necessary for the uniqueness.
-IMHO,it's bad to rely on the item which is neither necessary nor
-sufficient.
-I proposed relname+unique_id naming once. The unique_id is
-independent from oid. The relname is only for convinience for
-DBA and so we don't have to change it due to RENAME.
-Db's consistency is much more important than dba's satis-
-faction.
-
-Comments ?
-
-Regards.
-
-Hiroshi Inoue
-
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA00764
- for
; Thu, 15 Jun 2000 19:01:02 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id SAA17328 for
; Thu, 15 Jun 2000 18:57:32 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5FMsMI97744;
- Thu, 15 Jun 2000 18:54:22 -0400 (EDT)
-Received: from wallace.ece.rice.edu (wallace.ece.rice.edu [128.42.12.154])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5FMs0I94252
- for
; Thu, 15 Jun 2000 18:54:00 -0400 (EDT)
-Received: by rice.edu
- via sendmail from stdin
- id (Debian Smail3.2.0.102)
-Date: Thu, 15 Jun 2000 17:53:59 -0500
-From: "Ross J. Reedstrom"
-To: PostgreSQL-development
-Subject: Re: [HACKERS] Big 7.1 open items
-Mail-Followup-To: PostgreSQL-development
-Mime-Version: 1.0
-Content-Type: text/plain; charset=us-ascii
-User-Agent: Mutt/1.0i
-Precedence: bulk
-Status: RO
-
-On Thu, Jun 15, 2000 at 05:48:59PM -0400, Bruce Momjian wrote:
-> > I've also mentioned many times neither relname nor oid is sufficient
-> > for the uniqueness. In addiiton neither relname nor oid would be
-> > necessary for the uniqueness.
-> > IMHO,it's bad to rely on the item which is neither necessary nor
-> > sufficient.
-> > I proposed relname+unique_id naming once. The unique_id is
-> > independent from oid. The relname is only for convinience for
-> > DBA and so we don't have to change it due to RENAME.
-> > Db's consistency is much more important than dba's satis-
-> > faction.
-> >
-> > Comments ?
->
-> I am happy not to rename the file on 'RENAME', but seems no one likes
-> that.
-
-Good, 'cause that's how I've implemented it so far. Actually, all
-I've done is port my previous patch to current, with one little
-change: I added a macro RelationGetRealRelationName which does what
-RelationGetPhysicalRelationName used to do: i.e. return the relname with
-no temptable funny business, and used that for the relcache macros. It
-passes all the serial regression tests: I haven't run the parallel tests
-yet. ALTER TABLE RENAME rollsback nicely. I'll need to learn some omre
-about xacts to get DROP TABLE rolling back.
-
-I'll drop it on PATCHES right now, for comment.
-
-Ross
---
-Ross J. Reedstrom, Ph.D.,
-NSBRI Research Scientist/Programmer
-Computer and Information Technology Institute
-Rice University, 6100 S. Main St., Houston, TX 77005
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA01651
- for
; Thu, 15 Jun 2000 20:00:59 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id TAA20985 for
; Thu, 15 Jun 2000 19:57:49 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5FNsgI25402;
- Thu, 15 Jun 2000 19:54:42 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5FNsCI22412
- for
; Thu, 15 Jun 2000 19:54:12 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id TAA02263;
- Thu, 15 Jun 2000 19:53:52 -0400 (EDT)
-To: "Ross J. Reedstrom"
-cc: PostgreSQL-development
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to "Ross J. Reedstrom"
- message dated "Thu, 15 Jun 2000 11:45:19 -0500"
-Date: Thu, 15 Jun 2000 19:53:52 -0400
-From: Tom Lane
-Precedence: bulk
-Status: RO
-
-"Ross J. Reedstrom" writes:
-> On Thu, Jun 15, 2000 at 03:11:52AM -0400, Tom Lane wrote:
->> "Ross J. Reedstrom" writes:
->>>> Any strong objections to the mixed relname_oid solution?
->>
->> Yes!
-
-> The plan here was to let VACUUM handle renaming the file, since it
-> will already have all the necessary locks. This shortens the window
-> of confusion. ALTER TABLE RENAME doesn't happen that often, really -
-> the relname is there just for human consumption, then.
-
-Yeah, I've seen tons of discussion of how if we do this, that, and
-the other thing, and be prepared to fix up some other things in case
-of crash recovery, we can make it work with filename == relname + OID
-(where relname tracks logical name, at least at some remove).
-
-Probably. Assuming nobody forgets anything.
-
-I'm just trying to point out that that's a huge amount of pretty
-delicate mechanism. The amount of work required to make it trustworthy
-looks to me to dwarf the admin tools that Bruce is complaining about.
-And we only have a few people competent to do the work. (With all
-due respect, Ross, if you weren't already aware of the implications
-for mdblindwrt, I have to wonder what else you missed.)
-
-Filename == OID is so simple, reliable, and straightforward by
-comparison that I think the decision is a no-brainer.
-
-If we could afford to sink unlimited time into this one issue then
-it might make sense to do it the hard way, but we have enough
-important stuff on our TODO list to keep us all busy for years ---
-I cannot believe that it's an effective use of our time to do this.
-
-
-> Hmm, what's all this with functions in catalog.c that are only called by
-> smgr/md.c? seems to me that anything having to do with physical storage
-> (like the path!) belongs in the smgr abstraction.
-
-Yeah, there's a bunch of stuff that should have been implemented by
-adding new smgr entry points, but wasn't. It should be pushed down.
-(I can't resist pointing out that one of those things is physical
-relation rename, which will go away and not *need* to be pushed down
-if we do it the way I want.)
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA01647
- for
; Thu, 15 Jun 2000 20:00:58 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id TAA21034 for
; Thu, 15 Jun 2000 19:58:30 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id TAA02283;
- Thu, 15 Jun 2000 19:57:05 -0400 (EDT)
-cc: "Ross J. Reedstrom" ,
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Bruce Momjian
- message dated "Thu, 15 Jun 2000 15:35:45 -0400"
-Date: Thu, 15 Jun 2000 19:57:05 -0400
-From: Tom Lane
-Status: RO
-
->> Gee, so dogmatic. No one besides Bruce and Hiroshi discussed this _at
->> all_ when I first put up patches two month ago. O.K., I'll do the oids
->> only version (and fix up relpath_blind)
-
-> Hold on. I don't think we want that work done yet. Seems even Tom is
-> thinking that if Vadim is going to re-do everything later anyway, we may
-> be better with a relname/oid solution that does require additional
-> administration apps.
-
-Don't put words in my mouth, please. If we are going to throw the
-work away later, it'd be foolish to do the much greater amount of
-work needed to make filename=relname+OID fly than is needed for
-filename=OID.
-
-However, I'm pretty sure I recall Vadim stating that he thought
-filename=OID would be required for his smgr changes anyway...
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA02731
- for
; Thu, 15 Jun 2000 21:01:01 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id UAA23469 for
; Thu, 15 Jun 2000 20:36:36 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5G0WDI97134;
- Thu, 15 Jun 2000 20:32:13 -0400 (EDT)
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5G0VsI97003
- for
; Thu, 15 Jun 2000 20:31:54 -0400 (EDT)
-Received: from cadzone ([126.0.1.40] (may be forged))
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
- id JAA07328; Fri, 16 Jun 2000 09:26:04 +0900
-From: "Hiroshi Inoue"
-To: "Bruce Momjian" ,
- "Tom Lane"
-Cc: "PostgreSQL-development"
,
- "Ross J. Reedstrom"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Fri, 16 Jun 2000 09:28:14 +0900
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="iso-2022-jp"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
-Importance: Normal
-Precedence: bulk
-Status: RO
-
-> -----Original Message-----
-> Behalf Of Tom Lane
->
-> "Ross J. Reedstrom" writes:
-> > On Thu, Jun 15, 2000 at 03:11:52AM -0400, Tom Lane wrote:
-> >> "Ross J. Reedstrom" writes:
-> >>>> Any strong objections to the mixed relname_oid solution?
-> >>
-> >> Yes!
->
-> > The plan here was to let VACUUM handle renaming the file, since it
-> > will already have all the necessary locks. This shortens the window
-> > of confusion. ALTER TABLE RENAME doesn't happen that often, really -
-> > the relname is there just for human consumption, then.
->
-> Yeah, I've seen tons of discussion of how if we do this, that, and
-> the other thing, and be prepared to fix up some other things in case
-> of crash recovery, we can make it work with filename == relname + OID
-> (where relname tracks logical name, at least at some remove).
->
-
-I've seen little discussion of how to avoid the use of naming rule.
-I've proposed many times that we should keep the information
-where the table is stored in our database itself. I've never seen
-clear objections to it. So I could understand my proposal is OK ?
-Isn't it much more important than naming rule ? Under the
-mechanism,we could easily replace bad naming rule.
-And I believe that Ross's work is mostly around the mechanism
-not naming rule.
-
-Now I like neither relname nor oid because it's not sufficient
-for my purpose.
-
-Regards.
-
-Hiroshi Inoue
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA03637
- for ; Thu, 15 Jun 2000 22:01:01 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id VAA28521 for ; Thu, 15 Jun 2000 21:58:46 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id VAA02730;
- Thu, 15 Jun 2000 21:57:27 -0400 (EDT)
-To: "Hiroshi Inoue"
-cc: "Bruce Momjian" ,
- "PostgreSQL-development"
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to "Hiroshi Inoue"
- message dated "Fri, 16 Jun 2000 09:28:14 +0900"
-Date: Thu, 15 Jun 2000 21:57:27 -0400
-From: Tom Lane
-Status: ROr
-
-"Hiroshi Inoue" writes:
-> Now I like neither relname nor oid because it's not sufficient
-> for my purpose.
-
-We should probably not do much of anything with this issue until
-we have a clearer understanding of what we want to do about
-tablespaces and schemas.
-
-My gut feeling is that we will end up with pathnames that look
-something like
-
-.../data/base/DBNAME/TABLESPACE/OIDOFRELATION
-
-(with .N attached if a segment of a large relation, of course).
-
-The TABLESPACE "name" should likely be an OID itself, but it wouldn't
-have to be if you are willing to say that tablespaces aren't renamable.
-(Come to think of it, does anyone care about being able to rename
-databases? ;-)) Note that the TABLESPACE will often be a symlink
-to storage on another drive, rather than a plain subdirectory of the
-DBNAME, but that shouldn't be an issue at this level of discussion.
-
-I think that schemas probably don't enter into this. We should instead
-rely on the uniqueness of OIDs to prevent filename collisions. However,
-OIDs aren't really unique: different databases in an installation will
-use the same OIDs for their system tables. My feeling is that we can
-live with a restriction like "you can't store the system tables of
-different databases in the same tablespace". Alternatively we could
-avoid that issue by inverting the pathname order:
-
-.../data/base/TABLESPACE/DBNAME/OIDOFRELATION
-
-Note that in any case, system tables will have to live in a
-predetermined tablespace, since you can't very well look in pg_class
-to find out which tablespace pg_class lives in. Perhaps we should
-just reserve a tablespace per database for system tables and forget
-the whole issue. If we do that, there's not really any need for
-the database in the path! Just
-
-.../data/base/TABLESPACE/OIDOFRELATION
-
-would do fine and help reduce lookup overhead.
-
-BTW, schemas do make things interesting for the other camp:
-is it possible for the same table to be referenced by different
-names in different schemas? If so, just how useful is it to pick
-one of those names arbitrarily for the filename? This is an advanced
-version of the main objection to using the original relname and not
-updating it at RENAME TABLE --- sooner or later, the filenames are
-going to be more confusing than helpful.
-
-Comments? Have I missed something important about schemas?
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA04586
- for
; Thu, 15 Jun 2000 22:27:44 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5G2POI23418;
- Thu, 15 Jun 2000 22:25:24 -0400 (EDT)
- by hub.org (8.10.1/8.10.1) with ESMTP id e5G2P3I23299
- for
; Thu, 15 Jun 2000 22:25:04 -0400 (EDT)
-Received: (from pgman@localhost)
- by candle.pha.pa.us (8.9.0/8.9.0) id WAA04345;
- Thu, 15 Jun 2000 22:24:53 -0400 (EDT)
-Subject: Re: [HACKERS] Big 7.1 open items
- pm"
-To: Tom Lane
-Date: Thu, 15 Jun 2000 22:24:52 -0400 (EDT)
-CC: Hiroshi Inoue , Bruce Momjian ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-X-Mailer: ELM [version 2.4ME+ PL77 (25)]
-MIME-Version: 1.0
-Content-Transfer-Encoding: 7bit
-Content-Type: text/plain; charset=US-ASCII
-Precedence: bulk
-Status: RO
-
-> "Hiroshi Inoue" writes:
-> > Now I like neither relname nor oid because it's not sufficient
-> > for my purpose.
->
-> We should probably not do much of anything with this issue until
-> we have a clearer understanding of what we want to do about
-> tablespaces and schemas.
-
-Here is an analysis of our options:
-
- Work required Disadvantages
-----------------------------------------------------------------------------
-
-Keep current system no work rename/create no rollback
-
-relname/oid but less work new pg_class column,
-no rename change filename not accurate on
- rename
-
-relname/oid with more work complex code
-rename change during
-vacuum
-
-oid filename less work, but confusing to admins
- need admin tools
-
---
- Bruce Momjian | http://www.op.net/~candle
- + If your life is a hard drive, | 830 Blythe Avenue
- + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
-
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA05230
- for ; Thu, 15 Jun 2000 22:41:48 -0400 (EDT)
-Received: from cadzone ([126.0.1.40] (may be forged))
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
- id LAA07495; Fri, 16 Jun 2000 11:41:43 +0900
-From: "Hiroshi Inoue"
-To: "Tom Lane"
-Cc: "Bruce Momjian" ,
- "PostgreSQL-development"
,
- "Ross J. Reedstrom"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Fri, 16 Jun 2000 11:43:52 +0900
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="iso-2022-jp"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
-Importance: Normal
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
-Status: RO
-
-Sorry for my previous mail. It was posted by my mistake.
-
-> -----Original Message-----
->
-> "Hiroshi Inoue" writes:
-> > Now I like neither relname nor oid because it's not sufficient
-> > for my purpose.
->
-> We should probably not do much of anything with this issue until
-> we have a clearer understanding of what we want to do about
-> tablespaces and schemas.
->
-> My gut feeling is that we will end up with pathnames that look
-> something like
->
-> .../data/base/DBNAME/TABLESPACE/OIDOFRELATION
->
-
-Schema is a logical concept and irrevant to physical location.
-I strongly object your suggestion unless above means *default*
-location.
-Tablespace is an encapsulation of table allocation and the
-name should be irrevant to the location basically. So above
-seems very bad for me.
-
-Anyway I don't see any advantage in fixed mapping impleme
-ntation. After renewal,we should at least have a possibility to
-allocate a specific table in arbitrary separate directory.
-
-Regards.
-
-Hiroshi Inoue
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA06634;
- Thu, 15 Jun 2000 23:30:59 -0400 (EDT)
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id XAA03227; Thu, 15 Jun 2000 23:18:54 -0400 (EDT)
-Received: from cadzone ([126.0.1.40] (may be forged))
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
- id MAA07544; Fri, 16 Jun 2000 12:18:06 +0900
-From: "Hiroshi Inoue"
-To: "Bruce Momjian"
, "Tom Lane"
-Cc: "Bruce Momjian" ,
- "PostgreSQL-development"
,
- "Ross J. Reedstrom"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Fri, 16 Jun 2000 12:20:16 +0900
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="iso-8859-1"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
-Importance: Normal
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
-Status: RO
-
-> -----Original Message-----
->
-> > "Hiroshi Inoue" writes:
-> > > Now I like neither relname nor oid because it's not sufficient
-> > > for my purpose.
-> >
-> > We should probably not do much of anything with this issue until
-> > we have a clearer understanding of what we want to do about
-> > tablespaces and schemas.
->
-> Here is an analysis of our options:
->
-> Work required Disadvantages
-> ------------------------------------------------------------------
-> ----------
->
-> Keep current system no work rename/create
-> no rollback
->
-> relname/oid but less work new pg_class column,
-> no rename change filename not
-> accurate on
-> rename
->
-> relname/oid with more work complex code
-> rename change during
-> vacuum
->
-> oid filename less work, but confusing to admins
-> need admin tools
->
-
-Please add my opinion for naming rule.
-
-relname/unique_id but need some work new pg_class column,
-no relname change. for unique-id generation filename not relname
-
-Regards.
-
-Hiroshi Inoue
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA06924
- for
; Fri, 16 Jun 2000 00:01:00 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id XAA05470 for
; Thu, 15 Jun 2000 23:59:46 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5G3uaI10809;
- Thu, 15 Jun 2000 23:56:36 -0400 (EDT)
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5G3uKI10702
- for
; Thu, 15 Jun 2000 23:56:21 -0400 (EDT)
-Received: from cadzone ([126.0.1.40] (may be forged))
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
- id MAA07571; Fri, 16 Jun 2000 12:55:33 +0900
-From: "Hiroshi Inoue"
-To: "Tom Lane"
-Cc: "PostgreSQL-development"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Fri, 16 Jun 2000 12:57:44 +0900
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="iso-2022-jp"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
-Importance: Normal
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
-Precedence: bulk
-Status: RO
-
-> -----Original Message-----
->
-> "Hiroshi Inoue" writes:
-> > Please add my opinion for naming rule.
->
-> > relname/unique_id but need some work new
-> pg_class column,
-> > no relname change. for unique-id generation filename not relname
->
-> Why is a unique ID better than --- or even different from ---
-> using the relation's OID? It seems pointless to me...
->
-
-For example,in the implementation of CLUSTER command,
-we would need another new file for the target relation in
-order to put sorted rows but don't we want to change the
-OID ? It would be needed for table re-construction generally.
-If I remember correectly,you once proposed OID+version
-naming for the cases.
-
-Regards.
-
-Hiroshi Inoue
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA08093
- for ; Fri, 16 Jun 2000 02:00:59 -0400 (EDT)
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id BAA10174 for ; Fri, 16 Jun 2000 01:34:44 -0400 (EDT)
-Received: from cadzone ([126.0.1.40] (may be forged))
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
- id OAA07656; Fri, 16 Jun 2000 14:33:12 +0900
-From: "Hiroshi Inoue"
-To: "Tom Lane"
-Cc: "Bruce Momjian" ,
- "PostgreSQL-development"
,
- "Ross J. Reedstrom"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Fri, 16 Jun 2000 14:35:21 +0900
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="iso-8859-1"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
-Importance: Normal
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
-Status: RO
-
-> -----Original Message-----
->
-> "Hiroshi Inoue" writes:
-> > Tablespace is an encapsulation of table allocation and the
-> > name should be irrevant to the location basically. So above
-> > seems very bad for me.
-> > Anyway I don't see any advantage in fixed mapping impleme
-> > ntation. After renewal,we should at least have a possibility to
-> > allocate a specific table in arbitrary separate directory.
->
-> Call a "directory" a "tablespace" and we're on the same page,
-> aren't we? Actually I'd envision some kind of admin command
-> "CREATE TABLESPACE foo AS /path/to/wherever".
-
-Yes,I think 'tablespace -> directory' is the most natural
-extension under current file_per_table storage manager.
-If many_tables_in_a_file storage manager is introduced,we
-may be able to change the definiiton of TABLESPACE
-to 'tablespace -> files' like Oracle.
-
-> That would make
-> appropriate system catalog entries and also create a symlink
-> from ".../data/base/foo" (or some such place) to the target
-> directory.
-> Then when we make a table in that tablespace,
-> it's in the right place. Problem solved, no?
->
-
-I don't like symlink for dbms data files. However it may
-be OK,If symlink are limited to 'tablespace->directory'
-corrspondence and all tablespaces(including default
-etc) are symlink. It is simple and all debugging would
-be processed under tablespace_is_symlink environment.
-
-> It gets a little trickier if you want to be able to split
-> multi-gig tables across several tablespaces, though, since
-> you couldn't just append ".N" to the base table path in that
-> scenario.
->
-
-This seems to be not that easy to solve now.
-Ross doesn't change this naming rule for multi-gig
-tables either in his trial.
-
-> I'd be interested to know what sort of facilities Oracle
-> provides for managing huge tables...
->
-
-In my knowledge about old Oracle,one TABLESPACE
-could have many DATAFILEs which could contain
-many tables.
-
-Regards.
-
-Hiroshi Inoue
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA08109
- for
; Fri, 16 Jun 2000 02:01:02 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id BAA11218 for
; Fri, 16 Jun 2000 01:57:33 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5G5tLI49492;
- Fri, 16 Jun 2000 01:55:21 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5G5tAI49395
- for
; Fri, 16 Jun 2000 01:55:10 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id BAA05749;
- Fri, 16 Jun 2000 01:54:46 -0400 (EDT)
-To: "Hiroshi Inoue"
-cc: "PostgreSQL-development"
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to "Hiroshi Inoue"
- message dated "Fri, 16 Jun 2000 12:57:44 +0900"
-Date: Fri, 16 Jun 2000 01:54:46 -0400
-From: Tom Lane
-Precedence: bulk
-Status: RO
-
-"Hiroshi Inoue" writes:
->> Why is a unique ID better than --- or even different from ---
->> using the relation's OID? It seems pointless to me...
-
-> For example,in the implementation of CLUSTER command,
-> we would need another new file for the target relation in
-> order to put sorted rows but don't we want to change the
-> OID ? It would be needed for table re-construction generally.
-> If I remember correectly,you once proposed OID+version
-> naming for the cases.
-
-Hmm, so you are thinking that the pg_class row for the table would
-include this uniqueID, and then committing the pg_class update would
-be the atomic action that replaces the old table contents with the
-new? It does have some attraction now that I think about it.
-
-But there are other ways we could do the same thing. If we want to
-have tablespaces, there will need to be a tablespace identifier in
-each pg_class row. So we could do CLUSTER in the same way as we'd
-move a table from one tablespace to another: create the new files in
-the new tablespace directory, and the commit of the new pg_class row
-with the new tablespace value is the atomic action that makes the new
-files valid and the old files not.
-
-You will probably say "but I didn't want to move my table to a new
-tablespace just to cluster it!" I think we could live with that,
-though. A tablespace doesn't need to have any existence more concrete
-than a subdirectory, in my vision of the way things would work. We
-could do something like making two subdirectories of each place that
-the dbadmin designates as a "tablespace", so that we make two logical
-tablespaces out of what the dbadmin thinks of as one. Then we can
-ping-pong between those directories to do things like clustering "in
-place".
-
-Basically I want to keep the bottom-level mechanisms as simple and
-reliable as we possibly can. The fewer concepts are known down at
-the bottom, the better. If we can keep the pathname constituents
-to just "tablespace" and "relation OID" we'll be in great shape ---
-but each additional concept that has to be known down there is
-another potential problem.
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA12816
- for
; Fri, 16 Jun 2000 03:31:04 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id DAA14405 for
; Fri, 16 Jun 2000 03:03:38 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5G71YI83633;
- Fri, 16 Jun 2000 03:01:34 -0400 (EDT)
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5G713I82023
- for
; Fri, 16 Jun 2000 03:01:04 -0400 (EDT)
-Received: from cadzone ([126.0.1.40] (may be forged))
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
- id QAA07731; Fri, 16 Jun 2000 16:00:57 +0900
-From: "Hiroshi Inoue"
-To: "Tom Lane"
-Cc: "PostgreSQL-development"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Fri, 16 Jun 2000 16:03:06 +0900
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="iso-8859-1"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
-Importance: Normal
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
-Precedence: bulk
-Status: RO
-
-> -----Original Message-----
->
-> "Hiroshi Inoue" writes:
-> >> Why is a unique ID better than --- or even different from ---
-> >> using the relation's OID? It seems pointless to me...
->
-> > For example,in the implementation of CLUSTER command,
-> > we would need another new file for the target relation in
-> > order to put sorted rows but don't we want to change the
-> > OID ? It would be needed for table re-construction generally.
-> > If I remember correectly,you once proposed OID+version
-> > naming for the cases.
->
-> Hmm, so you are thinking that the pg_class row for the table would
-> include this uniqueID,
-
-No,I just include the place where the table is stored(pathname under
-current file_per_table storage manager) in the pg_class row because
-I don't want to rely on table allocating rule(naming rule for current)
-to access existent relation files. This has always been my main point.
-Many_tables_in_a_file storage manager wouldn't be able to live without
-keeping this kind of infomation.
-This information(where it is stored) is diffrent from tablespace(where
-to store) information. There was an idea to keep the information into
-opaque entry in pg_class which only a specific storage manager
-could handle. There was an idea to have a new system table which
-keeps the information. and so on...
-
-> and then committing the pg_class update would
-> be the atomic action that replaces the old table contents with the
-> new? It does have some attraction now that I think about it.
->
-> But there are other ways we could do the same thing. If we want to
-> have tablespaces, there will need to be a tablespace identifier in
-> each pg_class row. So we could do CLUSTER in the same way as we'd
-> move a table from one tablespace to another: create the new files in
-> the new tablespace directory, and the commit of the new pg_class row
-> with the new tablespace value is the atomic action that makes the new
-> files valid and the old files not.
->
-> You will probably say "but I didn't want to move my table to a new
-> tablespace just to cluster it!"
-
-Yes.
-
-> I think we could live with that,
-> though. A tablespace doesn't need to have any existence more concrete
-> than a subdirectory, in my vision of the way things would work. We
-> could do something like making two subdirectories of each place that
-> the dbadmin designates as a "tablespace", so that we make two logical
-> tablespaces out of what the dbadmin thinks of as one.
-
-Certainly we could design TABLESPACE(where to store) as above.
-
-> Then we can
-> ping-pong between those directories to do things like clustering "in
-> place".
->
-
-But maybe we must keep the directory information where the table was
-*ping-ponged* in (e.g.) pg_class. Is such an implementation cleaner or
-more extensible than mine(keeping the stored place exactly) ?
-
-Regards.
-
-Hiroshi Inoue
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id EAA13087
- for
; Fri, 16 Jun 2000 04:01:11 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id DAA16002 for
; Fri, 16 Jun 2000 03:37:24 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5G7ZZI51521;
- Fri, 16 Jun 2000 03:35:35 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5G7ZEI51350
- for
; Fri, 16 Jun 2000 03:35:14 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id DAA06103;
- Fri, 16 Jun 2000 03:34:47 -0400 (EDT)
-To: Chris Bitmead
-cc: PostgreSQL-development
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Chris Bitmead
- message dated "Fri, 16 Jun 2000 15:36:04 +1000"
-Date: Fri, 16 Jun 2000 03:34:47 -0400
-From: Tom Lane
-Precedence: bulk
-Status: RO
-
-Chris Bitmead writes:
-> Tom Lane wrote:
->> I don't see a lot of value in that. Better to do something like
->> tablespaces:
->>
->> //
-
-> What is the benefit of having oidoftablespace in the directory path?
-> Isn't tablespace an idea so you can store it somewhere completely
-> different?
-> Or is there some symlink idea or something?
-
-Exactly --- I'm assuming that the tablespace "directory" is likely
-to be a symlink to some other mounted volume. The point here is
-to keep the low-level file access routines from having to know very
-much about tablespaces or file organization. In the above proposal,
-all they need to know is the relation's OID and the name (or OID)
-of the tablespace the relation's assigned to; then they can form
-a valid path using a hardwired rule. There's still plenty of
-flexibility of organization, but it's not necessary to know that
-where the rubber meets the road (eg, when you're down inside mdblindwrt
-trying to dump a dirty buffer to disk with no spare resources to find
-out anything about the relation the page belongs to...)
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA28913
- for ; Fri, 16 Jun 2000 11:01:05 -0400 (EDT)
-Received: from mailout05.sul.t-online.com (mailout05.sul.t-online.com [194.25.134.82]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id KAA01818 for ; Fri, 16 Jun 2000 10:46:42 -0400 (EDT)
-Received: from fwd06.sul.t-online.de
- by mailout05.sul.t-online.com with smtp
- id 132xN9-0006ze-03; Fri, 16 Jun 2000 16:45:27 +0200
-Received: from hot.jw.home (340000654369-0001@[62.158.179.251]) by fwd06.sul.t-online.de
- with esmtp id 132xMx-0E54HQC; Fri, 16 Jun 2000 16:45:15 +0200
-Received: (from wieck@localhost)
- by hot.jw.home (8.8.5/8.8.5) id OAA15163;
- Fri, 16 Jun 2000 14:42:12 +0200
-Subject: Re: [HACKERS] Big 7.1 open items
- pm"
-To: Tom Lane
-Date: Fri, 16 Jun 2000 14:42:12 +0200 (MEST)
-CC: Hiroshi Inoue , Bruce Momjian ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Reply-To: Jan Wieck
-X-Mailer: ELM [version 2.4ME+ PL68 (25)]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Status: ROr
-
-Tom Lane wrote:
->
-> It gets a little trickier if you want to be able to split
-> multi-gig tables across several tablespaces, though, since
-> you couldn't just append ".N" to the base table path in that
-> scenario.
->
-> I'd be interested to know what sort of facilities Oracle
-> provides for managing huge tables...
-
- Oracle tablespaces are a collection of 1...n preallocated
- files. Each table then is bound to a tablespace and
- allocates extents (chunks) from those files.
-
- There are some per table attributes that control the extent
- sizes with default values coming from the tablespace. The
- initial extent size, the nextextent and the pctincrease.
- There is a hardcoded limit for the number of extents a table
- can have at all. In Oracle7 it was 512 (or somewhat below -
- don't recall correct). Maybe that's gone with Oracle8, don't
- know.
-
- This storage concept has IMHO a couple of advatages over
- ours.
-
- The tablespace files are preallocated, so there will
- never be a change in block allocation during runtime and
- that's the base for fdatasync() beeing sufficient at
- syncpoints. All what might be inaccurate after a crash is
- the last modified time in the inode, and that's totally
- irrelevant for Oracle. The fsck will never fail, and
- anything is up to Oracle's recovery.
-
- The number of total tablespace files is limited to a
- value that ensures, that the backends can keep them all
- open all the time. It's hard to exceed that limit. A
- typical SAP installation with more than 20,000
- tables/indices doesn't need more than 30 or 40 of them.
-
- It is perfectly prepared for raw devices, since a
- tablespace in a raw device installation is simply an area
- of blocks on a disk.
-
- There are also disadvantages.
-
- You can run out of space even if there are plenty GB's
- free on your disks. You have to create tablespaces
- explicitly.
-
- If you've choosen inadequate extent size parameters, you
- end up with high fragmented tables (slowing down) or get
- stuck with running against maxextents, where only a reorg
- (export/import) helps.
-
-
-Jan
-
---
-
-#======================================================================#
-# It's easier to get forgiveness for being wrong than for being right. #
-# Let's break this rule - forgive me. #
-
-
-
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA28898
- for ; Fri, 16 Jun 2000 11:00:39 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id LAA07184;
- Fri, 16 Jun 2000 11:00:35 -0400 (EDT)
-To: Jan Wieck
-cc: Hiroshi Inoue , Bruce Momjian ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
- message dated "Fri, 16 Jun 2000 14:42:12 +0200"
-Date: Fri, 16 Jun 2000 11:00:35 -0400
-From: Tom Lane
-Status: RO
-
-> There are also disadvantages.
-
-> You can run out of space even if there are plenty GB's
-> free on your disks. You have to create tablespaces
-> explicitly.
-
-Not to mention the reverse: if I read this right, you have to suck
-up your GB's long in advance of actually needing them. That's OK
-for a machine that's dedicated to Oracle ... not so OK for smaller
-installations, playpens, etc.
-
-I'm not convinced that there's anything fundamentally wrong with
-doing storage allocation in Unix files the way we have been.
-
-(At least not when we're sitting atop a well-done filesystem,
-which may leave the Linux folk out in the cold ;-).)
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA29853
- for ; Fri, 16 Jun 2000 12:01:02 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id LAA08255 for ; Fri, 16 Jun 2000 11:48:10 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id LAA07461;
- Fri, 16 Jun 2000 11:46:41 -0400 (EDT)
-To: Jan Wieck
-cc: Hiroshi Inoue , Bruce Momjian ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
- message dated "Fri, 16 Jun 2000 14:42:12 +0200"
-Date: Fri, 16 Jun 2000 11:46:41 -0400
-From: Tom Lane
-Status: RO
-
-> Tom Lane wrote:
->> It gets a little trickier if you want to be able to split
->> multi-gig tables across several tablespaces, though, since
->> you couldn't just append ".N" to the base table path in that
->> scenario.
->>
->> I'd be interested to know what sort of facilities Oracle
->> provides for managing huge tables...
-
-> Oracle tablespaces are a collection of 1...n preallocated
-> files. Each table then is bound to a tablespace and
-> allocates extents (chunks) from those files.
-
-OK, to get back to the point here: so in Oracle, tables can't cross
-tablespace boundaries, but a tablespace itself could span multiple
-disks?
-
-Not sure if I like that better or worse than equating a tablespace
-with a directory (so, presumably, all the files within it live on
-one filesystem) and then trying to make tables able to span
-tablespaces. We will need to do one or the other though, if we want
-to have any significant improvement over the current state of affairs
-for large tables.
-
-One way is to play the flip-the-path-ordering game some more,
-and access multiple-segment tables with pathnames like this:
-
- .../TABLESPACE/RELATION -- first or only segment
- .../TABLESPACE/N/RELATION -- N'th extension segment
-
-This isn't any harder for md.c to deal with than what we do now,
-but by making the /N subdirectories be symlinks, the dbadmin could
-easily arrange for extension segments to go on different filesystems.
-Also, since /N subdirectory symlinks can be added as needed,
-expanding available space by attaching more disks isn't hard.
-(If the admin hasn't pre-made a /N symlink when it's needed,
-I'd envision the backend just automatically creating a plain
-subdirectory so that it can extend the table.)
-
-A limitation is that the N'th extension segments of all the relations
-in a given tablespace have to be in the same place, but I don't see
-that as a major objection. Worst case is you make a separate tablespace
-for each of your multi-gig relations ... you're probably not going to
-have a very large number of such relations, so this doesn't seem like
-unmanageable admin complexity.
-
-We'd still want to create some tools to help the dbadmin with slinging
-all these symlinks around, of course. But I think it's critical to keep
-the low-level file access protocol simple and reliable, which really
-means minimizing the amount of information the backend needs to know to
-figure out which file to write a page in. With something like the above
-you only need to know the tablespace name (or more likely OID), the
-relation OID (+name or not, depending on outcome of other argument),
-and the offset in the table. No worse than now from the software's
-point of view.
-
-Comments?
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA00649
- for ; Fri, 16 Jun 2000 12:31:49 -0400 (EDT)
-Received: from huey.jpl.nasa.gov (huey.jpl.nasa.gov [128.149.68.100]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id MAA13118 for ; Fri, 16 Jun 2000 12:31:52 -0400 (EDT)
-Received: from golem.jpl.nasa.gov (hectic-1 [128.149.68.203])
- by huey.jpl.nasa.gov (8.8.8+Sun/8.8.8) with ESMTP id JAA15007;
- Fri, 16 Jun 2000 09:27:18 -0700 (PDT)
-Received: from alumni.caltech.edu (localhost.localdomain [127.0.0.1])
- by golem.jpl.nasa.gov (Postfix) with ESMTP
- id DD8426F51; Fri, 16 Jun 2000 16:27:22 +0000 (UTC)
-Date: Fri, 16 Jun 2000 16:27:22 +0000
-From: Thomas Lockhart
-Organization: Yes
-X-Mailer: Mozilla 4.7 [en] (X11; I; Linux 2.2.14-15mdksmp i686)
-X-Accept-Language: en
-MIME-Version: 1.0
-To: Tom Lane
-Cc: Jan Wieck , Hiroshi Inoue ,
- Bruce Momjian ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: RO
-
-> ... But I think it's critical to keep
-> the low-level file access protocol simple and reliable, which really
-> means minimizing the amount of information the backend needs to know
-> to figure out which file to write a page in. With something like the
-> above you only need to know the tablespace name (or more likely OID),
-> the relation OID (+name or not, depending on outcome of other
-> argument), and the offset in the table. No worse than now from the
-> software's point of view.
-> Comments?
-
-I'm probably missing the context a bit, but imho we should try hard to
-stay away from symlinks as the general solution for anything.
-
-Sorry for being behind here, but to make sure I'm on the right page:
-o tablespaces decouple storage from logical tables
-o a database lives in a default tablespace, unless specified
-o by default, a table will live in the default tablespace
-o (eventually) a table can be split across tablespaces
-
-Some thoughts:
-o the ability to split single tables across disks was essential for
-scalability when disks were small. But with RAID, NAS, etc etc isn't
-that a smaller issue now?
-o "tablespaces" would implement our less-developed "with location"
-feature, right? Splitting databases, whole indices and whole tables
-across storage is the biggest win for this work since more users will
-use the feature.
-o location information needs to travel with individual tables anyway.
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA01191;
- Fri, 16 Jun 2000 13:01:01 -0400 (EDT)
-Received: from thelab.hub.org (nat193.152.mpoweredpc.net [142.177.193.152]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id MAA15282; Fri, 16 Jun 2000 12:53:23 -0400 (EDT)
-Received: from localhost (scrappy@localhost)
- by thelab.hub.org (8.9.3/8.9.3) with ESMTP id NAA28326;
- Fri, 16 Jun 2000 13:50:37 -0300 (ADT)
-X-Authentication-Warning: thelab.hub.org: scrappy owned process doing -bs
-Date: Fri, 16 Jun 2000 13:50:37 -0300 (ADT)
-From: The Hermit Hacker
-cc: Tom Lane , Hiroshi Inoue ,
- Bruce Momjian ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-MIME-Version: 1.0
-Content-Type: TEXT/PLAIN; charset=US-ASCII
-Status: RO
-
-On Thu, 15 Jun 2000, Bruce Momjian wrote:
-
-> > "Hiroshi Inoue" writes:
-> > > Now I like neither relname nor oid because it's not sufficient
-> > > for my purpose.
-> >
-> > We should probably not do much of anything with this issue until
-> > we have a clearer understanding of what we want to do about
-> > tablespaces and schemas.
->
-> Here is an analysis of our options:
->
-> Work required Disadvantages
-> ----------------------------------------------------------------------------
->
-> Keep current system no work rename/create no rollback
->
-> relname/oid but less work new pg_class column,
-> no rename change filename not accurate on
-> rename
->
-> relname/oid with more work complex code
-> rename change during
-> vacuum
->
-> oid filename less work, but confusing to admins
-> need admin tools
-
-My vote is with Tom on this one ... oid only ... the admin should be able
-to do a quick SELECT on a table to find out the OID->table mapping, and I
-believe its already been pointed out that you cant' just restore one file
-anyway, so it kinda negates the "server isn't running problem" ...
-
-
-
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA01188
- for ; Fri, 16 Jun 2000 13:01:01 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id MAA15530 for ; Fri, 16 Jun 2000 12:55:38 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id MAA07750;
- Fri, 16 Jun 2000 12:54:00 -0400 (EDT)
-To: Thomas Lockhart
-cc: Jan Wieck , Hiroshi Inoue ,
- Bruce Momjian ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Thomas Lockhart
- message dated "Fri, 16 Jun 2000 16:27:22 -0000"
-Date: Fri, 16 Jun 2000 12:54:00 -0400
-From: Tom Lane
-Status: RO
-
-Thomas Lockhart writes:
->> ... But I think it's critical to keep
->> the low-level file access protocol simple and reliable, which really
->> means minimizing the amount of information the backend needs to know
->> to figure out which file to write a page in. With something like the
->> above you only need to know the tablespace name (or more likely OID),
->> the relation OID (+name or not, depending on outcome of other
->> argument), and the offset in the table. No worse than now from the
->> software's point of view.
->> Comments?
-
-> I'm probably missing the context a bit, but imho we should try hard to
-> stay away from symlinks as the general solution for anything.
-
-Why?
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA02086
- for ; Fri, 16 Jun 2000 14:54:59 -0400 (EDT)
-Received: from smtp.pacifier.com (comet.pacifier.com [199.2.117.155]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id OAA26430 for ; Fri, 16 Jun 2000 14:40:00 -0400 (EDT)
-Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68])
- by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id LAA08661;
- Fri, 16 Jun 2000 11:38:36 -0700 (PDT)
-X-Mailer: Windows Eudora Pro Version 3.0.1 (32)
-Date: Fri, 16 Jun 2000 10:50:23 -0700
-To: Tom Lane , Jan Wieck
-From: Don Baccus
-Subject: Re: [HACKERS] Big 7.1 open items
-Cc: Hiroshi Inoue , Bruce Momjian ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Mime-Version: 1.0
-Content-Type: text/plain; charset="us-ascii"
-Status: RO
-
-At 11:46 AM 6/16/00 -0400, Tom Lane wrote:
-
->OK, to get back to the point here: so in Oracle, tables can't cross
->tablespace boundaries,
-
-Right, the construct AFAIK is "create table/index foo on tablespace ..."
-
-> but a tablespace itself could span multiple
->disks?
-
-Right.
-
->Not sure if I like that better or worse than equating a tablespace
->with a directory (so, presumably, all the files within it live on
->one filesystem) and then trying to make tables able to span
->tablespaces. We will need to do one or the other though, if we want
->to have any significant improvement over the current state of affairs
->for large tables.
-
-Oracle's way does a reasonable job of isolating the datamodel
-from the details of the physical layout.
-
-Take the OpenACS web toolkit, for instance. We could take
-each module's tables and indices and assign them appropriately
-to various dataspaces, then provide a separate .sql files with
-only "create tablespace" statements in there.
-
-By modifying that one central file, the toolkit installation
-could be customized to run anything from a small site (one
-disk with everything on it, ala my own personal webserver at
-birdnotes.net) or a very large site with many spindles, with
-various index and table structures spread out widely hither
-and thither.
-
-Given that the OpenACS datamodel is nearly 10K lines long (including
-many comments, of course), being able to customize an installation
-to such a degree by modifying a single file filled with "create
-tablespaces" would be very attractive.
-
->One way is to play the flip-the-path-ordering game some more,
->and access multiple-segment tables with pathnames like this:
->
-> .../TABLESPACE/RELATION -- first or only segment
-> .../TABLESPACE/N/RELATION -- N'th extension segment
->
->This isn't any harder for md.c to deal with than what we do now,
->but by making the /N subdirectories be symlinks, the dbadmin could
->easily arrange for extension segments to go on different filesystems.
-
-I personally dislike depending on symlinks to move stuff around.
-Among other things, a pg_dump/restore (and presumably future
-backup tools?) can't recreate the disk layout automatically.
-
->We'd still want to create some tools to help the dbadmin with slinging
->all these symlinks around, of course.
-
-OK, if symlinks are simply an implementation detail hidden from the
-dbadmin, and if the physical structure is kept in the db so it can
-be rebuilt if necessary automatically, then I don't mind symlinks.
-
-> But I think it's critical to keep
->the low-level file access protocol simple and reliable, which really
->means minimizing the amount of information the backend needs to know to
->figure out which file to write a page in. With something like the above
->you only need to know the tablespace name (or more likely OID), the
->relation OID (+name or not, depending on outcome of other argument),
->and the offset in the table. No worse than now from the software's
->point of view.
-
-Make the code that creates and otherwise manipulates tablespaces
-do the work, while keeping the low-level file access protocol simple.
-
-Yes, this approach sounds very good to me.
-
-
-
-- Don Baccus, Portland OR
- Nature photos, on-line guides, Pacific Northwest
- Rare Bird Alert Service and other goodies at
- http://donb.photo.net.
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA02107
- for
; Fri, 16 Jun 2000 14:55:09 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id OAA26943 for
; Fri, 16 Jun 2000 14:44:12 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5GIelM05972;
- Fri, 16 Jun 2000 14:40:47 -0400 (EDT)
-Received: from smtp.pacifier.com (comet.pacifier.com [199.2.117.155])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5GIe5M05692
- for
; Fri, 16 Jun 2000 14:40:05 -0400 (EDT)
-Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68])
- by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id LAA08667;
- Fri, 16 Jun 2000 11:38:41 -0700 (PDT)
-X-Mailer: Windows Eudora Pro Version 3.0.1 (32)
-Date: Fri, 16 Jun 2000 11:14:35 -0700
-To: Thomas Lockhart ,
- Tom Lane
-From: Don Baccus
-Subject: Re: [HACKERS] Big 7.1 open items
-Cc: Jan Wieck , Hiroshi Inoue ,
- Bruce Momjian ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Mime-Version: 1.0
-Content-Type: text/plain; charset="us-ascii"
-Precedence: bulk
-Status: RO
-
-At 04:27 PM 6/16/00 +0000, Thomas Lockhart wrote:
-
->Sorry for being behind here, but to make sure I'm on the right page:
->o tablespaces decouple storage from logical tables
->o a database lives in a default tablespace, unless specified
->o by default, a table will live in the default tablespace
->o (eventually) a table can be split across tablespaces
-
-Or tablespaces across filesystems/mountpoints whatever.
-
->Some thoughts:
->o the ability to split single tables across disks was essential for
->scalability when disks were small. But with RAID, NAS, etc etc isn't
->that a smaller issue now?
-
-Yes for size issues, I should think, especially if you have the
-money for a large RAID subsystem. But for throughput performance,
-control over which spindles particularly busy tables and indices
-go on would still seem to be pretty relevant, when they're being
-updated a lot. In order to minimize seek times.
-
-I really can't say how important this is in reality. Oracle-world
-folks still talk about this kind of optimization being important,
-but I'm not personally running any kind of database-backed website
-that's busy enough or contains enough storage to worry about it.
-
->o "tablespaces" would implement our less-developed "with location"
->feature, right? Splitting databases, whole indices and whole tables
->across storage is the biggest win for this work since more users will
->use the feature.
->o location information needs to travel with individual tables anyway.
-
-
-
-- Don Baccus, Portland OR
- Nature photos, on-line guides, Pacific Northwest
- Rare Bird Alert Service and other goodies at
- http://donb.photo.net.
-
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id PAA02397
- for ; Fri, 16 Jun 2000 15:00:54 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id PAA08247;
- Fri, 16 Jun 2000 15:00:11 -0400 (EDT)
-To: Don Baccus
-cc: Jan Wieck , Hiroshi Inoue ,
- Bruce Momjian ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Don Baccus
- message dated "Fri, 16 Jun 2000 10:50:23 -0700"
-Date: Fri, 16 Jun 2000 15:00:10 -0400
-From: Tom Lane
-Status: RO
-
-Don Baccus writes:
->> This isn't any harder for md.c to deal with than what we do now,
->> but by making the /N subdirectories be symlinks, the dbadmin could
->> easily arrange for extension segments to go on different filesystems.
-
-> I personally dislike depending on symlinks to move stuff around.
-> Among other things, a pg_dump/restore (and presumably future
-> backup tools?) can't recreate the disk layout automatically.
-
-Good point, we'd need some way of saving/restoring the tablespace
-structures.
-
->> We'd still want to create some tools to help the dbadmin with slinging
->> all these symlinks around, of course.
-
-> OK, if symlinks are simply an implementation detail hidden from the
-> dbadmin, and if the physical structure is kept in the db so it can
-> be rebuilt if necessary automatically, then I don't mind symlinks.
-
-I'm not sure about keeping it in the db --- creates a bit of a
-chicken-and-egg problem doesn't it? Maybe there needs to be a
-"system database" that has nailed-down pathnames (no tablespaces
-for you baby) and contains the critical installation-wide tables
-like pg_database, pg_user, pg_tablespace. A restore would have
-to restore these tables first anyway.
-
-> Make the code that creates and otherwise manipulates tablespaces
-> do the work, while keeping the low-level file access protocol simple.
-
-Right, that's the bottom line for me.
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id QAA03689
- for ; Fri, 16 Jun 2000 16:51:49 -0400 (EDT)
-Received: from wallace.ece.rice.edu (wallace.ece.rice.edu [128.42.12.154]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id PAA03409 for ; Fri, 16 Jun 2000 15:48:40 -0400 (EDT)
-Received: by rice.edu
- via sendmail from stdin
- id (Debian Smail3.2.0.102)
-Date: Fri, 16 Jun 2000 14:35:28 -0500
-From: "Ross J. Reedstrom"
-To: Thomas Lockhart
-Cc: Tom Lane , Jan Wieck ,
- Hiroshi Inoue ,
- Bruce Momjian ,
-Subject: Re: [HACKERS] Big 7.1 open items
-Mail-Followup-To: Thomas Lockhart ,
- Tom Lane , Jan Wieck ,
- Hiroshi Inoue ,
- Bruce Momjian ,
-Mime-Version: 1.0
-Content-Type: text/plain; charset=iso-8859-1
-Content-Transfer-Encoding: 8bit
-User-Agent: Mutt/1.0i
-Status: RO
-
-On Fri, Jun 16, 2000 at 04:27:22PM +0000, Thomas Lockhart wrote:
-> > ... But I think it's critical to keep
-> > the low-level file access protocol simple and reliable, which really
-> > means minimizing the amount of information the backend needs to know
-> > to figure out which file to write a page in. With something like the
-> > above you only need to know the tablespace name (or more likely OID),
-> > the relation OID (+name or not, depending on outcome of other
-> > argument), and the offset in the table. No worse than now from the
-> > software's point of view.
-> > Comments?
-
-I think the backend needs a per table token that indicates how
-to get at the physical bits of the file. Whether that's a filename
-alone, filename with path, oid, key to a smgr hash table or something
-else, it's opaque above the smgr routines.
-
-Hmm, now I'm thinking, since the tablespace discussion has been reopened,
-the way to go about coding all this is to reactivate the smgr code: how
-about I leave the existing md smgr as is, and clone it, call it md2 or
-something, and start messing with adding features there?
-
-
->
-> I'm probably missing the context a bit, but imho we should try hard to
-> stay away from symlinks as the general solution for anything.
->
-> Sorry for being behind here, but to make sure I'm on the right page:
-> o tablespaces decouple storage from logical tables
-> o a database lives in a default tablespace, unless specified
-> o by default, a table will live in the default tablespace
-> o (eventually) a table can be split across tablespaces
->
-> Some thoughts:
-> o the ability to split single tables across disks was essential for
-> scalability when disks were small. But with RAID, NAS, etc etc isn't
-> that a smaller issue now?
-> o "tablespaces" would implement our less-developed "with location"
-> feature, right? Splitting databases, whole indices and whole tables
-> across storage is the biggest win for this work since more users will
-> use the feature.
-> o location information needs to travel with individual tables anyway.
-
-I was juist thinking that that discussion needed some summation.
-
-Some links to historic discussion:
-
-This one is Vadim saying WAL will need oids names:
-http://www.postgresql.org/mhonarc/pgsql-hackers/1999-11/msg00809.html
-
-A longer discussion kicked off by Don Baccus:
-http://www.postgresql.org/mhonarc/pgsql-hackers/2000-01/msg00510.html
-
-Tom suggesting OIDs to allow rollback:
-http://www.postgresql.org/mhonarc/pgsql-hackers/2000-03/msg00119.html
-
-
-Martin Neumann posted an question on dataspaces:
-
-(can't find it in the offical archives: looks like March 2000, 10-29 is
-missing. here's my copy: don't beat on it! n particular, since I threw
-it together for local access, it's one _big_ index page)
-
-http://cooker.ir.rice.edu/postgresql/msg20257.html
-(in that thread is a post where I mention blindwrites and getting rid
-of GetRawDatabaseInfo)
-
-Martin later posted an RFD on tablespaces:
-
-http://cooker.ir.rice.edu/postgresql/msg20490.html
-
-Here's Horák Daniel with a patch for discussion, implementing dataspaces
-on a per database level:
-
-http://cooker.ir.rice.edu/postgresql/msg20498.html
-
-Ross
---
-Ross J. Reedstrom, Ph.D.,
-NSBRI Research Scientist/Programmer
-Computer and Information Technology Institute
-Rice University, 6100 S. Main St., Houston, TX 77005
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id QAA03692
- for ; Fri, 16 Jun 2000 16:51:50 -0400 (EDT)
-Received: from smtp.pacifier.com (comet.pacifier.com [199.2.117.155]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id PAA02911 for ; Fri, 16 Jun 2000 15:43:13 -0400 (EDT)
-Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68])
- by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id MAA11003;
- Fri, 16 Jun 2000 12:41:50 -0700 (PDT)
-X-Mailer: Windows Eudora Pro Version 3.0.1 (32)
-Date: Fri, 16 Jun 2000 12:37:36 -0700
-To: Tom Lane
-From: Don Baccus
-Subject: Re: [HACKERS] Big 7.1 open items
-Cc: Jan Wieck , Hiroshi Inoue ,
- Bruce Momjian ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Mime-Version: 1.0
-Content-Type: text/plain; charset="us-ascii"
-Status: RO
-
-At 03:00 PM 6/16/00 -0400, Tom Lane wrote:
-
->> OK, if symlinks are simply an implementation detail hidden from the
->> dbadmin, and if the physical structure is kept in the db so it can
->> be rebuilt if necessary automatically, then I don't mind symlinks.
->
->I'm not sure about keeping it in the db --- creates a bit of a
->chicken-and-egg problem doesn't it?
-
-Not if the tablespace creates preceeds the tables stored in them.
-
-> Maybe there needs to be a
->"system database" that has nailed-down pathnames (no tablespaces
->for you baby) and contains the critical installation-wide tables
->like pg_database, pg_user, pg_tablespace. A restore would have
->to restore these tables first anyway.
-
-Oh, I see. Yes, when I've looked into this and have thought about
-it I've assumed that there would always be a known starting point
-which would contain the installation-wide tables.
-
->From a practical point of view, I don't think that's really a
-problem.
-
-I've not looked into how Oracle does this, I assume it builds
-a system tablespace on one of the initial mount points you give
-it when you install the thing. The paths to the mount points
-are stored in specific files known to Oracle, I think. It's
-been over a year (not long enough!) since I've set up Oracle...
-
-
-
-
-- Don Baccus, Portland OR
- Nature photos, on-line guides, Pacific Northwest
- Rare Bird Alert Service and other goodies at
- http://donb.photo.net.
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA04168
- for
; Fri, 16 Jun 2000 17:31:03 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id RAA12122 for
; Fri, 16 Jun 2000 17:09:28 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5GL7WM02231;
- Fri, 16 Jun 2000 17:07:32 -0400 (EDT)
-Received: from wallace.ece.rice.edu (wallace.ece.rice.edu [128.42.12.154])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5GL7EM02150
- for
; Fri, 16 Jun 2000 17:07:14 -0400 (EDT)
-Received: by rice.edu
- via sendmail from stdin
- id (Debian Smail3.2.0.102)
-Date: Fri, 16 Jun 2000 16:07:13 -0500
-From: "Ross J. Reedstrom"
-To: Tom Lane
-Subject: Re: [HACKERS] Big 7.1 open items
-Mail-Followup-To: Tom Lane ,
-Mime-Version: 1.0
-Content-Type: text/plain; charset=us-ascii
-User-Agent: Mutt/1.0i
-Precedence: bulk
-Status: RO
-
-On Thu, Jun 15, 2000 at 07:53:52PM -0400, Tom Lane wrote:
-> "Ross J. Reedstrom" writes:
-> > On Thu, Jun 15, 2000 at 03:11:52AM -0400, Tom Lane wrote:
-> >> "Ross J. Reedstrom" writes:
-> >>>> Any strong objections to the mixed relname_oid solution?
-> >>
-> >> Yes!
->
-> > The plan here was to let VACUUM handle renaming the file, since it
-> > will already have all the necessary locks. This shortens the window
-> > of confusion. ALTER TABLE RENAME doesn't happen that often, really -
-> > the relname is there just for human consumption, then.
->
-> Yeah, I've seen tons of discussion of how if we do this, that, and
-> the other thing, and be prepared to fix up some other things in case
-> of crash recovery, we can make it work with filename == relname + OID
-> (where relname tracks logical name, at least at some remove).
->
-> Probably. Assuming nobody forgets anything.
-
-I agree, it seems a major undertaking, at first glance. And second. Even
-third. Especially for someone who hasn't 'earned his spurs' yet. as
-it were.
-
-> I'm just trying to point out that that's a huge amount of pretty
-> delicate mechanism. The amount of work required to make it trustworthy
-> looks to me to dwarf the admin tools that Bruce is complaining about.
-> And we only have a few people competent to do the work. (With all
-> due respect, Ross, if you weren't already aware of the implications
-> for mdblindwrt, I have to wonder what else you missed.)
-
-Ah, you knew that comment would come back to haunt me (I have a
-tendency to think out loud, even if checking and coming back latter
-would be better;-) In fact, there's no problem, and never was, since the
-buffer->blind.relname is filled in via RelationGetPhysicalRelationName,
-just like every other path that requires direct file access. I just
-didn't remember that I had in fact checked it (it's been a couple months,
-and I just got back from vacation ;-)
-
-Actually, Once I re-checked it, the code looked very familiar. I had
-spent time looking at the blind write code in the context of getting
-rid of the only non-startup use of GetRawDatabaseInfo.
-
-As to missing things: I'm leaning heavily on Bruce's previous
-work for temp tables, to seperate the two uses of relname, via the
-RelationGetRelationName and RelationGetPhysicalRelationName. There are
-102 uses of the first in the current code (many in elog messages), and
-only 11 of the second. If I'd had to do the original work of finding
-every use of relname, and catagorizing it, I agree I'm not (yet) up to
-it, but I have more confidence in Bruce's (already tested) work.
-
->
-> Filename == OID is so simple, reliable, and straightforward by
-> comparison that I think the decision is a no-brainer.
->
-
-Perhaps. Changing the label of the file on disk still requires finding
-all the code that assumes it knows what that name is, and changing it.
-Same work.
-
-> If we could afford to sink unlimited time into this one issue then
-> it might make sense to do it the hard way, but we have enough
-> important stuff on our TODO list to keep us all busy for years ---
-> I cannot believe that it's an effective use of our time to do this.
->
-
-The joys of Open Development. You've spent a fair amount of time trying
-to convince _me_ not to waste my time. Thanks, but I'm pretty bull headed
-sometimes. Since I've already done something of the work, take a look
-at what I've got, and then tell me I'm wasting my time, o.k.?
-
->
-> > Hmm, what's all this with functions in catalog.c that are only called by
-> > smgr/md.c? seems to me that anything having to do with physical storage
-> > (like the path!) belongs in the smgr abstraction.
->
-> Yeah, there's a bunch of stuff that should have been implemented by
-> adding new smgr entry points, but wasn't. It should be pushed down.
-> (I can't resist pointing out that one of those things is physical
-> relation rename, which will go away and not *need* to be pushed down
-> if we do it the way I want.)
->
-
-Oh, I agree completely. In fact, As I said to Hiroshi last time this came
-up, I think of the field in pg_class an an opaque token, to be filled in
-by the smgr, and only used by code further up to hand back to the smgr
-routines. Same should be true of the buffer->blind struct.
-
-Ross
---
-Ross J. Reedstrom, Ph.D.,
-NSBRI Research Scientist/Programmer
-Computer and Information Technology Institute
-Rice University, 6100 S. Main St., Houston, TX 77005
-
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA05334
- for ; Fri, 16 Jun 2000 19:30:59 -0400 (EDT)
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id TAA19834 for ; Fri, 16 Jun 2000 19:09:59 -0400 (EDT)
-Received: from mcadnote1 (ppm122.noc.fukui.nsk.ne.jp [210.161.188.41])
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
- id IAA08210; Sat, 17 Jun 2000 08:08:15 +0900
-From: "Hiroshi Inoue"
-To: "Tom Lane" , "Jan Wieck"
-Cc: "Bruce Momjian" ,
- "PostgreSQL-development"
,
- "Ross J. Reedstrom"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Sat, 17 Jun 2000 08:11:08 +0900
-Message-ID:
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="iso-2022-jp"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
-Importance: Normal
-Status: RO
-
-> -----Original Message-----
->
-> > There are also disadvantages.
->
-> > You can run out of space even if there are plenty GB's
-> > free on your disks. You have to create tablespaces
-> > explicitly.
->
-> Not to mention the reverse: if I read this right, you have to suck
-> up your GB's long in advance of actually needing them. That's OK
-> for a machine that's dedicated to Oracle ... not so OK for smaller
-> installations, playpens, etc.
->
-
-I've had an anxiety about the way like Oracle's preallocation.
-It had not been easy for me to estimate the extent size in
-Oracle. Maybe it would lose the simplicity of environment
-settings which is one of the biggest advantage of PostgreSQL.
-It seems that we should also provide not_preallocated DATAFILE
-when many_tables_in_a_file storage manager is introduced.
-
-Regards.
-
-Hiroshi Inoue
-
-
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA05337
- for ; Fri, 16 Jun 2000 19:31:00 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id TAA20335 for ; Fri, 16 Jun 2000 19:18:26 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id TAA09274;
- Fri, 16 Jun 2000 19:16:37 -0400 (EDT)
-To: "Ross J. Reedstrom"
-cc: Thomas Lockhart ,
- Jan Wieck , Hiroshi Inoue ,
- Bruce Momjian ,
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to "Ross J. Reedstrom"
- message dated "Fri, 16 Jun 2000 14:35:28 -0500"
-Date: Fri, 16 Jun 2000 19:16:37 -0400
-From: Tom Lane
-Status: RO
-
-"Ross J. Reedstrom" writes:
-> I think the backend needs a per table token that indicates how
-> to get at the physical bits of the file. Whether that's a filename
-> alone, filename with path, oid, key to a smgr hash table or something
-> else, it's opaque above the smgr routines.
-
-Except to the commands that provide the user interface for tablespaces
-and so forth. And there aren't all that many places that deal with
-physical filenames anyway. It would be a good idea to try to be a
-little stricter about this, but I'm not sure you can make the separation
-a whole lot cleaner than it is now ... with the exception of the obvious
-bogosities like "rename table" being done above the smgr level. (But,
-as I said, I want to see that code go away, not just get moved into
-smgr...)
-
-> Hmm, now I'm thinking, since the tablespace discussion has been reopened,
-> the way to go about coding all this is to reactivate the smgr code: how
-> about I leave the existing md smgr as is, and clone it, call it md2 or
-> something, and start messing with adding features there?
-
-Um, well, you can't have it both ways. If you're going to change/fix
-the assumptions of code above the smgr, then you've got to update md
-at the same time to match your new definition of the smgr interface.
-Won't do much good to have a playpen smgr if the "standard" one is
-broken.
-
-One thing I have been thinking would be a good idea is to take the
-relcache out of the bufmgr/smgr interfaces. The relcache is a
-higher-level concept and ought not be known to bufmgr or smgr; they
-ought to work with some low-level data structure or token for relations.
-We might be able to eliminate the whole concept of "blind write" if we
-do that. There are other problems with the relcache dependency: entries
-in relcache can get blown away at inopportune times due to shared cache
-inval, and it doesn't provide a good home for tokens for multiple
-"versions" of a relation if we go with the fill-a-new-physical-file
-approach to CLUSTER and so on.
-
-Hmm, if you replace relcache in the smgr interfaces with pointers to
-an smgr-maintained data structure, that might be the same thing that
-you are alluding to above about an smgr hash table.
-
-One thing *not* to do is add yet a third layer of data structure on
-top of the ones already maintained in fd.c and md.c. Whatever extra
-data might be needed here should be added to md.c's tables, I think,
-and then the tokens used in the smgr interface would be pointers into
-that table.
-
- regards, tom lane
-
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA05329
- for ; Fri, 16 Jun 2000 19:30:41 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id TAA09320;
- Fri, 16 Jun 2000 19:30:26 -0400 (EDT)
-To: "Hiroshi Inoue"
-cc: "Jan Wieck" ,
- "Bruce Momjian" ,
- "PostgreSQL-development"
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-In-reply-to:
-References:
-Comments: In-reply-to "Hiroshi Inoue"
- message dated "Sat, 17 Jun 2000 08:11:08 +0900"
-Date: Fri, 16 Jun 2000 19:30:25 -0400
-From: Tom Lane
-Status: ROr
-
-"Hiroshi Inoue" writes:
-> It seems that we should also provide not_preallocated DATAFILE
-> when many_tables_in_a_file storage manager is introduced.
-
-Several people in this thread have been talking like a
-single-physical-file storage manager is in our future, but I can't
-recall anyone saying that they were going to do such a thing or even
-presenting reasons why it'd be a good idea.
-
-Seems to me that physical file per relation is considerably better for
-our purposes. It's easier to figure out what's going on for admin and
-debug work, it means less lock contention among different backends
-appending concurrently to different relations, and it gives the OS a
-better shot at doing effective read-ahead on sequential scans.
-
-So why all the enthusiasm for multi-tables-per-file?
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA07578;
- Fri, 16 Jun 2000 21:01:00 -0400 (EDT)
-Received: from tech.com.au (IDENT:
[email protected] [139.130.75.122]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id UAA24724; Fri, 16 Jun 2000 20:39:30 -0400 (EDT)
-Received: from bitmead.com (IDENT:chris@tardis [203.41.180.243])
- by tech.com.au (8.9.3/8.9.3) with ESMTP id KAA21388;
- Sat, 17 Jun 2000 10:39:21 +1000
-Date: Sat, 17 Jun 2000 10:39:16 +1000
-From: Chris Bitmead
-X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.14-5.0 i686)
-X-Accept-Language: en
-MIME-Version: 1.0
-CC: Tom Lane , Hiroshi Inoue ,
- Jan Wieck ,
- Bruce Momjian ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: RO
-
-
-> > So why all the enthusiasm for multi-tables-per-file?
-
-It allows you to use raw partitions which stop the OS double buffering
-and wasting half of memory, as well as removing the overhead of indirect
-blocks in the file system.
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id GAA22177;
- Sat, 17 Jun 2000 06:00:59 -0400 (EDT)
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id FAA21759; Sat, 17 Jun 2000 05:36:27 -0400 (EDT)
-Received: from mcadnote1 (ppm130.noc.fukui.nsk.ne.jp [210.161.188.49])
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
- id SAA08383; Sat, 17 Jun 2000 18:35:36 +0900
-From: "Hiroshi Inoue"
-To: "Bruce Momjian"
, "Tom Lane"
-Cc: "Jan Wieck" ,
- "Bruce Momjian" ,
- "PostgreSQL-development"
,
- "Ross J. Reedstrom"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Sat, 17 Jun 2000 18:38:29 +0900
-Message-ID:
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="US-ASCII"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
-Importance: Normal
-Status: RO
-
-> -----Original Message-----
-> >
-> > So why all the enthusiasm for multi-tables-per-file?
->
-> No idea. I thought Vadim mentioned it, but I am not sure anymore. I
-> certainly like our current system.
->
-
-Oops,I'm not so enthusiastic for multi_tables_per_file smgr.
-I believe that Ross and I have taken a practical way that doesn't
-break current file_per_table smgr.
-
-However it seems very natural to take multi_tables_per_file
-smgr into account when we consider TABLESPACE concept.
-Because TABLESPACE is an encapsulation,it should have
-a possibility to handle multi_tables_per_file smgr IMHO.
-
-Regards.
-
-Hiroshi Inoue
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA02794;
- Sat, 17 Jun 2000 12:31:07 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id MAA07194; Sat, 17 Jun 2000 12:12:53 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id MAA18824;
- Sat, 17 Jun 2000 12:11:18 -0400 (EDT)
-To: "Hiroshi Inoue"
-cc: "Bruce Momjian"
, "Jan Wieck" ,
- "Bruce Momjian" ,
- "PostgreSQL-development"
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-In-reply-to:
-References:
-Comments: In-reply-to "Hiroshi Inoue"
- message dated "Sat, 17 Jun 2000 18:38:29 +0900"
-Date: Sat, 17 Jun 2000 12:11:18 -0400
-From: Tom Lane
-Status: RO
-
-"Hiroshi Inoue" writes:
-> However it seems very natural to take multi_tables_per_file
-> smgr into account when we consider TABLESPACE concept.
-> Because TABLESPACE is an encapsulation,it should have
-> a possibility to handle multi_tables_per_file smgr IMHO.
-
-OK, I see: you're just saying that the tablespace stuff should be
-designed in such a way that it would work with a non-file-per-table
-smgr. Agreed, that'd be a good check of a clean design, and someday
-we might need it...
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA06514
- for
; Sun, 18 Jun 2000 12:30:58 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id MAA04979 for
; Sun, 18 Jun 2000 12:07:44 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id MAA12163;
- Sun, 18 Jun 2000 12:06:29 -0400 (EDT)
-cc: Jan Wieck , Hiroshi Inoue ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Bruce Momjian
- message dated "Sun, 18 Jun 2000 09:33:44 -0400"
-Date: Sun, 18 Jun 2000 12:06:29 -0400
-From: Tom Lane
-Status: ROr
-
-> ... We could even get fancy and
-> round-robin through all the extents directories, looping around to the
-> beginning when we run out of them. That sounds nice.
-
-That sounds horrible. There's no way to tell which extent directory
-extent N goes into except by scanning the location directory to find
-out how many extent subdirectories there are (so that you can compute
-N modulo number-of-directories). Do you want to pay that price on every
-file open?
-
-Worse, what happens when you add another extent directory? You can't
-find your old extents anymore, that's what, because they're not in the
-right place (N modulo number-of-directories just changed). Since the
-extents are presumably on different volumes, you're talking about
-physical file moves to get them where they should be. You probably
-can't add a new extent without shutting down the entire database while
-you reshuffle files --- at the very least you'd need to get exclusive
-locks on all the tables in that tablespace.
-
-Also, you'll get filename conflicts from multiple extents of a single
-table appearing in one of the recycled extent dirs. You could work
-around it by using the non-modulo'd N as part of the final file name,
-but that just adds more complexity and makes the filename-generation
-machinery that much more closely tied to this specific way of doing
-things.
-
-The right way to do this is that extent N goes into extents subdirectory
-N, period. If there's no such subdirectory, create one on-the-fly as a
-plain subdirectory of the location directory. The dbadmin can easily
-create secondary extent symlinks *in advance of their being needed*.
-Reorganizing later is much more painful since it requires moving
-physical files, but I think that'd be true no matter what. At least
-we should see to it that adding more space in advance of needing it is
-painless.
-
-It's possible to do it that way (auto-create extent subdir if needed)
-without tying the md.c machinery real closely to a specific filename
-creation procedure: it's just the same sort of thing as install programs
-customarily do. "If you fail to create a file, try creating its
-ancestor directory." We'd have to think about whether it'd be a good
-idea to allow auto-creation of more than one level of directory; offhand
-it seems that needing to make more than one level is probably a sign of
-an erroneous path, not need for another extent subdirectory.
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA19951
- for
; Sun, 18 Jun 2000 20:00:59 -0400 (EDT)
-Received: from smtp.pacifier.com (asteroid.pacifier.com [199.2.117.154]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id TAA24345 for
; Sun, 18 Jun 2000 19:50:06 -0400 (EDT)
-Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68])
- by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id QAA05302;
- Sun, 18 Jun 2000 16:49:27 -0700 (PDT)
-X-Mailer: Windows Eudora Pro Version 3.0.1 (32)
-Date: Sun, 18 Jun 2000 16:43:42 -0700
-To: Bruce Momjian
, Tom Lane
-From: Don Baccus
-Subject: Re: [HACKERS] Big 7.1 open items
-Cc: Jan Wieck , Hiroshi Inoue ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Mime-Version: 1.0
-Content-Type: text/plain; charset="us-ascii"
-Status: ROr
-
-At 06:50 PM 6/18/00 -0400, Bruce Momjian wrote:
->If we eliminate the round-robin idea, what did people think of the rest
->of the ideas?
-
-Why invent new syntax when "create tablespace" is something a lot
-of folks will recognize?
-
-And why not use "create table ... using ... "? In other words,
-Oracle-compatible for this construct? Sure, Postgres doesn't
-have to follow Oraclisms but picking an existing contruct means
-at least SOME folks can import a datamodel without having to
-edit it.
-
-Does your proposal break the smgr abstraction, i.e. does it
-preclude later efforts to (say) implement an (optional)
-raw-device storage manager?
-
-
-
-
-- Don Baccus, Portland OR
- Nature photos, on-line guides, Pacific Northwest
- Rare Bird Alert Service and other goodies at
- http://donb.photo.net.
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA23880
- for
; Sun, 18 Jun 2000 23:28:12 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id XAA04627 for
; Sun, 18 Jun 2000 23:24:37 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5J3GQM78526;
- Sun, 18 Jun 2000 23:16:26 -0400 (EDT)
- by hub.org (8.10.1/8.10.1) with ESMTP id e5J3E3M71538
- for
; Sun, 18 Jun 2000 23:14:03 -0400 (EDT)
-Received: (from pgman@localhost)
- by candle.pha.pa.us (8.9.0/8.9.0) id XAA23541;
- Sun, 18 Jun 2000 23:13:44 -0400 (EDT)
-Subject: Re: [HACKERS] Big 7.1 open items
- pm"
-To: Tom Lane
-Date: Sun, 18 Jun 2000 23:13:44 -0400 (EDT)
-CC: Jan Wieck , Hiroshi Inoue ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-X-Mailer: ELM [version 2.4ME+ PL77 (25)]
-MIME-Version: 1.0
-Content-Transfer-Encoding: 7bit
-Content-Type: text/plain; charset=US-ASCII
-Precedence: bulk
-Status: RO
-
-My basic proposal is that we optionally allow symlinks when creating
-tablespace directories, and that we interrogate those symlinks during a
-dump so administrators can move tablespaces around without having to
-modify environment variables or system tables.
-
-I also suggested creating an extent directory to hold extents, like
-extent/2 and extent/3. This will allow administration for smaller sites
-to be simpler.
-
---
- Bruce Momjian | http://www.op.net/~candle
- + If your life is a hard drive, | 830 Blythe Avenue
- + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA01941
- for
; Mon, 19 Jun 2000 00:31:00 -0400 (EDT)
-Received: from smtp.pacifier.com (comet.pacifier.com [199.2.117.155]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id AAA06881 for
; Mon, 19 Jun 2000 00:11:39 -0400 (EDT)
-Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68])
- by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id VAA29138;
- Sun, 18 Jun 2000 21:11:01 -0700 (PDT)
-X-Mailer: Windows Eudora Pro Version 3.0.1 (32)
-Date: Sun, 18 Jun 2000 21:07:48 -0700
-To: Bruce Momjian
, Tom Lane
-From: Don Baccus
-Subject: Re: [HACKERS] Big 7.1 open items
-Cc: Jan Wieck , Hiroshi Inoue ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Mime-Version: 1.0
-Content-Type: text/plain; charset="us-ascii"
-Status: RO
-
-At 11:13 PM 6/18/00 -0400, Bruce Momjian wrote:
->My basic proposal is that we optionally allow symlinks when creating
->tablespace directories, and that we interrogate those symlinks during a
->dump so administrators can move tablespaces around without having to
->modify environment variables or system tables.
-
-If they can move them around from within the db, they'll have no need to
-move them around from outside the db.
-
-I don't quite understand your devotion to using filesystem commands
-outside the database to do database administration.
-
-
-
-- Don Baccus, Portland OR
- Nature photos, on-line guides, Pacific Northwest
- Rare Bird Alert Service and other goodies at
- http://donb.photo.net.
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA01981
- for
; Mon, 19 Jun 2000 01:31:01 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id BAA09569 for
; Mon, 19 Jun 2000 01:13:53 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5J4T3M86960;
- Mon, 19 Jun 2000 00:29:04 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5J4RFM80712
- for
; Mon, 19 Jun 2000 00:27:15 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id AAA09517;
- Mon, 19 Jun 2000 00:25:53 -0400 (EDT)
-cc: Jan Wieck , Hiroshi Inoue ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Bruce Momjian
- message dated "Sun, 18 Jun 2000 23:13:44 -0400"
-Date: Mon, 19 Jun 2000 00:25:52 -0400
-From: Tom Lane
-Precedence: bulk
-Status: ROr
-
-> I also suggested creating an extent directory to hold extents, like
-> extent/2 and extent/3. This will allow administration for smaller sites
-> to be simpler.
-
-I don't see the value in creating an extra level of directory --- seems
-that just adds one more Unix directory-lookup cycle to each file open,
-without any apparent return. What's wrong with extent directory names
-like extent2, extent3, etc?
-
-Obviously the extent dirnames must be chosen so they can't conflict
-with table filenames, but that's easily done. For example, if table
-files are named like 'OID_xxx' then 'extentN' will never conflict.
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA01934
- for
; Mon, 19 Jun 2000 00:30:58 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id AAA07814 for
; Mon, 19 Jun 2000 00:29:36 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id AAA09535;
- Mon, 19 Jun 2000 00:28:14 -0400 (EDT)
-To: Don Baccus
-cc: Bruce Momjian
, Jan Wieck ,
- Hiroshi Inoue ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Don Baccus
- message dated "Sun, 18 Jun 2000 21:07:48 -0700"
-Date: Mon, 19 Jun 2000 00:28:14 -0400
-From: Tom Lane
-Status: ROr
-
-Don Baccus writes:
-> If they can move them around from within the db, they'll have no need to
-> move them around from outside the db.
-> I don't quite understand your devotion to using filesystem commands
-> outside the database to do database administration.
-
-Being *able* to use filesystem commands to see/fix what's going on is a
-good thing, particularly from a development/debugging standpoint. But
-I agree we want to have within-the-system admin commands to do the same
-things.
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA01977
- for
; Mon, 19 Jun 2000 01:31:00 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id BAA09374 for
; Mon, 19 Jun 2000 01:07:50 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5J4VkM95901;
- Mon, 19 Jun 2000 00:31:46 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5J4TgM89399
- for
; Mon, 19 Jun 2000 00:29:42 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id AAA09535;
- Mon, 19 Jun 2000 00:28:14 -0400 (EDT)
-To: Don Baccus
-cc: Bruce Momjian
, Jan Wieck ,
- Hiroshi Inoue ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Don Baccus
- message dated "Sun, 18 Jun 2000 21:07:48 -0700"
-Date: Mon, 19 Jun 2000 00:28:14 -0400
-From: Tom Lane
-Precedence: bulk
-Status: RO
-
-Don Baccus writes:
-> If they can move them around from within the db, they'll have no need to
-> move them around from outside the db.
-> I don't quite understand your devotion to using filesystem commands
-> outside the database to do database administration.
-
-Being *able* to use filesystem commands to see/fix what's going on is a
-good thing, particularly from a development/debugging standpoint. But
-I agree we want to have within-the-system admin commands to do the same
-things.
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA00799
- for
; Mon, 19 Jun 2000 00:58:38 -0400 (EDT)
-Received: from smtp.pacifier.com (comet.pacifier.com [199.2.117.155]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id AAA08143 for
; Mon, 19 Jun 2000 00:37:39 -0400 (EDT)
-Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68])
- by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id VAA00259;
- Sun, 18 Jun 2000 21:36:25 -0700 (PDT)
-X-Mailer: Windows Eudora Pro Version 3.0.1 (32)
-Date: Sun, 18 Jun 2000 21:33:19 -0700
-To: Tom Lane
-From: Don Baccus
-Subject: Re: [HACKERS] Big 7.1 open items
-Cc: Bruce Momjian
, Jan Wieck ,
- Hiroshi Inoue ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Mime-Version: 1.0
-Content-Type: text/plain; charset="us-ascii"
-Status: RO
-
-At 12:28 AM 6/19/00 -0400, Tom Lane wrote:
-
->Being *able* to use filesystem commands to see/fix what's going on is a
->good thing, particularly from a development/debugging standpoint.
-
-Of course it's a crutch for development, but outside of development
-circles few users will know how to use the OS in regard to the
-database.
-
-Assuming PG takes off. Of course, if it remains the realm of the
-dedicated hard-core hacker, I'm wrong.
-
-I have nothing against preserving the ability to use filesystem
-commands if there's no significant costs inherent with this approach.
-I'd view the breaking of smgr abstraction as a significant cost (though
-I agree with Ross that it Bruce's proposal shouldn't require that, I
-asked my question to flush Bruce out, if you will, because he's
-devoted to a particular outside-the-db management model).
-
-> But
->I agree we want to have within-the-system admin commands to do the same
->things.
-
-MUST have, I should think.
-
-
-
-- Don Baccus, Portland OR
- Nature photos, on-line guides, Pacific Northwest
- Rare Bird Alert Service and other goodies at
- http://donb.photo.net.
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA29988
- for
; Mon, 19 Jun 2000 12:31:16 -0400 (EDT)
-Received: from sd.tpf.co.jp (mail.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id MAA21005 for
; Mon, 19 Jun 2000 12:15:22 -0400 (EDT)
-Received: from mcadnote1 (ppm127.noc.fukui.nsk.ne.jp [210.161.188.46])
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
- id BAA09828; Tue, 20 Jun 2000 01:14:19 +0900
-From: "Hiroshi Inoue"
-Cc: "Tom Lane" , "Jan Wieck" ,
- "PostgreSQL-development"
,
- "Ross J. Reedstrom" ,
- "Don Baccus"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Tue, 20 Jun 2000 01:17:14 +0900
-Message-ID:
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="us-ascii"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
-Importance: Normal
-Status: ROr
-
-> -----Original Message-----
->
-> The fact is that symlink information is already stored in the file
-> system. If we store symlink information in the database too, there
-> exists the ability for the two to get out of sync. My point is that I
-> think we can _not_ store symlink information in the database, and query
-> the file system using lstat when required.
->
-
-Hmm,this seems pretty confusing to me.
-I don't understand the necessity of symlink.
-Directory tree,symlink,hard link ... are OS's standard.
-But I don't think they are fit for dbms management.
-
-PostgreSQL is a database system of cource. So
-couldn't it handle more flexible structure than OS's
-directory tree for itself ?
-
-Regards.
-
-Hiroshi Inoue
-
-
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA24419
- for
; Tue, 20 Jun 2000 02:00:59 -0400 (EDT)
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id BAA26090 for
; Tue, 20 Jun 2000 01:51:00 -0400 (EDT)
-Received: from cadzone ([126.0.1.40] (may be forged))
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
- id OAA10171; Tue, 20 Jun 2000 14:50:03 +0900
-From: "Hiroshi Inoue"
-Cc: "Tom Lane" , "Jan Wieck" ,
- "Ross J. Reedstrom" ,
- "Don Baccus" ,
- "PostgreSQL-development"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Tue, 20 Jun 2000 14:52:17 +0900
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="iso-8859-1"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
-Importance: Normal
-Status: ROr
-
-> -----Original Message-----
->
-> > > -----Original Message-----
-> > >
-> > > The fact is that symlink information is already stored in the file
-> > > system. If we store symlink information in the database too, there
-> > > exists the ability for the two to get out of sync. My point is that I
-> > > think we can _not_ store symlink information in the database,
-> and query
-> > > the file system using lstat when required.
-> > >
-> > Hmm,this seems pretty confusing to me.
-> > I don't understand the necessity of symlink.
-> > Directory tree,symlink,hard link ... are OS's standard.
-> > But I don't think they are fit for dbms management.
-> >
-> > PostgreSQL is a database system of cource. So
-> > couldn't it handle more flexible structure than OS's
-> > directory tree for itself ?
->
-> Yes, but is anyone suggesting a solution that does not work with
-> symlinks? If not, why not do it that way?
->
-
-Maybe other solutions have been proposed already because
-there have been so many opinions and proposals.
-
-I've felt TABLE(DATA)SPACE discussion has always been
-divergent. IMHO,one of the main cause is that various factors
-have been discussed at once. Shouldn't we make step by step
-consensus in TABLE(DATA)SPACE discussion ?
-
-IMHO,the first step is to decide the syntax of CREATE TABLE
-command not to define TABLE(DATA)SPACE.
-
-Comments ?
-
-Regards.
-
-Hiroshi Inoue
-
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA15181
- for
; Tue, 20 Jun 2000 10:51:31 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id KAA26466 for
; Tue, 20 Jun 2000 10:37:20 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id KAA29689;
- Tue, 20 Jun 2000 10:36:04 -0400 (EDT)
-cc: Hiroshi Inoue , Jan Wieck ,
- "Ross J. Reedstrom" ,
- Don Baccus ,
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Bruce Momjian
- message dated "Tue, 20 Jun 2000 09:40:03 -0400"
-Date: Tue, 20 Jun 2000 10:36:04 -0400
-From: Tom Lane
-Status: RO
-
-> Agreed. Seems we have several issues:
-
-> filename contents
-> tablespace implementation
-> tablespace directory layout
-> tablespace commands and syntax
-
-I think we've agreed that the filename must depend on tablespace,
-file version, and file segment number in some fashion --- plus
-the table name/OID of course. Although there's no real consensus
-about exactly how to construct the name, agreeing on the components
-is still a positive step.
-
-A couple of other areas of contention were:
-
- revising smgr interface to be cleaner
- exactly what to store in pg_class
-
-I don't think there's any quibble about the idea of cleaning up smgr,
-but we don't have a complete proposal on the table yet either.
-
-As for the pg_class issue, I still favor storing
- (a) OID of tablespace --- not for file access, but so that
- associated tablespace-table entry can be looked up
- by tablespace management operations
- (b) pathname of file as a column of type "name", including
- a %d to be replaced by segment #
-
-I think Peter was holding out for storing purely numeric tablespace OID
-and table version in pg_class and having a hardwired mapping to pathname
-somewhere in smgr. However, I think that doing it that way gains only
-micro-efficiency compared to passing a "name" around, while using the
-name approach buys us flexibility that's needed for at least some of
-the variants under discussion. Given that the exact filename contents
-are still so contentious, I think it'd be a bad idea to pick an
-implementation that doesn't allow some leeway as to what the filename
-will be. A name also has the advantage that it is a single item that
-can be used to identify the table to smgr, which will help in cleaning
-up the smgr interface.
-
-As for tablespace layout/implementation, the only real proposal I've
-heard is that there be a subdirectory of the database directory for each
-tablespace, and that that have a subdirectory for each segment (extent)
-of its tables --- where any of these subdirectories could be symlinks
-off to a different filesystem. Some unhappiness was raised about
-depending on symlinks for this function, but I didn't hear one single
-concrete reason not to do it, nor an alternative design. Unless someone
-comes up with a counterproposal, I think that that's what the actual
-access mechanism will look like. We still need to talk about what we
-want to store in the SQL-level representation of a tablespace, and what
-sort of tablespace management tools/commands are needed. (Although
-"try to make it look like Oracle" seems to be pretty much the consensus
-for the command level, not all of us know exactly what that means...)
-
-Comments? Anything else that we do have consensus on?
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA25768
- for
; Tue, 20 Jun 2000 12:55:04 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id MAA09949 for
; Tue, 20 Jun 2000 12:41:15 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5KGcCM19112;
- Tue, 20 Jun 2000 12:38:12 -0400 (EDT)
-Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5KGbbM18701
- for
; Tue, 20 Jun 2000 12:37:37 -0400 (EDT)
-Received: from regulus.student.UU.SE ([130.238.5.2]:43625 "EHLO
- regulus.its.uu.se") by merganser.its.uu.se with ESMTP
- id ; Tue, 20 Jun 2000 18:37:05 +0200
-Received: from peter (helo=localhost)
- by regulus.its.uu.se with local-esmtp (Exim 3.02 #2)
- id 134R7f-0003wS-00; Tue, 20 Jun 2000 18:43:35 +0200
-Date: Tue, 20 Jun 2000 18:43:35 +0200 (CEST)
-cc: Jan Wieck , Tom Lane ,
- Hiroshi Inoue ,
- Bruce Momjian ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-MIME-Version: 1.0
-Content-Type: TEXT/PLAIN; charset=ISO-8859-1
-Content-Transfer-Encoding: 8BIT
-Precedence: bulk
-Status: ROr
-
-Bruce Momjian writes:
-
-> If we have a new CREATE DATABASE LOCATION command, we can say:
->
-> CREATE DATABASE LOCATION dbloc IN '/var/private/pgsql';
-> CREATE DATABASE newdb IN dbloc;
-
-We kind of have this already, with CREATE DATABASE foo WITH LOCATION =
-'bar'; but of course with environment variable kludgery. But it's a start.
-
-> mkdir /var/private/pgsql/dbloc
-> ln -s /var/private/pgsql/dbloc data/base/dbloc
-
-I think the problem with this was that you'd have to do an extra lookup
-into, say, pg_location to resolve this. Some people are talking about
-blind writes, this is not really blind.
-
-> CREATE LOCATION tabloc IN '/var/private/pgsql';
-> CREATE TABLE newtab ... IN tabloc;
-
-Okay, so we'd have "table spaces" and "database spaces". Seems like one
-"space" ought to be enough. I was thinking that the database "space" would
-serve as a default "space" for tables created within it but you could
-still create tables in other "spaces" than were the database really is. In
-fact, the database wouldn't show up at all in the file names anymore,
-which may or may not be a good thing.
-
-I think Tom suggested something more or less like this:
-
-$PGDATA/base/tablespace/segment/table
-
-(leaving the details of "table" aside for now). pg_class would get a
-column storing the table space somehow, say an oid reference to
-pg_location. There would have to be a default tablespace that's created by
-initdb and it's indicated by oid 0. So if you create a simple little table
-"foo" it ends up in
-
-$PGDATA/base/0/0/foo
-
-That is pretty manageable. Now to create a table space you do
-
-CREATE LOCATION "name" AT '/some/where';
-
-which would make an entry in pg_location and, similar to how you
-suggested, create a symlink from
-
-$PGDATA/base/newoid -> /some/where
-
-Then when you create a new table at that new location this gets simply
-noted in pg_class with an oid reference, the rest works completely
-transparently and no lookup outside of pg_class required. The system would
-create the segment 0 subdirectory automatically.
-
-When tables get segmented the system would simply create subdirectories 1,
-2, 3, etc. as needed, just as it created the 0 as need, no extra code.
-
-pg_dump doesn't need to use lstat or whatever at all because the locations
-are catalogued. Administrators don't even need to know about the linking
-business, they just make sure the target directory exists.
-
-Two more items to ponder:
-
-* per-location transaction logs
-
-* pg_upgrade
-
-
---
-Peter Eisentraut Sernanders väg 10:115
-http://yi.org/peter-e/ Sweden
-
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA10307
- for
; Tue, 20 Jun 2000 17:10:55 -0400 (EDT)
-Received: from sd.tpf.co.jp (mail.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id QAA08017 for
; Tue, 20 Jun 2000 16:57:44 -0400 (EDT)
-Received: from mcadnote1 (ppm127.noc.fukui.nsk.ne.jp [210.161.188.46])
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
- id FAA00867; Wed, 21 Jun 2000 05:56:44 +0900
-From: "Hiroshi Inoue"
-To: "Tom Lane"
, "Bruce Momjian"
-Cc: "Jan Wieck" , "Ross J. Reedstrom" ,
- "Don Baccus" ,
- "PostgreSQL-development"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Wed, 21 Jun 2000 05:59:41 +0900
-Message-ID:
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="iso-2022-jp"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
-Importance: Normal
-Status: RO
-
-> -----Original Message-----
->
-> > Agreed. Seems we have several issues:
->
-> > filename contents
-> > tablespace implementation
-> > tablespace directory layout
-> > tablespace commands and syntax
->
-
-[snip]
-
->
-> Comments? Anything else that we do have consensus on?
->
-
-Before the details of tablespace implementation,
-
-1) How to change(extend) the syntax of CREATE TABLE
- We only add table(data)space name with some
- keyword ? i.e Do we consider tablespace as an
- abstraction ?
-
-To confirm our mutual understanding.
-
-2) Is tablespace defined per PostgreSQL's database ?
-3) Is default tablespace defined per database/user or
- for all ?
-
-AFAIK in Oracle,2) global, 3) per user.
-
-Regards.
-
-Hiroshi Inoue
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA12668;
- Tue, 20 Jun 2000 20:00:58 -0400 (EDT)
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id TAA21016; Tue, 20 Jun 2000 19:54:18 -0400 (EDT)
-Received: from cadzone ([126.0.1.40] (may be forged))
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
- id IAA00974; Wed, 21 Jun 2000 08:52:38 +0900
-From: "Hiroshi Inoue"
-Cc: "Jan Wieck" , "Tom Lane" ,
- "Bruce Momjian" ,
- "PostgreSQL-development"
,
- "Ross J. Reedstrom" ,
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Wed, 21 Jun 2000 08:54:51 +0900
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="iso-8859-1"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
-Importance: Normal
-Status: ROr
-
-> -----Original Message-----
-> From: Peter Eisentraut
->
-> Bruce Momjian writes:
->
-> > If we have a new CREATE DATABASE LOCATION command, we can say:
-> >
-> > CREATE DATABASE LOCATION dbloc IN '/var/private/pgsql';
-> > CREATE DATABASE newdb IN dbloc;
->
-> We kind of have this already, with CREATE DATABASE foo WITH LOCATION =
-> 'bar'; but of course with environment variable kludgery. But it's a start.
->
-> > mkdir /var/private/pgsql/dbloc
-> > ln -s /var/private/pgsql/dbloc data/base/dbloc
->
-> I think the problem with this was that you'd have to do an extra lookup
-> into, say, pg_location to resolve this. Some people are talking about
-> blind writes, this is not really blind.
->
-> > CREATE LOCATION tabloc IN '/var/private/pgsql';
-> > CREATE TABLE newtab ... IN tabloc;
->
-> Okay, so we'd have "table spaces" and "database spaces". Seems like one
-> "space" ought to be enough.
-
-Does your "database space" correspond to current PostgreSQL's database ?
-And is it different from SCHEMA ?
-
-Regards.
-
-Hiroshi Inoue
-
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA18016;
- Wed, 21 Jun 2000 00:23:47 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id AAA05207; Wed, 21 Jun 2000 00:07:58 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id AAA03002;
- Wed, 21 Jun 2000 00:06:42 -0400 (EDT)
-cc: Hiroshi Inoue
, Peter Eisentraut ,
- Jan Wieck ,
- Bruce Momjian ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Bruce Momjian
- message dated "Tue, 20 Jun 2000 23:45:13 -0400"
-Date: Wed, 21 Jun 2000 00:06:42 -0400
-From: Tom Lane
-Status: ROr
-
-> I recommend making a dbname in each directory, then putting the
-> location inside there.
-
-This still seems backwards to me. Why is it better than tablespace
-directory inside database directory?
-
-One significant problem with it is that there's no longer (AFAICS)
-a "default" per-database directory that corresponds to the current
-working directory of backends running in that database. Thus,
-for example, it's not immediately clear where temporary files and
-backend core-dump files will end up. Also, you've just added an
-essential extra level (if not two) to the pathnames that backends will
-use to address files.
-
-There is a great deal to be said for
- ..../database/tablespace/filename
-where .../database/ is the working directory of a backend running in
-that database, so that the relative pathname used by that backend to
-get to a table is just tablespace/filename. I fail to see any advantage
-in reversing the pathname order. If you see one, enlighten me.
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA19614
- for
; Wed, 21 Jun 2000 01:00:54 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5L4wA125142;
- Wed, 21 Jun 2000 00:58:10 -0400 (EDT)
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5L4vp125043
- for
; Wed, 21 Jun 2000 00:57:51 -0400 (EDT)
-Received: from cadzone ([126.0.1.40] (may be forged))
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
- id NAA01462; Wed, 21 Jun 2000 13:52:47 +0900
-From: "Hiroshi Inoue"
-To: "Tom Lane"
, "Bruce Momjian"
-Cc: "Peter Eisentraut"
, "Jan Wieck" ,
- "Bruce Momjian" ,
- "PostgreSQL-development"
,
- "Ross J. Reedstrom"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Wed, 21 Jun 2000 13:55:01 +0900
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="iso-2022-jp"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
-Importance: Normal
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
-Precedence: bulk
-Status: RO
-
-> -----Original Message-----
->
-> > I recommend making a dbname in each directory, then putting the
-> > location inside there.
->
-> This still seems backwards to me. Why is it better than tablespace
-> directory inside database directory?
->
-> One significant problem with it is that there's no longer (AFAICS)
-> a "default" per-database directory that corresponds to the current
-> working directory of backends running in that database. Thus,
-> for example, it's not immediately clear where temporary files and
-> backend core-dump files will end up. Also, you've just added an
-> essential extra level (if not two) to the pathnames that backends will
-> use to address files.
->
-> There is a great deal to be said for
-> ..../database/tablespace/filename
-
-OK,I seem to have gotten the answer for the question
- Is tablespace defined per PostgreSQL's database ?
-
-You and Bruce
- 1) tablespace is per database
-Peter seems to have the following idea(?? not sure)
- 2) database = tablespace
-My opinion
- 3) database and tablespace are relatively irrelevant.
- I assume PostgreSQL's database would correspond
- to the concept of SCHEMA.
-
-It seems we are different from the first.
-Shoudln't we reach an agreement on it in the first place ?
-
-Regards.
-
-Hiroshi Inoue
-
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA20523
- for
; Wed, 21 Jun 2000 01:31:12 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id BAA08982 for
; Wed, 21 Jun 2000 01:15:17 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5L5Bp151546;
- Wed, 21 Jun 2000 01:11:51 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5L5BP151324
- for
; Wed, 21 Jun 2000 01:11:25 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id BAA03463;
- Wed, 21 Jun 2000 01:09:52 -0400 (EDT)
-To: Chris Bitmead
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Chris Bitmead
- message dated "Wed, 21 Jun 2000 14:45:01 +1000"
-Date: Wed, 21 Jun 2000 01:09:52 -0400
-From: Tom Lane
-Precedence: bulk
-Status: RO
-
-Chris Bitmead writes:
-> What I meant is, would you still be able to create tablespaces on
-> systems without symlinks? That would seem to be a desirable feature.
-
-All else being equal, it'd be nice. Since all else is not equal,
-exactly how much sweat are we willing to expend on supporting that
-feature on such systems --- to the exclusion of other features we
-might expend the same sweat on, with more widely useful results?
-
-Bear in mind that everything will still *work* just fine on such a
-platform, you just don't have a way to spread the database across
-multiple filesystems. That's only an issue if the platform has a
-fairly Unixy notion of filesystems ... but no symlinks.
-
-A few messages back someone was opining that we were wasting our time
-thinking about tablespaces at all, because any modern platform can
-create disk-spanning filesystems for itself, so applications don't have
-to worry. I don't buy that argument in general, but I'm quite willing
-to quote it for the *very* few systems that are Unixy enough to run
-Postgres in the first place, but not quite Unixy enough to have
-symlinks.
-
-You gotta draw the line somewhere at what you will support, and
-this particular line seems to me to be entirely reasonable and
-justifiable. YMMV...
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA20492
- for
; Wed, 21 Jun 2000 01:30:58 -0400 (EDT)
-Received: from smtp.pacifier.com (comet.pacifier.com [199.2.117.155]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id BAA09401 for
; Wed, 21 Jun 2000 01:22:50 -0400 (EDT)
-Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68])
- by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id WAA22395;
- Tue, 20 Jun 2000 22:21:47 -0700 (PDT)
-X-Mailer: Windows Eudora Pro Version 3.0.1 (32)
-Date: Tue, 20 Jun 2000 22:12:48 -0700
-To: "Philip J. Warner"
, "Hiroshi Inoue" ,
- "Tom Lane" ,
-From: Don Baccus
-Subject: RE: [HACKERS] Big 7.1 open items
-Cc: "Jan Wieck" , "Ross J. Reedstrom" ,
- "PostgreSQL-development"
-References:
-Mime-Version: 1.0
-Content-Type: text/plain; charset="us-ascii"
-Status: RO
-
-At 11:22 AM 6/21/00 +1000, Philip J. Warner wrote:
-
->It may be worth considering leaving the CREATE TABLE statement alone.
->Dec/RDB uses a new statement entirely to define where a table goes...
-
-It's worth considering, but on the other hand Oracle users greatly
-outnumber Compaq/RDB users these days...
-
-If there's no SQL92 guidance for implementing a feature, I'm pretty much in
-favor of tracking Oracle, whose SQL dialect is rapidly becoming a
-de-facto standard.
-
-I'm not saying I like the fact, Oracle's a pain in the ass. But when
-adopting existing syntax, might as well adopt that of the crushing
-borg.
-
-
-
-- Don Baccus, Portland OR
- Nature photos, on-line guides, Pacific Northwest
- Rare Bird Alert Service and other goodies at
- http://donb.photo.net.
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA20508;
- Wed, 21 Jun 2000 01:31:06 -0400 (EDT)
-Received: from huey.jpl.nasa.gov (huey.jpl.nasa.gov [128.149.68.100]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id BAA09355; Wed, 21 Jun 2000 01:22:03 -0400 (EDT)
-Received: from golem.jpl.nasa.gov (hectic-1 [128.149.68.203])
- by huey.jpl.nasa.gov (8.8.8+Sun/8.8.8) with ESMTP id WAA00821;
- Tue, 20 Jun 2000 22:18:38 -0700 (PDT)
-Received: from alumni.caltech.edu (localhost.localdomain [127.0.0.1])
- by golem.jpl.nasa.gov (Postfix) with ESMTP
- id AF4376F51; Wed, 21 Jun 2000 05:19:29 +0000 (UTC)
-Date: Wed, 21 Jun 2000 05:19:29 +0000
-From: Thomas Lockhart
-Organization: Yes
-X-Mailer: Mozilla 4.7 [en] (X11; I; Linux 2.2.14-15mdksmp i686)
-X-Accept-Language: en
-MIME-Version: 1.0
-Cc: Peter Eisentraut
, Jan Wieck ,
- Tom Lane , Hiroshi Inoue ,
- Bruce Momjian ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: ROr
-
-> Yes, I didn't like the environment variable stuff. In fact, I would
-> like to not mention the symlink location anywhere in the database, so
-> it can be changed without changing it in the database.
-
-Well, as y'all have noticed, I think there are strong reasons to use
-environment variables to manage locations, and that symlinks are a
-potential portability and robustness problem.
-
-An additional point which has relevance to this whole discussion:
-
-In the future we may allow system resource such as tables to carry names
-which use multi-byte encodings. afaik these encodings are not allowed to
-be used for physical file names, and even if they were the utility of
-using standard operating system utilities like ls goes way down.
-
-istm that from a portability and evolutionary standpoint OID-only file
-names (or at least file names *not* based on relation/class names) is a
-requirement.
-
-Comments?
-
- - Thomas
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA20503
- for
; Wed, 21 Jun 2000 01:31:05 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id BAA09513 for
; Wed, 21 Jun 2000 01:25:18 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id BAA03557;
- Wed, 21 Jun 2000 01:23:58 -0400 (EDT)
-To: "Hiroshi Inoue"
- "Peter Eisentraut"
, "Jan Wieck" ,
- "PostgreSQL-development"
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to "Hiroshi Inoue"
- message dated "Wed, 21 Jun 2000 13:55:01 +0900"
-Date: Wed, 21 Jun 2000 01:23:57 -0400
-From: Tom Lane
-Status: ROr
-
-"Hiroshi Inoue" writes:
->> There is a great deal to be said for
->> ..../database/tablespace/filename
-
-> OK,I seem to have gotten the answer for the question
-> Is tablespace defined per PostgreSQL's database ?
-
-Not necessarily --- the tablespace subdirectories could be symlinks
-pointing to the same place (assuming you use OIDs or something to keep
-the table filenames unique even across databases). This is just an
-implementation mechanism; it doesn't foreclose the policy decision
-whether tablespaces are database-local or installation-wide.
-
-(OTOH, pathnames like tablespace/database would pretty much force
-tablespaces to be installation-wide whether you wanted it that way
-or not.)
-
-> My opinion
-> 3) database and tablespace are relatively irrelevant.
-> I assume PostgreSQL's database would correspond
-> to the concept of SCHEMA.
-
-My inclindation is that tablespaces should be installation-wide, but
-I'm not completely sold on it. In any case I could see wanting a
-permissions mechanism that would only allow some databases to have
-tables in a particular tablespace.
-
-We do need to think more about how traditional Postgres databases
-fit together with SCHEMA. Maybe we wouldn't even need multiple
-databases per installation if we had SCHEMA done right.
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA25698
- for
; Wed, 21 Jun 2000 02:31:00 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id CAA11423 for
; Wed, 21 Jun 2000 02:09:13 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5L5we151226;
- Wed, 21 Jun 2000 01:58:40 -0400 (EDT)
-Received: from wallace.ece.rice.edu (wallace.ece.rice.edu [128.42.12.154])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5L5wE151030
- for
; Wed, 21 Jun 2000 01:58:14 -0400 (EDT)
-Received: by rice.edu
- via sendmail from stdin
- id (Debian Smail3.2.0.102)
-Date: Wed, 21 Jun 2000 00:45:02 -0500
-From: "Ross J. Reedstrom"
-To: Tom Lane
-Cc: Hiroshi Inoue
, Bruce Momjian ,
- Peter Eisentraut
, Jan Wieck ,
-Subject: Re: [HACKERS] Big 7.1 open items
-Mail-Followup-To: Tom Lane ,
- Hiroshi Inoue ,
- Peter Eisentraut
, Jan Wieck ,
-Mime-Version: 1.0
-Content-Type: text/plain; charset=us-ascii
-User-Agent: Mutt/1.0i
-Precedence: bulk
-Status: ROr
-
-On Wed, Jun 21, 2000 at 01:23:57AM -0400, Tom Lane wrote:
-> "Hiroshi Inoue" writes:
->
-> > My opinion
-> > 3) database and tablespace are relatively irrelevant.
-> > I assume PostgreSQL's database would correspond
-> > to the concept of SCHEMA.
->
-> My inclindation is that tablespaces should be installation-wide, but
-> I'm not completely sold on it. In any case I could see wanting a
-> permissions mechanism that would only allow some databases to have
-> tables in a particular tablespace.
->
-> We do need to think more about how traditional Postgres databases
-> fit together with SCHEMA. Maybe we wouldn't even need multiple
-> databases per installation if we had SCHEMA done right.
->
-
-The important point I think is that tablespaces are about physical
-storage/namespace, and SCHEMA are about logical namespace: it would make
-sense for tables from multiple schema to live in the same tablespace,
-as well as tables from one schema to be stored in multiple tablespaces.
-
-Ross
---
-Ross J. Reedstrom, Ph.D.,
-NSBRI Research Scientist/Programmer
-Computer and Information Technology Institute
-Rice University, 6100 S. Main St., Houston, TX 77005
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA25704
- for
; Wed, 21 Jun 2000 02:31:02 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id CAA11923 for
; Wed, 21 Jun 2000 02:22:41 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5L6JO196109;
- Wed, 21 Jun 2000 02:19:24 -0400 (EDT)
-Received: from mailo.vtcif.telstra.com.au (mailo.vtcif.telstra.com.au [202.12.144.17])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5L6JB196028
- for
; Wed, 21 Jun 2000 02:19:11 -0400 (EDT)
-Received: (from uucp@localhost) by mailo.vtcif.telstra.com.au (8.8.2/8.6.9) id QAA21128 for
; Wed, 21 Jun 2000 16:19:04 +1000 (EST)
-Received: from maili.vtcif.telstra.com.au(202.12.142.17)
- via SMTP by mailo.vtcif.telstra.com.au, id smtpd08EKgu; Wed Jun 21 16:17:56 2000
-Received: (from uucp@localhost) by maili.vtcif.telstra.com.au (8.8.2/8.6.9) id QAA02825 for
; Wed, 21 Jun 2000 16:17:55 +1000 (EST)
-Received: from localhost(127.0.0.1), claiming to be "mail.cdn.telstra.com.au"
- via SMTP by localhost, id smtpdnjRBD_; Wed Jun 21 16:17:14 2000
-Received: from lunitari.nimrod.itg.telecom.com.au (lunitari.nimrod.itg.telecom.com.au [192.53.254.48]) by mail.cdn.telstra.com.au (8.8.2/8.6.9) with ESMTP id QAA07553 for
; Wed, 21 Jun 2000 16:17:14 +1000 (EST)
-Received: from nimrod.itg.telecom.com.au (majere [192.53.254.45])
- by lunitari.nimrod.itg.telecom.com.au (8.9.1/8.9.3) with ESMTP id QAA05880
- for
; Wed, 21 Jun 2000 16:15:56 +1000 (EST)
-Date: Wed, 21 Jun 2000 16:13:47 +1000
-From: Chris Bitmead
-Organization: IBM Global Services
-X-Mailer: Mozilla 4.6 [en] (X11; I; SunOS 5.6 sun4u)
-X-Accept-Language: en
-MIME-Version: 1.0
-To: PostgreSQL-development
-Subject: Re: [HACKERS] Big 7.1 open items
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Precedence: bulk
-Status: RO
-
-"Ross J. Reedstrom" wrote:
-
-> The important point I think is that tablespaces are about physical
-> storage/namespace, and SCHEMA are about logical namespace: it would make
-> sense for tables from multiple schema to live in the same tablespace,
-> as well as tables from one schema to be stored in multiple tablespaces.
-
-If we accept that argument (which sounds good) then wouldn't we have...
-
-data/base/db1/table1 -> ../../../tablespace/ts1/db1.table1
-data/base/db1/table2 -> ../../../tablespace/ts1/db1.table2
-data/tablespace/ts1/db1.table1
-data/tablespace/ts1/db1.table2
-
-In other words there is a directory for databases, and a directory for
-tablespaces. Database tables are symlinked to the appropriate
-tablespace. So there is multiple databases per tablespace and multiple
-tablespaces per database.
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id JAA06055
- for
; Wed, 21 Jun 2000 09:01:00 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id IAA29647 for
; Wed, 21 Jun 2000 08:52:25 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5LCo0112103;
- Wed, 21 Jun 2000 08:50:00 -0400 (EDT)
-Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5LCnS112011
- for
; Wed, 21 Jun 2000 08:49:28 -0400 (EDT)
-Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
- by gandalf.it-austria.net (xxx/xxx) with ESMTP id OAA27330;
- Wed, 21 Jun 2000 14:48:44 +0200
-Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
- id ; Wed, 21 Jun 2000 14:48:44 +0200
-Message-ID: <219F68D65015D011A8E000006F8590C605BA5983@sdexcsrv1.f000.d0188.sd.spardat.at>
-From: Zeugswetter Andreas SB
-To: "'Hiroshi Inoue'"
-Subject: AW: [HACKERS] Big 7.1 open items
-Date: Wed, 21 Jun 2000 14:48:43 +0200
-MIME-Version: 1.0
-X-Mailer: Internet Mail Service (5.5.2448.0)
-Content-Type: text/plain;
- charset="iso-8859-1"
-Precedence: bulk
-Status: RO
-
-
-> > > CREATE LOCATION tabloc IN '/var/private/pgsql';
-> > > CREATE TABLE newtab ... IN tabloc;
-> >
-> > Okay, so we'd have "table spaces" and "database spaces".
-> Seems like one
-> > "space" ought to be enough.
-
-Yes, one space should be enough.
-
->
-> Does your "database space" correspond to current PostgreSQL's
-> database ?
-
-I think we should think of the "database space" as the default "table space"
-for this database.
-
-> And is it different from SCHEMA ?
-
-Please don't mix schema and database, they are two different issues.
-Even Oracle has a database, only in Oracle you are limited to one database
-per instance. We do not want to add this limitation to PostgreSQL.
-
-Andreas
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA06585;
- Wed, 21 Jun 2000 10:01:09 -0400 (EDT)
-Received: from meryl.it.uu.se (
[email protected] [130.238.12.42]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id JAA03592; Wed, 21 Jun 2000 09:38:34 -0400 (EDT)
- by meryl.it.uu.se (8.8.5/8.8.5) with ESMTP id PAA20520;
- Wed, 21 Jun 2000 15:34:34 +0200 (MET DST)
-Received: from localhost (e99re41@localhost) by Ulv.DoCS.UU.SE (8.6.12/8.6.12) with ESMTP id PAA10847; Wed, 21 Jun 2000 15:34:27 +0200
-X-Authentication-Warning: Ulv.DoCS.UU.SE: e99re41 owned process doing -bs
-Date: Wed, 21 Jun 2000 15:34:27 +0200 (MET DST)
-From: Peter Eisentraut
-Reply-To: Peter Eisentraut
-To: Hiroshi Inoue
-cc: Tom Lane
, Bruce Momjian ,
- Jan Wieck ,
- Bruce Momjian ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: RE: [HACKERS] Big 7.1 open items
-MIME-Version: 1.0
-Content-Type: TEXT/PLAIN; charset=iso-8859-1
-Content-Transfer-Encoding: 8bit
-X-MIME-Autoconverted: from QUOTED-PRINTABLE to 8bit by candle.pha.pa.us id KAA06585
-Status: RO
-
-On Wed, 21 Jun 2000, Hiroshi Inoue wrote:
-
-> Peter seems to have the following idea(?? not sure)
-> 2) database = tablespace
-
-No, I thought that a database would have a table space assigned that would
-serve as the default for newly created tables, but could be overridden. So
-you could group databases onto disks as you want, but a couple of
-particularly big/important/unimportant/etc tables from each database could
-be put on a different disk. At least this seems to be the most flexible
-and conceptually simple solution.
-
-Ideally, directories per database would go away, but then we'd have the
-system tables colliding, since those have the same oid in each database.
-But that's not really important. So essentially you'd have
-
- $PGDATA/base/tablespacesomething/database/tables
-
-In the default tablespace, "tablespacesomething" is an ordinary directory,
-for other tablespaces it symlinks somewhere else. For those browsing
-$PGDATA/base, it all looks the same (unless you have colour ls). For those
-browsing the actual storage location it looks like
-/var/foo/elsewhere/database/tables.
-
-I'm sure you can squeeze the extension segments in there, maybe between
-tablespace and database.
-
-What I think Bruce is saying is that there should be both database spaces
-and table spaces, I think that's too much.
-
-> My opinion
-> 3) database and tablespace are relatively irrelevant.
-> I assume PostgreSQL's database would correspond
-> to the concept of SCHEMA.
-
-A database corresponds to a catalog and a schema corresponds to nothing
-yet.
-
-
---
-Peter Eisentraut Sernanders väg 10:115
-http://yi.org/peter-e/ Sweden
-
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA06582;
- Wed, 21 Jun 2000 10:01:08 -0400 (EDT)
-Received: from meryl.it.uu.se (
[email protected] [130.238.12.42]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id JAA04510; Wed, 21 Jun 2000 09:43:48 -0400 (EDT)
- by meryl.it.uu.se (8.8.5/8.8.5) with ESMTP id PAA20730;
- Wed, 21 Jun 2000 15:39:23 +0200 (MET DST)
-Received: from localhost (e99re41@localhost) by Ulv.DoCS.UU.SE (8.6.12/8.6.12) with ESMTP id PAA10853; Wed, 21 Jun 2000 15:39:16 +0200
-X-Authentication-Warning: Ulv.DoCS.UU.SE: e99re41 owned process doing -bs
-Date: Wed, 21 Jun 2000 15:39:16 +0200 (MET DST)
-From: Peter Eisentraut
-Reply-To: Peter Eisentraut
-cc: Jan Wieck , Tom Lane ,
- Hiroshi Inoue ,
- Bruce Momjian ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-MIME-Version: 1.0
-Content-Type: TEXT/PLAIN; charset=iso-8859-1
-Content-Transfer-Encoding: 8bit
-X-MIME-Autoconverted: from QUOTED-PRINTABLE to 8bit by candle.pha.pa.us id KAA06582
-Status: ROr
-
-On Tue, 20 Jun 2000, Bruce Momjian wrote:
-
-> What I was suggesting is not to catalog the symlink locations, but to
-> use lstat when dumping, so that admins can move files around using
-> symlinks and not have to udpate the database.
-
-That surely wouldn't make those happy that are calling for smgr
-abstraction.
-
-
---
-Peter Eisentraut Sernanders väg 10:115
-http://yi.org/peter-e/ Sweden
-
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA08120;
- Wed, 21 Jun 2000 11:31:08 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id LAA13232; Wed, 21 Jun 2000 11:08:38 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id LAA04286;
- Wed, 21 Jun 2000 11:07:20 -0400 (EDT)
-cc: Hiroshi Inoue
, Peter Eisentraut ,
- Jan Wieck ,
- Bruce Momjian ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Bruce Momjian
- message dated "Wed, 21 Jun 2000 00:33:01 -0400"
-Date: Wed, 21 Jun 2000 11:07:20 -0400
-From: Tom Lane
-Status: ROr
-
-> Yes, agreed. I was thinking this:
-> CREATE TABLESPACE loc USING '/var/pgsql'
-> does:
-> ln -s /var/pgsql/dbname/loc data/base/dbname/loc
-> In this way, the database has a view of its main directory, plus a /loc
-> subdirectory for the tablespace. In the other location, we have
-> /var/pgsql/dbname/loc because this allows different databases to use:
-> CREATE TABLESPACE loc USING '/var/pgsql'
-> and they do not collide with each other in /var/pgsql.
-
-But they don't collide anyway, because the dbname is already unique.
-Isn't the extra subdirectory a waste?
-
-Because table files will have installation-wide unique names, there's
-no really good reason to have either level of subdirectory; you could
-just make
- CREATE TABLESPACE loc USING '/var/pgsql'
-do
- ln -s /var/pgsql data/base/dbname/loc
-and it'd still work even if multiple DBs were using the same tablespace.
-
-However, forcing creation of a subdirectory does give you the chance to
-make sure the subdir is owned by postgres and has the right permissions,
-so there's something to be said for that. It might be reasonable to do
- mkdir /var/pgsql/dbname
- chmod 700 /var/pgsql/dbname
- ln -s /var/pgsql/dbname data/base/dbname/loc
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA08135;
- Wed, 21 Jun 2000 11:31:09 -0400 (EDT)
-Received: from huey.jpl.nasa.gov (huey.jpl.nasa.gov [128.149.68.100]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id LAA15864; Wed, 21 Jun 2000 11:30:06 -0400 (EDT)
-Received: from golem.jpl.nasa.gov (hectic-1 [128.149.68.203])
- by huey.jpl.nasa.gov (8.8.8+Sun/8.8.8) with ESMTP id IAA02881;
- Wed, 21 Jun 2000 08:26:40 -0700 (PDT)
-Received: from alumni.caltech.edu (localhost.localdomain [127.0.0.1])
- by golem.jpl.nasa.gov (Postfix) with ESMTP
- id AB8AE6F51; Wed, 21 Jun 2000 15:27:36 +0000 (UTC)
-Date: Wed, 21 Jun 2000 15:27:36 +0000
-From: Thomas Lockhart
-Organization: Yes
-X-Mailer: Mozilla 4.7 [en] (X11; I; Linux 2.2.14-15mdksmp i686)
-X-Accept-Language: en
-MIME-Version: 1.0
-Cc: Peter Eisentraut
, Jan Wieck ,
- Tom Lane , Hiroshi Inoue ,
- Bruce Momjian ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: RO
-
-> Sorry, disagree. Environment variables are a pain to administer, and
-> quite counter-intuitive.
-
-Well, I guess we disagree. But until we have a complete proposed
-solution, we should leave environment variables on the table, since they
-*do* allow some decoupling of logical and physical storage, and *do*
-give the administrator some control over resources *that the admin would
-not otherwise have*.
-
-> > istm that from a portability and evolutionary standpoint OID-only
-> > file names (or at least file names *not* based on relation/class
-> > names) is a requirement.
-> Maybe a requirement at some point for some installations, but I hope
-> not a general requirement.
-
-If a table name can have characters which are not legal for file names,
-then how would you propose to support it? If we are doing a
-restructuring of the storage scheme, this should be taken into account.
-
-lockhart=# create table "one/two" (i int);
-ERROR: cannot create one/two
-
-Why not? It demonstrates an unfortunate linkage between file systems and
-database resources.
-
- - Thomas
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA08164;
- Wed, 21 Jun 2000 11:31:12 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id LAA15786; Wed, 21 Jun 2000 11:29:30 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id LAA04451;
- Wed, 21 Jun 2000 11:28:09 -0400 (EDT)
-To: Thomas Lockhart
-cc: Bruce Momjian
, Peter Eisentraut ,
- Jan Wieck , Hiroshi Inoue ,
- Bruce Momjian ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Thomas Lockhart
- message dated "Wed, 21 Jun 2000 05:19:29 -0000"
-Date: Wed, 21 Jun 2000 11:28:09 -0400
-From: Tom Lane
-Status: RO
-
-Thomas Lockhart writes:
-> Well, as y'all have noticed, I think there are strong reasons to use
-> environment variables to manage locations, and that symlinks are a
-> potential portability and robustness problem.
-
-Reasons? Evidence?
-
-> An additional point which has relevance to this whole discussion:
-> In the future we may allow system resource such as tables to carry names
-> which use multi-byte encodings. afaik these encodings are not allowed to
-> be used for physical file names, and even if they were the utility of
-> using standard operating system utilities like ls goes way down.
-
-Good point, although in one sense a string is a string --- as long as
-we don't allow embedded nulls in server-side encodings, we could use
-anything that Postgres thought was a name in a filename, and the OS
-should take it. But if your local ls doesn't show it the way you see
-in Postgres, the usefulness of having the tablename in the filename
-goes way down.
-
-> istm that from a portability and evolutionary standpoint OID-only file
-> names (or at least file names *not* based on relation/class names) is a
-> requirement.
-
-No argument from me ;-). I've been looking for compromise positions
-but I still think that pure numeric filenames are the cleanest solution.
-
-There's something else that should be taken into account: for WAL, the
-log will need to record the table file that each insert/delete/update
-operation affects. To do that with the smgr-token-is-a-pathname
-approach I was suggesting yesterday, I think you have to record the
-database name and pathname in each WAL log entry. That's 64 bytes/log
-entry which is a *lot*. If we bit the bullet and restricted ourselves
-to numeric filenames then the log would need just four numeric values:
- database OID
- tablespace OID
- relation OID
- relation version number
-(this set of 4 values would also be an smgr file reference token).
-16 bytes/log entry looks much better than 64.
-
-At the moment I can recall the following opinions:
-
-Pure OID filenames: Thomas, Tom, Marc, Peter E.
-
-OID+relname filenames: Bruce
-
-Vadim was in the pure-OID camp a few months ago, but I won't presume
-to list him there now since he hasn't been involved in this most
-recent round of discussions. I'm not sure where anyone else stands...
-but at least in terms of the core group it's pretty clear where the
-majority opinion is.
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA09021;
- Wed, 21 Jun 2000 11:51:38 -0400 (EDT)
-Received: from www.wgcr.org (IDENT:
[email protected] [206.74.232.194]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id LAA18613; Wed, 21 Jun 2000 11:51:48 -0400 (EDT)
-Received: from wgcr.org ([206.74.232.197])
- by www.wgcr.org (8.9.3/8.9.3/WGCR) with ESMTP id LAA19124;
- Wed, 21 Jun 2000 11:48:25 -0400
-Date: Wed, 21 Jun 2000 11:48:19 -0400
-From: Lamar Owen
-X-Mailer: Mozilla 4.61 [en] (Win95; I)
-X-Accept-Language: en
-MIME-Version: 1.0
-To: Tom Lane
-CC: Thomas Lockhart ,
- Peter Eisentraut
, Jan Wieck ,
- Hiroshi Inoue ,
- Bruce Momjian ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: ROr
-
-Tom Lane wrote:
-
-> Thomas Lockhart writes:
-> > Well, as y'all have noticed, I think there are strong reasons to use
-> > environment variables to manage locations, and that symlinks are a
-> > potential portability and robustness problem.
-
-> Reasons? Evidence?
-
-Does Win32 do symlinks these days? I know Win32 does envvars, and Win32
-is currently a supported platform.
-
-I'm not thrilled with either solution -- envvars have their problems
-just as surely as symlinks do.
-
-> At the moment I can recall the following opinions:
-
-> Pure OID filenames: Thomas, Tom, Marc, Peter E.
-
-FWIW, count me here. I have tried administering my system using the
-filenames -- and have been bitten. Better admin tools in the PostgreSQL
-package beat using standard filesystem tools -- the PostgreSQL tools can
-be WAL-aware, transaction-aware, and can provide consistent results.
-Filesystem tools never will be able to provide consistent results for a
-database system that must remain up 24x7, as many if not most PostgreSQL
-installations must.
-
-> OID+relname filenames: Bruce
-
-Sorry Bruce -- I understand and am sympathetic to your position, and, at
-one time, I agreed with it. But not any more.
-
---
-Lamar Owen
-WGCR Internet Radio
-1 Peter 4:11
-
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA09885
- for
; Wed, 21 Jun 2000 12:10:04 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id MAA04789;
- Wed, 21 Jun 2000 12:10:15 -0400 (EDT)
-cc: Hiroshi Inoue
, Peter Eisentraut ,
- Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Bruce Momjian
- message dated "Wed, 21 Jun 2000 11:45:12 -0400"
-Date: Wed, 21 Jun 2000 12:10:15 -0400
-From: Tom Lane
-Status: ROr
-
-> Yes, that is true. My idea is that they may want to create loc1 and
-> loc2 which initially point to the same location, but later may be moved.
-> For example, one tablespace for tables, another for indexes. They may
-> initially point to the same directory, but later be split.
-
-Well, that opens up a completely different issue, which is what about
-moving tables from one tablespace to another?
-
-I think the way you appear to be implying above (shut down the server
-so that you can rearrange subdirectories by hand) is the wrong way to
-go about it. For one thing, lots of people don't want to shut down
-their servers completely for that long, but it's difficult to avoid
-doing so if you want to move files by filesystem commands. For another
-thing, the above approach requires guessing in advance --- maybe long
-in advance --- how you are going to want to repartition your database
-when it gets too big for your existing storage.
-
-The right way to address this problem is to invent a "move table to
-new tablespace" command. This'd be pretty trivial to implement based
-on a file-versioning approach: the new version of the pg_class tuple
-has a new tablespace identifier in it.
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA10371
- for
; Wed, 21 Jun 2000 12:30:41 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id MAA22315 for
; Wed, 21 Jun 2000 12:23:18 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5LGJU175424;
- Wed, 21 Jun 2000 12:19:30 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5LGJJ175359
- for
; Wed, 21 Jun 2000 12:19:19 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id MAA04878;
- Wed, 21 Jun 2000 12:17:38 -0400 (EDT)
-cc: Lamar Owen ,
- Thomas Lockhart ,
- Peter Eisentraut
, Jan Wieck ,
- Hiroshi Inoue ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Bruce Momjian
- message dated "Wed, 21 Jun 2000 12:03:12 -0400"
-Date: Wed, 21 Jun 2000 12:17:37 -0400
-From: Tom Lane
-Precedence: bulk
-Status: RO
-
->> Sorry Bruce -- I understand and am sympathetic to your position, and, at
->> one time, I agreed with it. But not any more.
-
-> I thought the most recent proposal was to just throw ~16 chars of the
-> file name on the end of the file name, and that should not be used for
-> anything except visibility. WAL would not need to store that. It could
-> just grab the file name that matches the oid/sequence number.
-
-But that's extra complexity in WAL, plus extra complexity in renaming
-tables (if you want the filename to track the logical table name, which
-I expect you would), plus extra complexity in smgr and bufmgr and other
-places.
-
-I think people are coming around to the notion that it's better to keep
-these low-level operations simple, even if we need to expend more work
-on high-level admin tools as a result.
-
-But we do need to remember to expend that effort on tools! Let's not
-drop the ball on that, folks.
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA10364
- for
; Wed, 21 Jun 2000 12:30:38 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id MAA22593 for
; Wed, 21 Jun 2000 12:25:58 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id MAA04944;
- Wed, 21 Jun 2000 12:24:44 -0400 (EDT)
-cc: Hiroshi Inoue
, Peter Eisentraut ,
- Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Bruce Momjian
- message dated "Wed, 21 Jun 2000 12:14:59 -0400"
-Date: Wed, 21 Jun 2000 12:24:44 -0400
-From: Tom Lane
-Status: ROr
-
->> Well, that opens up a completely different issue, which is what about
->> moving tables from one tablespace to another?
-
-> Are you suggesting that doing dbname/locname is somehow harder to do
-> that? If you are, I don't understand why.
-
-It doesn't make it harder, but it still seems pointless to have the
-extra directory level. Bear in mind that if we go with all-OID
-filenames then you're not going to be looking at "loc1" and "loc2"
-anyway, but at "5938171" and "8583727". It's not much of a convenience
-to the admin to see that, so we might as well save a level of directory
-lookup.
-
-> The general issue of moving tables between tablespaces can be done from
-> in the database. I don't think it is reasonable to shut down the db to
-> do that. However, I can see moving tablespaces to different symlinked
-> locations may require a shutdown.
-
-Only if you insist on doing it outside the database using filesystem
-tools. Another way is to create a new tablespace in the desired new
-location, then move the tables one-by-one to that new tablespace.
-
-I suppose either one might be preferable depending on your access
-patterns --- locking your most critical tables while they're being moved
-might be as bad as a total shutdown.
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA11366
- for
; Wed, 21 Jun 2000 13:01:05 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id MAA24726 for
; Wed, 21 Jun 2000 12:47:50 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id MAA05112;
- Wed, 21 Jun 2000 12:46:34 -0400 (EDT)
-cc: Hiroshi Inoue
, Peter Eisentraut ,
- Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Bruce Momjian
- message dated "Wed, 21 Jun 2000 12:40:35 -0400"
-Date: Wed, 21 Jun 2000 12:46:34 -0400
-From: Tom Lane
-Status: ROr
-
->>>> Are you suggesting that doing dbname/locname is somehow harder to do
->>>> that? If you are, I don't understand why.
->>
->> It doesn't make it harder, but it still seems pointless to have the
->> extra directory level. Bear in mind that if we go with all-OID
->> filenames then you're not going to be looking at "loc1" and "loc2"
->> anyway, but at "5938171" and "8583727". It's not much of a convenience
->> to the admin to see that, so we might as well save a level of directory
->> lookup.
-
-> Just seems easier to have stuff segregates into separate per-db
-> directories for clarity. Also, as directories get bigger, finding a
-> specific file in there becomes harder. Putting 10 databases all in the
-> same directory seems bad in this regard.
-
-Huh? I wasn't arguing against making a db-specific directory below the
-tablespace point. I was arguing against making *another* directory
-below that one.
-
-> I don't think we want to be using
-> symlinks for tables if we can avoid it.
-
-Agreed, but where did that come from? None of these proposals mentioned
-symlinks for anything but directories, AFAIR.
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA13233
- for
; Wed, 21 Jun 2000 14:31:13 -0400 (EDT)
-Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id OAA04201 for
; Wed, 21 Jun 2000 14:11:42 -0400 (EDT)
-Received: from regulus.student.UU.SE ([130.238.5.2]:34923 "EHLO
- regulus.its.uu.se") by merganser.its.uu.se with ESMTP
- id ; Wed, 21 Jun 2000 20:09:46 +0200
-Received: from peter (helo=localhost)
- by regulus.its.uu.se with local-esmtp (Exim 3.02 #2)
- id 134p2o-0000Uo-00; Wed, 21 Jun 2000 20:16:10 +0200
-Date: Wed, 21 Jun 2000 20:16:10 +0200 (CEST)
-To: Tom Lane
-cc: Bruce Momjian
, Hiroshi Inoue ,
- Jan Wieck ,
- "Ross J. Reedstrom" ,
- Don Baccus ,
-Subject: Re: [HACKERS] Big 7.1 open items
-MIME-Version: 1.0
-Content-Type: TEXT/PLAIN; charset=ISO-8859-1
-Content-Transfer-Encoding: 8BIT
-Sender: Peter Eisentraut
-Status: ROr
-
-Tom Lane writes:
-
-> I think Peter was holding out for storing purely numeric tablespace OID
-> and table version in pg_class and having a hardwired mapping to pathname
-> somewhere in smgr. However, I think that doing it that way gains only
-> micro-efficiency compared to passing a "name" around, while using the
-> name approach buys us flexibility that's needed for at least some of
-> the variants under discussion.
-
-But that name can only be a dozen or so characters, contain no slash or
-other funny characters, etc. That's really poor. Then the alternative is
-to have an internal name and an external canonical name. Then you have two
-names to worry about. Also consider that when you store both the table
-space oid and the internal name in pg_class you create redundant data.
-What if you rename the table space? Do you leave the internal name out of
-sync? Then what good is the internal name? I'm just concerned that we are
-creating at the table space level problems similar to that we're trying to
-get rid of at the relation and database level.
-
-
---
-Peter Eisentraut Sernanders väg 10:115
-http://yi.org/peter-e/ Sweden
-
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id SAA24147
- for
; Wed, 21 Jun 2000 18:14:18 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id RAA24649 for
; Wed, 21 Jun 2000 17:40:59 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id RAA06031;
- Wed, 21 Jun 2000 17:39:38 -0400 (EDT)
-cc: Peter Eisentraut
, Hiroshi Inoue ,
- Jan Wieck ,
- "Ross J. Reedstrom" ,
- Don Baccus ,
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Bruce Momjian
- message dated "Wed, 21 Jun 2000 14:42:21 -0400"
-Date: Wed, 21 Jun 2000 17:39:38 -0400
-From: Tom Lane
-Status: RO
-
->> But that name can only be a dozen or so characters, contain no slash or
->> other funny characters, etc. That's really poor. Then the alternative is
->> to have an internal name and an external canonical name. Then you have two
->> names to worry about. Also consider that when you store both the table
->> space oid and the internal name in pg_class you create redundant data.
->> What if you rename the table space? Do you leave the internal name out of
->> sync? Then what good is the internal name? I'm just concerned that we are
->> creating at the table space level problems similar to that we're trying to
->> get rid of at the relation and database level.
-
-> Agreed. Having table spaces stored by directories named by oid just
-> seems very complicated for no reason.
-
-Huh? He just gave you two very good reasons: avoid Unix-derived
-limitations on the naming of tablespaces (and tables), and avoid
-problems with renaming tablespaces.
-
-I'm pretty much firmly back in the "OID and nothing but" camp.
-Or perhaps I should say "OID, file version, and nothing but",
-since we still need a version number to do CLUSTER etc.
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07570;
- Wed, 21 Jun 2000 22:18:36 -0400 (EDT)
-Received: from sectorbase2.sectorbase.com ([208.48.122.131]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id TAA29965; Wed, 21 Jun 2000 19:07:37 -0400 (EDT)
-Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21)
- id ; Wed, 21 Jun 2000 15:58:30 -0700
-Message-ID: <8F4C99C66D04D4118F580090272A7A23018C2B@SECTORBASE1>
-From: "Mikheev, Vadim"
-To: "'Tom Lane'" ,
- Thomas Lockhart
-
- Peter Eisentraut
- Hiroshi Inoue
- ,
- Bruce Momjian ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Wed, 21 Jun 2000 16:00:17 -0700
-MIME-Version: 1.0
-X-Mailer: Internet Mail Service (5.5.2650.21)
-Content-Type: text/plain;
- charset="iso-8859-1"
-Status: RO
-
-> If we bit the bullet and restricted ourselves to numeric filenames then
-> the log would need just four numeric values:
-> database OID
-> tablespace OID
-
-Is someone going to implement it for 7.1?
-
-> relation OID
-> relation version number
-
-I believe that we can avoid versions using WAL...
-
-> (this set of 4 values would also be an smgr file reference token).
-> 16 bytes/log entry looks much better than 64.
->
-> At the moment I can recall the following opinions:
->
-> Pure OID filenames: Thomas, Tom, Marc, Peter E.
-
-+ me.
-
-But what about LOCATIONs? I object using environment and think that
-locations
-must be stored in pg_control..?
-
-Vadim
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07573;
- Wed, 21 Jun 2000 22:18:38 -0400 (EDT)
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id TAA01857; Wed, 21 Jun 2000 19:37:04 -0400 (EDT)
-Received: from cadzone ([126.0.1.40] (may be forged))
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
- id IAA02627; Thu, 22 Jun 2000 08:35:27 +0900
-From: "Hiroshi Inoue"
-To: "Tom Lane"
- "Peter Eisentraut"
, "Jan Wieck" ,
- "Bruce Momjian" ,
- "PostgreSQL-development"
,
- "Ross J. Reedstrom" ,
- "Thomas Lockhart"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Thu, 22 Jun 2000 08:37:42 +0900
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="iso-2022-jp"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
-Importance: Normal
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
-Status: RO
-
-> -----Original Message-----
->
-> No argument from me ;-). I've been looking for compromise positions
-> but I still think that pure numeric filenames are the cleanest solution.
->
-> There's something else that should be taken into account: for WAL, the
-> log will need to record the table file that each insert/delete/update
-> operation affects. To do that with the smgr-token-is-a-pathname
-> approach I was suggesting yesterday, I think you have to record the
-> database name and pathname in each WAL log entry. That's 64 bytes/log
-> entry which is a *lot*. If we bit the bullet and restricted ourselves
-> to numeric filenames then the log would need just four numeric values:
-> database OID
-> tablespace OID
-
-I strongly object to keep tablespace OID for smgr file reference token
-though we have to keep it for another purpose of cource. I've mentioned
-many times tablespace(where to store) info should be distinguished from
-*where it is stored* info. Generally tablespace isn't sufficiently
-restrictive
-for this purpose. e.g. there was an idea about round-robin. e.g. Oracle's
-tablespace could have pluaral files... etc.
-IMHO,it is misleading to use tablespace OID as (a part of) reference token.
-
-> relation OID
-> relation version number
-> (this set of 4 values would also be an smgr file reference token).
-> 16 bytes/log entry looks much better than 64.
->
-
-Regards.
-
-Hiroshi Inoue
-
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07540;
- Wed, 21 Jun 2000 22:18:11 -0400 (EDT)
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id UAA04100; Wed, 21 Jun 2000 20:15:09 -0400 (EDT)
-Received: from cadzone ([126.0.1.40] (may be forged))
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
- id JAA02691; Thu, 22 Jun 2000 09:14:15 +0900
-From: "Hiroshi Inoue"
-To: "Mikheev, Vadim"
- "Peter Eisentraut"
, "Jan Wieck" ,
- "Bruce Momjian" ,
- "PostgreSQL-development"
,
- "Ross J. Reedstrom" ,
- "'Tom Lane'" ,
- "Thomas Lockhart"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Thu, 22 Jun 2000 09:16:30 +0900
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="iso-8859-1"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
-Importance: Normal
-In-Reply-To: <8F4C99C66D04D4118F580090272A7A23018C2B@SECTORBASE1>
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
-Status: RO
-
-> -----Original Message-----
->
-> > If we bit the bullet and restricted ourselves to numeric filenames then
-> > the log would need just four numeric values:
-> > database OID
-> > tablespace OID
->
-> Is someone going to implement it for 7.1?
->
-> > relation OID
-> > relation version number
->
-> I believe that we can avoid versions using WAL...
->
-
-How to re-construct tables in place ?
-Is the following right ?
-1) save the content of current table to somewhere
-2) shrink the table and related indexes
-3) reload the saved(+some filtering) content
-
-Regards.
-
-Hiroshi Inoue
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07553;
- Wed, 21 Jun 2000 22:18:15 -0400 (EDT)
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id UAA05872; Wed, 21 Jun 2000 20:44:21 -0400 (EDT)
-Received: from cadzone ([126.0.1.40] (may be forged))
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
- id JAA02750; Thu, 22 Jun 2000 09:43:31 +0900
-From: "Hiroshi Inoue"
-To: "Mikheev, Vadim"
- "Peter Eisentraut"
, "Jan Wieck" ,
- "Bruce Momjian" ,
- "PostgreSQL-development"
,
- "Ross J. Reedstrom" ,
- "'Tom Lane'" ,
- "Thomas Lockhart"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Thu, 22 Jun 2000 09:45:46 +0900
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="iso-8859-1"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
-Importance: Normal
-In-Reply-To: <8F4C99C66D04D4118F580090272A7A23018C2C@SECTORBASE1>
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
-Status: RO
-
-> -----Original Message-----
->
-> > > > relation version number
-> > >
-> > > I believe that we can avoid versions using WAL...
-> > >
-> >
-> > How to re-construct tables in place ?
-> > Is the following right ?
-> > 1) save the content of current table to somewhere
-> > 2) shrink the table and related indexes
-> > 3) reload the saved(+some filtering) content
->
-> Or - create tmp file and load with new content; log "intent to
-> relink table
-> file";
-> relink table file; log "file is relinked".
->
-
-It seems to me that whole content of the table should be
-logged before relinking or shrinking.
-Is my understanding right ?
-
-Regards.
-
-Hiroshi Inoue
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07504
- for
; Wed, 21 Jun 2000 22:17:58 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id VAA07914 for
; Wed, 21 Jun 2000 21:23:22 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5M1It194420;
- Wed, 21 Jun 2000 21:18:55 -0400 (EDT)
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5M1Ig194334
- for
; Wed, 21 Jun 2000 21:18:43 -0400 (EDT)
-Received: from cadzone ([126.0.1.40] (may be forged))
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
- id KAA02808; Thu, 22 Jun 2000 10:12:45 +0900
-From: "Hiroshi Inoue"
-To: "Tom Lane"
- "Peter Eisentraut"
, "Jan Wieck" ,
- "Bruce Momjian" ,
- "PostgreSQL-development"
,
- "Ross J. Reedstrom" ,
- "Thomas Lockhart"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Thu, 22 Jun 2000 10:15:01 +0900
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="iso-2022-jp"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
-Importance: Normal
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
-Precedence: bulk
-Status: RO
-
-> -----Original Message-----
->
-> At the moment I can recall the following opinions:
->
-> Pure OID filenames: Thomas, Tom, Marc, Peter E.
->
-> OID+relname filenames: Bruce
->
-
-Please add my opinion to the list.
-
-Unique-id filename: Hiroshi
- (Unqiue-id is irrelevant to OID/relname).
-
-Regards.
-
-Hiroshi Inoue
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07513
- for
; Wed, 21 Jun 2000 22:18:01 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id VAA08502 for
; Wed, 21 Jun 2000 21:33:13 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5M1QS107400;
- Wed, 21 Jun 2000 21:26:28 -0400 (EDT)
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5M1QA107223
- for
; Wed, 21 Jun 2000 21:26:10 -0400 (EDT)
-Received: from cadzone ([126.0.1.40] (may be forged))
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
- id KAA02831; Thu, 22 Jun 2000 10:25:11 +0900
-From: "Hiroshi Inoue"
-To: "Mikheev, Vadim"
- "Peter Eisentraut"
, "Jan Wieck" ,
- "Bruce Momjian" ,
- "PostgreSQL-development"
,
- "Ross J. Reedstrom" ,
- "'Tom Lane'" ,
- "Thomas Lockhart"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Thu, 22 Jun 2000 10:27:26 +0900
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="iso-8859-1"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
-Importance: Normal
-In-Reply-To: <8F4C99C66D04D4118F580090272A7A23018C2D@SECTORBASE1>
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
-Precedence: bulk
-Status: RO
-
-> -----Original Message-----
->
-> > > Or - create tmp file and load with new content;
-> > > log "intent to relink table file";
-> > > relink table file; log "file is relinked".
-> >
-> > It seems to me that whole content of the table should be
-> > logged before relinking or shrinking.
->
-> Why not just fsync tmp files?
->
-
-Probably I've misunderstood *relink*.
-If *relink* different from *rename* ?
-
-Regards.
-
-Hiroshi Inoue
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07492;
- Wed, 21 Jun 2000 22:17:51 -0400 (EDT)
-Received: from sectorbase2.sectorbase.com ([208.48.122.131]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id VAA08730; Wed, 21 Jun 2000 21:37:44 -0400 (EDT)
-Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21)
- id ; Wed, 21 Jun 2000 18:28:36 -0700
-Message-ID: <8F4C99C66D04D4118F580090272A7A23018C2F@SECTORBASE1>
-From: "Mikheev, Vadim"
-To: "'Hiroshi Inoue'"
- Peter Eisentraut
- Bruce Momjian
- ,
- PostgreSQL-development
- "Ross J. Reedstrom" ,
- "'Tom Lane'" ,
- Thomas Lockhart
-
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Wed, 21 Jun 2000 18:30:23 -0700
-MIME-Version: 1.0
-X-Mailer: Internet Mail Service (5.5.2650.21)
-Content-Type: text/plain;
- charset="iso-8859-1"
-Status: RO
-
-> > > > Or - create tmp file and load with new content;
-> > > > log "intent to relink table file";
-> > > > relink table file; log "file is relinked".
-> > >
-> > > It seems to me that whole content of the table should be
-> > > logged before relinking or shrinking.
-> >
-> > Why not just fsync tmp files?
-> >
->
-> Probably I've misunderstood *relink*.
-> If *relink* different from *rename* ?
-
-I ment something like this - link(table file, tmp2 file); fsync(tmp2 file);
-unlink(table file); link(tmp file, table file); fsync(table file);
-unlink(tmp file). We can do additional logging (with log flush) of these
-steps
-if required, postpone on-recovery redo of operations till last relink log
-record/
-end of log/transaction abort etc etc etc.
-
-Vadim
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA10350
- for
; Wed, 21 Jun 2000 23:22:35 -0400 (EDT)
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id XAA13743 for
; Wed, 21 Jun 2000 23:07:50 -0400 (EDT)
-Received: from cadzone ([126.0.1.40] (may be forged))
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
- id MAA03008; Thu, 22 Jun 2000 12:07:00 +0900
-From: "Hiroshi Inoue"
-To: "Mikheev, Vadim"
- "Peter Eisentraut"
, "Jan Wieck" ,
- "PostgreSQL-development"
,
- "Ross J. Reedstrom" ,
- "'Tom Lane'" ,
- "Thomas Lockhart"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Thu, 22 Jun 2000 12:09:15 +0900
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="iso-8859-1"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
-Importance: Normal
-In-Reply-To: <8F4C99C66D04D4118F580090272A7A23018C2F@SECTORBASE1>
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
-Status: RO
-
-> -----Original Message-----
->
-> > > > > Or - create tmp file and load with new content;
-> > > > > log "intent to relink table file";
-> > > > > relink table file; log "file is relinked".
-> > > >
-> > > > It seems to me that whole content of the table should be
-> > > > logged before relinking or shrinking.
-> > >
-> > > Why not just fsync tmp files?
-> > >
-> >
-> > Probably I've misunderstood *relink*.
-> > If *relink* different from *rename* ?
->
-> I ment something like this - link(table file, tmp2 file);
-> fsync(tmp2 file);
-> unlink(table file); link(tmp file, table file); fsync(table file);
-> unlink(tmp file).
-
-I see,old file would be rolled back from tmp2 file on abort.
-This would work on most platforms.
-But cygwin port has a flaw that files could not be unlinked
-if they are open. So *relink* may fail in some cases(including
-rollback cases).
-
-Regards.
-
-Hiroshi Inoue
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA10353
- for
; Wed, 21 Jun 2000 23:22:36 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id XAA14206 for
; Wed, 21 Jun 2000 23:16:26 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id XAA07099;
- Wed, 21 Jun 2000 23:14:50 -0400 (EDT)
-To: "Mikheev, Vadim"
-cc: Thomas Lockhart ,
- Peter Eisentraut
, Jan Wieck ,
- Hiroshi Inoue ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-In-reply-to: <8F4C99C66D04D4118F580090272A7A23018C2B@SECTORBASE1>
-References: <8F4C99C66D04D4118F580090272A7A23018C2B@SECTORBASE1>
-Comments: In-reply-to "Mikheev, Vadim"
- message dated "Wed, 21 Jun 2000 16:00:17 -0700"
-Date: Wed, 21 Jun 2000 23:14:50 -0400
-From: Tom Lane
-Status: RO
-
-"Mikheev, Vadim" writes:
->> relation OID
->> relation version number
-
-> I believe that we can avoid versions using WAL...
-
-I don't think so. You're basically saying that
- 1. create file 'new'
- 2. delete file 'old'
- 3. rename 'new' to 'old'
-is safe as long as you have a redo log to ensure that the rename
-happens even if you crash between steps 2 and 3. But crash is not
-the only hazard. What if step 3 just plain fails? Redo won't help.
-
-I'm having a hard time inventing really plausible examples, but a
-slightly implausible example is that someone chmod's the containing
-directory -w between steps 2 and 3. (Maybe it's not so implausible
-if you assume a crash after step 2 ... someone might have left the
-directory nonwritable while restoring the system.)
-
-If we use file version numbers, then the *only* thing needed to
-make a valid transition between one set of files and another is
-a commit of the update of pg_class that shows the new version number
-in the rel's pg_class tuple. The worst that can happen to you in
-a crash or other failure is that you are unable to get rid of the
-set of files that you don't want anymore. That might waste disk
-space but it doesn't leave the database corrupted.
-
-> But what about LOCATIONs? I object using environment and think that
-> locations must be stored in pg_control..?
-
-I don't like environment variables for this either; it's just way too
-easy to start the postmaster with wrong environment. It still seems
-to me that relying on subdirectory symlinks is a good way to go.
-pg_control is not so good --- if it gets corrupted, how do you recover?
-symlinks can be recreated by hand if necessary, but...
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA22245
- for
; Thu, 22 Jun 2000 01:01:02 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id AAA18310 for
; Thu, 22 Jun 2000 00:43:00 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5M3US167109;
- Wed, 21 Jun 2000 23:30:28 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5M3U0164115
- for
; Wed, 21 Jun 2000 23:30:00 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id XAA07156;
- Wed, 21 Jun 2000 23:27:10 -0400 (EDT)
-To: "Hiroshi Inoue"
- "Peter Eisentraut"
, "Jan Wieck" ,
- "PostgreSQL-development"
,
- "Ross J. Reedstrom" ,
- "Thomas Lockhart"
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to "Hiroshi Inoue"
- message dated "Thu, 22 Jun 2000 10:15:01 +0900"
-Date: Wed, 21 Jun 2000 23:27:10 -0400
-From: Tom Lane
-Precedence: bulk
-Status: RO
-
-"Hiroshi Inoue" writes:
-> Please add my opinion to the list.
-> Unique-id filename: Hiroshi
-> (Unqiue-id is irrelevant to OID/relname).
-
-"Unique ID" is more or less equivalent to "OID + version number",
-right?
-
-I was trying earlier to convince myself that a single unique-ID value
-would be better than OID+version for the smgr interface, because it'd
-certainly be easier to pass around. I failed to convince myself though,
-and the thing that bothered me was this. Suppose you are trying to
-recover a corrupted database manually, and the only information you have
-about which table is which is a somewhat out-of-date listing of OIDs
-versus table names. (Maybe it's out of date because you got it from
-your last backup tape.) If the files are named OID+version you're not
-going to have much trouble seeing which is which, even if some of the
-versions are higher than what was on the tape. But if version-updated
-tables are given entirely new unique IDs, you've got no hope at all of
-telling which one corresponds to what you had in the listing. Maybe
-you can tell by looking through the physical file contents, but
-certainly this way is more fragile from the point of view of data
-recovery.
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA22232;
- Thu, 22 Jun 2000 01:00:59 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id AAA17842; Thu, 22 Jun 2000 00:31:06 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id AAA07254;
- Thu, 22 Jun 2000 00:29:42 -0400 (EDT)
-To: "Hiroshi Inoue"
- "Peter Eisentraut"
, "Jan Wieck" ,
- "Bruce Momjian" ,
- "PostgreSQL-development"
,
- "Ross J. Reedstrom" ,
- "Thomas Lockhart"
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to "Hiroshi Inoue"
- message dated "Thu, 22 Jun 2000 08:37:42 +0900"
-Date: Thu, 22 Jun 2000 00:29:42 -0400
-From: Tom Lane
-Status: RO
-
-"Hiroshi Inoue" writes:
-> I strongly object to keep tablespace OID for smgr file reference token
-> though we have to keep it for another purpose of cource. I've mentioned
-> many times tablespace(where to store) info should be distinguished from
-> *where it is stored* info.
-
-Sure. But this proposal assumes that we're relying on symlinks to
-carry the information about physical locations corresponding to
-tablespace OIDs. The backend just needs to know enough to access a
-relation file at a relative pathname like
- tablespaceOID/relationOID
-(ignoring version and segment numbers for now). Under the hood,
-a symlink for tablespaceOID gets the work done.
-
-Certainly this is not a perfect mechanism. But it is simple, it
-is reliable, it is portable to most of the platforms we care about
-(yeah, I know we have a Win port, but you wouldn't ever recommend
-someone to run a *serious* database on it would you?), and in general
-I think the bang-for-the-buck ratio is enormous. I do not want to
-have to deal with explicit tablespace bookkeeping in the backend,
-but that seems like what we'd have to do in order to improve on
-symlinks.
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA24025
- for
; Thu, 22 Jun 2000 02:01:02 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id BAA21392 for
; Thu, 22 Jun 2000 01:56:49 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5M5jp143149;
- Thu, 22 Jun 2000 01:45:51 -0400 (EDT)
-Received: from smtp.pacifier.com (comet.pacifier.com [199.2.117.155])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5M5jT143025
- for
; Thu, 22 Jun 2000 01:45:29 -0400 (EDT)
-Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68])
- by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id WAA11735;
- Wed, 21 Jun 2000 22:44:28 -0700 (PDT)
-X-Mailer: Windows Eudora Pro Version 3.0.1 (32)
-Date: Wed, 21 Jun 2000 22:41:22 -0700
-To: Chris Bitmead ,
-From: Don Baccus
-Subject: Re: [HACKERS] Big 7.1 open items
-Cc: PostgreSQL-development
-Mime-Version: 1.0
-Content-Type: text/plain; charset="us-ascii"
-Precedence: bulk
-Status: RO
-
-At 01:43 PM 6/22/00 +1000, Chris Bitmead wrote:
-
->I'm wondering if pg_dump should store the location of the tablespace. If
->your machine dies, you get a new machine to re-create the database, you
->may not want the tablespace in the same spot. And text-editing a
->gigabyte file would be extremely painful.
-
-So you don't dump your create tablespace statements, recognizing that on
-a new machine (due to upgrades or crashing) you might assign them to
-different directories/mount points/whatever. That's the reason for
-wanting to hide physical allocation in tablespaces ... the rest of
-your datamodel doesn't need to know.
-
-Or you do dump your tablespaces, and knowing the paths assigned
-to various ones set up your new machine accordingly.
-
-
-
-- Don Baccus, Portland OR
- Nature photos, on-line guides, Pacific Northwest
- Rare Bird Alert Service and other goodies at
- http://donb.photo.net.
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA24005
- for
; Thu, 22 Jun 2000 02:00:58 -0400 (EDT)
-Received: from smtp.pacifier.com (comet.pacifier.com [199.2.117.155]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id BAA21369 for
; Thu, 22 Jun 2000 01:56:18 -0400 (EDT)
-Received: from desktop (dsl-dhogaza.pacifier.net [207.202.226.68])
- by smtp.pacifier.com (8.9.3/8.9.3pop) with SMTP id WAA12121;
- Wed, 21 Jun 2000 22:55:39 -0700 (PDT)
-X-Mailer: Windows Eudora Pro Version 3.0.1 (32)
-Date: Wed, 21 Jun 2000 22:51:49 -0700
- Chris Bitmead
-From: Don Baccus
-Subject: Re: [HACKERS] Big 7.1 open items
-Cc: PostgreSQL-development
-Mime-Version: 1.0
-Content-Type: text/plain; charset="us-ascii"
-Status: RO
-
-At 12:03 AM 6/22/00 -0400, Bruce Momjian wrote:
-
->If the symlink create fails in CREATE TABLESPACE, it just creates an
->ordinary directory.
-
-Silent surprises - the earmark of truly professional software ...
-
-
-
-- Don Baccus, Portland OR
- Nature photos, on-line guides, Pacific Northwest
- Rare Bird Alert Service and other goodies at
- http://donb.photo.net.
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA24009
- for
; Thu, 22 Jun 2000 02:00:59 -0400 (EDT)
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id BAA21277 for
; Thu, 22 Jun 2000 01:54:44 -0400 (EDT)
-Received: from cadzone ([126.0.1.40] (may be forged))
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
- id OAA03303; Thu, 22 Jun 2000 14:53:52 +0900
-From: "Hiroshi Inoue"
-To: "Tom Lane"
- "Peter Eisentraut"
, "Jan Wieck" ,
- "PostgreSQL-development"
,
- "Ross J. Reedstrom" ,
- "Thomas Lockhart"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Thu, 22 Jun 2000 14:56:07 +0900
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="iso-2022-jp"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
-Importance: Normal
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
-Status: RO
-
-> -----Original Message-----
->
-> "Hiroshi Inoue" writes:
-> > I strongly object to keep tablespace OID for smgr file reference token
-> > though we have to keep it for another purpose of cource. I've mentioned
-> > many times tablespace(where to store) info should be distinguished from
-> > *where it is stored* info.
->
-> Sure. But this proposal assumes that we're relying on symlinks to
-> carry the information about physical locations corresponding to
-> tablespace OIDs. The backend just needs to know enough to access a
-> relation file at a relative pathname like
-> tablespaceOID/relationOID
-> (ignoring version and segment numbers for now). Under the hood,
-> a symlink for tablespaceOID gets the work done.
->
-
-I think tablespaceOID is an easy substitution for the purpose.
-I don't like to depend on poor directory tree structure in dbms
-either..
-
-> Certainly this is not a perfect mechanism. But it is simple, it
-> is reliable, it is portable to most of the platforms we care about
-> (yeah, I know we have a Win port, but you wouldn't ever recommend
-> someone to run a *serious* database on it would you?), and in general
-> I think the bang-for-the-buck ratio is enormous. I do not want to
-> have to deal with explicit tablespace bookkeeping in the backend,
-> but that seems like what we'd have to do in order to improve on
-> symlinks.
->
-
-I've already mentioned about it 10 times or so but unfortunately
-I see no one on my side yet.
-OK,I've given up the discussion about it. I don't want to waste
-my time any more.
-
-Regards.
-
-Hiroshi Inoue
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA28813
- for
; Thu, 22 Jun 2000 03:31:03 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id DAA23901 for
; Thu, 22 Jun 2000 03:06:47 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id DAA07725;
- Thu, 22 Jun 2000 03:05:00 -0400 (EDT)
-To: Chris Bitmead
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Chris Bitmead
- message dated "Thu, 22 Jun 2000 13:43:56 +1000"
-Date: Thu, 22 Jun 2000 03:05:00 -0400
-From: Tom Lane
-Status: RO
-
-Chris Bitmead writes:
-> I'm wondering if pg_dump should store the location of the tablespace. If
-> your machine dies, you get a new machine to re-create the database, you
-> may not want the tablespace in the same spot. And text-editing a
-> gigabyte file would be extremely painful.
-
-Might make sense to store the tablespace setup separately from the bulk
-of the data, but certainly you want some way to dump that info in a
-restorable form.
-
-I've been thinking lately that the pg_dump shove-it-all-in-one-file
-approach doesn't scale anyway. We ought to start thinking about ways
-to make the standard dump method store schema separately from bulk
-data, for example. That's offtopic for this thread but ought to be
-on the TODO list someplace...
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id DAA28819
- for
; Thu, 22 Jun 2000 03:31:05 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id DAA24751 for
; Thu, 22 Jun 2000 03:29:00 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5M7KP140211;
- Thu, 22 Jun 2000 03:20:25 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5M7Jb139991
- for
; Thu, 22 Jun 2000 03:19:37 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id DAA07785;
- Thu, 22 Jun 2000 03:17:45 -0400 (EDT)
-cc: "Hiroshi Inoue" ,
- "Peter Eisentraut"
, "Jan Wieck" ,
- "PostgreSQL-development"
,
- "Ross J. Reedstrom" ,
- "Thomas Lockhart"
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to "Philip J. Warner"
- message dated "Thu, 22 Jun 2000 16:31:33 +1000"
-Date: Thu, 22 Jun 2000 03:17:45 -0400
-From: Tom Lane
-Precedence: bulk
-Status: RO
-
-"Philip J. Warner"
writes:
->> ... the thing that bothered me was this. Suppose you are trying to
->> recover a corrupted database manually, and the only information you have
->> about which table is which is a somewhat out-of-date listing of OIDs
->> versus table names.
-
-> This worries me a little; in the Dec/RDB world it is a very long time since
-> database backups were done by copying the files. There is a database
-> backup/restore utility which runs while the database is on-line and makes
-> sure a valid snapshot is taken. Backing up storage areas (table spapces)
-> can be done separately by the same utility, and again, it records enough
-> information to ensure integrity. Maybe the thing to do is write a pg_backup
-> utility, which in a first pass could, presumably, be synonymous with pg_dump?
-
-pg_dump already does the consistent-snapshot trick (it just has to run
-inside a single transaction).
-
-> Am I missing something here? Is there a problem with backing up using
-> 'pg_dump | gzip'?
-
-None, as long as your ambition extends no further than restoring your
-data to where it was at your last pg_dump. I was thinking about the
-all-too-common-in-the-real-world scenario where you're hoping to recover
-some data more recent than your last backup from the fractured shards
-of your database...
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id FAA29525
- for
; Thu, 22 Jun 2000 05:01:09 -0400 (EDT)
-Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id EAA27070 for
; Thu, 22 Jun 2000 04:38:32 -0400 (EDT)
-Received: from peligor.server.lan.at (peligor.server.lan.at [10.8.32.84])
- by gandalf.it-austria.net (xxx/xxx) with ESMTP id KAA23252;
- Thu, 22 Jun 2000 10:37:45 +0200
-Received: from zeus (totalctlh1-port029.f000.d0188.sd.spardat.at [10.8.35.226])
- by peligor.server.lan.at (8.9.1/8.9.1) with SMTP id KAA02457;
- Thu, 22 Jun 2000 10:41:04 GMT
-From: Zeugswetter Andreas SB
-To: Chris Bitmead ,
-Subject: Re: Big 7.1 open items
-Date: Thu, 22 Jun 2000 09:49:07 +0200
-X-Mailer: KMail [version 1.0.29.1]
-Content-Type: text/plain
-Cc: PostgreSQL-development
-MIME-Version: 1.0
-Message-Id: <00062210055400.00299@zeus>
-Content-Transfer-Encoding: 8bit
-Status: RO
-
-
-> > pg_dump would recreate a CREATE TABLESPACE command:
-> >
-> > printf("CREATE TABLESPACE %s USING %s", loc, symloc);
-> >
-> > where symloc would be SELECT symloc(loc) and return the value into a
-> > variable that is used by pg_dump. The backend would do the lstat() and
-> > return the value to the client.
->
-> I'm wondering if pg_dump should store the location of the tablespace. If
-> your machine dies, you get a new machine to re-create the database, you
-> may not want the tablespace in the same spot. And text-editing a
-> gigabyte file would be extremely painful.
-
-Yes, that seems like a valid concern that should be kept in mind.
-It should also be possible to restore a pg instance to a different location
-on the same machine.
-Maybe this could be done by adding a utility that dumps all tablespace
-info which could then be altered to desire.
-
-I still opt for instance-wide tablespaces. People wanting separation can easily
-create different tablespaces for each database, but those that only want to
-separate data and index need only create two tablespaces. A typical installation would
-have 1 to 4 tablespaces (systemtbs, datatbs, indextbs, toasttbs | lobdbs )
-
-I would also switch the directory structure between dbname and extent subdir,
-because that allows less symlinks/filesystems, and thus less admin.
-
-thus you would have:
- tablespace1/extent1/dbname1
- tablespace1/extent2/dbname1
- tablespace1/extent1/dbname2
-
-Andreas
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id EAA29060
- for
; Thu, 22 Jun 2000 04:01:03 -0400 (EDT)
-Received: from acheron.rime.com.au (
[email protected] [139.130.54.222]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id DAA25604 for
; Thu, 22 Jun 2000 03:50:30 -0400 (EDT)
-Received: from oberon (Oberon.rime.com.au [203.8.195.100])
- by acheron.rime.com.au (8.9.3/8.9.3) with SMTP id RAA08811;
- Thu, 22 Jun 2000 17:43:22 +1000
-X-Mailer: QUALCOMM Windows Eudora Pro Version 3.0.5 (32)
-Date: Thu, 22 Jun 2000 17:50:15 +1000
-To: Tom Lane
-From: "Philip J. Warner"
-Subject: Re: [HACKERS] Big 7.1 open items
-Cc: "Hiroshi Inoue" ,
- "Peter Eisentraut"
, "Jan Wieck" ,
- "PostgreSQL-development"
,
- "Ross J. Reedstrom" ,
- "Thomas Lockhart"
-Mime-Version: 1.0
-Content-Type: text/plain; charset="us-ascii"
-Status: RO
-
-At 03:17 22/06/00 -0400, Tom Lane wrote:
->
->> This worries me a little; in the Dec/RDB world it is a very long time since
->> database backups were done by copying the files. There is a database
->> backup/restore utility which runs while the database is on-line and makes
->> sure a valid snapshot is taken. Backing up storage areas (table spapces)
->> can be done separately by the same utility, and again, it records enough
->> information to ensure integrity. Maybe the thing to do is write a pg_backup
->> utility, which in a first pass could, presumably, be synonymous with
-pg_dump?
->
->pg_dump already does the consistent-snapshot trick (it just has to run
->inside a single transaction).
->
->> Am I missing something here? Is there a problem with backing up using
->> 'pg_dump | gzip'?
->
->None, as long as your ambition extends no further than restoring your
->data to where it was at your last pg_dump. I was thinking about the
->all-too-common-in-the-real-world scenario where you're hoping to recover
->some data more recent than your last backup from the fractured shards
->of your database...
->
-
-pg_dump is a good basis for any pg_backup utility; perhaps as you indicated
-elsewhere, more carefull formatting of the dump files would make
-table-based restoration possible. In another response, I also suggested
-allowing overrides of placement information in a restore operation- the
-simplest approach would be an 'ignore-storage-parameters' flag. Does this
-sound reasonable? If so, then discussion of file-id based on OID needs not
-be too concerned about how db restoration is done.
-
-
-
-
-
-----------------------------------------------------------------
-Philip Warner | __---_____
-Albatross Consulting Pty. Ltd. |----/ - \
-(A.C.N. 008 659 498) | /(@) ______---_
-Tel: (+61) 0500 83 82 81 | _________ \
-Fax: (+61) 0500 83 82 82 | ___________ |
-Http://www.rhyme.com.au | / \|
- | --________--
-PGP key available upon request, | /
-and from pgp5.ai.mit.edu:11371 |/
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id FAA29741
- for
; Thu, 22 Jun 2000 05:31:00 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id FAA28478 for
; Thu, 22 Jun 2000 05:18:37 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5M96W171286;
- Thu, 22 Jun 2000 05:06:32 -0400 (EDT)
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5M96A168442
- for
; Thu, 22 Jun 2000 05:06:10 -0400 (EDT)
-Received: from cadzone ([126.0.1.40] (may be forged))
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
- id SAA03635; Thu, 22 Jun 2000 18:05:02 +0900
-From: "Hiroshi Inoue"
-Cc: "Tom Lane"
, "Bruce Momjian" ,
- "Jan Wieck" ,
- "PostgreSQL-development"
,
- "Ross J. Reedstrom"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Thu, 22 Jun 2000 18:07:18 +0900
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="iso-8859-1"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
-Importance: Normal
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
-Precedence: bulk
-Status: RO
-
-> -----Original Message-----
->
-> > My opinion
-> > 3) database and tablespace are relatively irrelevant.
-> > I assume PostgreSQL's database would correspond
-> > to the concept of SCHEMA.
->
-> A database corresponds to a catalog and a schema corresponds to nothing
-> yet.
->
-
-Oh I see your point. However I've thought that current PostgreSQL's
-database is an imcomplete SCHEMA and still feel so in reality.
-Catalog per database has been nothing but needless for me from
-the first.
-
-Regards.
-
-Hiroshi Inoue
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id HAA07559
- for
; Thu, 22 Jun 2000 07:31:00 -0400 (EDT)
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id HAA02741 for
; Thu, 22 Jun 2000 07:08:29 -0400 (EDT)
-Received: from cadzone ([126.0.1.40] (may be forged))
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
- id UAA03834; Thu, 22 Jun 2000 20:06:51 +0900
-From: "Hiroshi Inoue"
-To: "Tom Lane"
- "Peter Eisentraut"
, "Jan Wieck" ,
- "PostgreSQL-development"
,
- "Ross J. Reedstrom" ,
- "Thomas Lockhart"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Thu, 22 Jun 2000 20:09:07 +0900
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="iso-2022-jp"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
-Importance: Normal
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
-Status: RO
-
-> -----Original Message-----
->
-> "Hiroshi Inoue" writes:
-> > Please add my opinion to the list.
-> > Unique-id filename: Hiroshi
-> > (Unqiue-id is irrelevant to OID/relname).
->
-> "Unique ID" is more or less equivalent to "OID + version number",
-> right?
->
-
-Hmm,no one seems to be on my side at this point also.
-OK,I change my mind as follows.
-
- OID except cygwin,unique-id on cygwin
-
-Regards.
-
-Hiroshi Inoue
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA10544
- for
; Thu, 22 Jun 2000 11:31:05 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id LAA23513 for
; Thu, 22 Jun 2000 11:28:53 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id LAA08851;
- Thu, 22 Jun 2000 11:27:30 -0400 (EDT)
-To: "Hiroshi Inoue"
- "Peter Eisentraut"
, "Jan Wieck" ,
- "PostgreSQL-development"
,
- "Ross J. Reedstrom" ,
- "Thomas Lockhart"
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to "Hiroshi Inoue"
- message dated "Thu, 22 Jun 2000 20:09:07 +0900"
-Date: Thu, 22 Jun 2000 11:27:30 -0400
-From: Tom Lane
-Status: RO
-
-"Hiroshi Inoue" writes:
-> OK,I change my mind as follows.
-> OID except cygwin,unique-id on cygwin
-
-We don't really want to do that, do we? That's a huge difference in
-behavior to have in just one port --- especially a port that none of
-the primary developers use (AFAIK anyway). The cygwin port's normal
-state of existence will be "broken", surely, if we go that way.
-
-Besides which, OID alone doesn't give us a possibility of file
-versioning, and as I commented to Vadim I think we will want that,
-WAL or no WAL. So it seems to me the two viable choices are
-unique-id or OID+version-number. Either way, the file-naming behavior
-should be the same across all platforms.
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA11892
- for
; Thu, 22 Jun 2000 14:30:59 -0400 (EDT)
-Received: from sectorbase2.sectorbase.com ([208.48.122.131]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id OAA10107 for
; Thu, 22 Jun 2000 14:17:04 -0400 (EDT)
-Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21)
- id ; Thu, 22 Jun 2000 11:07:59 -0700
-Message-ID: <8F4C99C66D04D4118F580090272A7A23018C31@SECTORBASE1>
-From: "Mikheev, Vadim"
-To: "'Tom Lane'"
-Cc: Thomas Lockhart ,
- Bruce Momjian
- Peter Eisentraut
, Jan Wieck
- ,
- Hiroshi Inoue ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Thu, 22 Jun 2000 11:09:47 -0700
-MIME-Version: 1.0
-X-Mailer: Internet Mail Service (5.5.2650.21)
-Content-Type: text/plain;
- charset="iso-8859-1"
-Status: RO
-
-> > I believe that we can avoid versions using WAL...
->
-> I don't think so. You're basically saying that
-> 1. create file 'new'
-> 2. delete file 'old'
-> 3. rename 'new' to 'old'
-> is safe as long as you have a redo log to ensure that the rename
-> happens even if you crash between steps 2 and 3. But crash is not
-> the only hazard. What if step 3 just plain fails? Redo won't help.
-
-Ok, ok. Let's use *unique* file name for each table version.
-But after thinking, seems that I agreed with Hiroshi about using
-*some unique id* for file names instead of oid+version: we could use
-just DB' OID + this unique ID in log records to find table file - just
-8 bytes.
-
-So, add me to Hiroshi' camp... if Hiroshi is ready to implement new file
-naming -:)
-
-> > But what about LOCATIONs? I object using environment and think that
-> > locations must be stored in pg_control..?
->
-> I don't like environment variables for this either; it's just way too
-> easy to start the postmaster with wrong environment. It still seems
-> to me that relying on subdirectory symlinks is a good way to go.
-
-I always thought so.
-
-> pg_control is not so good --- if it gets corrupted, how do
-> you recover?
-
-Impossible to recover anyway - pg_control keeps last checkpoint pointer,
-required for recovery. That's why Oracle recommends (requires?) at least
-two copies of control file (and log too).
-But what if log gets corrupted? Or file system (lost symlinks etc)?
-One will have to use backup...
-
-Vadim
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id SAA19684
- for
; Thu, 22 Jun 2000 18:37:34 -0400 (EDT)
-Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id SAA02841 for
; Thu, 22 Jun 2000 18:31:53 -0400 (EDT)
-Received: from regulus.student.UU.SE ([130.238.5.2]:37596 "EHLO
- regulus.its.uu.se") by merganser.its.uu.se with ESMTP
- id ; Fri, 23 Jun 2000 00:29:48 +0200
-Received: from peter (helo=localhost)
- by regulus.its.uu.se with local-esmtp (Exim 3.02 #2)
- id 135FaG-00062q-00; Fri, 23 Jun 2000 00:36:28 +0200
-Date: Fri, 23 Jun 2000 00:36:28 +0200 (CEST)
-To: Tom Lane
-cc: Hiroshi Inoue
, Bruce Momjian ,
- Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-MIME-Version: 1.0
-Content-Type: TEXT/PLAIN; charset=ISO-8859-1
-Content-Transfer-Encoding: 8BIT
-Sender: Peter Eisentraut
-Status: RO
-
-Tom Lane writes:
-
-> In my mind the point of the "database" concept is to provide a domain
-> within which custom datatypes and functions are available.
-
-Quoth SQL99:
-
-"A user-defined type is a schema object"
-
-"An SQL-invoked routine is an element of an SQL-schema"
-
-I have yet to see anything in SQL that's a per-catalog object. Some things
-are global, like users, but everything else is per-schema.
-
-The way I see it is that schemas are required to be a logical hierarchy,
-whereas implementations may see catalogs as a physical division (as indeed
-this implementation does).
-
-> So I think we will still want "database" = "span of applicability of
-> system catalogs"
-
-Yes, because the system catalogs would live in a schema of their own.
-
-
---
-Peter Eisentraut Sernanders väg 10:115
-http://yi.org/peter-e/ Sweden
-
-
-Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id EAA29267
- for
; Mon, 26 Jun 2000 04:09:59 -0400 (EDT)
-Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
- by gandalf.it-austria.net (xxx/xxx) with ESMTP id KAA35550;
- Mon, 26 Jun 2000 10:09:14 +0200
-Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
- id ; Mon, 26 Jun 2000 10:09:14 +0200
-Message-ID: <219F68D65015D011A8E000006F8590C605BA598B@sdexcsrv1.f000.d0188.sd.spardat.at>
-From: Zeugswetter Andreas SB
-To: "'Tom Lane'" , Hiroshi Inoue
- Peter Eisentraut
- PostgreSQL-development
,
- "Ross J. Reedstrom" ,
- Thomas Lockhart
-
-Subject: [HACKERS] File versioning (was: Big 7.1 open items)
-Date: Mon, 26 Jun 2000 10:09:13 +0200
-MIME-Version: 1.0
-X-Mailer: Internet Mail Service (5.5.2448.0)
-Content-Type: text/plain;
- charset="iso-8859-1"
-Status: RO
-
-
-> Besides which, OID alone doesn't give us a possibility of file
-> versioning, and as I commented to Vadim I think we will want that,
-> WAL or no WAL. So it seems to me the two viable choices are
-> unique-id or OID+version-number. Either way, the file-naming behavior
-> should be the same across all platforms.
-
-I do not think the only problem of a failing rename of "temp" to "new"
-on startup rollforward is issue enough to justify the additional complexity
-a version implys.
-Why not simply abort startup of postmaster in such an event and let the
-dba fix it. There can be no data loss.
-
-If e.g. the permissions of the directory are insufficient we will want to
-abort
-startup anyway, no?
-
-Andreas
-
-Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id FAA29616
- for
; Mon, 26 Jun 2000 05:32:03 -0400 (EDT)
-Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
- by gandalf.it-austria.net (xxx/xxx) with ESMTP id LAA27288;
- Mon, 26 Jun 2000 11:31:08 +0200
-Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
- id ; Mon, 26 Jun 2000 11:31:08 +0200
-Message-ID: <219F68D65015D011A8E000006F8590C605BA598F@sdexcsrv1.f000.d0188.sd.spardat.at>
-From: Zeugswetter Andreas SB
-To: "'Hiroshi Inoue'"
, Peter Eisentraut ,
- Tom Lane
-Cc: Bruce Momjian
, Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: AW: [HACKERS] Big 7.1 open items
-Date: Mon, 26 Jun 2000 11:31:06 +0200
-MIME-Version: 1.0
-X-Mailer: Internet Mail Service (5.5.2448.0)
-Content-Type: text/plain;
- charset="iso-8859-1"
-Status: RO
-
-
-> > > In my mind the point of the "database" concept is to
-> provide a domain
-> > > within which custom datatypes and functions are available.
-> >
->
-> AFAIK few users understand it and many users have wondered
-> why we couldn't issue cross "database" queries.
-
-Imho the same issue is access to tables on another machine.
-If we "fix" that, access to another db on the same instance is just
-a variant of the above.
-
->
-> > Quoth SQL99:
-> >
-> > "A user-defined type is a schema object"
-> >
-> > "An SQL-invoked routine is an element of an SQL-schema"
-> >
-> > I have yet to see anything in SQL that's a per-catalog
-> object. Some things
-> > are global, like users, but everything else is per-schema.
-
-Yes.
-
-> So why is system catalog needed per "database" ?
-
-I like to use different databases on a development machine,
-because it makes testing easier. The only thing that
-needs to be changed is the connect statement. All other statements
-including schema qualified tablenames stay exactly the same for
-each developer even though each has his own database,
-and his own version of functions.
-I have yet to see an installation that does'nt have at least one program
-that needs access to more than one schema.
-
-On production machines we (using Informix) use different databases
-for different products, because it reduces the possibility of accessing
-the wrong tables, since the syntax for accessing tables in other db's
-is different (dbname[@instancename]:"owner".tabname in Informix)
-The schema does not help us, since most of our programs access
-tables from more than one schema.
-
-And again someone wanting Oracle'ish behavior will only create one
-database per instance.
-
-Andreas
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id BAA08810
- for
; Mon, 3 Jul 2000 01:57:49 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e635u5S69222;
- Mon, 3 Jul 2000 01:56:05 -0400 (EDT)
-Received: from po.seiren.co.jp (po.seiren.co.jp [203.138.223.10])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5QA5d124120
- for
; Mon, 26 Jun 2000 06:05:41 -0400 (EDT)
-Received: from mcadnote1 ([210.161.188.23]) by po.seiren.co.jp
- (post.office MTA v1.9.3 ID# 0100012-16224) with SMTP id AAA59;
- Mon, 26 Jun 2000 19:04:51 +0900
-From: "Hiroshi Inoue"
-To: "Zeugswetter Andreas SB" ,
- "Peter Eisentraut"
, "Tom Lane"
-Cc: "Bruce Momjian"
, "Jan Wieck" ,
- "PostgreSQL-development"
,
- "Ross J. Reedstrom"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Mon, 26 Jun 2000 19:08:26 +0900
-Message-ID:
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="Windows-1252"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
-Importance: Normal
-In-Reply-To: <219F68D65015D011A8E000006F8590C605BA598F@sdexcsrv1.f000.d0188.sd.spardat.at>
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
-Precedence: bulk
-Status: RO
-
-> -----Original Message-----
-> From: Zeugswetter Andreas SB
->
-> > > > In my mind the point of the "database" concept is to
-> > provide a domain
-> > > > within which custom datatypes and functions are available.
-> > >
-> >
-> > AFAIK few users understand it and many users have wondered
-> > why we couldn't issue cross "database" queries.
->
-> Imho the same issue is access to tables on another machine.
-> If we "fix" that, access to another db on the same instance is just
-> a variant of the above.
->
-
-What is a difference between SCHAMA and your "database" ?
-I myself am confused about them.
-
-Regards.
-
-Hiroshi Inoue
-
-Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id GAA07354
- for
; Mon, 26 Jun 2000 06:50:24 -0400 (EDT)
-Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
- by gandalf.it-austria.net (xxx/xxx) with ESMTP id MAA41146;
- Mon, 26 Jun 2000 12:50:11 +0200
-Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
- id ; Mon, 26 Jun 2000 12:50:11 +0200
-Message-ID: <219F68D65015D011A8E000006F8590C605BA5991@sdexcsrv1.f000.d0188.sd.spardat.at>
-From: Zeugswetter Andreas SB
-To: "'Hiroshi Inoue'" ,
- Peter Eisentraut
-Cc: Bruce Momjian
, Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: AW: [HACKERS] Big 7.1 open items
-Date: Mon, 26 Jun 2000 12:50:10 +0200
-MIME-Version: 1.0
-X-Mailer: Internet Mail Service (5.5.2448.0)
-Content-Type: text/plain;
- charset="windows-1252"
-Status: RO
-
-> > > > > In my mind the point of the "database" concept is to
-> > > provide a domain
-> > > > > within which custom datatypes and functions are available.
-> > > >
-> > >
-> > > AFAIK few users understand it and many users have wondered
-> > > why we couldn't issue cross "database" queries.
-> >
-> > Imho the same issue is access to tables on another machine.
-> > If we "fix" that, access to another db on the same instance is just
-> > a variant of the above.
-> >
->
-> What is a difference between SCHAMA and your "database" ?
-> I myself am confused about them.
-
-Think of it as a hierarchy:
- instance -> database -> schema -> object
-
-- "instance" corresponds to one postmaster
-- "database" as in current implementation
-- "schema" name corresponds to the owner of the object,
-only that a corresponding db or os user does not need to exist in
-some of the implementations I know.
-- "object" is one of table, index, function ...
-
-The database is what you connect to in your connect statement,
-you then see all schemas inside this database only. Access to another
-database would need an explicitly created synonym or different syntax.
-The default "schema" name is usually the logged in user name
-(although I don't like this approach, I like Informix's approach where
-the schema need not be specified if tabname is unique (and tabname
-is unique per db unless you specify database mode ansi)).
-All other schemas have to be explicitly named ("schemaname".tabname).
-
-Oracle has exactly this layout, only you are restricted to one database
-per instance.
-(They even have a "create database .." statement, although it is somehow
-analogous to our initdb).
-
-Andreas
-
-Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id HAA07648
- for
; Mon, 26 Jun 2000 07:51:12 -0400 (EDT)
-Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
- by gandalf.it-austria.net (xxx/xxx) with ESMTP id NAA40848;
- Mon, 26 Jun 2000 13:50:56 +0200
-Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
- id ; Mon, 26 Jun 2000 13:50:55 +0200
-Message-ID: <219F68D65015D011A8E000006F8590C605BA5993@sdexcsrv1.f000.d0188.sd.spardat.at>
-From: Zeugswetter Andreas SB
-To: "'Mikheev, Vadim'" ,
- "'Tom Lane'"
-
-Cc: Thomas Lockhart ,
- Bruce Momjian
- Peter Eisentraut
, Jan Wieck
- ,
- Hiroshi Inoue ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: AW: [HACKERS] Big 7.1 open items
-Date: Mon, 26 Jun 2000 13:50:55 +0200
-MIME-Version: 1.0
-X-Mailer: Internet Mail Service (5.5.2448.0)
-Content-Type: text/plain;
- charset="iso-8859-1"
-Status: RO
-
-Vadim wrote:
-> Impossible to recover anyway - pg_control keeps last
-> checkpoint pointer, required for recovery.
-
-Why not put this info in the tx log itself.
-
-> That's why Oracle recommends (requires?) at least
-> two copies of control file ....
-
-This is one of the most stupid design issues Oracle has.
-I suggest you look at the tx log design of Informix.
-(No Informix dba fears to pull the power cord on his servers,
-ask the same of an Oracle dba, they even fear
-"shutdown immediate" on a heavily used db)
-
-Andreas
-
-Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id IAA07760
- for
; Mon, 26 Jun 2000 08:02:05 -0400 (EDT)
-Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
- by gandalf.it-austria.net (xxx/xxx) with ESMTP id OAA74134;
- Mon, 26 Jun 2000 14:01:17 +0200
-Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
- id ; Mon, 26 Jun 2000 14:01:17 +0200
-Message-ID: <219F68D65015D011A8E000006F8590C605BA5994@sdexcsrv1.f000.d0188.sd.spardat.at>
-From: Zeugswetter Andreas SB
-To: Zeugswetter Andreas SB ,
- "'Mikheev, Vadim'" ,
- "'Tom Lane'"
-
-Cc: Thomas Lockhart ,
- Bruce Momjian
- Peter Eisentraut
, Jan Wieck
- ,
- Hiroshi Inoue ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: AW: [HACKERS] Big 7.1 open items
-Date: Mon, 26 Jun 2000 14:01:15 +0200
-MIME-Version: 1.0
-X-Mailer: Internet Mail Service (5.5.2448.0)
-Content-Type: text/plain;
- charset="iso-8859-1"
-Status: RO
-
-I wrote:
-> Vadim wrote:
-> > Impossible to recover anyway - pg_control keeps last
-> > checkpoint pointer, required for recovery.
->
-> Why not put this info in the tx log itself.
->
-> > That's why Oracle recommends (requires?) at least
-> > two copies of control file ....
->
-> This is one of the most stupid design issues Oracle has.
-
-The problem is, that if you want to switch to a no fsync environment,
-(here I also mean the tx log)
-but the possibility of losing a write is still there, you cannot sync
-writes to two or more different files. Only one file, the tx log itself is
-allowed
-to carry lastminute information.
-
-Thus you need to txlog changes to pg_control also.
-
-Andreas
-
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA11148
- for
; Mon, 26 Jun 2000 10:42:06 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id KAA17018;
- Mon, 26 Jun 2000 10:42:31 -0400 (EDT)
-To: Zeugswetter Andreas SB
-cc: Hiroshi Inoue
, Bruce Momjian ,
- Peter Eisentraut
, Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom" ,
- Thomas Lockhart
-Subject: Re: [HACKERS] File versioning (was: Big 7.1 open items)
-In-reply-to: <219F68D65015D011A8E000006F8590C605BA598B@sdexcsrv1.f000.d0188.sd.spardat.at>
-References: <219F68D65015D011A8E000006F8590C605BA598B@sdexcsrv1.f000.d0188.sd.spardat.at>
-Comments: In-reply-to Zeugswetter Andreas SB
- message dated "Mon, 26 Jun 2000 10:09:13 +0200"
-Date: Mon, 26 Jun 2000 10:42:31 -0400
-From: Tom Lane
-Status: RO
-
-Zeugswetter Andreas SB writes:
-> I do not think the only problem of a failing rename of "temp" to "new"
-> on startup rollforward is issue enough to justify the additional complexity
-> a version implys.
-
-If that were the only reason for it then I wouldn't feel it was so
-essential. However, it will also let us fix CLUSTER, vacuuming of
-indexes, ALTER TABLE DROP COLUMN with physical removal of the column,
-etc etc. Making the world safe for rollbackable RENAME/DROP/TRUNCATE
-TABLE is just one of the benefits.
-
-Versioning also eliminates a whole host of problems at the bufmgr/smgr
-level that are caused by having to cope with relation files getting
-renamed out from under you. We have painfully eliminated some of these
-problems over the past couple of years by ad-hoc, ugly techniques like
-flushing the buffer cache when doing a rename. But who's to say there
-are not more such bugs left?
-
-In short, I think versioning is far *less* complex, not to mention more
-reliable, than the kluges we need to use to work around the lack of it.
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id SAA02022
- for
; Mon, 26 Jun 2000 18:30:54 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5QMMa123238;
- Mon, 26 Jun 2000 18:22:37 -0400 (EDT)
-Received: from sectorbase2.sectorbase.com ([208.48.122.131])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5QMMJ123161
- for
; Mon, 26 Jun 2000 18:22:19 -0400 (EDT)
-Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21)
- id ; Mon, 26 Jun 2000 15:13:48 -0700
-Message-ID: <8F4C99C66D04D4118F580090272A7A23018C36@SECTORBASE1>
-From: "Mikheev, Vadim"
-To: "'Tom Lane'"
-Cc: "'Hiroshi Inoue'" ,
- Thomas Lockhart
- ,
- Peter Eisentraut
, Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Mon, 26 Jun 2000 15:15:39 -0700
-MIME-Version: 1.0
-X-Mailer: Internet Mail Service (5.5.2650.21)
-Content-Type: text/plain;
- charset="iso-8859-1"
-Precedence: bulk
-Status: RO
-
-> > Do we need *both* database & tablespace to find table file ?!
-> > Imho, database shouldn't be used...
->
-> That'd work fine for me, but I think Bruce was arguing for paths that
-> included the database name. We'd end up with paths that go something
-> like
-> ..../data/tablespaces/TABLESPACEOID/RELATIONOID
-> (plus some kind of decoration for segment and version), so you'd have
-> a hard time telling which files in a tablespace belong to which
-> database. Doesn't bother me a whole lot, personally --- if one wants
-
-We could create /data/databases/DATABASEOID/ and create soft-links to
-table-files. This way different tables of the same database could be in
-different tablespaces. /data/database path would be used in production
-and /data/tablespace path would be used in recovery.
-
-Vadim
-
-Received: from sectorbase2.sectorbase.com ([208.48.122.131])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id SAA01888
- for
; Mon, 26 Jun 2000 18:21:52 -0400 (EDT)
-Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21)
- id ; Mon, 26 Jun 2000 15:13:48 -0700
-Message-ID: <8F4C99C66D04D4118F580090272A7A23018C36@SECTORBASE1>
-From: "Mikheev, Vadim"
-To: "'Tom Lane'"
-Cc: "'Hiroshi Inoue'" ,
- Thomas Lockhart
- ,
- Peter Eisentraut
, Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Mon, 26 Jun 2000 15:15:39 -0700
-MIME-Version: 1.0
-X-Mailer: Internet Mail Service (5.5.2650.21)
-Content-Type: text/plain;
- charset="iso-8859-1"
-Status: RO
-
-> > Do we need *both* database & tablespace to find table file ?!
-> > Imho, database shouldn't be used...
->
-> That'd work fine for me, but I think Bruce was arguing for paths that
-> included the database name. We'd end up with paths that go something
-> like
-> ..../data/tablespaces/TABLESPACEOID/RELATIONOID
-> (plus some kind of decoration for segment and version), so you'd have
-> a hard time telling which files in a tablespace belong to which
-> database. Doesn't bother me a whole lot, personally --- if one wants
-
-We could create /data/databases/DATABASEOID/ and create soft-links to
-table-files. This way different tables of the same database could be in
-different tablespaces. /data/database path would be used in production
-and /data/tablespace path would be used in recovery.
-
-Vadim
-
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id SAA02118
- for
; Mon, 26 Jun 2000 18:47:52 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id SAA19579;
- Mon, 26 Jun 2000 18:48:22 -0400 (EDT)
-To: "Mikheev, Vadim"
-cc: "'Hiroshi Inoue'" ,
- Thomas Lockhart ,
- Peter Eisentraut
, Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-In-reply-to: <8F4C99C66D04D4118F580090272A7A23018C36@SECTORBASE1>
-References: <8F4C99C66D04D4118F580090272A7A23018C36@SECTORBASE1>
-Comments: In-reply-to "Mikheev, Vadim"
- message dated "Mon, 26 Jun 2000 15:15:39 -0700"
-Date: Mon, 26 Jun 2000 18:48:22 -0400
-From: Tom Lane
-Status: RO
-
-"Mikheev, Vadim" writes:
-> We could create /data/databases/DATABASEOID/ and create soft-links to
-> table-files. This way different tables of the same database could be in
-> different tablespaces. /data/database path would be used in production
-> and /data/tablespace path would be used in recovery.
-
-Why would you want to do it that way? Having a different access path
-for recovery than for normal operation strikes me as just asking for
-trouble ;-)
-
-The symlinks wouldn't do any good for what Bruce had in mind anyway
-(IIRC, he wanted to get useful per-database numbers from "du").
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA04481
- for
; Mon, 26 Jun 2000 23:37:51 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5R1nx169365;
- Mon, 26 Jun 2000 21:50:00 -0400 (EDT)
-Received: from sectorbase2.sectorbase.com ([208.48.122.131])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5R1mt169094
- for
; Mon, 26 Jun 2000 21:48:55 -0400 (EDT)
-Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21)
- id ; Mon, 26 Jun 2000 18:40:19 -0700
-Message-ID: <8F4C99C66D04D4118F580090272A7A23018C38@SECTORBASE1>
-From: "Mikheev, Vadim"
-To: "'Tom Lane'"
-Cc: "'Hiroshi Inoue'" ,
- Thomas Lockhart
- ,
- Peter Eisentraut
, Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Mon, 26 Jun 2000 18:42:10 -0700
-MIME-Version: 1.0
-X-Mailer: Internet Mail Service (5.5.2650.21)
-Content-Type: text/plain;
- charset="iso-8859-1"
-Precedence: bulk
-Status: RO
-
-> > We could create /data/databases/DATABASEOID/ and create
-> > soft-links to table-files. This way different tables of
-> > the same database could be in different tablespaces.
-> > /data/database path would be used in production
-> > and /data/tablespace path would be used in recovery.
->
-> Why would you want to do it that way? Having a different access path
-> for recovery than for normal operation strikes me as just asking for
-> trouble ;-)
-
-I just think that *databases* (schemas) must be used for *logical* groupping
-of tables, not for *physical* one. "Where to store table" is tablespace'
-related kind of things!
-
-> The symlinks wouldn't do any good for what Bruce had in mind anyway
-> (IIRC, he wanted to get useful per-database numbers from "du").
-
-Imho, ability to put different tables/indices (of the same database)
-to different tablespaces (disks) is much more useful then ability to
-use du/ls for administration purposes -:)
-
-Also, I think that we *must* go away from OS' driven disk space
-allocation anyway. Currently, the way we extend table files breaks WAL
-rule (nothing must go to disk untill logged). + we have to move tuples
-from end of file to top to shrink relation - not perfect way to reuse
-empty space. +... +... +...
-
-Vadim
-
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA05264
- for
; Tue, 27 Jun 2000 00:05:11 -0400 (EDT)
-Received: from tpf.co.jp ([126.0.1.56] (may be forged))
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with ESMTP
- id NAA01123; Tue, 27 Jun 2000 13:04:26 +0900
-Date: Tue, 27 Jun 2000 13:07:28 +0900
-From: Hiroshi Inoue
-X-Mailer: Mozilla 4.73 [ja] (Windows NT 5.0; U)
-X-Accept-Language: ja
-MIME-Version: 1.0
-To: Tom Lane
-CC: "Mikheev, Vadim" ,
- Thomas Lockhart ,
- Peter Eisentraut
, Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-References: <8F4C99C66D04D4118F580090272A7A23018C36@SECTORBASE1> <
[email protected]>
-Content-Type: text/plain; charset=iso-2022-jp
-Content-Transfer-Encoding: 7bit
-Status: ROr
-
-Tom Lane wrote:
-
->
-> The symlinks wouldn't do any good for what Bruce had in mind anyway
-> (IIRC, he wanted to get useful per-database numbers from "du").
-
-Our database design seems to be in the opposite direction
-if it is restricted for the convenience of command calls.
-
-Regards.
-
-Hiroshi Inoue
-
-
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA05478
- for
; Tue, 27 Jun 2000 00:14:23 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5R46J182392;
- Tue, 27 Jun 2000 00:06:20 -0400 (EDT)
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5R466180629
- for
; Tue, 27 Jun 2000 00:06:06 -0400 (EDT)
-Received: from tpf.co.jp ([126.0.1.56] (may be forged))
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with ESMTP
- id NAA01123; Tue, 27 Jun 2000 13:04:26 +0900
-Date: Tue, 27 Jun 2000 13:07:28 +0900
-From: Hiroshi Inoue
-X-Mailer: Mozilla 4.73 [ja] (Windows NT 5.0; U)
-X-Accept-Language: ja
-MIME-Version: 1.0
-To: Tom Lane
-CC: "Mikheev, Vadim" ,
- Thomas Lockhart ,
- Peter Eisentraut
, Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-References: <8F4C99C66D04D4118F580090272A7A23018C36@SECTORBASE1> <
[email protected]>
-Content-Type: text/plain; charset=iso-2022-jp
-Content-Transfer-Encoding: 7bit
-Precedence: bulk
-Status: RO
-
-Tom Lane wrote:
-
->
-> The symlinks wouldn't do any good for what Bruce had in mind anyway
-> (IIRC, he wanted to get useful per-database numbers from "du").
-
-Our database design seems to be in the opposite direction
-if it is restricted for the convenience of command calls.
-
-Regards.
-
-Hiroshi Inoue
-
-
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA21305
- for
; Tue, 27 Jun 2000 10:07:48 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5RDUh185923;
- Tue, 27 Jun 2000 09:30:43 -0400 (EDT)
-Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5RDTB183147
- for
; Tue, 27 Jun 2000 09:29:12 -0400 (EDT)
-Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
- by gandalf.it-austria.net (xxx/xxx) with ESMTP id PAA41830;
- Tue, 27 Jun 2000 15:27:07 +0200
-Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
- id ; Tue, 27 Jun 2000 15:27:06 +0200
-Message-ID: <219F68D65015D011A8E000006F8590C605BA5999@sdexcsrv1.f000.d0188.sd.spardat.at>
-From: Zeugswetter Andreas SB
-To: "'Tom Lane'" ,
- "Mikheev, Vadim"
-
-Cc: "'Hiroshi Inoue'" ,
- Thomas Lockhart
- ,
- Peter Eisentraut
, Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: AW: [HACKERS] Big 7.1 open items
-Date: Tue, 27 Jun 2000 15:27:03 +0200
-MIME-Version: 1.0
-X-Mailer: Internet Mail Service (5.5.2448.0)
-Content-Type: text/plain;
- charset="iso-8859-1"
-Precedence: bulk
-Status: RO
-
-
-> That'd work fine for me, but I think Bruce was arguing for paths that
-> included the database name. We'd end up with paths that go something
-> like
-> ..../data/tablespaces/TABLESPACEOID/RELATIONOID
-> (plus some kind of decoration for segment and version), so you'd have
-> a hard time telling which files in a tablespace belong to which
-> database.
-
-Well ,as long as we have the file per object layout it probably makes sense
-to
-have "speaking paths", But I see no real problem with:
-
-..../data/tablespacename/dbname/RELATIONOID[.dat|.idx]
-
-RELATIONOID standing for whatever the consensus will be.
-I do not really see an argument for using a tablespaceoid instead of
-it's [maybe mangled] name.
-
-Andreas
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA21468
- for
; Tue, 27 Jun 2000 10:28:38 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5REOa111784;
- Tue, 27 Jun 2000 10:24:36 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5REOG109445
- for
; Tue, 27 Jun 2000 10:24:16 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id KAA09575;
- Tue, 27 Jun 2000 10:23:48 -0400 (EDT)
-To: Zeugswetter Andreas SB
-cc: "Mikheev, Vadim" ,
- "'Hiroshi Inoue'" ,
- Thomas Lockhart ,
- Peter Eisentraut
, Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: AW: [HACKERS] Big 7.1 open items
-In-reply-to: <219F68D65015D011A8E000006F8590C605BA5999@sdexcsrv1.f000.d0188.sd.spardat.at>
-References: <219F68D65015D011A8E000006F8590C605BA5999@sdexcsrv1.f000.d0188.sd.spardat.at>
-Comments: In-reply-to Zeugswetter Andreas SB
- message dated "Tue, 27 Jun 2000 15:27:03 +0200"
-Date: Tue, 27 Jun 2000 10:23:48 -0400
-From: Tom Lane
-Precedence: bulk
-Status: RO
-
-Zeugswetter Andreas SB writes:
-> I do not really see an argument for using a tablespaceoid instead of
-> it's [maybe mangled] name.
-
-Eliminating filesystem-based restrictions on names, for one.
-For example we'd not have to forbid slashes and (probably) backquotes
-in tablespace names if we did this, and we'd not have to worry about
-filesystem-induced limits on name lengths. Renaming a tablespace
-would also be trivial instead of nigh impossible.
-
-It might be that using tablespace names as directory names is worth
-enough from the admin point of view to make the above restrictions
-acceptable. But it's a tradeoff, and not one with an obvious choice
-IMHO.
-
- regards, tom lane
-
-Received: from sectorbase2.sectorbase.com ([208.48.122.131])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA28715
- for
; Tue, 27 Jun 2000 14:01:07 -0400 (EDT)
-Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21)
- id ; Tue, 27 Jun 2000 10:53:03 -0700
-Message-ID: <8F4C99C66D04D4118F580090272A7A23018C39@SECTORBASE1>
-From: "Mikheev, Vadim"
- Hiroshi Inoue
-
-Cc: Tom Lane ,
- Thomas Lockhart
- ,
- Peter Eisentraut
, Jan Wieck ,
- PostgreSQL-development
- "Ross J. Reedstrom"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Tue, 27 Jun 2000 10:54:55 -0700
-MIME-Version: 1.0
-X-Mailer: Internet Mail Service (5.5.2650.21)
-Content-Type: text/plain;
- charset="iso-8859-1"
-Status: ROr
-
-> > > The symlinks wouldn't do any good for what Bruce had in
-> > > mind anyway (IIRC, he wanted to get useful per-database
-> > > numbers from "du").
-> >
-> > Our database design seems to be in the opposite direction
-> > if it is restricted for the convenience of command calls.
->
-> Well, I don't see any reason not to use tablespace/database
-> rather than just tablespace. Seems having fewer files in each directory
-
-Once again - ability to use different tablespaces (disks) for tables/indices
-in the same schema. Schemas must not dictate where to store objects <-
-bad design.
-
-> will be a little faster, and if we can make administration easier,
-> why not?
-
-Because you'll not be able use du/ls once we'll implement new smgr anyway.
-
-And, btw, - for what are we going implement tablespaces? Just to have
-fewer files in each dir ?!
-
-Vadim
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA28748
- for
; Tue, 27 Jun 2000 14:03:34 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5RI1h139788;
- Tue, 27 Jun 2000 14:01:44 -0400 (EDT)
-Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5RI1I138791
- for
; Tue, 27 Jun 2000 14:01:18 -0400 (EDT)
-Received: from regulus.student.UU.SE ([130.238.5.2]:59174 "EHLO
- regulus.its.uu.se") by merganser.its.uu.se with ESMTP
- id ; Tue, 27 Jun 2000 20:00:50 +0200
-Received: from peter (helo=localhost)
- by regulus.its.uu.se with local-esmtp (Exim 3.02 #2)
- id 136zlm-0003zn-00; Tue, 27 Jun 2000 20:07:34 +0200
-Date: Tue, 27 Jun 2000 20:07:34 +0200 (CEST)
-To: "Mikheev, Vadim"
-cc: "'Hiroshi Inoue'" , "'Tom Lane'" ,
- Thomas Lockhart ,
- Bruce Momjian
, Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: RE: [HACKERS] Big 7.1 open items
-In-Reply-To: <8F4C99C66D04D4118F580090272A7A23018C35@SECTORBASE1>
-MIME-Version: 1.0
-Content-Type: TEXT/PLAIN; charset=ISO-8859-1
-Content-Transfer-Encoding: 8BIT
-Precedence: bulk
-Status: RO
-
-Mikheev, Vadim writes:
-
-> Do we need *both* database & tablespace to find table file ?!
-> Imho, database shouldn't be used...
-
-Then the system tables from different databases would collide.
-
-
---
-Peter Eisentraut Sernanders väg 10:115
-http://yi.org/peter-e/ Sweden
-
-
-Received: from sectorbase2.sectorbase.com ([208.48.122.131])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id PAA04820
- for
; Tue, 27 Jun 2000 15:28:24 -0400 (EDT)
-Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21)
- id ; Tue, 27 Jun 2000 12:20:20 -0700
-Message-ID: <8F4C99C66D04D4118F580090272A7A23018C3A@SECTORBASE1>
-From: "Mikheev, Vadim"
-Cc: Hiroshi Inoue , Tom Lane ,
- Thomas Lockhart ,
- Peter Eisentraut
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Tue, 27 Jun 2000 12:22:13 -0700
-MIME-Version: 1.0
-X-Mailer: Internet Mail Service (5.5.2650.21)
-Content-Type: text/plain;
- charset="iso-8859-1"
-Status: ROr
-
-> > > Well, I don't see any reason not to use tablespace/database
-> > > rather than just tablespace. Seems having fewer files in
-> > > each directory
-> >
-> > Once again - ability to use different tablespaces (disks)
-> > for tables/indices in the same schema. Schemas must not dictate
-> > where to store objects <- bad design.
->
-> I am suggesting this symlink:
->
-> ln -s data/base/testdb/myspace /var/myspace/testdb
->
-> rather than:
->
-> ln -s data/base/testdb/myspace /var/myspace
->
-> Tablespaces still sit inside database directories, it is just that it
-> points to a subdirectory of myspace, rather than myspace itself.
-^^^^^^^^^^^
-
-Didn't you mean
-
-ln -s /var/myspace/testdb data/base/testdb/myspace
-
-?
-
-I thought that you don't like symlinks from data/base/... This is
-how I understood Tom' words:
-
-> The symlinks wouldn't do any good for what Bruce had in mind anyway
-> (IIRC, he wanted to get useful per-database numbers from "du").
-
-Vadim
-
-Received: from sectorbase2.sectorbase.com ([208.48.122.131])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id PAA05148
- for
; Tue, 27 Jun 2000 15:43:30 -0400 (EDT)
-Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21)
- id ; Tue, 27 Jun 2000 12:35:41 -0700
-Message-ID: <8F4C99C66D04D4118F580090272A7A23018C3C@SECTORBASE1>
-From: "Mikheev, Vadim"
-Cc: "'Peter Eisentraut'"
,
- "'Hiroshi Inoue'"
- ,
- "'Tom Lane'" ,
- Thomas Lockhart
- ,
- Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Tue, 27 Jun 2000 12:37:34 -0700
-MIME-Version: 1.0
-X-Mailer: Internet Mail Service (5.5.2650.21)
-Content-Type: text/plain;
- charset="iso-8859-1"
-Status: ROr
-
-> > > Then the system tables from different databases would collide.
-> >
-> > Actually, if we're going to use unique-ids for file names
-> > then we have to know how to get system file names anyway.
-> > Hm, OID+VERSION would make our life easier... Hiroshi?
->
-> I assume we were going to have a pg_class.relversion to do that, but
- ^^^^^^^^
-PG_CLASS_OID.VERSION_ID...
-
-Just a clarification -:)
-
-> that is per-database because pg_class is per-database.
-
-Vadim
-
-Received: from sectorbase2.sectorbase.com ([208.48.122.131])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id PAA05452
- for
; Tue, 27 Jun 2000 15:48:30 -0400 (EDT)
-Received: by SECTORBASE2 with Internet Mail Service (5.5.2650.21)
- id ; Tue, 27 Jun 2000 12:40:42 -0700
-Message-ID: <8F4C99C66D04D4118F580090272A7A23018C3D@SECTORBASE1>
-From: "Mikheev, Vadim"
-Cc: "'Peter Eisentraut'"
,
- "'Hiroshi Inoue'"
- ,
- "'Tom Lane'" ,
- Thomas Lockhart
- ,
- Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: RE: [HACKERS] Big 7.1 open items
-Date: Tue, 27 Jun 2000 12:42:35 -0700
-MIME-Version: 1.0
-X-Mailer: Internet Mail Service (5.5.2650.21)
-Content-Type: text/plain;
- charset="iso-8859-1"
-Status: ROr
-
-> I actually meant I thought we were going to have a pg_class column
-> called relversion that held the currently active version for that
-> relation.
->
-> Yes, the file name will be pg_class_oid.version_id.
->
-> Is that OK?
-
-We recently discussed pure *unique-id* file names...
-
-Vadim
-
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA08565
- for
; Tue, 27 Jun 2000 17:03:32 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5RL2B155891;
- Tue, 27 Jun 2000 17:02:11 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5RL10155419
- for
; Tue, 27 Jun 2000 17:01:00 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id RAA11135;
- Tue, 27 Jun 2000 17:00:12 -0400 (EDT)
-cc: "Mikheev, Vadim" ,
- "'Hiroshi Inoue'" ,
- Thomas Lockhart ,
- Bruce Momjian
, Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Peter Eisentraut
- message dated "Tue, 27 Jun 2000 20:07:34 +0200"
-Date: Tue, 27 Jun 2000 17:00:11 -0400
-From: Tom Lane
-Precedence: bulk
-Status: RO
-
-Peter Eisentraut
writes:
-> Mikheev, Vadim writes:
->> Do we need *both* database & tablespace to find table file ?!
->> Imho, database shouldn't be used...
-
-> Then the system tables from different databases would collide.
-
-I've been assuming that we would create a separate tablespace for
-each database, which would be the location of that database's
-system tables. It's probably also the default tablespace for user
-tables created in that database, though it wouldn't have to be.
-
-There should also be a known tablespace for the installation-wide tables
-(pg_shadow et al).
-
-With this approach tablespace+relation would indeed be a sufficient
-identifier. We could even eliminate the knowledge that certain
-tables are installation-wide from the bufmgr and below (currently
-that knowledge is hardwired in places that I'd rather didn't know
-about it...)
-
- regards, tom lane
-
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA08435
- for
; Tue, 27 Jun 2000 17:00:12 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id RAA11135;
- Tue, 27 Jun 2000 17:00:12 -0400 (EDT)
-cc: "Mikheev, Vadim" ,
- "'Hiroshi Inoue'" ,
- Thomas Lockhart ,
- Bruce Momjian
, Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Peter Eisentraut
- message dated "Tue, 27 Jun 2000 20:07:34 +0200"
-Date: Tue, 27 Jun 2000 17:00:11 -0400
-From: Tom Lane
-Status: ROr
-
-Peter Eisentraut
writes:
-> Mikheev, Vadim writes:
->> Do we need *both* database & tablespace to find table file ?!
->> Imho, database shouldn't be used...
-
-> Then the system tables from different databases would collide.
-
-I've been assuming that we would create a separate tablespace for
-each database, which would be the location of that database's
-system tables. It's probably also the default tablespace for user
-tables created in that database, though it wouldn't have to be.
-
-There should also be a known tablespace for the installation-wide tables
-(pg_shadow et al).
-
-With this approach tablespace+relation would indeed be a sufficient
-identifier. We could even eliminate the knowledge that certain
-tables are installation-wide from the bufmgr and below (currently
-that knowledge is hardwired in places that I'd rather didn't know
-about it...)
-
- regards, tom lane
-
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA09638
- for
; Tue, 27 Jun 2000 17:18:48 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id RAA11377;
- Tue, 27 Jun 2000 17:19:31 -0400 (EDT)
-cc: "Mikheev, Vadim" ,
- "'Hiroshi Inoue'" ,
- Thomas Lockhart ,
- Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Bruce Momjian
- message dated "Tue, 27 Jun 2000 15:52:40 -0400"
-Date: Tue, 27 Jun 2000 17:19:31 -0400
-From: Tom Lane
-Status: ROr
-
-> Well, that would allow us to mix database files in the same directory,
-> if we wanted to do that. My opinion it is better to keep databases in
-> separate directories in each tablespace for clarity and performance
-> reasons.
-
-One reason not to do that is that we'd still have to special-case
-the system-wide relations. If it's just tablespace and OID in the
-path, then the system-wide rels look just the same as any other rel
-as far as the low-level stuff is concerned. That would be nice.
-
-My feeling about the "clarity and performance" issue is that if a
-dbadmin wants to keep track of database contents separately, he can
-put different databases' tables into different tablespaces to start
-with. If he puts several tables into one tablespace, he's saying
-he doesn't care about distinguishing their space usage. There's
-no reason for us to force an additional level of directory lookup
-to be done whether the admin wants it or not.
-
- regards, tom lane
-
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA09909
- for
; Tue, 27 Jun 2000 17:29:33 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id RAA13026;
- Tue, 27 Jun 2000 17:30:18 -0400 (EDT)
-cc: "Mikheev, Vadim" ,
- "'Hiroshi Inoue'" ,
- Thomas Lockhart ,
- Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Bruce Momjian
- message dated "Tue, 27 Jun 2000 17:23:49 -0400"
-Date: Tue, 27 Jun 2000 17:30:17 -0400
-From: Tom Lane
-Status: RO
-
-> Yes, good point about pg_shadow. They don't have databases. How do we
-> get multiple pg_class tables in the same directory? Is the
-> pg_class.relversion file a number like 1,2,3,4, or does it come out of
-> some global counter like oid. If so, we could put them in the same
-> directory.
-
-I think we could get away with insisting that each database store its
-pg_class and friends in a separate tablespace (physically distinct
-directory) from any other database. That gets around the OID conflict.
-
-It's still an open question whether OID+version is better than
-unique-ID for naming files that belong to different versions of the
-same relation. I can see arguments on both sides.
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA09986
- for
; Tue, 27 Jun 2000 17:33:04 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5RLV7124097;
- Tue, 27 Jun 2000 17:31:07 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5RLUn123949
- for
; Tue, 27 Jun 2000 17:30:49 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id RAA13026;
- Tue, 27 Jun 2000 17:30:18 -0400 (EDT)
-cc: "Mikheev, Vadim" ,
- "'Hiroshi Inoue'" ,
- Thomas Lockhart ,
- Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Bruce Momjian
- message dated "Tue, 27 Jun 2000 17:23:49 -0400"
-Date: Tue, 27 Jun 2000 17:30:17 -0400
-From: Tom Lane
-Precedence: bulk
-Status: RO
-
-> Yes, good point about pg_shadow. They don't have databases. How do we
-> get multiple pg_class tables in the same directory? Is the
-> pg_class.relversion file a number like 1,2,3,4, or does it come out of
-> some global counter like oid. If so, we could put them in the same
-> directory.
-
-I think we could get away with insisting that each database store its
-pg_class and friends in a separate tablespace (physically distinct
-directory) from any other database. That gets around the OID conflict.
-
-It's still an open question whether OID+version is better than
-unique-ID for naming files that belong to different versions of the
-same relation. I can see arguments on both sides.
-
- regards, tom lane
-
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA12791
- for
; Tue, 27 Jun 2000 19:13:28 -0400 (EDT)
-Received: from tpf.co.jp ([126.0.1.56] (may be forged))
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with ESMTP
- id IAA01830; Wed, 28 Jun 2000 08:13:26 +0900
-Date: Wed, 28 Jun 2000 08:16:27 +0900
-From: Hiroshi Inoue
-X-Mailer: Mozilla 4.73 [ja] (Windows NT 5.0; U)
-X-Accept-Language: ja
-MIME-Version: 1.0
-To: Tom Lane
- "Mikheev, Vadim" ,
- Thomas Lockhart ,
- Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Content-Type: text/plain; charset=iso-2022-jp
-Content-Transfer-Encoding: 7bit
-Status: RO
-
-Tom Lane wrote:
-
-> > Yes, good point about pg_shadow. They don't have databases. How do we
-> > get multiple pg_class tables in the same directory? Is the
-> > pg_class.relversion file a number like 1,2,3,4, or does it come out of
-> > some global counter like oid. If so, we could put them in the same
-> > directory.
->
-> I think we could get away with insisting that each database store its
-> pg_class and friends in a separate tablespace (physically distinct
-> directory) from any other database. That gets around the OID conflict.
->
-> It's still an open question whether OID+version is better than
-> unique-ID for naming files that belong to different versions of the
-> same relation. I can see arguments on both sides.
->
-
-I don't stick to unique-ID. My main point has always been the
-transactional control of file allocation change.
-However *VERSION(_ID)* may be misleading because it couldn't
-mean the version of pg_class tuples.
-
-Regards.
-
-Hiroshi Inoue
-
-
-
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA11316
- for
; Wed, 28 Jun 2000 12:10:58 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id MAA15790;
- Wed, 28 Jun 2000 12:11:40 -0400 (EDT)
-cc: "Mikheev, Vadim" ,
- "'Hiroshi Inoue'" ,
- Thomas Lockhart ,
- Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Bruce Momjian
- message dated "Wed, 28 Jun 2000 10:25:21 -0400"
-Date: Wed, 28 Jun 2000 12:11:40 -0400
-From: Tom Lane
-Status: ROr
-
-> If we put multiple database tables in the same directory, have we
-> considered how to drop databases? Right now we do rm -rf:
-
-rm -rf will no longer work in a tablespaces environment anyway.
-(Even if you kept symlinks underneath the DB directory, rm -rf
-wouldn't follow them.)
-
-DROP DATABASE will have to be implemented honestly: run through
-pg_class and do a regular DROP on each user table.
-
-Once you've got rid of the user tables, rm -rf should suffice to
-get rid of the "home tablespace" as I've been calling it, with
-all the system tables therein.
-
-Now that you mention it, this is another reason why system tables for
-each database have to live in a separate tablespace directory: there's
-no other good way to do that final stage of DROP DATABASE. The
-DROP-each-table approach doesn't work for system tables (somewhere along
-about the point where you drop pg_attribute, DROP TABLE itself would
-stop working ;-)).
-
-However I do see a bit of a problem here: since DROP DATABASE is
-ordinarily executed by a backend that's running in a different database,
-how's it going to read pg_class of the target database? Perhaps it will
-be necessary to fire up a sub-backend that runs in the target DB for
-long enough to kill all the user tables. Looking messy...
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA27612
- for
; Wed, 28 Jun 2000 19:53:27 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5SNqG142069;
- Wed, 28 Jun 2000 19:52:17 -0400 (EDT)
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5SNp7137729
- for
; Wed, 28 Jun 2000 19:51:07 -0400 (EDT)
-Received: from tpf.co.jp ([126.0.1.56] (may be forged))
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with ESMTP
- id IAA03041; Thu, 29 Jun 2000 08:50:01 +0900
-Date: Thu, 29 Jun 2000 08:53:03 +0900
-From: Hiroshi Inoue
-X-Mailer: Mozilla 4.73 [ja] (Windows NT 5.0; U)
-X-Accept-Language: ja
-MIME-Version: 1.0
-To: Tom Lane
- "Mikheev, Vadim" ,
- Thomas Lockhart ,
- Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Content-Type: text/plain; charset=iso-2022-jp
-Content-Transfer-Encoding: 7bit
-Precedence: bulk
-Status: RO
-
-Tom Lane wrote:
-
-> "Hiroshi Inoue" writes:
-> > Why do we have to have system tables per *database* ?
-> > Is there anything wrong with global system tables ?
-> > And how about adding dbid to pg_class,pg_proc etc ?
->
-> We could, but I think I'd vote against it on two grounds:
->
-> 1. Reliability. If something corrupts pg_class, do you want to
-> lose your whole installation, or just one database?
->
-> 2. Increased locking overhead/loss of concurrency. Currently, there
-> is very little lock contention between backends running in different
-> databases. A shared pg_class will be a single point of locking (as
-> well as a single point of failure) for the whole installation.
-
-Isn't current design of PG's *database* for dropdb using "rm -rf"
-rather than for above 1.2. ?
-If we couldn't rely on our db itself and our locking mechanism is
-poor,we could start different postmasters for different *database*s.
-
-
-> It would solve the DROP DATABASE problem kind of nicely, but really
-> it'd just be downgrading DROP DATABASE to a DROP SCHEMA operation...
->
-
-What is our *DATABASE* ?
-Is it clear to all people ?
-At least it's a vague concept for me.
-Could you please tell me what kind of objects are our *DATABASE*
-objects but could not be schema objects ?
-
-Regards.
-
-Hiroshi Inoue
-
-
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA28321
- for
; Thu, 29 Jun 2000 10:39:57 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5T7nr158743;
- Thu, 29 Jun 2000 03:49:53 -0400 (EDT)
-Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5T7io146030
- for
; Thu, 29 Jun 2000 03:44:51 -0400 (EDT)
-Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
- by gandalf.it-austria.net (xxx/xxx) with ESMTP id JAA46266;
- Thu, 29 Jun 2000 09:43:20 +0200
-Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
- id ; Thu, 29 Jun 2000 09:43:20 +0200
-Message-ID: <219F68D65015D011A8E000006F8590C605BA59A8@sdexcsrv1.f000.d0188.sd.spardat.at>
-From: Zeugswetter Andreas SB
-Cc: "Mikheev, Vadim" ,
- Hiroshi Inoue
- , Tom Lane ,
- Thomas Lockhart
- ,
- Peter Eisentraut
, Jan Wieck ,
- PostgreSQL-development
- "Ross J. Reedstrom"
-Subject: AW: AW: [HACKERS] Big 7.1 open items
-Date: Thu, 29 Jun 2000 09:43:14 +0200
-MIME-Version: 1.0
-X-Mailer: Internet Mail Service (5.5.2448.0)
-Content-Type: text/plain;
- charset="windows-1252"
-Precedence: bulk
-Status: RO
-
-
-> > ln -s data/base/testdb/myspace/extent1 /var/myspace/extent1/testdb
->
-> The idea was to put the main files in the directory, and create Extent2,
-> Extent3 directories for the extents.
-
-The reasoning was, that the database subdir should be below the extentdir,
-so that creating different fs for each extent would be easier, and not
-depend
-on the database name.
-
-It is easy to create fs for:
- /var/myspace
-or
- /var/myspace[/extent1]
- /var/myspace/extent2
-but not if it has dbname in it.
-
-Andreas
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id GAA25201
- for
; Thu, 29 Jun 2000 06:34:44 -0400 (EDT)
-Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id GAA00379 for
; Thu, 29 Jun 2000 06:35:30 -0400 (EDT)
-Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
- by gandalf.it-austria.net (xxx/xxx) with ESMTP id MAA33950;
- Thu, 29 Jun 2000 12:33:42 +0200
-Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
- id ; Thu, 29 Jun 2000 12:33:42 +0200
-Message-ID: <219F68D65015D011A8E000006F8590C605BA59AC@sdexcsrv1.f000.d0188.sd.spardat.at>
-From: Zeugswetter Andreas SB
-To: "'Tom Lane'"
- Peter Eisentraut
- "Mikheev, Vadim" ,
- "'Hiroshi Inoue'" ,
- Thomas Lockhart
- ,
- Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: AW: AW: [HACKERS] Big 7.1 open items
-Date: Thu, 29 Jun 2000 12:33:39 +0200
-MIME-Version: 1.0
-X-Mailer: Internet Mail Service (5.5.2448.0)
-Content-Type: text/plain;
- charset="iso-8859-1"
-Status: RO
-
-
-> > > I think I would prefer the ability to place more than one
-> > database into
-> > > the same tablespace.
-> >
-> > You can put user tables from multiple databases into the same
-> > tablespace, under this proposal. Just not system tables.
->
-> Yes, but then it is only half baked.
-
-Half baked or not, I think I am starting to like it.
-I think I would restrict such an automagically created tablespace
-(tblspace name = db name) to only contain tables from this database.
-
-Andreas
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA08070
- for
; Thu, 29 Jun 2000 13:24:35 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5THLf102550;
- Thu, 29 Jun 2000 13:21:41 -0400 (EDT)
-Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5THL1197262
- for
; Thu, 29 Jun 2000 13:21:01 -0400 (EDT)
-Received: from regulus.student.UU.SE ([130.238.5.2]:50625 "EHLO
- regulus.its.uu.se") by merganser.its.uu.se with ESMTP
- id ; Thu, 29 Jun 2000 19:20:28 +0200
-Received: from peter (helo=localhost)
- by regulus.its.uu.se with local-esmtp (Exim 3.02 #2)
- id 137i5r-0000BK-00; Thu, 29 Jun 2000 19:27:15 +0200
-Date: Thu, 29 Jun 2000 19:27:15 +0200 (CEST)
-To: Hiroshi Inoue
-cc: Zeugswetter Andreas SB ,
- "'Mikheev, Vadim'" ,
-Subject: Re: AW: [HACKERS] Big 7.1 open items
-MIME-Version: 1.0
-Content-Type: TEXT/PLAIN; charset=ISO-8859-1
-Content-Transfer-Encoding: 8BIT
-Precedence: bulk
-Status: RO
-
-Hiroshi Inoue writes:
-
-> According to your another posting,your *database* hierarchy is
-> instance -> database -> schema -> object
-> like Oracle.
->
-> However SQL92 seems to have another hierarchy:
-> cluster -> catalog -> schema -> object
-> and dot notation catalog.schema.object could be used.
-
-FYI:
-
-An "instance" is a "cluster". I don't know where the word instance came
-from, the docs sometimes call it "installation" or "site", which is even
-worse. I have been using "database cluster" for the latest documentation
-work. My dictionary defines a cluster as "a group of things gathered or
-occurring closely together", which is what this is. Call it a "data area"
-or an "initdb'ed thing", etc.
-
-A "catalog" can be equated with our "database". The method of creating
-catalogs is implementation defined, so our CREATE DATABASE command is in
-perfect compliance with the standard. We don't support the
-catalog.schema.object notation but that notation only makes sense when you
-can access more than one catalog at a time. We don't allow that and SQL
-doesn't require it. We could allow that notation and throw an error when
-the catalog name doesn't match the current database, but that's mere
-cosmetic work.
-
-In entry level SQL 92, a "schema" is essentially the same as table
-ownership. You can execute the command CREATE SCHEMA AUTHORIZATION
-"peter", which means that user "peter" (where he came from is
-"implementation-defined") can now create tables under his name. There is
-no such thing as a table owner, there's the "containing schema" and its
-owner. The tables "peter" creates can then be referenced by the dotted
-notation. But it is not correct to equate this with CREATE USER. Even if
-there was no schema for "peter" he could still connect and query other
-people's tables.
-
-Moving beyond SQL 92 you can also create schemas with a different name
-than your user name. This is merely a little more naming flexibility.
-
-
---
-Peter Eisentraut Sernanders väg 10:115
-http://yi.org/peter-e/ Sweden
-
-
-Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA00202
- for
; Thu, 29 Jun 2000 19:25:39 -0400 (EDT)
-Received: from regulus.student.UU.SE ([130.238.5.2]:52854 "EHLO
- regulus.its.uu.se") by merganser.its.uu.se with ESMTP
- id ; Fri, 30 Jun 2000 01:25:27 +0200
-Received: from peter (helo=localhost)
- by regulus.its.uu.se with local-esmtp (Exim 3.02 #2)
- id 137nnA-00023q-00; Fri, 30 Jun 2000 01:32:20 +0200
-Date: Fri, 30 Jun 2000 01:32:20 +0200 (CEST)
-To: Tom Lane
-cc: "Mikheev, Vadim" ,
- "'Hiroshi Inoue'" ,
- Thomas Lockhart ,
- Bruce Momjian
, Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-MIME-Version: 1.0
-Content-Type: TEXT/PLAIN; charset=ISO-8859-1
-Content-Transfer-Encoding: 8BIT
-Sender: Peter Eisentraut
-Status: RO
-
-Tom Lane writes:
-
-> You can put *user* tables from more than one database into a table space.
-> The restriction is just on *system* tables.
-
-I think my understanding as a user would be that a table space represents
-a storage location. If I want to put a table/object/entire database on a
-fancy disk somewhere I create a table space for it there. But if I want to
-store all my stuff under /usr/local/pgsql/data then I wouldn't expect to
-have to create more than one table space. So the table spaces become at
-that point affected by the logical hierarchy: I must make sure to have
-enough table spaces to have many databases.
-
-More specifically, what would the user interface to this look like?
-Clearly there has to be some sort of CREATE TABLESPACE command. Now does
-CREATE DATABASE imply a CREATE TABLESPACE? I think not. Do you have to
-create a table space before creating each database? I think not.
-
-> We could avoid it along the lines you suggest (name table files like
-> DBOID.RELOID.VERSION instead of just RELOID.VERSION) but is it really
-> worth it?
-
-I only intended that for pg_class and other bootstrap-sort-of tables,
-maybe all system tables. Normal heap files could look like RELOID.VERSION,
-whereas system tables would look like "name.DBOID". Clearly there's no
-market for renaming system tables or dropping any of their columns. We're
-obviously going to have to treat pg_class special anyway.
-
-> Vadim's concerned about every byte that has to go into the WAL log,
-> and I think he's got a good point.
-
-True. But if you only do it for the system tables then it might take less
-space than keeping track of lots of table spaces that are unneeded. :-)
-
-
---
-Peter Eisentraut Sernanders väg 10:115
-http://yi.org/peter-e/ Sweden
-
-
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA00852
- for
; Thu, 29 Jun 2000 20:12:38 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5TNwm184774;
- Thu, 29 Jun 2000 19:58:48 -0400 (EDT)
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5TNvD180670
- for
; Thu, 29 Jun 2000 19:57:14 -0400 (EDT)
-Received: from tpf.co.jp ([126.0.1.56] (may be forged))
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with ESMTP
- id IAA04081; Fri, 30 Jun 2000 08:56:46 +0900
-Date: Fri, 30 Jun 2000 08:59:49 +0900
-From: Hiroshi Inoue
-X-Mailer: Mozilla 4.73 [ja] (Windows NT 5.0; U)
-X-Accept-Language: ja
-MIME-Version: 1.0
-CC: Zeugswetter Andreas SB ,
- "'Mikheev, Vadim'" ,
-Subject: Re: AW: [HACKERS] Big 7.1 open items
-Content-Type: text/plain; charset=iso-2022-jp
-Content-Transfer-Encoding: 7bit
-Precedence: bulk
-Status: RO
-
-Peter Eisentraut wrote:
-
-> Hiroshi Inoue writes:
->
-> > According to your another posting,your *database* hierarchy is
-> > instance -> database -> schema -> object
-> > like Oracle.
-> >
-> > However SQL92 seems to have another hierarchy:
-> > cluster -> catalog -> schema -> object
-> > and dot notation catalog.schema.object could be used.
->
-> FYI:
-
-Thanks.
-I'm asking to all what our *DATABASE* is.
-Different from you,I couldn't see any decisive feature in our *DATABASE*.
-
->
->
-> An "instance" is a "cluster". I don't know where the word instance came
-
-I could find the word in Oracle.
-IMHO,it corresponds to our initdb'ed thing(a postmaster controls).
-
->
-> from, the docs sometimes call it "installation" or "site", which is even
-> worse. I have been using "database cluster" for the latest documentation
-> work. My dictionary defines a cluster as "a group of things gathered or
-> occurring closely together", which is what this is. Call it a "data area"
-> or an "initdb'ed thing", etc.
->
-
-SQL92 seems to say that a cluster corresponds to a target of connection
-and has no name(after connection was established). Isn't it same as our
-*DATABASE* ?
-
->
-> A "catalog" can be equated with our "database". The method of creating
-> catalogs is implementation defined, so our CREATE DATABASE command is in
-> perfect compliance with the standard. We don't support the
-> catalog.schema.object notation but that notation only makes sense when you
-> can access more than one catalog at a time.
-
-Yes,it's most essential that we couldn't access more than one catalog.
-This means that we have only one (noname) "catalog" per "cluster".
-
-> We don't allow that and SQL
-> doesn't require it. We could allow that notation and throw an error when
-> the catalog name doesn't match the current database, but that's mere
-> cosmetic work.
->
-> In entry level SQL 92, a "schema" is essentially the same as table
-> ownership. You can execute the command CREATE SCHEMA AUTHORIZATION
-> "peter", which means that user "peter" (where he came from is
-> "implementation-defined") can now create tables under his name. There is
-> no such thing as a table owner, there's the "containing schema" and its
-> owner. The tables "peter" creates can then be referenced by the dotted
-> notation. But it is not correct to equate this with CREATE USER. Even if
-> there was no schema for "peter" he could still connect and query other
-> people's tables.
->
-
-I've used *username* "schema"s in Oracle for a long time but I've never
-thought that it's the essence of "schema". If I recoginze correctly,the
-concept of "catalog" hasn't necessarily been important while "schema"
-= "user". The conflict of "schema" name is equivalent to the conflict
-of "user" name if "schema" = "user". IMHO,SQL92 has required the
-concept of "catalog" because "schema" has been changed to be
-independent of "user".
-
-Anyway in current PG "cluster":"catalog":"schema"=1:1:1(0) and
-our *DATABASE* is an only confusing concept in the hierarchy..
-
-Regards,
-
-Hiroshi Inoue
-
-
-
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA00958
- for
; Thu, 29 Jun 2000 20:42:55 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id UAA02520;
- Thu, 29 Jun 2000 20:43:32 -0400 (EDT)
-cc: "Mikheev, Vadim" ,
- "'Hiroshi Inoue'" ,
- Thomas Lockhart ,
- Bruce Momjian
, Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Peter Eisentraut
- message dated "Fri, 30 Jun 2000 01:32:20 +0200"
-Date: Thu, 29 Jun 2000 20:43:32 -0400
-From: Tom Lane
-Status: RO
-
-Peter Eisentraut
writes:
-> Tom Lane writes:
->> You can put *user* tables from more than one database into a table space.
->> The restriction is just on *system* tables.
-
-> More specifically, what would the user interface to this look like?
-> Clearly there has to be some sort of CREATE TABLESPACE command. Now does
-> CREATE DATABASE imply a CREATE TABLESPACE? I think not. Do you have to
-> create a table space before creating each database? I think not.
-
-I would say that CREATE DATABASE just implicitly creates a new
-tablespace that's physically located right under the toplevel data
-directory of the installation, no symlink. What's wrong with that?
-You need not keep anything except the system tables of the DB there
-if you don't want to. In practice, for someone who doesn't need to
-worry about tablespaces (because they put the installation on a disk
-with enough room for their purposes), the whole thing acts exactly
-the same as it does now.
-
->> We could avoid it along the lines you suggest (name table files like
->> DBOID.RELOID.VERSION instead of just RELOID.VERSION) but is it really
->> worth it?
-
-> I only intended that for pg_class and other bootstrap-sort-of tables,
-> maybe all system tables. Normal heap files could look like RELOID.VERSION,
-> whereas system tables would look like "name.DBOID".
-
-That would imply that the very bottom levels of the system know all
-about which tables are system tables and which are not (and, if you
-are really going to insist on the "name" part of that, that they
-know what name goes with each system-table OID). I'd prefer to avoid
-that. The less the smgr knows about the upper levels of the system,
-the better.
-
-> Clearly there's no market for renaming system tables or dropping any
-> of their columns.
-
-No, but there is a market for compacting indexes on system relations,
-and I haven't heard a good proposal for doing index compaction in place.
-So we need versioning for system indexes.
-
->> Vadim's concerned about every byte that has to go into the WAL log,
->> and I think he's got a good point.
-
-> True. But if you only do it for the system tables then it might take less
-> space than keeping track of lots of table spaces that are unneeded. :-)
-
-Again, WAL should not need to distinguish system and user tables.
-
-And as for the keeping track, the tablespace OID will simply replace the
-database OID in the log and in the smgr interfaces. There's no "extra"
-cost, except maybe by comparison to a system with neither tablespaces
-nor multiple databases.
-
- regards, tom lane
-
-Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236])
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA02996
- for
; Sat, 1 Jul 2000 10:39:10 -0400 (EDT)
-Received: from regulus.student.UU.SE ([130.238.5.2]:50862 "EHLO
- regulus.its.uu.se") by merganser.its.uu.se with ESMTP
- id ; Sat, 1 Jul 2000 16:56:49 +0200
-Received: from peter (helo=localhost)
- by regulus.its.uu.se with local-esmtp (Exim 3.02 #2)
- id 138Oo3-0003UQ-00; Sat, 01 Jul 2000 17:03:43 +0200
-Date: Sat, 1 Jul 2000 17:03:42 +0200 (CEST)
-To: Tom Lane
-cc: "Mikheev, Vadim" ,
- "'Hiroshi Inoue'" ,
- Thomas Lockhart ,
- Bruce Momjian
, Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-MIME-Version: 1.0
-Content-Type: TEXT/PLAIN; charset=ISO-8859-1
-Content-Transfer-Encoding: 8BIT
-Sender: Peter Eisentraut
-Status: RO
-
-Tom Lane writes:
-
-> In practice, for someone who doesn't need to worry about tablespaces
-> (because they put the installation on a disk with enough room for
-> their purposes), the whole thing acts exactly the same as it does now.
-
-But I'd venture the guess that for someone who wants to use tablespaces it
-wouldn't work as expected. Table spaces should represent a physical
-storage location. Creation of table spaces should be a restricted
-operation, possibly more than, but at least differently from, databases.
-Eventually, table spaces probably will have attributes, such as
-optimization parameters (random_page_cost). This will not work as expected
-if you intermix them with the databases.
-
-I'd expect that if I have three disks and 50 databases, then I make three
-tablespaces and assign the databases to them. I'll bet lunch that if we
-don't do it that way that before long people will come along and ask for
-something that does work this way.
-
-
---
-Peter Eisentraut Sernanders väg 10:115
-http://yi.org/peter-e/ Sweden
-
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA03777
- for
; Sat, 1 Jul 2000 13:21:38 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e61He8S63312;
- Sat, 1 Jul 2000 13:40:08 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by hub.org (8.10.1/8.10.1) with ESMTP id e61Hd7S58820
- for
; Sat, 1 Jul 2000 13:39:07 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id NAA22822;
- Sat, 1 Jul 2000 13:37:21 -0400 (EDT)
-cc: "Mikheev, Vadim" ,
- "'Hiroshi Inoue'" ,
- Thomas Lockhart ,
- Bruce Momjian
, Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-Comments: In-reply-to Peter Eisentraut
- message dated "Sat, 01 Jul 2000 17:03:42 +0200"
-Date: Sat, 01 Jul 2000 13:37:21 -0400
-From: Tom Lane
-Precedence: bulk
-Status: RO
-
-Peter Eisentraut
writes:
-> I'd expect that if I have three disks and 50 databases, then I make three
-> tablespaces and assign the databases to them.
-
-In our last installment, you were complaining that you didn't want to
-be bothered with that ;-)
-
-But I don't see any reason why CREATE DATABASE couldn't take optional
-parameters indicating where to create the new DB's default tablespace.
-We already have a LOCATION option for it that does something close to
-that.
-
-Come to think of it, it would probably make sense to adapt the existing
-notion of "location" (cf initlocation script) into something meaning
-"directory that users are allowed to create tablespaces (including
-databases) in". If there were an explicit table of allowed locations,
-it could be used to address the protection issues you raise --- for
-example, a location could be restricted so that only some users could
-create tablespaces/databases in it. $PGDATA/data would be just the
-first location in every installation.
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA14294
- for
; Sun, 2 Jul 2000 11:16:51 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e62FGqS51200;
- Sun, 2 Jul 2000 11:16:52 -0400 (EDT)
-Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236])
- by hub.org (8.10.1/8.10.1) with ESMTP id e62FGaS50925
- for
; Sun, 2 Jul 2000 11:16:36 -0400 (EDT)
-Received: from regulus.student.UU.SE ([130.238.5.2]:52424 "EHLO
- regulus.its.uu.se") by merganser.its.uu.se with ESMTP
- id ; Sun, 2 Jul 2000 17:15:57 +0200
-Received: from peter (helo=localhost)
- by regulus.its.uu.se with local-esmtp (Exim 3.02 #2)
- id 138lZz-0001VD-00; Sun, 02 Jul 2000 17:22:43 +0200
-Date: Sun, 2 Jul 2000 17:22:43 +0200 (CEST)
-To: Tom Lane
-cc: "Mikheev, Vadim" ,
- "'Hiroshi Inoue'" ,
- Thomas Lockhart ,
- Bruce Momjian
, Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: Re: [HACKERS] Big 7.1 open items
-MIME-Version: 1.0
-Content-Type: TEXT/PLAIN; charset=ISO-8859-1
-Content-Transfer-Encoding: 8BIT
-Precedence: bulk
-Status: RO
-
-Tom Lane writes:
-
-> Come to think of it, it would probably make sense to adapt the existing
-> notion of "location" (cf initlocation script) into something meaning
-> "directory that users are allowed to create tablespaces (including
-> databases) in".
-
-This is what I've been trying to push all along. But note that this
-mechanism does allow multiple databases per location. :)
-
-
---
-Peter Eisentraut Sernanders väg 10:115
-http://yi.org/peter-e/ Sweden
-
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id EAA16088
- for
; Mon, 3 Jul 2000 04:30:05 -0400 (EDT)
-Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id EAA19031 for
; Mon, 3 Jul 2000 04:30:07 -0400 (EDT)
-Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
- by gandalf.it-austria.net (xxx/xxx) with ESMTP id KAA28416;
- Mon, 3 Jul 2000 10:28:06 +0200
-Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
- id ; Mon, 3 Jul 2000 10:28:06 +0200
-Message-ID: <219F68D65015D011A8E000006F8590C605BA59B0@sdexcsrv1.f000.d0188.sd.spardat.at>
-From: Zeugswetter Andreas SB
-To: "'Hiroshi Inoue'" ,
- Peter Eisentraut
-Cc: Bruce Momjian
, Jan Wieck ,
- PostgreSQL-development
,
- "Ross J. Reedstrom"
-Subject: AW: [HACKERS] Big 7.1 open items
-Date: Mon, 3 Jul 2000 10:28:05 +0200
-MIME-Version: 1.0
-X-Mailer: Internet Mail Service (5.5.2448.0)
-Content-Type: text/plain;
- charset="windows-1252"
-Status: RO
-
-
-> > > > > In my mind the point of the "database" concept is to
-> > > provide a domain
-> > > > > within which custom datatypes and functions are available.
-> > > >
-> > >
-> > > AFAIK few users understand it and many users have wondered
-> > > why we couldn't issue cross "database" queries.
-> >
-> > Imho the same issue is access to tables on another machine.
-> > If we "fix" that, access to another db on the same instance is just
-> > a variant of the above.
-> >
->
-> What is a difference between SCHAMA and your "database" ?
-> I myself am confused about them.
-
-"my *database*" corresponds to the current database, which is created with
-"create database" in postgresql. It corresponds to the catalog concept in
-SQL99.
-
-The schema is below the database. Access to different schemas with one
-connection
-is mandatory. Access to different catalogs (databases) with one connection
-is not mandatory,
-but should imho be solved analogous to access to another catalog on a
-different
-(SQL99) cluster. This would be a very nifty feature.
-
-Andreas
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA02116
- for
; Fri, 16 Jun 2000 14:55:13 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id NAA21581 for
; Fri, 16 Jun 2000 13:53:58 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5GHpqN06086;
- Fri, 16 Jun 2000 13:51:52 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5GHpcN05946
- for
; Fri, 16 Jun 2000 13:51:39 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id NAA07945
- for
; Fri, 16 Jun 2000 13:51:38 -0400 (EDT)
-Subject: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generated filename
-Date: Fri, 16 Jun 2000 13:51:37 -0400
-From: Tom Lane
-Precedence: bulk
-Status: RO
-
-After further thought I think there's a lot of merit in Hiroshi's
-opinion that physical file names should not be tied to relation OID.
-If we use a separately generated value for the file name, we can
-solve a lot of problems pretty nicely by means of "table versioning".
-
-For example: VACUUM can't compact indexes at the moment, and what it
-does do (scan the index and delete unused entries) is really slow.
-The right thing to do is for it to generate an all-new index file,
-but how do we do that without creating a risk of leaving the index
-corrupted if we crash partway through? The answer is to build the
-new index in a new physical file. But how do we install the new
-file as the real index atomically, when it might span multiple
-segments? If the physical file name is decoupled from the relation's
-name *and* OID then there is no problem: the atomic event that makes
-the new file(s) the real table contents is the commit of the new
-pg_class row with the new value for the physical filename.
-
-Aside from possible improvements in VACUUM, this would let us do a
-robust implementation of CLUSTER, and we could do the "really change
-the table" variant of ALTER TABLE DROP COLUMN the same way if anyone
-wants to do it.
-
-The only cost is that we need an additional column in pg_class to
-hold the physical file name. That's not so bad, especially when
-you remember that we'd surely need to add something to pg_class for
-tablespace support anyway.
-
-If we bite that bullet, then we could also do something to satisfy
-Bruce about having legible file names ;-). The column in pg_class
-could perfectly well be a string, not a pure number, and that means
-that we can throw in the relname (truncated to fit of course). So
-the thing would act a lot like the original-relname-plus-OID variant
-that's been discussed so far. (Original relname because ALTER TABLE
-RENAME would *not* change the physical file name. But we could
-think about a form of VACUUM that creates a whole new table by
-versioning, and that would presumably bring the physical name back
-in sync with the logical relname.)
-
-Here is a sketch of a concrete proposal. I see no need to have
-separate pg_class columns for tablespace and physical relname;
-instead, I suggest there be a column of type NAME that is the
-file pathname (relative to the database directory). Further,
-instead of the existing convention of appending .N to the base
-file name to make extension segment names, I propose that we
-always have a segment number in the physical file name, and that
-the pg_class entry be required to contain a "%d" somewhere that
-indicates where. The actual filename is manufactured by
- sprintf(tempbuf, value_from_pg_class_column, segment_number);
-
-As an example, the arrangement I was suggesting earlier today
-about segments in different subdirectories of a tablespace
-could be implemented by assigning physical filenames like
-
- tablespace/%d/12345_relname
-
-where the 12345 is a value generated separately from the table's OID.
-(We would still use the OID counter to produce these numbers, and
-in fact there's no reason not to use the table's OID as the initial
-unique ID for the physical filename. The point is just that the
-physical filename doesn't have to remain forever equal to the
-relation's OID.)
-
-If we use type NAME for this string then the tablespace part of the path
-would have to be kept to no more than ~ 15 characters, but that seems
-workable enough. (Anybody who really didn't like that could recompile
-with larger NAMEDATALEN. Doesn't seem worth inventing a separate type.)
-
-As Hiroshi pointed out, one of the best aspects of this approach
-is that the physical table layout policy doesn't have to be hard-wired
-into low-level file access routines. The low-level routines don't
-need to know much of anything about the format of the pathname,
-they just stuff in the right segment number and use the name. The
-layout policy need only be known to one single routine that generates
-the strings that go into pg_class. So it'd be really easy to change.
-
-One thing we'd have to work out is that the critical system tables
-(eg, pg_class itself, as well as its indexes) would have to have
-predictable physical names. Otherwise there's no way for a new
-backend to bootstrap itself up ... it can't very well read pg_class
-to find out where pg_class is. A brute-force solution is to forbid
-reversioning of the critical tables, but I suspect we can find a
-less restrictive answer.
-
-This seems like it'd satisfy all the concerns that have been raised.
-Comments?
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA07796
- for
; Fri, 16 Jun 2000 21:30:58 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id VAA26393 for
; Fri, 16 Jun 2000 21:16:37 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5H1EeM94683;
- Fri, 16 Jun 2000 21:14:40 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5H1D0M94365
- for
; Fri, 16 Jun 2000 21:13:00 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id VAA10209;
- Fri, 16 Jun 2000 21:12:30 -0400 (EDT)
-To: Chris Bitmead
-Subject: Re: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generated filename
-Comments: In-reply-to Chris Bitmead
- message dated "Sat, 17 Jun 2000 10:50:10 +1000"
-Date: Fri, 16 Jun 2000 21:12:29 -0400
-From: Tom Lane
-Precedence: bulk
-Status: RO
-
-Chris Bitmead writes:
-> At least on UNIX, couldn't you use a hard-link and change the name in
-> pg_class immediately? Let the brain-dead operating systems use the
-> vacuum method.
-
-Hmm ... maybe, but it doesn't seem worth the portability headache to
-me. We do have an NT port that we don't want to break, and I don't
-think RENAME TABLE is worth the trouble of testing/supporting two
-implementations.
-
-Even on Unix, aren't there filesystems that don't do hard links?
-Not that I'd recommend running Postgres on such a volume, but...
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id GAA22194
- for
; Sat, 17 Jun 2000 06:01:02 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id FAA21836 for
; Sat, 17 Jun 2000 05:39:21 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5H9bSM88777;
- Sat, 17 Jun 2000 05:37:28 -0400 (EDT)
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5H9anM88603
- for
; Sat, 17 Jun 2000 05:36:49 -0400 (EDT)
-Received: from mcadnote1 (ppm130.noc.fukui.nsk.ne.jp [210.161.188.49])
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
- id SAA08384; Sat, 17 Jun 2000 18:36:00 +0900
-From: "Hiroshi Inoue"
-To: "Tom Lane"
-Subject: RE: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generated filename
-Date: Sat, 17 Jun 2000 18:38:53 +0900
-Message-ID:
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="iso-2022-jp"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700
-Importance: Normal
-Precedence: bulk
-Status: RO
-
-> -----Original Message-----
-> Behalf Of Tom Lane
->
-> After further thought I think there's a lot of merit in Hiroshi's
-> opinion that physical file names should not be tied to relation OID.
-> If we use a separately generated value for the file name, we can
-> solve a lot of problems pretty nicely by means of "table versioning".
->
-> For example: VACUUM can't compact indexes at the moment, and what it
-> does do (scan the index and delete unused entries) is really slow.
-> The right thing to do is for it to generate an all-new index file,
-> but how do we do that without creating a risk of leaving the index
-> corrupted if we crash partway through? The answer is to build the
-> new index in a new physical file. But how do we install the new
-> file as the real index atomically, when it might span multiple
-> segments? If the physical file name is decoupled from the relation's
-> name *and* OID then there is no problem: the atomic event that makes
-> the new file(s) the real table contents is the commit of the new
-> pg_class row with the new value for the physical filename.
->
-> Aside from possible improvements in VACUUM, this would let us do a
-> robust implementation of CLUSTER, and we could do the "really change
-> the table" variant of ALTER TABLE DROP COLUMN the same way if anyone
-> wants to do it.
->
-
-Yes,I've wondered how do we implement column_is_really_dropped
-ALTER TABLE DROP COLUMN feature without this kind of mechanism.
-
-> The only cost is that we need an additional column in pg_class to
-> hold the physical file name. That's not so bad, especially when
-> you remember that we'd surely need to add something to pg_class for
-> tablespace support anyway.
->
-> If we bite that bullet, then we could also do something to satisfy
-> Bruce about having legible file names ;-). The column in pg_class
-> could perfectly well be a string, not a pure number, and that means
-> that we can throw in the relname (truncated to fit of course). So
-> the thing would act a lot like the original-relname-plus-OID variant
-> that's been discussed so far. (Original relname because ALTER TABLE
-> RENAME would *not* change the physical file name. But we could
-> think about a form of VACUUM that creates a whole new table by
-> versioning, and that would presumably bring the physical name back
-> in sync with the logical relname.)
->
-> As Hiroshi pointed out, one of the best aspects of this approach
-> is that the physical table layout policy doesn't have to be hard-wired
-> into low-level file access routines. The low-level routines don't
-> need to know much of anything about the format of the pathname,
-> they just stuff in the right segment number and use the name. The
-> layout policy need only be known to one single routine that generates
-> the strings that go into pg_class. So it'd be really easy to change.
->
-
-Ross's approach is fundamentally same though he is using relname+OID
-naming rule. I've said his trial is most practical one.
-
-> One thing we'd have to work out is that the critical system tables
-> (eg, pg_class itself, as well as its indexes) would have to have
-> predictable physical names.
-
-The only limitation of the relation filename is the uniqueness.
-So it doesn't introduce any inconsistency that system tables
-have fixed name.
-As for system relations it wouldn't be so bad because CLUSTER/
-ALTER TABLE DROP COLUMN ... would be unnecessary(maybe).
-But as for system indexes,it is preferable that VACUUM/REINDEX
-could rebuild them safely. System indexes never shrink currently.
-
-Regards.
-
-Hiroshi Inoue
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id JAA24004
- for
; Sat, 17 Jun 2000 09:01:23 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id IAA28633 for
; Sat, 17 Jun 2000 08:57:47 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5HCtxM77095;
- Sat, 17 Jun 2000 08:55:59 -0400 (EDT)
-Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5HCtoM77026
- for
; Sat, 17 Jun 2000 08:55:50 -0400 (EDT)
-Received: from regulus.student.UU.SE ([130.238.5.2]:57716 "EHLO
- regulus.its.uu.se") by merganser.its.uu.se with ESMTP
- id ; Sat, 17 Jun 2000 14:55:25 +0200
-Received: from peter (helo=localhost)
- by regulus.its.uu.se with local-esmtp (Exim 3.02 #2)
- id 133IET-0002Y3-00; Sat, 17 Jun 2000 15:01:53 +0200
-Date: Sat, 17 Jun 2000 15:01:53 +0200 (CEST)
-To: Tom Lane
-Subject: Re: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generated
- filename
-MIME-Version: 1.0
-Content-Type: TEXT/PLAIN; charset=ISO-8859-1
-Content-Transfer-Encoding: 8BIT
-Precedence: bulk
-Status: RO
-
-Tom Lane writes:
-
-> tablespace/%d/12345_relname
-
-Throwing table spaces and relation names into one pot doesn't excite me
-very much. For example, before long people will want to
-
-* Query what tables are in what space (without using string operations)
-Consider for example creating a new table and choosing where to put it.
-
-* Rename table spaces
-
-* Assign attributes of some sort to table spaces (permissions, etc.)
-
-* Use table space names with more than 15 characters. :)
-
-Somehow table spaces need to be catalogued. You could still make the
-physical file name 'tablespaceoid/rest' without actually having to look up
-anything, although that depends on your symlink idea which is still under
-discussion.
-
-Then, why are all nth segments of tables in one directory in that
-proposal?
-
-Also, you said before that an old relname (after rename) is worse than
-none at all. I couldn't agree more.
-
-Why not use OID.[SEGMENT.]VERSION for the physical relname (different
-order possible)? That way you at least have some guaranteed correspondence
-between files and tables. Version could probably be an INT2, so you save
-some space.
-
-
---
-Peter Eisentraut Sernanders väg 10:115
-http://yi.org/peter-e/ Sweden
-
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA02801
- for
; Sat, 17 Jun 2000 12:31:10 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id MAA07848 for
; Sat, 17 Jun 2000 12:27:14 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5HGPJM95074;
- Sat, 17 Jun 2000 12:25:19 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5HGP1M94990
- for
; Sat, 17 Jun 2000 12:25:01 -0400 (EDT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss2.sss.pgh.pa.us (8.9.3/8.9.3) with ESMTP id MAA18939;
- Sat, 17 Jun 2000 12:24:56 -0400 (EDT)
-To: "Hiroshi Inoue"
-Subject: Re: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generated filename
-In-reply-to:
-References:
-Comments: In-reply-to "Hiroshi Inoue"
- message dated "Sat, 17 Jun 2000 18:38:53 +0900"
-Date: Sat, 17 Jun 2000 12:24:56 -0400
-From: Tom Lane
-Precedence: bulk
-Status: RO
-
-"Hiroshi Inoue" writes:
->> One thing we'd have to work out is that the critical system tables
->> (eg, pg_class itself, as well as its indexes) would have to have
->> predictable physical names.
-
-> The only limitation of the relation filename is the uniqueness.
-> So it doesn't introduce any inconsistency that system tables
-> have fixed name.
-> As for system relations it wouldn't be so bad because CLUSTER/
-> ALTER TABLE DROP COLUMN ... would be unnecessary(maybe).
-> But as for system indexes,it is preferable that VACUUM/REINDEX
-> could rebuild them safely. System indexes never shrink currently.
-
-Right, it's the index-shrinking business that has me worried.
-Most of the other reasons for swapping in a new file don't apply
-to system tables, but that one does.
-
-One possibility is to say that system *tables* can't be reversioned
-(at least not the critical ones) but system *indexes* can be.
-Then we'd have to use your ignore-system-indexes stuff during backend
-startup, until we'd found out where the indexes are. Might be too big
-a time penalty however... not sure. Shared cache inval of a system
-index could be a little tricky too; I don't think the catcache routines
-are prepared to fall back to non-index scan are they?
-
-On the whole it might be better to cheat by using a side data structure
-like the pg_internal.init file, that a backend could consult to find out
-where the indexes are now.
-
- regards, tom lane
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA08740
- for
; Sun, 18 Jun 2000 17:31:02 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id RAA18332 for
; Sun, 18 Jun 2000 17:21:51 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5ILJcM11720;
- Sun, 18 Jun 2000 17:19:38 -0400 (EDT)
-Received: from merganser.its.uu.se (merganser.its.uu.se [130.238.6.236])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5ILILM09628
- for
; Sun, 18 Jun 2000 17:18:21 -0400 (EDT)
-Received: from regulus.student.UU.SE ([130.238.5.2]:40239 "EHLO
- regulus.its.uu.se") by merganser.its.uu.se with ESMTP
- id ; Sun, 18 Jun 2000 23:17:49 +0200
-Received: from peter (helo=localhost)
- by regulus.its.uu.se with local-esmtp (Exim 3.02 #2)
- id 133mYM-0000Ns-00; Sun, 18 Jun 2000 23:24:26 +0200
-Date: Sun, 18 Jun 2000 23:24:26 +0200 (CEST)
-To: Tom Lane
-cc: PostgreSQL Development
-Subject: Re: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generated
- filename
-MIME-Version: 1.0
-Content-Type: TEXT/PLAIN; charset=ISO-8859-1
-Content-Transfer-Encoding: 8BIT
-Precedence: bulk
-Status: RO
-
-Tom Lane writes:
-
-> I don't think it's a good idea to have to consult pg_tablespace to find
-> out where a table actually is --- I think the pathname (or smgr access
-> token as Ross would call it ;-)) ought to be determinable from just the
-> pg_class entry.
-
-That's why I suggested the table space oid. That would be readily
-available from pg_class.
-
-
-> Tablespaces can have logical names stored in pg_tablespace; they just
-> can't contribute more than a dozen or so characters to file pathnames
-> under the implementation I'm proposing. That doesn't seem too
-> unreasonable; the pathname part can be some sort of abbreviated name.
-
-Since the abbreviated name is really only used internally it might as well
-be the oid. Otherwise you create a weird functional dependency like the
-pg_shadow.usesysid field that's just an extra layer of maintenance.
-
-
-> this implementation mechanism will support either policy choice ---
-> original relname in the filename, or just a numeric ID for the
-> filename
-
-But when you look at a file name `12345_accounts_recei' you know neither
-
-* whether the table name was really `accounts_recei' or whether the name
-was truncated
-
-* whether the table still has that name, whatever it was
-
-* what table this is at all
-
-So in the aggregate you really know less than nothing. :-)
-
-
-> > Why not use OID.[SEGMENT.]VERSION for the physical relname (different
-> > order possible)?
->
-> Doesn't give you a manageable way to split segments across different
-> disks.
-
-Okay, so maybe ${base}/TABLESPACEOID/SEGMENT/RELOID.VERSION.
-
-This doesn't need any catalog lookup outside of pg_class, yet it's still
-easy to resolve to human-readable names by simple admin tools (SELECT *
-FROM pg_foo WHERE oid = xxx). VERSION would be unique within a conceptual
-relation, so you could even see how many times the relation was altered in
-major ways (kind of).
-
-
---
-Peter Eisentraut Sernanders väg 10:115
-http://yi.org/peter-e/ Sweden
-
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA20523
- for
; Sun, 18 Jun 2000 20:31:02 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id UAA25719 for
; Sun, 18 Jun 2000 20:26:49 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5J0OLM53050;
- Sun, 18 Jun 2000 20:24:21 -0400 (EDT)
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5J0NmM50883
- for
; Sun, 18 Jun 2000 20:23:49 -0400 (EDT)
-Received: from cadzone ([126.0.1.40] (may be forged))
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
- id JAA09003; Mon, 19 Jun 2000 09:22:45 +0900
-From: "Hiroshi Inoue"
-To: "Chris Bitmead" , "Tom Lane"
-Cc: "Peter Eisentraut"
,
-Subject: RE: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generated filename
-Date: Mon, 19 Jun 2000 09:24:56 +0900
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="ISO-8859-1"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
-Importance: Normal
-Precedence: bulk
-Status: RO
-
-> -----Original Message-----
-> Behalf Of Chris Bitmead
->
-> Tom Lane wrote:
->
-> > > Also, you said before that an old relname (after rename) is worse than
-> > > none at all. I couldn't agree more.
-> >
-> > I'm not the one who wants relnames in the physical names ;-). However,
-> > this implementation mechanism will support either policy choice ---
-> > original relname in the filename, or just a numeric ID for the filename
-> > --- and that seems like a good sign to me.
-> >
-> > > Why not use OID.[SEGMENT.]VERSION for the physical relname (different
-> > > order possible)?
->
-> Unless VERSION is globally unique like an oid is, having RELNAME.VERSION
-> would be a problem if you created a table with the same name as a
-> recently renamed table.
->
-
-In my proposal(relname+unique-id),the unique-id is globally unique
-and relname is only for dba's convenience. I've said many times that
-we should be free from the rule of file naming as far as possible.
-I myself don't mind the name of relation files except that they should
-be globally unique. I had to propose my opinion for file naming
-because people have been so enthusiastic about globally_not_unique
-file naming.
-
-Regards.
-
-Hiroshi Inoue
-
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA07568
- for
; Fri, 16 Jun 2000 21:00:59 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id UAA25354 for
; Fri, 16 Jun 2000 20:54:02 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5H0q3M53458;
- Fri, 16 Jun 2000 20:52:03 -0400 (EDT)
- by hub.org (8.10.1/8.10.1) with ESMTP id e5H0oRM47761
- for
; Fri, 16 Jun 2000 20:50:28 -0400 (EDT)
-Received: from bitmead.com (IDENT:chris@tardis [203.41.180.243])
- by tech.com.au (8.9.3/8.9.3) with ESMTP id KAA21482;
- Sat, 17 Jun 2000 10:50:14 +1000
-Date: Sat, 17 Jun 2000 10:50:10 +1000
-From: Chris Bitmead
-X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.14-5.0 i686)
-X-Accept-Language: en
-MIME-Version: 1.0
-To: Tom Lane
-Subject: Re: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generated
- filename
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Precedence: bulk
-Status: ROr
-
-Tom Lane wrote:
- So
-> the thing would act a lot like the original-relname-plus-OID variant
-> that's been discussed so far. (Original relname because ALTER TABLE
-> RENAME would *not* change the physical file name. But we could
-> think about a form of VACUUM that creates a whole new table by
-> versioning, and that would presumably bring the physical name back
-> in sync with the logical relname.)
-
-At least on UNIX, couldn't you use a hard-link and change the name in
-pg_class immediately? Let the brain-dead operating systems use the
-vacuum method.
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA00789
- for
; Mon, 19 Jun 2000 00:58:34 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5J4qfM87650;
- Mon, 19 Jun 2000 00:52:41 -0400 (EDT)
-Received: from sd.tpf.co.jp (sd.tpf.co.jp [210.161.239.34])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5J4oUM77400
- for
; Mon, 19 Jun 2000 00:50:30 -0400 (EDT)
-Received: from cadzone ([126.0.1.40] (may be forged))
- by sd.tpf.co.jp (2.5 Build 2640 (Berkeley 8.8.6)/8.8.4) with SMTP
- id NAA09265; Mon, 19 Jun 2000 13:50:22 +0900
-From: "Hiroshi Inoue"
-Cc: "PostgreSQL Development"
,
- "Tom Lane"
-Subject: RE: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generatedfilename
-Date: Mon, 19 Jun 2000 13:52:34 +0900
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="ISO-8859-1"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
-X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
-Importance: Normal
-Precedence: bulk
-Status: RO
-
-> -----Original Message-----
-> Behalf Of Peter Eisentraut
->
-> Tom Lane writes:
->
-> > I don't think it's a good idea to have to consult pg_tablespace to find
-> > out where a table actually is --- I think the pathname (or smgr access
-> > token as Ross would call it ;-)) ought to be determinable from just the
-> > pg_class entry.
->
-> That's why I suggested the table space oid. That would be readily
-> available from pg_class.
->
-
-It seems to me that the following 1)2) has always been mixed up.
-IMHO,they should be distinguished clearly.
-
-1) Where the table is stored
- Currently PostgreSQL relies on relname -> filename mapping
- rule to access *existent* relations and doesn't have this
- information in its database. Our(Tom,Ross,me) proposal is to
- keep the information(token) in pg_class and provide a standard
- transactional control mechanism for the change of table file
- allocation. By doing it we would be able to be free from table
- allocation(naming) rule.
- Isn't it a kind of thing why we haven't had it from the first ?
-
-2) Where to store the table
- Yes,TABLE(DATA)SPACE should encapsulate this concept.
-
-I want the decision about 1) first. Ross has already tried it without
-2).
-
-Comments ?
-
-As for 2) every one seems to have each opinion and the discussion
-has always been divergent. Please don't discard 1) together.
-
-Regards.
-
-Hiroshi Inoue
-
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA21409
- for
; Mon, 19 Jun 2000 10:01:18 -0400 (EDT)
-Received: from hub.org (
[email protected] [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.12 $) with ESMTP id JAA05383 for
; Mon, 19 Jun 2000 09:56:59 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e5JDsVM91574;
- Mon, 19 Jun 2000 09:54:31 -0400 (EDT)
-Received: from gandalf.it-austria.net (gandalf.it-austria.net [213.150.1.65])
- by hub.org (8.10.1/8.10.1) with ESMTP id e5JDldM77267
- for
; Mon, 19 Jun 2000 09:48:05 -0400 (EDT)
-Received: from sdexcgtw01.f000.d0188.sd.spardat.at (sdgtw.sd.spardat.at [172.18.1.16])
- by gandalf.it-austria.net (xxx/xxx) with ESMTP id PAA80686;
- Mon, 19 Jun 2000 15:46:24 +0200
-Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2448.0)
- id ; Mon, 19 Jun 2000 15:46:24 +0200
-Message-ID: <219F68D65015D011A8E000006F8590C605BA5978@sdexcsrv1.f000.d0188.sd.spardat.at>
-From: Zeugswetter Andreas SB
-To: "'Tom Lane'"
, Peter Eisentraut
-Subject: AW: [HACKERS] OK, OK, Hiroshi's right: use a seperately-generated
- filename
-Date: Mon, 19 Jun 2000 15:46:22 +0200
-MIME-Version: 1.0
-X-Mailer: Internet Mail Service (5.5.2448.0)
-Content-Type: text/plain;
- charset="iso-8859-1"
-Precedence: bulk
-Status: RO
-
-
-> It's better than *all* segments of tables in one directory, which is
-> what you get if the segment number is just a component of a flat file
-> name. We have to have a better answer than that for people who need
-> to cope with tables bigger than a disk. Perhaps someone can
-> think of a
-> better answer than subdirectory-per-segment-number, but I think that
-> will work well enough; and it doesn't add any complexity for file
-> access.
-
-I do not see this connection between a filesystem and a disk ?
-Modern systems have the ability to join more than one disk into
-one filesystem.
-
-Also if we think about separating large tables into smaller parts
-we imho want something where the optimizer has knowledge
-what data it finds in what part of the table.
-
-Andreas
-
- by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA28153
- for
; Mon, 10 Jul 2000 10:16:06 -0400 (EDT)
-Received: from hub.org (majordom@localhost [127.0.0.1])
- by hub.org (8.10.1/8.10.1) with SMTP id e6AEG5W83419;
- Mon, 10 Jul 2000 10:16:05 -0400 (EDT)
-Received: from corvette.mascari.com (dhcp160176144.columbus.rr.com [24.160.176.144])
- by hub.org (8.10.1/8.10.1) with ESMTP id e6AE7FW63372
- for
; Mon, 10 Jul 2000 10:07:24 -0400 (EDT)
-Received: from mascari.com (ferrari.mascari.com [192.168.2.1])
- by corvette.mascari.com (8.9.3/8.9.3) with ESMTP id KAA10768;
- Mon, 10 Jul 2000 10:03:27 -0400
-Date: Mon, 10 Jul 2000 10:03:54 -0400
-From: Mike Mascari
-Organization: Mascari Development Inc
-X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.5-15 i586)
-X-Accept-Language: en
-MIME-Version: 1.0
-CC: Tom Lane
, Philip Warner ,
- Chris Bitmead ,
-Subject: Re: [HACKERS] Re: [GENERAL] PostgreSQL vs. MySQL
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Precedence: bulk
-Status: ROr
-
-Bruce Momjian wrote:
->
-> > And of course the major problem with *that* is how do you get the
-> > connection request to arrive at a backend that's been prestarted in
-> > the right database? If you don't commit to a database then there's
-> > not a whole lot of prestarting that can be done.
-> >
-> > It occurs to me that this'd get a whole lot more feasible if one
-> > postmaster == one database, which is something we *could* do if we
-> > implemented schemas. Hiroshi's been arguing that the current hard
-> > separation between databases in an installation should be done away
-> > with in favor of schemas, and I'm starting to see his point...
->
-> This is interesting. You believe schema's would allow a pool of
-> backends to connect to any database? That would clearly be a win.
-
-I'm just curious, but did a consensus ever develop on schemas? It
-seemed that the schemas/tablespace thread just ran out of steam.
-For what its worth, I like the idea of:
-
-1. PostgreSQL installation -> SQL cluster of catalogs
-2. PostgreSQL database -> SQL catalog
-3. PostgreSQL schema -> SQL schema
-
-This correlates nicely with the current representation of
-DATABASE. People can run multiple SQL clusters by running
-multiple postmasters on different ports. Today, most people
-achieve a logical separation of data by issuing multiple CREATE
-DATABASE commands. But under the above, most sites would run with
-a single PostgreSQL database (SQL catalog), since:
-
-"Catalogs are named collections of schemas in an SQL-environment"
-
-This would mirror the behavior of Oracle, where most people run
-with a single Oracle SID. The logical separation would be
-achieved with SCHEMA's a level under the current DATABASE (a.k.a.
-catalog). This eliminates the problem of using softlinks and
-creating various subdirectories to mirror *logical* parititioning
-of data. It also alleviates the problem people currently
-encounter when they've built their data model around multiple
-DATABASE's but learn later that they need access to more than one
-simultaneously. Instead, they'll model their design around
-multiple SCHEMA's which exist within a single DATABASE instance.
-
-It seems that the discussion of tablespaces shouldn't be mixed
-with SCHEMA's except to note that a DATABASE (catalog) should
-have a default TABLESPACE whose path matches the current one:
-
-../pgsql/data/base/
-
-Later, users might be able to create a hierarchy of default
-TABLESPACE's where the location of the object is found with logic
-like:
-
-1. Is there a object-specified tablespace?
- (ex: CREATE TABLE payroll IN TABLESPACE...)
-2. Is there a user-specified default tablespace?
- (ex: CREATE USER mike DEFAULT TABLESPACE...)
-2. Is there a schema-specified default tablespace?
- (ex: CREATE SCHEMA accounting DEFAULT TABLESPACE..)
-3. Use the catalog-default tablespace
- (ex: CREATE DATABASE postgres DEFAULT LOCATION '/home/pgsql')
-
-with the last example creating the system tablespace,
-'system_tablespace', with '/home/pgsql' as the location.
-
-Anyways, it seems a consensus should be developed on the whole
-Cluster/Catalog/Schema scenario.
-
-Mike Mascari
-
-Received: from relay1.pair.com (relay1.pair.com [209.68.1.20])
- by candle.pha.pa.us (8.9.0/8.9.0) with SMTP id MAA22644
- for
; Sun, 15 Apr 2001 12:57:06 -0400 (EDT)
-Received: (qmail 16730 invoked from network); 15 Apr 2001 16:56:26 -0000
-Received: from cpe-144-132-70-18.vic.bigpond.net.au (HELO w98) (144.132.70.18)
- by relay1.pair.com with SMTP; 15 Apr 2001 16:56:26 -0000
-X-pair-Authenticated: 144.132.70.18
- "'Hiroshi Inoue'" ,
- "'Ross J. Reedstrom'" ,
- "'Mike Mascari'" , ,
- "'Tom Lane'" ,
- "'Zeugswetter Andreas SB'" ,
- "'The Hermit Hacker'" ,
- "'Don Baccus'" ,
- "'Thomas Lockhart'" ,
- "'Chris Bitmead'" ,
- "'Lamar Owen'" ,
- "'Vadim Mikheev'"
-Subject: Tablespaces - checkout SAP DB
-Date: Mon, 16 Apr 2001 02:56:04 +1000
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="iso-8859-1"
-Content-Transfer-Encoding: 7bit
-X-Priority: 3 (Normal)
-X-MSMail-Priority: Normal
-X-Mailer: Microsoft Outlook CWS, Build 9.0.2416 (9.0.2910.0)
-X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
-Importance: Normal
-Status: RO
-
-Hi everyone,
-
-Sorry about the long To list - this is to everyone I noticed commenting in:
-http://www.postgresql.org/docs/pgsql/doc/TODO.detail/tablespaces
-
-I strongly recommend checkout of approach used in SAP DB:
-
-http://www.sap.com/solutions/technology/sapdb/sap_db_documentation.htm
-
-Their glossy 2 page brochure emphasizes the way they handle
-tablespaces as strongest point for ease of administration:
-
-http://www.sap.com/solutions/technology/sapdb/pdf/50033321.pdf
-
-Directory distribution explained in:
-http://www.sap.com/solutions/technology/sapdb/pdf/directorydistrib_72eng.pdf
-
-Architecture and tablespace/devspace concepts explained in:
-
-http://www.sap.com/solutions/technology/sapdb/pdf/dbmgui_73eng.pdf
-(721K)
-
-A good short overview can be obtained from the Glossary:
-
-http://www.sap.com/solutions/technology/sapdb/sap_db_glossary.htm
-(not .pdf - ordinary html)
-
-vvvvvvv
-data devspace
-
-The user data (tables, indexes) and the SQL catalog are stored in the data
-devspaces. A table or an index needs one page (minimum); a table can use all
-the data devspaces that is the whole database (maximum). A table increases
-or decreases in size automatically without administrative intervention.
-
-As a rule, a database internal striping algorithm distributes the data
-belonging to a table evenly across all the data devspaces. An assignment of
-tables to data devspaces is not possible nor is it necessary.
-
-When installing the database instance you can configure one or more data
-devspaces and while the database is running you can also add new data
-devspaces. The disk storage space defined by all the data devspaces is the
-total size of the database.
-
-devspace
-
-This term denotes a physical disk or part of a physical disk. This can be a
-raw device or a file.
-
-log devspace
-
-What is recorded in a log devspace is all the changes in the contents of the
-database, to enable the contents to be recovered or restored after hardware
-faults. The complete log can consist of a number of devspaces. You can
-define the number of log devspaces required when installing the database
-instance and can add new log devspaces even while the database is operating.
-To ensure that the data on the database is kept safe, you have the option of
-mirroring the log devspace(s) (set parameter LOG_MODE to DUAL).
-
-In log backups the contents of the log devspace(s) is copied to a file and
-the space originally occupied by it is released for log data. The backup
-files are numbered by the system in sequence. The selected size of the
-archive log devspace should therefore be sufficient for all the changes
-occurring between two backups to be recorded there.
-
-serverdb
-
-A Serverdb consists of the system devspace, one or more log devspaces, and
-one or more data devspaces.
-
-For security and performance reasons, each devspace type should be kept on a
-different disk. The log devspaces of a serverdb can also be mirrored to
-obtain a higher degree of availability. The disks used should present
-uniform performance data (especially access speeds) because this is the only
-way that equal usage of the devspaces can be achieved. If necessary, a
-database instance can be expanded by additional data devspaces while the
-database is running.
-
-The devspace usage level of a database instance is therefore a critical
-parameter of database operation and must be monitored. If the data devspaces
-become full, database operation stops. Further data devspaces can be defined
-in this state to allow database operation to continue.
-
-system devspace
-
-The restart information and the mapping of the logical page numbers to
-physical page addresses are administered in the system devspace. The size of
-the system devspace therefore depends directly on the database size and is
-determined by the database kernel.
-^^^^^^^^^^^^^^^^^
-
-Concept of just flexibly assigning space to databases,
-with only two types of space that should be kept on
-different spindlesets, plus the ability to add space
-*while running* is what justifies their claim to much
-easier admin than Oracle.
-
-Many Postgresql sites run with far too few spindles anyway
-and don't have DBAs with a clue what to do with tablespaces.
-Now that SAP DB is also open source, making it easy for them
-could be critically important.
-
-I'm not even subscribed to pgsql-hacker and don't understand
-the internals enough to have any view on whether it's possible
-or how.
-
-But if it is possible to present similar *concepts* to DBAs
-from the "outside", with whatever actually goes on internally,
-that would be really *great*.
-
-Once the internals are done, others could more easily add
-admin tools and documentation comparable to SAP DB. Given
-the overwhelming advantages of PostgreSQL from all other
-points of view, this could be critically important.
-
-I was surprised to find no discussion of comparisons with
-SAP DB and what could be learned from it's source release
-in a quick search of the web site and mailing lists.
-
-Seeya, Albert
-
-
-Received: from server1.pgsql.org (server1.pgsql.org [64.39.15.238])
- by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id f7REVIF27112
- for
; Mon, 27 Aug 2001 10:31:18 -0400 (EDT)
-Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
- by server1.pgsql.org (8.11.6/8.11.6) with ESMTP id f7REVkq86991;
- Mon, 27 Aug 2001 09:31:47 -0500 (CDT)
-Received: from svana.org (svana.org [210.9.66.30])
- by postgresql.org (8.11.3/8.11.4) with ESMTP id f7RDcEf82291
- for
; Mon, 27 Aug 2001 09:38:15 -0400 (EDT)
-Received: from kleptog by svana.org with local (Exim 3.12 #1 (Debian))
- id 15bMal-0000Ac-00; Mon, 27 Aug 2001 23:38:15 +1000
-Date: Mon, 27 Aug 2001 23:38:15 +1000
-From: Martijn van Oosterhout
-Subject: Re: [GENERAL] raw partition
-Reply-To: Martijn van Oosterhout
-MIME-Version: 1.0
-Content-Type: text/plain; charset=us-ascii
-Content-Disposition: inline
-User-Agent: Mutt/1.2.5i
-Precedence: bulk
-Status: OR
-
-> On Mon, Aug 27, 2001 at 12:46:16AM -0700, Jeff Davis wrote:
-> > On Sunday 26 August 2001 09:54 am, you wrote:
-> >
-> > Obviously, if done properly, it couldn't hurt. However, is it really worth
-> > the extra trouble to set it up, and more so, to debug an extra form that disk
->
-> I think it's only a matter of getting rid
-> of file system layer.
-
-But that won't work. Postgres currently stores each table in its own file.
-Thus, to implement raw access postgres would have to implement it's own
-filesystem within the raw partition.
-
-By using the filesystems built into the OS, it can take advantage of
-filesystem smarts already there. No to mention people just being able to use
-normal system commands to view what's there e.g. symlinks to relocate
-tables. I beleive that filesystem technology within the OS will advance much
-faster than anything the postgres developers could come up with.
-
-For example, by running your database on an ext3 partition, all file
-metadata is automatically journalled, with no additional effort from the
-postgres developers. You could even choose to journal all database access
-(though I have no idea how that interacts with WAL).
-
-> > marginal utility for integrated functionality? Consider this: should postgres
-> > be it's own OS; bootable and everything (get rid of all that OS overhead)? I
->
-> file system overhead is all, I think.
-> The only thing I am sure about is that
-> whether pg (and developers) will have to be
-> aware of the disk technology since it is
-> evolving continuosly. Or is there another
-> layer provided by the OS: a layer
-> between physical disk and the filesystem?
-> That layer will have to understand UDMA technology,
-> SCSI technology? I have no idea.
-
-Well, a raw partition provided by the OS would hide such details. However,
-postgres would have to make assumptions about what kind of access patterns
-are optimal. The kernel is in a much better position to make such decisions
-about resource usage. Which is precisly why we have OS's in the first place.
-
-> > oracle allows this behaviour you speak of, but I have never used it. Does
-> > someone have experience (or benchmarks or whatever) with oracle's
-> > implementation?
->
-> I have never used an oracle
-
-I beleive (someone correct me if I'm wrong) that even when used on a
-filesystem, oracle still places all it's tables in a single file i.e. it has
-a filesystem layer builtin. I think that's why it's a clear win for oracle
-because you *are* actually removing a layer.
-
-IMHO it's something postgres should stay well away from.
---
-Martijn van Oosterhout
-http://svana.org/kleptog/
-> It would be nice if someone came up with a certification system that
-> actually separated those who can barely regurgitate what they crammed over
-> the last few weeks from those who command secret ninja networking powers.
-
----------------------------(end of broadcast)---------------------------
-TIP 2: you can get off all lists at once with the unregister command
-
-Return-path:
-Received: from dual.buttafuoco.net (vsat-148-63-214-126.c004.g4.mrt.starband.net [148.63.214.126])
- by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g23KYjM24547
- for
; Sun, 3 Mar 2002 15:34:52 -0500 (EST)
-Received: from buttafuoco.net (dual [127.0.0.1])
- by dual.buttafuoco.net (8.11.2/8.11.2) with ESMTP id g23KYaF05729;
- Sun, 3 Mar 2002 15:34:36 -0500
-From: "Jim Buttafuoco"
-cc: Vadim Mikheev ,
-Subject: Re: [HACKERS] Status of index location patch
-Date: Sun, 3 Mar 2002 15:34:36 -0500
-X-Mailer: Open WebMail 1.62 20020220
-X-OriginatingIP: 192.1.3.22 (jim)
-MIME-Version: 1.0
-Content-Type: text/plain; charset=iso-8859-1
-Status: ORr
-
-Bruce,
-
-I stopped all work on this since people seemed confused about the
-tablespace/location words. I don't think enough of the "core" team likes
-this idea. Am I wrong here? Did I explain the patch good enough?
-
-Please let me know, I still am planning on doing it for internal use. I
-would prefer that it was a standard feature. If you think I should still
-pursue this, let me know what I need to do to get it off the ground.
-
-Thanks for your help
-Jim
-
-
-
-> Jim, do you have an updated patch that you would like applied for 7.3?
->
-> ---------------------------------------------------------------------------
->
-> Jim Buttafuoco wrote:
-> > Vadim,
-> >
-> > I guess I am still confused...
-> >
-> > In dbcommands.c resolve_alt_dbpath() takes the db oid as a argument.
-> > This number is used to "find" the directory where the data files live.
-> > All the patch does is put the indexes into a "db oid"_index directory
-> > instead of "db oid"
-> >
-> >
-> > This is for tables snprintf(ret, len, "%s/base/%u", prefix, dboid);
-> > This is for indexes snprintf(ret, len, "%s/base/%u_index", prefix,
-> > dboid);
-> >
-> > And in catalog.c
-> > tables: sprintf(path, "%s/base/%u/%u", DataDir, rnode.tblNode,
-> > rnode.relNode);
-> > indexes: sprintf(path, "%s/base/%u_index/%u", DataDir,
-> > rnode.tblNode,rnode.relNode);
-> >
-> > Can you explain how I would get the tblNode for an existing database
-> > index files if it doesn't have the same OID as the database entry in
-> > pg_databases.
-> >
-> > Jim
-> >
-> >
-> > > > Just wondering what is the status of this patch. Is seems from
-> > comments
-> > > > that people like the idea. I have also looked in the archives for
-> > other
-> > > > people looking for this kind of feature and have found alot of
-> > interest.
-> > > >
-> > > > If you think it is a good idea for 7.2, let me know what needs to be
-> > > > changed and I will work on it this weekend.
-> > >
-> > > Just change index' dir naming as was already discussed.
-> > >
-> > > Vadim
-> > >
-> > >
-> >
-> >
-> >
-> > ---------------------------(end of broadcast)---------------------------
-> > TIP 2: you can get off all lists at once with the unregister command
-> >
->
-> --
-> Bruce Momjian | http://candle.pha.pa.us
-> + If your life is a hard drive, | 830 Blythe Avenue
-> + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
-
-
-
-
-Return-path:
-Received: from golem.fourpalms.org (www.fourpalms.org [64.3.68.148])
- by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g25E2oY04958
- for
; Tue, 5 Mar 2002 09:02:50 -0500 (EST)
-Received: from fourpalms.org (localhost.localdomain [127.0.0.1])
- by golem.fourpalms.org (Postfix) with ESMTP
- id CACDD1BC83; Tue, 5 Mar 2002 06:02:47 -0800 (PST)
-Date: Tue, 05 Mar 2002 06:02:47 -0800
-From: Thomas Lockhart
-Organization: Yes
-X-Mailer: Mozilla 4.78 [en] (X11; U; Linux 2.4.8-34.1mdksmp i686)
-X-Accept-Language: en
-MIME-Version: 1.0
-To: Tom Lane
-Subject: Re: Storage Location Patch Proposal for V7.3
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-...
-> Forward compatibility to a future tablespace implementation.
-> If we do this, we'll be stuck with supporting this feature set,
-> not to mention this syntax; neither of which have garnered any
-> support from the assembled hackers.
-
-The feature set (in some incarnation) is exactly something we should
-have. "Tablespace" could mean almost anything, since (I recall that) we
-are not slavishly copying the Oracle features having a similar name. The
-syntax (or something similar) seems acceptable to me. I haven't looked
-at the implementation itself.
-
-So, I'll guess that the particular objection to this implementation is
-along the lines of wanting to be able to manage tablespaces/locations as
-a single entity? So that one could issue commands like (forgive the
-syntax) "move tablespace xxx to yyy;" and be able to yank the entire
-contents from one place to another in a single line?
-
-Jim's patches don't explicitly tie the pieces residing in a single
-location together. Is that the objection? In all other respects (and
-perhaps in all respects period) it seems to be a good starting point at
-least.
-
-I know that you have said that you want to look at "tablespaces" for
-7.3. If we get there with a feature set we all find acceptable, then
-great. If we don't, then Jim's subset of features would be great to
-have.
-
-Comments?
-
- - Thomas
-
-Received: from postgresql.org (postgresql.org [64.49.215.8])
- by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g271okY15943
- for
; Wed, 6 Mar 2002 20:50:46 -0500 (EST)
-Received: from postgresql.org (postgresql.org [64.49.215.8])
- by postgresql.org (Postfix) with SMTP
- id 220A3475B48; Wed, 6 Mar 2002 20:49:59 -0500 (EST)
-Received: from dual.buttafuoco.net (vsat-148-63-214-126.c004.g4.mrt.starband.net [148.63.214.126])
- by postgresql.org (Postfix) with ESMTP id 4D925475881
- for
; Wed, 6 Mar 2002 20:44:51 -0500 (EST)
-Received: from buttafuoco.net (dual [127.0.0.1])
- by dual.buttafuoco.net (8.11.2/8.11.2) with ESMTP id g271ihm25853
- for
; Wed, 6 Mar 2002 20:44:43 -0500
-From: "Jim Buttafuoco"
-Subject: [HACKERS] Storage Location / Tablespaces (try 3)
-Date: Wed, 6 Mar 2002 20:44:43 -0500
-X-Mailer: Open WebMail 1.62 20020220
-X-OriginatingIP: 192.1.3.22 (jim)
-MIME-Version: 1.0
-Content-Type: text/plain; charset=iso-8859-1
-Precedence: bulk
-Status: OR
-
-Me again, I have some more details on my storage location patch
-
-
-
-This patch would allow the system admin (DBA) to specify the location of
-databases, tables/indexes and temporary objects (temp tables and temp sort
-space) independent of the database/system default location. This patch would
-replace the current "LOCATION" code.
-
-Please let me know if you have any questions/comments. I would like to see
-this feature make 7.3. I believe it will take about 1 month of coding and
-testing after I get started.
-
-Thanks
-Jim
-
-==============================================================================
-Storage Location Patch (Try 3)
-
-
-(If people like TABLESPACE instead of LOCATION then s/LOCATION/TABLESPACE/g
-below)
-
-
-This patch would add the following NEW commands
-----------------------------------------------------
- CREATE LOCATION name PATH 'dbpath';
- DROP LOCATION name;
-
-where dbpath is any directory that the postgresql backend can write to.
-(I know this is how Oracle works, don't know about the other major db systems)
-
-The following NEW GLOBAL system table would be added.
------------------------------------------------------
-PG_LOCATION
-(
- LOC_NAME name,
- LOC_PATH text -- This should be able to take any path name.
-);
-(initdb would add (PGDATA,'/usr/local/pgsql/data')
-
-The following system tables would need to be modified
------------------------------------------------------
-PG_DATABASE drop datpath
- add DATA_LOC_NAME name or DATA_LOC_OID OID
- add INDEX_LOC_NAME name or INDEX_LOC_OID OID
- add TEMP_LOC_NAME name or TEMP_LOC_OID OID
-PG_CLASS to add LOC_NAME name or LOC_OID OID
-
-DATA_LOC_* and INDEX_LOC_* would default to PGDATA if not specified.
-
-(I like *LOC_NAME better but I believe the rest of the systems tables use OID)
-
-
-The following command syntax would be modified
-------------------------------------------------------
-CREATE DATABASE WITH DATA_LOCATION=XXX INDEX_LOCATION=YYY TEMP_LOCATION=ZZZ
-CREATE TABLE aaa (...) WITH LOCATION=XXX;
-CREATE TABLE bbb (c1 text primary key location CCC) WITH LOCATION=XXX;
-CREATE TABLE ccc (c2 text unique location CCC) WITH LOCATION=XXX;
-CREATE INDEX XXX on SAMPLE (C2) WITH LOCATION BBB;
-
-
-
-Now for an example
-------------------------------------------------------
-First:
- postgresql is installed at /usr/local/pgsql
- userid postgres
- the postgres user also is the owner of /pg01 /pg02 /pg03
-
-the dba executes the following script
-CREATE LOCATION pg01 PATH '/pg01';
-CREATE LOCATION pg02 PATH '/pg02';
-CREATE LOCATION pg03 PATH '/pg03';
-CREATE LOCATION bigdata PATH '/bigdata';
-CREATE LOCATION bigidx PATH '/bigidx';
-\q
-
-PG_LOCATION now has
-pg01 | /pg01
-pg02 | /pg02
-pg03 | /pg03
-bigdata | /bigdata
-bigidx | /bigidx
-
-Now the following command is run
-CREATE DATABASE jim1 WITH DATA_LOCATION='pg01' INDEX_LOCATION='pg02'
-TEMP_LOCATION='pg03'
--- OID of 'jim1' tuple is 1786146
-
-on disk the directories look like this
-/pg01/1786146 <<-- Default DATA Location
-/pg02/1786146 <<-- Default INDEX Location
-/pg03/1786146 <<-- Default Temp Location
-
-All files from the above directories will have symbolic links to
-/usr/local/pgsql/data/base/1786146/
-
-
-
-Now the system will have 1 BIG table that will get its own disk for data and
-its own disk for index
-create table big (a text,b text ..., primary key (a,b) location 'bigidx');
-
-oid of big table is 1786150
-oid of big table primary key index is 1786151
-
-on disk directories look like this
-/bigdata/1786146/1786150
-/bigidx/1786146/1786151
-/usr/local/pgsql/data/base/1786146/1786150 symbolic link to
-/bigdata/1786146/1786150
-/usr/local/pgsql/data/base/1786146/1786151 symbolic link to
-/bigdata/1786146/1786151
-
-
-
-The symbolic links will enable the rest of the software to be location
-independent.
-
-
-
----------------------------(end of broadcast)---------------------------
-TIP 2: you can get off all lists at once with the unregister command
-
-Received: from postgresql.org (postgresql.org [64.49.215.8])
- by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g27NP4Q10967
- for
; Thu, 7 Mar 2002 18:25:05 -0500 (EST)
-Received: from postgresql.org (postgresql.org [64.49.215.8])
- by postgresql.org (Postfix) with SMTP
- id 74CC94761DE; Thu, 7 Mar 2002 17:50:44 -0500 (EST)
-Received: from sss.pgh.pa.us (unknown [192.204.191.242])
- by postgresql.org (Postfix) with ESMTP id 712F0476101
- for
; Thu, 7 Mar 2002 17:47:04 -0500 (EST)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss.pgh.pa.us (8.11.4/8.11.4) with ESMTP id g27MkaS15710;
- Thu, 7 Mar 2002 17:46:41 -0500 (EST)
-cc: "Zeugswetter Andreas SB SD" ,
-Subject: Re: [HACKERS] Storage Location / Tablespaces (try 3)
-Comments: In-reply-to "Jim Buttafuoco"
- message dated "Thu, 07 Mar 2002 16:05:19 -0500"
-Date: Thu, 07 Mar 2002 17:46:36 -0500
-From: Tom Lane
-Precedence: bulk
-Status: OR
-
-"Jim Buttafuoco" writes:
-> My first try passed the tablespace OID arround but someone pointed out the the
-> WAL code doesn't know what the tablespace OID is or what it's location is.
-
-The low-level file access code (including WAL references) names tables
-by two OIDs, which currently are database OID and relfilenode (the
-latter is NOT to be considered equivalent to table OID, even though it
-presently always is equal).
-
-I believe that the correct implementation approach is to revise things
-so that the low-level name of a table is tablespace OID + relfilenode;
-this physical table name would in concept be completely distinct from
-the logical table identification (database OID + table OID). The file
-reference path would become something like
-"$PGDATA/base/tablespaceoid/relfilenode", where tablespaceoid might
-reference a symlink to a directory instead of a plain directory.
-Tablespace management then consists of setting up those symlinks
-correctly, and there is essentially zero impact on the low-level access
-code.
-
-The hard part of this is that we are probably being sloppy in some
-places about the difference between physical and logical table
-identifications. Those places will need to be found and fixed.
-This needs to happen anyway, of course, since the point of introducing
-relfilenode was to allow table versioning, which we still want.
-
-Vadim suggested long ago that bufmgr, smgr, and below should have
-nothing to do with referencing files by relcache entries; they should
-only deal in physical file identifiers. That requires some tedious but
-(in principle) straightforward API changes.
-
-BTW, if tablespaces can be shared by databases then DROP DATABASE
-becomes rather tricky: how do you zap the correct files out of a shared
-tablespace, keeping in mind that you are not logged into the doomed
-database and can't look at its catalogs? The best idea I've seen for
-this so far is:
-
-1. Access path for tables is really
- $PGDATA/base/databaseoid/tablespaceoid/relfilenode.
-(BTW, we could save some work if we chdir'd into
-$PGDATA/base/databaseoid at backend start and then used only relative
-tablespaceoid/relfilenode paths. Right now we tend to use absolute
-paths because the bootstrap code doesn't do that chdir; which seems
-like a stupid solution...)
-
-2. A shared tablespace directory contains a subdirectory for each database
-that has files in the tablespace. Thus, the actual filesystem location
-of a table is something like
-
/databaseoid/relfilenode
-The symlink from a database's $PGDATA/base/databaseoid/ directory to
-the tablespace points at
/databaseoid. The first attempt to
-create a table in a tablespace from a particular database will create
-the hard subdirectory and set up the symlink; or perhaps that should be
-done by an explicit tablespace management operation to "connect" the
-database to the tablespace.
-
-3. To drop a database, we examine the symlinks in its
-$PGDATA/base/databaseoid/ and rm -rf each referenced tablespace
-subdirectory before rm -rf'ing $PGDATA/base/databaseoid.
-
- regards, tom lane
-
----------------------------(end of broadcast)---------------------------
-TIP 2: you can get off all lists at once with the unregister command
-
-Received: from postgresql.org (postgresql.org [64.49.215.8])
- by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g2P7uGa20556
- for
; Mon, 25 Mar 2002 02:56:16 -0500 (EST)
-Received: from postgresql.org (postgresql.org [64.49.215.8])
- by postgresql.org (Postfix) with SMTP id D28B2475B61
- for
; Mon, 25 Mar 2002 02:56:17 -0500 (EST)
-Received: from sss.pgh.pa.us (unknown [192.204.191.242])
- by postgresql.org (Postfix) with ESMTP id EB3244758E9
- for
; Mon, 25 Mar 2002 02:55:54 -0500 (EST)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss.pgh.pa.us (8.11.4/8.11.4) with ESMTP id g2P7toS17527;
- Mon, 25 Mar 2002 02:55:50 -0500 (EST)
-Subject: Re: [GENERAL] Large Object Location in 7.3
-Comments: In-reply-to Bruce Momjian
- message dated "Sun, 24 Mar 2002 14:32:16 -0500"
-Date: Mon, 25 Mar 2002 02:55:50 -0500
-From: Tom Lane
-Precedence: bulk
-Status: OR
-
-> Richard Emberson wrote:
->> I expect (actually hope) to have thousands and thousands of blob/clobs
->> in the db I am designing.
->> I would like such largeobjects to be stored in their own file system.
-
-> Sure, find the oid of pg_largeobject and symlink that to another file
-> system. You need to do that toast table and any indexes for the table
-> too.
-
-If Richard's envisioning more than 1GB of large objects, I don't think
-he's going to be very satisfied with manual symlinking.
-
-This does bring up an interesting point: the tablespace schemes we've
-discussed so far don't allow system catalogs to be moved out of the
-default tablespace for a database. That doesn't bother me for most
-of the system catalogs ... but pg_largeobject seems like it might be
-an exception.
-
- regards, tom lane
-
----------------------------(end of broadcast)---------------------------
-
-Received: from relay2.pgsql.com (host-64-117-225-159.altec1.com [64.117.225.159] (may be forged))
- by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id h4JN4Dv08477
- for
; Mon, 19 May 2003 19:04:14 -0400 (EDT)
-Received: from postgresql.org (unknown [64.117.224.193])
- by relay2.pgsql.com (Postfix) with ESMTP id 3EC59FA439
- for
; Mon, 19 May 2003 18:39:59 -0400 (EDT)
-Received: from localhost (unknown [64.117.224.193])
- by developer.postgresql.org (Postfix) with ESMTP id 7BFFA92617A
- for
; Mon, 19 May 2003 18:39:07 -0400 (EDT)
-Received: from developer.postgresql.org ([64.117.224.193])
- by localhost (developer.postgresql.org [64.117.224.193:10024]) (amavisd-new)
- with ESMTP id 47742-05 for
;
- Mon, 19 May 2003 18:39:01 -0400 (EDT)
-Received: from smxsat1.smxs.net (smxsat1.smxs.net [213.150.10.1])
- by developer.postgresql.org (Postfix) with ESMTP id F30679255EE
- for
; Mon, 19 May 2003 04:46:14 -0400 (EDT)
-Received: from m01x1.s-mxs.net [10.3.55.201]
- by smxsat1.smxs.net
- over TLS secured channel
- with XWall v3.26e ;
- Mon, 19 May 2003 10:46:11 +0200
-Received: from m0102.s-mxs.net [10.3.55.2]
- by m01x1.s-mxs.net
- with XWall v3.26 ;
- Mon, 19 May 2003 10:46:10 +0200
-Received: from m0114.s-mxs.net ([10.3.55.14]) by m0102.s-mxs.net with Microsoft SMTPSVC(5.0.2195.5329);
- Mon, 19 May 2003 10:46:04 +0200
-content-class: urn:content-classes:message
-MIME-Version: 1.0
-Content-Type: text/plain;
- charset="us-ascii"
-Subject: Re: [HACKERS] Feature suggestions (long)
-X-MimeOLE: Produced By Microsoft Exchange V6.0.6434.0
-Date: Mon, 19 May 2003 10:46:04 +0200
-Thread-Topic: [HACKERS] Feature suggestions (long)
-Thread-Index: AcMciTRmS8S5HY34Q62Cd5TpVM44pwBUcmeQ
-From: "Zeugswetter Andreas SB SD"
-To: "Martijn van Oosterhout" ,
-X-OriginalArrivalTime: 19 May 2003 08:46:04.0672 (UTC) FILETIME=[152F7800:01C31DE3]
-X-Virus-Scanned: by amavisd-new
-Precedence: bulk
-Content-Transfer-Encoding: 8bit
-X-MIME-Autoconverted: from quoted-printable to 8bit by candle.pha.pa.us id h4JN4Dv08477
-Status: OR
-
-> Partitions
-> ==========
-
-> Next stage would be teaching the planner. The conditions would be
-> pseudo-constraints on the partitions. Hence if the conditions and the
-> constraints form a non-intersecting set, you can skip that partition
-> altogether.
-
-Make that "normal check constraints", and make the planner consider
-constraints,
-and I think that by itself combined with the current featureset will
-be much more powerful than any of the "partitioning" features out there.
-(This is mainly needed to optimize selects on the big union all view)
-
-Imho if a dba starts to partition, he usually needs to be more involved
-than the average user, so I think he should be able cope with compexity.
-What imho would help, is a tool that generates a suggested rule set,
-indexes
-and actions, which the dba can review and apply. I do not think new SQL
-syntax
-would really help, since that would somehow hide the great existing
-power of
-the rule system. A tool would teach the dba, and empower him to use it.
-
-And yes, creating several smaller tables and adding the appropriate
-rules
-usually makes the VLDB life much easier compared to growing single
-tables into
-the TB range.
-
-Andreas
-
----------------------------(end of broadcast)---------------------------
-TIP 5: Have you checked our extensive FAQ?
-
-http://www.postgresql.org/docs/faqs/FAQ.html
-
-Received: from relay3.pgsql.com (relay3.pgsql.com [64.117.224.149])
- by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id h4KF8Br13455
- for
; Tue, 20 May 2003 11:08:16 -0400 (EDT)
-Received: from postgresql.org (unknown [64.117.224.193])
- by relay3.pgsql.com (Postfix) with ESMTP id 6754111262F2
- for
; Tue, 20 May 2003 15:08:07 +0000 (GMT)
-Received: from localhost (unknown [64.117.224.193])
- by developer.postgresql.org (Postfix) with ESMTP id 5119A924FA2
- for
; Tue, 20 May 2003 11:02:38 -0400 (EDT)
-Received: from developer.postgresql.org ([64.117.224.193])
- by localhost (developer.postgresql.org [64.117.224.193:10024]) (amavisd-new)
- with ESMTP id 79611-01 for
;
- Tue, 20 May 2003 11:02:34 -0400 (EDT)
-Received: from flake.decibel.org (flake.decibel.org [66.143.173.58])
- by developer.postgresql.org (Postfix) with SMTP id C9F22924FA0
- for
; Tue, 20 May 2003 11:02:29 -0400 (EDT)
-Received: (qmail 20461 invoked by uid 1001); 20 May 2003 15:02:24 -0000
-Date: Tue, 20 May 2003 10:02:24 -0500
-From: "Jim C. Nasby"
-To: Martijn van Oosterhout
-cc: Zeugswetter Andreas SB SD ,
-Subject: Re: [HACKERS] Feature suggestions (long)
-MIME-Version: 1.0
-Content-Type: text/plain; charset=us-ascii
-Content-Disposition: inline
-User-Agent: Mutt/1.4.1i
-X-Operating-System: FreeBSD 4.8-RELEASE i386
-X-Distributed: Join the Effort! http://www.distributed.net
-X-Virus-Scanned: by amavisd-new
-Precedence: bulk
-Status: OR
-
-On Tue, May 20, 2003 at 12:40:00AM +1000, Martijn van Oosterhout wrote:
-> Anyway, the general trend seems to be against the idea so I may as well go
-> think of something else :)
-
-I'm disappointed to hear that. Having no way to effectively partition
-data is a real pain in pgsql, and your proposal would adress that. Yes,
-you can build it yourself by creating the view and all the rules by
-hand, but that has a lot of drawbacks:
-
-It's completely PGSQL specific
-It leaves no possibility for performance improvements down the road
-It's a lot of code to write
-You have to manually maintain it all every time you need to add a new
-partition (in your example, at the start of every year).
-
-I don't know what the policies for patches are, but I'd hope that the
-core team would consider adding this functionality, especially since a
-first-round implimentation can be done entirely with rules (or so it
-seems).
-
-I certainly understand that development time is a very limited resource,
-and I'm willing to work on this (though I'm not a C coder). Even if no
-one can commit to this right now, can't it be added to the todo list?
---
-Member: Triangle Fraternity, Sports Car Club of America
-Give your computer some brain candy! www.distributed.net Team #1828
-
-Windows: "Where do you want to go today?"
-Linux: "Where do you want to go tomorrow?"
-FreeBSD: "Are you guys coming, or what?"
-
----------------------------(end of broadcast)---------------------------
-TIP 2: you can get off all lists at once with the unregister command
-
-Received: from relay3.pgsql.com (relay3.pgsql.com [64.117.224.149])
- by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id h4KFU5r15234
- for
; Tue, 20 May 2003 11:30:06 -0400 (EDT)
-Received: from postgresql.org (unknown [64.117.224.193])
- by relay3.pgsql.com (Postfix) with ESMTP id 1A18211260D9
- for
; Tue, 20 May 2003 15:30:02 +0000 (GMT)
-Received: from localhost (unknown [64.117.224.193])
- by developer.postgresql.org (Postfix) with ESMTP id 80D32925003
- for
; Tue, 20 May 2003 11:24:16 -0400 (EDT)
-Received: from developer.postgresql.org ([64.117.224.193])
- by localhost (developer.postgresql.org [64.117.224.193:10024]) (amavisd-new)
- with ESMTP id 85756-06 for
;
- Tue, 20 May 2003 11:24:13 -0400 (EDT)
-Received: from svana.org (svana.org [203.20.62.76])
- by developer.postgresql.org (Postfix) with ESMTP id 532D6923324
- for
; Tue, 20 May 2003 11:24:11 -0400 (EDT)
-Received: from kleptog by svana.org with local (Exim 3.35 #1 (Debian))
- id 19I8xs-0001rc-00; Wed, 21 May 2003 01:23:44 +1000
-Date: Wed, 21 May 2003 01:23:44 +1000
-From: Martijn van Oosterhout
-To: "Jim C. Nasby"
-cc: Zeugswetter Andreas SB SD ,
-Subject: Re: [HACKERS] Feature suggestions (long)
-Reply-To: Martijn van Oosterhout
-MIME-Version: 1.0
-Content-Type: multipart/signed; micalg=pgp-sha1;
- protocol="application/pgp-signature"; boundary="ZYOWEO2dMm2Af3e3"
-Content-Disposition: inline
-User-Agent: Mutt/1.3.28i
-X-PGP-Key-ID: Length=1024; ID=0x0DC67BE6
-X-PGP-Key-Fingerprint: 295F A899 A81A 156D B522 48A7 6394 F08A 0DC6 7BE6
-X-PGP-Key-URL:
-X-Virus-Scanned: by amavisd-new
-Precedence: bulk
-Status: OR
-
---ZYOWEO2dMm2Af3e3
-Content-Type: text/plain; charset=us-ascii
-Content-Disposition: inline
-Content-Transfer-Encoding: quoted-printable
-
-On Tue, May 20, 2003 at 10:02:24AM -0500, Jim C. Nasby wrote:
-> On Tue, May 20, 2003 at 12:40:00AM +1000, Martijn van Oosterhout wrote:
-> > Anyway, the general trend seems to be against the idea so I may as well=
- go
-> > think of something else :)
->=20
-> I'm disappointed to hear that. Having no way to effectively partition
-> data is a real pain in pgsql, and your proposal would adress that. Yes,
-> you can build it yourself by creating the view and all the rules by
-> hand, but that has a lot of drawbacks:
-
-I agree, there is a lot of potential here. And I don't beleive it would be
-too much work as most of the infrastructure is already there. At this stage
-I'm just wondering if it will go on the TODO list. I propose that the
-following items be added:
-
- * Improve the planner to take CHECK constraints into account to prune th=
-e plan.
- * Allow a single index to index multiple tables (also for inherited PRIM=
-ARY KEYS)
- * Allow partitioning of table into multiple subtables
-
-The first two items would be useful in their own right. With them the final
-one would be straight forward. I'd be prepared to put some effort into this
-if there is some indication it would be accepted.
-
-> I don't know what the policies for patches are, but I'd hope that the
-> core team would consider adding this functionality, especially since a
-> first-round implimentation can be done entirely with rules (or so it
-> seems).
-
-Well, I think the policy is 'if you write the code you have a better chance
-to have it accepted' :) So, if it's likely to be accepted then we only need
-to find someone to code it. Given the other priorities currently I think
-waiting for the core team to write it would be futile (unless you can
-convince someone like IBM to give the core team money to write it).
-
-Right now I'd be happy if the anonymous CVS server would talk to me :)
-
-By the way, has anyone given thought to user-defined storage managers? Apart
-from allowing backward compatable table access, you could implement a simple
-version of partitioning that doesn't take advantage of planner tricks.
-
-Have a nice day,
---=20
-Martijn van Oosterhout http://svana.org/kleptog/
-> "the West won the world not by the superiority of its ideas or values or
-> religion but rather by its superiority in applying organized violence.
-> Westerners often forget this fact, non-Westerners never do."
-> - Samuel P. Huntington
-
---ZYOWEO2dMm2Af3e3
-Content-Type: application/pgp-signature
-Content-Disposition: inline
-
------BEGIN PGP SIGNATURE-----
-Version: GnuPG v1.0.6 (GNU/Linux)
-Comment: For info see http://www.gnupg.org
-
-iD8DBQE+ykh/Y5Twig3Ge+YRAkC3AKCCHBjQKOnQEvMSHP5fvqKs3aDmSwCglzl+
-AcdlBtS/wZjauiKtyITTbZA=
-=2dU9
------END PGP SIGNATURE-----
-
---ZYOWEO2dMm2Af3e3--
-
-Received: from www.postgresql.com (www.postgresql.com [64.117.225.209])
- by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id h5QGqqd00584
- for
; Thu, 26 Jun 2003 12:52:54 -0400 (EDT)
-Received: from postgresql.org (developer.postgresql.org [64.117.224.193])
- by www.postgresql.com (Postfix) with ESMTP id 926C8CF76B8
- for
; Thu, 26 Jun 2003 13:52:47 -0300 (ADT)
-Received: from localhost (unknown [64.117.224.193])
- by svr1.postgresql.org (Postfix) with ESMTP id E3C3C30FFC3
- for
; Thu, 26 Jun 2003 16:52:10 +0000 (GMT)
-Received: from svr1.postgresql.org ([64.117.224.193])
- by localhost (svr1.postgresql.org [64.117.224.193]) (amavisd-new, port 10024)
- with ESMTP id 87866-05
- Thu, 26 Jun 2003 13:52:00 -0300 (ADT)
-Received: from sss.pgh.pa.us (unknown [192.204.191.242])
- by svr1.postgresql.org (Postfix) with ESMTP id C933230FFAF
- for
; Thu, 26 Jun 2003 13:51:59 -0300 (ADT)
-Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
- by sss.pgh.pa.us (8.12.9/8.12.9) with ESMTP id h5QGobQQ026342;
- Thu, 26 Jun 2003 12:50:37 -0400 (EDT)
-Subject: Re: [HACKERS] [GENERAL] Physical Database Configuration
- message dated "Thu, 26 Jun 2003 11:26:50 -0500"
-Date: Thu, 26 Jun 2003 12:50:36 -0400
-From: Tom Lane
-Precedence: bulk
-Status: OR
-
-> Being able to zap a database with one or more 'rm -rf' commands assumes
-> that there will be files from just ONE database permitted in any given
-> tablespace, and ONLY files from that database.
-
-I said no such thing. Look at the structure again:
-
-$PGDATA/base/dboid/...stuff...
-
-sometablespace/dboid/...stuff...
-
-othertablespace/dboid/...stuff...
-
-DROPDB needs to nuke /dboid/ for each tablespace's associated
-. The other design simplifies DROPDB at the cost of increased
-complexity for every other tablespace management operation, since you'd
-need to cope with a symlink in each database for each tablespace.
-
-Also, this scheme is at least theoretically amenable to a symlink-free
-implementation, though I personally don't give a darn whether
-tablespaces are supported on Windows and thus wouldn't expend the extra
-effort needed to keep track of full paths. I'd want
-$PGDATA/tablespaces/tboid to be a symlink to the root of the tablespace
-with a given OID, and then the actual pathname used to access a table in
-tablespace tboid, database dboid, table filenode rfoid would look like
- $PGDATA/tablespaces/tboid/dboid/rfoid
-But a Windoze version could in theory keep track of tablespace locations
-directly, and replace the first part of this path with the actual
-tablespace location. If we put tablespaces under directories then the
-facility has zero functionality without symlinks, because you couldn't
-actually do anything to segregate stuff within a database across
-different devices.
-
-BTW, we'd probably remove $PGDATA/base in favor of $PGDATA/tablespaces/N
-for some fixed-in-advance N that is the system tablespace, and we'd
-require all system catalogs to live in this tablespace --- certainly at
-least pg_class and its indexes. Otherwise you have circularity problems
-in finding the catalogs ...
-
- regards, tom lane
-
----------------------------(end of broadcast)---------------------------
-TIP 2: you can get off all lists at once with the unregister command
-