Remove subquery.

author Bruce Momjian

Mon, 29 Jan 2001 17:52:47 +0000 (17:52 +0000)

committer Bruce Momjian

Mon, 29 Jan 2001 17:52:47 +0000 (17:52 +0000)
author Bruce Momjian
Mon, 29 Jan 2001 17:52:47 +0000 (17:52 +0000)
committer Bruce Momjian
Mon, 29 Jan 2001 17:52:47 +0000 (17:52 +0000)
diff --git a/doc/TODO b/doc/TODO

index 2fc015adda90aa48c589c547ec9d4df897571b4c..443b2f28f79ef2b5880478262a5804b52c815d3e 100644 (file)
--- a/doc/TODO
+++ b/doc/TODO
@@ -292,7 +292,6 @@ MISC
  * -Make oid use oidin/oidout not int4in/int4out in pg_type.h (Tom)
  * Improve Subplan list handling
  * Allow Subplans to use efficient joins(hash, merge) with upper variable
-  [subquery]
  * -use fmgr_info()/fmgr_faddr() instead of fmgr() calls in high-traffic
    places, like GROUP BY, UNIQUE, index processing, etc.
  * improve dynamic memory allocation by introducing tuple-context memory
diff --git a/doc/TODO.detail/subquery b/doc/TODO.detail/subquery

deleted file mode 100644 (file)

index cdc55c8..0000000
--- a/doc/TODO.detail/subquery
+++ /dev/null
@@ -1,9706 +0,0 @@
-From [email protected] Fri Aug  6 00:02:02 1999
-Received: from sunpine.krs.ru (SunPine.krs.ru [195.161.16.37])
-   by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA22890
-   for ; Fri, 6 Aug 1999 00:02:00 -0400 (EDT)
-Received: from krs.ru (dune.krs.ru [195.161.16.38])
-   by sunpine.krs.ru (8.8.8/8.8.8) with ESMTP id MAA23302;
-   Fri, 6 Aug 1999 12:01:59 +0800 (KRSS)
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Fri, 06 Aug 1999 12:01:57 +0800
-From: Vadim Mikheev 
-Organization: OJSC Rostelecom (Krasnoyarsk)
-X-Mailer: Mozilla 4.5 [en] (X11; I; FreeBSD 3.0-RELEASE i386)
-X-Accept-Language: ru, en
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: Tom Lane , [email protected]
-Subject: Re: [HACKERS] Idea for speeding up uncorrelated subqueries
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: RO
-
-Bruce Momjian wrote:
-> 
-> Isn't it something that takes only a few hours to implement.  We can't
-> keep telling people to us EXISTS, especially because most SQL people
-> think correlated queries are slower that non-correlated ones.  Can we
-> just on-the-fly rewrite the query to use exists?
-
-This seems easy to implement. We could look does subquery have
-aggregates or not before calling union_planner() in
-subselect.c:_make_subplan() and rewrite it (change 
-slink->subLinkType from IN to EXISTS and add quals).
-
-Without caching implemented IN-->EXISTS rewriting always
-has sence.
-
-After implementation of caching we probably should call union_planner()
-for both original/modified subqueries and compare costs/sizes
-of EXISTS/IN_with_caching plans and maybe even make
-decision what plan to use after parent query is planned
-and we know for how many parent rows subplan will be executed.
-
-Vadim
-
-From [email protected] Fri Aug  6 00:15:23 1999
-Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2])
-   by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA23058
-   for ; Fri, 6 Aug 1999 00:15:22 -0400 (EDT)
-Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
-   by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id AAA06786;
-   Fri, 6 Aug 1999 00:14:50 -0400 (EDT)
-To: Bruce Momjian 
-cc: Vadim Mikheev , [email protected]
-Subject: Re: [HACKERS] Idea for speeding up uncorrelated subqueries 
-In-reply-to: Your message of Thu, 5 Aug 1999 23:31:01 -0400 (EDT) 
-             <[email protected]> 
-Date: Fri, 06 Aug 1999 00:14:50 -0400
-Message-ID: <[email protected]>
-From: Tom Lane 
-Status: RO
-
-Bruce Momjian  writes:
-> Isn't it something that takes only a few hours to implement.  We can't
-> keep telling people to us EXISTS, especially because most SQL people
-> think correlated queries are slower that non-correlated ones.  Can we
-> just on-the-fly rewrite the query to use exists?
-
-I was just about to suggest exactly that.  The "IN (subselect)"
-notation seems to be a lot more intuitive --- at least, people
-keep coming up with it --- so why not rewrite it to the EXISTS
-form, if we can handle that more efficiently?
-
-           regards, tom lane
-
-From [email protected] Thu Dec  5 10:30:53 1996
-Received: from abs.net ([email protected] [207.114.0.130]) by candle.pha.pa.us (8.8.3/8.7.3) with ESMTP id KAA06591 for ; Thu, 5 Dec 1996 10:30:43 -0500 (EST)
-Received: from aixssd.UUCP (nobody@localhost) by abs.net (8.8.3/8.7.3) with UUCP id KAA01387 for [email protected]; Thu, 5 Dec 1996 10:13:56 -0500 (EST)
-Received: by aixssd (AIX 3.2/UCB 5.64/4.03)
-          id AA36963; Thu, 5 Dec 1996 10:10:24 -0500
-Received: by ceodev (AIX 4.1/UCB 5.64/4.03)
-          id AA34942; Thu, 5 Dec 1996 10:07:56 -0500
-Date: Thu, 5 Dec 1996 10:07:56 -0500
-From: [email protected] (Darren King)
-Message-Id: <9612051507.AA34942@ceodev>
-To: [email protected]
-Subject: Subselect info.
-Mime-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Content-Md5: jaWdPH2KYtdr7ESzqcOp5g==
-Status: OR
-
-> Any of them deal with implementing subselects?
-
-There's a white paper at the www.sybase.com that might
-help a little.  It's just a copy of a presentation
-given by the optimizer guru there.  Nothing code-wise,
-but he gives a few ways of flattening them with temp
-tables, etc...
-
-Darren 
-
-From [email protected] Thu Aug 21 23:42:50 1997
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id XAA04109
-   for ; Thu, 21 Aug 1997 23:42:43 -0400 (EDT)
-Received: from www.krasnet.ru (localhost [127.0.0.1]) by www.krasnet.ru (8.7.5/8.7.3) with SMTP id MAA04399; Fri, 22 Aug 1997 12:04:31 +0800 (KRD)
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Fri, 22 Aug 1997 12:04:31 +0800
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 3.01 (X11; I; FreeBSD 2.1.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-Subject: Re: subselects
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> Considering the complexity of the primary/secondary changes you are
-> making, I believe subselects will be easier than that.
-
-I don't do changes for P/F keys - just thinking...
-Yes, I think that impl of referential integrity is
-more complex work.
-
-As for subselects:
-
-in plannodes.h
-
-typedef struct Plan {
-...
-    struct Plan         *lefttree;
-    struct Plan         *righttree;
-} Plan;
-
-/* ----------------
- *  these are are defined to avoid confusion problems with "left"
-                                   ^^^^^^^^^^^^^^^^^^
- *  and "right" and "inner" and "outer".  The convention is that   
- *  the "left" plan is the "outer" plan and the "right" plan is
- *  the inner plan, but these make the code more readable.
- * ----------------
- */
-#define innerPlan(node)         (((Plan *)(node))->righttree)
-#define outerPlan(node)         (((Plan *)(node))->lefttree)
-
-First thought is avoid any confusions by re-defining
-
-#define rightPlan(node)         (((Plan *)(node))->righttree)
-#define leftPlan(node)          (((Plan *)(node))->lefttree)
-
-and change all occurrences of 'outer' & 'inner' in code
-to 'left' & 'inner' ones:
-
-this will allow to use 'outer' & 'inner' things for subselects
-latter, without confusion. My hope is that we may change Executor
-very easy by adding outer/inner plans/TupleSlots to
-EState, CommonState, JoinState, etc and by doing node
-processing in right order.
-
-Subselects are mostly Planner problem.
-
-Unfortunately, I havn't time at the moment: CHECK/DEFAULT...
-
-Vadim
-
-From [email protected] Fri Aug 22 00:00:59 1997
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id AAA04354
-   for ; Fri, 22 Aug 1997 00:00:51 -0400 (EDT)
-Received: from www.krasnet.ru (localhost [127.0.0.1]) by www.krasnet.ru (8.7.5/8.7.3) with SMTP id MAA04425; Fri, 22 Aug 1997 12:22:37 +0800 (KRD)
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Fri, 22 Aug 1997 12:22:37 +0800
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 3.01 (X11; I; FreeBSD 2.1.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-Subject: Re: subselects
-References: <[email protected]> <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Vadim B. Mikheev wrote:
-> 
-> this will allow to use 'outer' & 'inner' things for subselects
-> latter, without confusion. My hope is that we may change Executor
-
-Or may be use 'high' & 'low' for subselecs (to avoid confusion
-with outter hoins).
-
-> very easy by adding outer/inner plans/TupleSlots to
-> EState, CommonState, JoinState, etc and by doing node
-> processing in right order.
-             ^^^^^^^^^^^^^^
-Rule is easy:
-1. Uncorrelated subselect - do 'low' plan node first
-2. Correlated             - do left/right first
-
-- just some flag in structures.
-
-Vadim
-
-From [email protected] Thu Oct 30 17:02:30 1997
-Received: from hub.org (hub.org [209.47.148.200])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id RAA09682
-   for ; Thu, 30 Oct 1997 17:02:28 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.5/8.7.5) with SMTP id QAA20688; Thu, 30 Oct 1997 16:58:40 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Thu, 30 Oct 1997 16:58:24 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.5/8.7.5) id QAA20615 for pgsql-hackers-outgoing; Thu, 30 Oct 1997 16:58:17 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [206.84.210.195]) by hub.org (8.8.5/8.7.5) with ESMTP id QAA20495 for ; Thu, 30 Oct 1997 16:57:54 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id QAA07726
-   for [email protected]; Thu, 30 Oct 1997 16:50:29 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: [HACKERS] subselects
-To: [email protected] (PostgreSQL-development)
-Date: Thu, 30 Oct 1997 16:50:29 -0500 (EST)
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-The only thing I have to add to what I had written earlier is that I
-think it is best to have these subqueries executed as early in query
-execution as possible.
-
-Every piece of the backend: parser, optimizer, executor, is designed to
-work on a single query.  The earlier we can split up the queries, the
-better those pieces will work at doing their job.  You want to be able
-to use the parser and optimizer on each part of the query separately, if
-you can.
-
-
-Forwarded message:
-> I have done some thinking about subselects.  There are basically two
-> issues:
- > 
->  Does the query return one row or several rows?  This can be
->  determined by seeing if the user uses equals on 'IN' to join the
->  subquery. 
-> 
->  Is the query correlated, meaning "Does the subquery reference
->  values from the outer query?"
-> 
-> (We already have the third type of subquery, the INSERT...SELECT query.)
-> 
-> So we have these four combinations:
-> 
->  1) one row, no correlation
->  2) multiple rows, no correlation
->  3) one row, correlated
->  4) multiple rows, correlated
-> 
-> 
-> With #1, we can execute the subquery, get the value, replace the
-> subquery with the constant returned from the subquery, and execute the
-> outer query.
-> 
-> With #2, we can execute the subquery and put the result into a temporary
-> table.  We then rewrite the outer query to access the temporary table
-> and replace the subquery with the column name from the temporary table. 
-> We probabally put an index on the temp. table, which has only one
-> column, because a subquery can only return one column.  We remove the
-> temp. table after query execution.
-> 
-> With #3 and #4, we potentially need to execute the subquery for every
-> row returned by the outer query.  Performance would be horrible for
-> anything but the smallest query.  Another way to handle this is to
-> execute the subquery WITHOUT using any of the outer-query columns to
-> restrict the WHERE clause, and add those columns used to join the outer
-> variables into the target list of the subquery.  So for query:
-> 
->  select t1.name
->  from tab t1
->  where t1.age = (select max(t2.age)
->              from tab2
->              where tab2.name = t1.name)
-> 
-> Execute the subquery and put it in a temporary table:
-> 
->  select t2.name, max(t2.age)
->  into table temp999
->  from tab2
->  where tab2.name = t1.name
-> 
->  create index i_temp999 on temp999 (name)
-> 
-> Then re-write the outer query:
-> 
->  select t1.name
->  from tab t1, temp999
->  where t1.age = temp999.age and
->        t1.name = temp999.name
-> 
-> The only problem here is that the subselect is running for all entries
-> in tab2, even if the outer query is only going to need a few rows. 
-> Determining whether to execute the subquery each time, or create a temp.
-> table is often difficult to determine.  Even some non-correlated
-> subqueries are better to execute for each row rather the pre-execute the
-> entire subquery, expecially if the outer query returns few rows.
-> 
-> One requirement to handle these issues is better column statistics,
-> which I am working on.
-> 
-
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Fri Oct 31 22:30:58 1997
-Received: from renoir.op.net ([email protected] [206.84.208.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id WAA15643
-   for ; Fri, 31 Oct 1997 22:30:56 -0500 (EST)
-Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id WAA24379 for ; Fri, 31 Oct 1997 22:06:08 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.5/8.7.5) with SMTP id WAA15503; Fri, 31 Oct 1997 22:03:40 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Fri, 31 Oct 1997 22:01:38 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.5/8.7.5) id WAA14136 for pgsql-hackers-outgoing; Fri, 31 Oct 1997 22:01:29 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [206.84.210.195]) by hub.org (8.8.5/8.7.5) with ESMTP id WAA13866 for ; Fri, 31 Oct 1997 22:00:53 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id VAA14566;
-   Fri, 31 Oct 1997 21:37:06 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: Re: [HACKERS] subselects
-To: [email protected] (Bruce Momjian)
-Date: Fri, 31 Oct 1997 21:37:06 +1900 (EST)
-Cc: [email protected]
-In-Reply-To: <[email protected]> from "Bruce Momjian" at Oct 30, 97 04:50:29 pm
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-One more issue I thought of.  You can have multiple subselects in a
-single query, and subselects can have their own subselects.
-
-This makes it particularly important that we define a system that always
-is able to process the subselect BEFORE the upper select.  This will
-allow use to handle all these cases without limitations.
-
-> 
-> The only thing I have to add to what I had written earlier is that I
-> think it is best to have these subqueries executed as early in query
-> execution as possible.
-> 
-> Every piece of the backend: parser, optimizer, executor, is designed to
-> work on a single query.  The earlier we can split up the queries, the
-> better those pieces will work at doing their job.  You want to be able
-> to use the parser and optimizer on each part of the query separately, if
-> you can.
-> 
-
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Sun Nov  2 10:33:33 1997
-Received: from sid.trust.ee (sid.trust.ee [194.204.23.180])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id KAA27619
-   for ; Sun, 2 Nov 1997 10:32:04 -0500 (EST)
-Received: from sid.trust.ee (wink.trust.ee [194.204.23.184])
-   by sid.trust.ee (8.8.5/8.8.5) with ESMTP id RAA02233;
-   Sun, 2 Nov 1997 17:30:11 +0200
-Message-ID: <[email protected]>
-Date: Sun, 02 Nov 1997 17:27:57 +0200
-From: Hannu Krosing 
-X-Mailer: Mozilla 4.02 [en] (Win95; I)
-MIME-Version: 1.0
-To: [email protected]
-CC: [email protected]
-Subject: Re: [HACKERS] subselects
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-> Date: Fri, 31 Oct 1997 21:37:06 +1900 (EST)
-> From: Bruce Momjian 
-> Subject: Re: [HACKERS] subselects
->
-> One more issue I thought of.  You can have multiple subselects in a
-> single query, and subselects can have their own subselects.
->
-> This makes it particularly important that we define a system that always
-> is able to process the subselect BEFORE the upper select.  This will
-> allow use to handle all these cases without limitations.
-
-This would severely limit what subselects can be used for as you can't useany of the fields in the upper select in a
-search criteria for the subselect,
-for example you can't do
-
-update parts p1
-set parts.current_id = (
-    select new_id
-    from parts p2
-    where p1.old_id = p2.new_id);or
-
-select id, price, (select sum(price) from parts p2 where p1.id=p2.id) as totalprice
-from parts p1;
-
-there may be of course ways to rewrite these queries (which the optimiser should do
-if it can) but IMHO, these kinds of subselects should still be allowed
-
-> > The only thing I have to add to what I had written earlier is that I
-> > think it is best to have these subqueries executed as early in query
-> > execution as possible.
-> >
-> > Every piece of the backend: parser, optimizer, executor, is designed to
-> > work on a single query.  The earlier we can split up the queries, the
-> > better those pieces will work at doing their job.  You want to be able
-> > to use the parser and optimizer on each part of the query separately, if
-> > you can.
-> >
->
-
-Hannu
-
-
-From [email protected] Sun Nov  2 21:30:59 1997
-Received: from renoir.op.net ([email protected] [206.84.208.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id VAA14831
-   for ; Sun, 2 Nov 1997 21:30:57 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id VAA19683 for ; Sun, 2 Nov 1997 21:20:13 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by www.krasnet.ru (8.8.7/8.7.3) with SMTP id JAA17259; Mon, 3 Nov 1997 09:22:38 +0700 (KRS)
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Mon, 03 Nov 1997 09:22:38 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 3.01 (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: PostgreSQL-development 
-Subject: Re: [HACKERS] subselects
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> > > One more issue I thought of.  You can have multiple subselects in a
-> > > single query, and subselects can have their own subselects.
-> > >
-> > > This makes it particularly important that we define a system that always
-> > > is able to process the subselect BEFORE the upper select.  This will
-> > > allow use to handle all these cases without limitations.
-> >
-> > This would severely limit what subselects can be used for as you can't useany of the fields in the upper select in a
-> > search criteria for the subselect,
-> > for example you can't do
-> >
-> > update parts p1
-> > set parts.current_id = (
-> >     select new_id
-> >     from parts p2
-> >     where p1.old_id = p2.new_id);or
-> >
-> > select id, price, (select sum(price) from parts p2 where p1.id=p2.id) as totalprice
-> > from parts p1;
-> >
-> > there may be of course ways to rewrite these queries (which the optimiser should do
-> > if it can) but IMHO, these kinds of subselects should still be allowed
-> 
-> I hadn't even gotten to this point yet, but it is a good thing to keep
-> in mind.
-> 
-> In these cases, as in correlated subqueries in the where clause, we will
-> create a temporary table, and add the proper join fields and tables to
-> the clauses.  Our version of UPDATE accepts a FROM section, and we will
-> certainly use this for this purpose.
-
-We can't replace subselect with join if there is aggregate
-in subselect.
-
-Actually, I don't see any problems if we going to process subselect
-like sql-funcs: non-correlated subselects can be emulated by
-funcs without args, for correlated subselects parser (analyze.c)
-has to change all upper query references to $1, $2,...
-
-Vadim
-
-From [email protected] Mon Nov  3 06:07:12 1997
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id GAA27433
-   for ; Mon, 3 Nov 1997 06:07:03 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by www.krasnet.ru (8.8.7/8.7.3) with SMTP id SAA18519; Mon, 3 Nov 1997 18:09:44 +0700 (KRS)
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Mon, 03 Nov 1997 18:09:43 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 3.01 (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: [email protected]
-Subject: Re: [HACKERS] subselects
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> >
-> > > In these cases, as in correlated subqueries in the where clause, we will
-> > > create a temporary table, and add the proper join fields and tables to
-> > > the clauses.  Our version of UPDATE accepts a FROM section, and we will
-> > > certainly use this for this purpose.
-> >
-> > We can't replace subselect with join if there is aggregate
-> > in subselect.
-> 
-> I got lost here.  Why can't we handle aggregates?
-
-Sorry, I missed using of temp tables. Sybase uses joins (without
-temp tables) for non-correlated subqueries:
-
-    A noncorrelated subquery can be evaluated as if it were an independent query.
-    Conceptually, the results of the subquery are substituted in the main statement, or
-    outer query. This is not how SQL Server actually processes statements with
-    subqueries. Noncorrelated subqueries can be alternatively stated as joins and
-    are processed as joins by SQL Server. 
-
-but this is not possible if there are aggregates in subquery.
-
-> 
-> My idea was this.  This is a non-correlated subquery.
-...
-No problems with it...
-
-> 
-> Here is a correlated example:
-> 
->         select *
->         from table_a
->         where table_a.col_a in (select table_b.col_b
->                         from table_b
->                         where table_b.col_b = table_a.col_c)
-> 
-> rewrite as:
-> 
->         select distinct table_b.col_b, table_a.col_c -- the distinct is needed
->         into table_sub
->         from table_a, table_b
-
-First, could we add 'where table_b.col_b = table_a.col_c' here ?
-Just to avoid Cartesian results ? I hope we can.
-
-Note that for query
-
-        select *
-        from table_a
-        where table_a.col_a in (select table_b.col_b * table_a.col_c
-                        from table_b)
-
-it's better to do
-
-   select distinct table_a.col_a
-   into table table_sub
-   from table_b, table_a
-        where table_a.col_a = table_b.col_b * table_a.col_c
-
-once again - to avoid Cartesians.
-
-But what could we do for
-
-        select *
-        from table_a
-        where table_a.col_a = (select max(table_b.col_b * table_a.col_c)
-                        from table_b)
-???
-   select max(table_b.col_b * table_a.col_c), table_a.col_a
-   into table table_sub
-   from table_b, table_a
-        group by table_a.col_a
-
-first tries to sort sizeof(table_a) * sizeof(table_b) tuples...
-For tables big and small with 100 000 and 1000 tuples 
-
-select max(x*y), x from big, small group by x
-
-"ate" all free 140M in my file system after 20 minutes (just for
-sorting - nothing more) and was killed...
-
-select x from big where x = cor(x);
-(cor(int4) is 'select max($1*y) from small') takes 20 minutes -
-this is bad too.
-
-> >
-> > Actually, I don't see any problems if we going to process subselect
-> > like sql-funcs: non-correlated subselects can be emulated by
-> > funcs without args, for correlated subselects parser (analyze.c)
-> > has to change all upper query references to $1, $2,...
-> 
-> Yes, logically, they are SQL functions, but aren't we going to see
-> terrible performance in such circumstances.  My experience is that when
-  ^^^^^^^^^^^^^^^^^^^^
-You're right.
-
-> people are given subselects, they start to do huge jobs with them.
-> 
-> In fact, the final solution may be to have both methods available, and
-> switch between them depending on the size of the query sets.  Each
-> method has its advantages.  The function example lets the outside query
-> be executed, and only calls the subquery when needed.
-> 
-> For large tables where the subselect is small and is the entire WHERE
-> restriction, the SQL function gets call much too often.  A simple join
-> of the subquery result and the large table would be much better.  This
-> method also allows for sort/merge join of the subquery results, and
-> index use.
-
-...keep thinking...
-
-Vadim
-
-From [email protected] Mon Nov  3 11:01:01 1997
-Received: from renoir.op.net ([email protected] [206.84.208.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id LAA03633
-   for ; Mon, 3 Nov 1997 11:00:59 -0500 (EST)
-Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id KAA12174 for ; Mon, 3 Nov 1997 10:49:42 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.5/8.7.5) with SMTP id KAA26203; Mon, 3 Nov 1997 10:33:32 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 03 Nov 1997 10:31:43 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.5/8.7.5) id KAA25514 for pgsql-hackers-outgoing; Mon, 3 Nov 1997 10:31:36 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [206.84.210.195]) by hub.org (8.8.5/8.7.5) with ESMTP id KAA25449 for ; Mon, 3 Nov 1997 10:31:23 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id KAA02262;
-   Mon, 3 Nov 1997 10:25:34 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: Re: [HACKERS] subselects
-To: [email protected] (Vadim B. Mikheev)
-Date: Mon, 3 Nov 1997 10:25:34 -0500 (EST)
-Cc: [email protected]
-In-Reply-To: <[email protected]> from "Vadim B. Mikheev" at Nov 3, 97 06:09:43 pm
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-> Sorry, I missed using of temp tables. Sybase uses joins (without
-> temp tables) for non-correlated subqueries:
-> 
->     A noncorrelated subquery can be evaluated as if it were an independent query.
->     Conceptually, the results of the subquery are substituted in the main statement, or
->     outer query. This is not how SQL Server actually processes statements with
->     subqueries. Noncorrelated subqueries can be alternatively stated as joins and
->     are processed as joins by SQL Server. 
-> 
-> but this is not possible if there are aggregates in subquery.
-> 
-> > 
-> > My idea was this.  This is a non-correlated subquery.
-> ...
-> No problems with it...
-> 
-> > 
-> > Here is a correlated example:
-> > 
-> >         select *
-> >         from table_a
-> >         where table_a.col_a in (select table_b.col_b
-> >                         from table_b
-> >                         where table_b.col_b = table_a.col_c)
-> > 
-> > rewrite as:
-> > 
-> >         select distinct table_b.col_b, table_a.col_c -- the distinct is needed
-> >         into table_sub
-> >         from table_a, table_b
-> 
-> First, could we add 'where table_b.col_b = table_a.col_c' here ?
-> Just to avoid Cartesian results ? I hope we can.
-
-Yes, of course.  I forgot that line here.  We can also be fancy and move
-some of the outer where restrictions on table_a into the subquery.
-
-I think the classic subquery for this would be if someone wanted all
-customer names that had invoices in the past month:
-
-select custname
-from customer
-where custid in (select order.custid
-        from order
-        where order.date >= "09/01/97" and
-              order.date <= "09/30/97"
-
-In this case, the subquery can use an index on 'date' to quickly
-evaluate the query, and the resulting temp table can quickly be joined
-to the customer table.  If we used SQL functions, every customer would
-have an order query evaluated for it, and there may be no multi-column
-index on customer and date, or even if there is, this could be many
-query executions.
-
-
-> 
-> Note that for query
-> 
->         select *
->         from table_a
->         where table_a.col_a in (select table_b.col_b * table_a.col_c
->                         from table_b)
-> 
-> it's better to do
-> 
->  select distinct table_a.col_a
->  into table table_sub
->  from table_b, table_a
->         where table_a.col_a = table_b.col_b * table_a.col_c
-
-Yes, I had not thought of cases where they are doing correlated column
-arithmetic, but it looks like this would work.
-
-> 
-> once again - to avoid Cartesians.
-> 
-> But what could we do for
-> 
->         select *
->         from table_a
->         where table_a.col_a = (select max(table_b.col_b * table_a.col_c)
->                         from table_b)
-
-OK, who wrote this horrible query. :-)
-
-Without a join of table_b and table_a, even an SQL function would die on
-this.  You have to take the current value table_a.col_c, and multiply by
-every value of table_b.col_b to get the maximum.
-
-Trying to do a temp table on this is certainly going to be a cartesian
-product, but using an SQL function is also going to be a cartesian
-product, except that the product is generated in small pieces instead of
-in one big query.  The SQL function example may eventually complete, but
-it will take forever to do so in cases where the temp table would bomb.
-
-I can recommend some SQL books for anyone go sends in a bug report on
-this query. :-)
-
-
-
-> ???
->  select max(table_b.col_b * table_a.col_c), table_a.col_a
->  into table table_sub
->  from table_b, table_a
->         group by table_a.col_a
-> 
-> first tries to sort sizeof(table_a) * sizeof(table_b) tuples...
-> For tables big and small with 100 000 and 1000 tuples 
-> 
-> select max(x*y), x from big, small group by x
-> 
-> "ate" all free 140M in my file system after 20 minutes (just for
-> sorting - nothing more) and was killed...
-> 
-> select x from big where x = cor(x);
-> (cor(int4) is 'select max($1*y) from small') takes 20 minutes -
-> this is bad too.
-
-Again, my feeling is that in cases where the temp table would bomb, the
-SQL function will be so slow that neither will be acceptable.
-
-> 
-> > >
-> > > Actually, I don't see any problems if we going to process subselect
-> > > like sql-funcs: non-correlated subselects can be emulated by
-> > > funcs without args, for correlated subselects parser (analyze.c)
-> > > has to change all upper query references to $1, $2,...
-> > 
-> > Yes, logically, they are SQL functions, but aren't we going to see
-> > terrible performance in such circumstances.  My experience is that when
->   ^^^^^^^^^^^^^^^^^^^^
-> You're right.
-> 
-> > people are given subselects, they start to do huge jobs with them.
-> > 
-> > In fact, the final solution may be to have both methods available, and
-> > switch between them depending on the size of the query sets.  Each
-> > method has its advantages.  The function example lets the outside query
-> > be executed, and only calls the subquery when needed.
-> > 
-> > For large tables where the subselect is small and is the entire WHERE
-> > restriction, the SQL function gets call much too often.  A simple join
-> > of the subquery result and the large table would be much better.  This
-> > method also allows for sort/merge join of the subquery results, and
-> > index use.
-> 
-> ...keep thinking...
-> 
-> Vadim
-> 
-
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Thu Nov 20 00:09:18 1997
-Received: from hub.org (hub.org [209.47.148.200])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id AAA05239
-   for ; Thu, 20 Nov 1997 00:09:11 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.5/8.7.5) with SMTP id XAA13776; Wed, 19 Nov 1997 23:59:53 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Wed, 19 Nov 1997 23:58:49 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.5/8.7.5) id XAA13599 for pgsql-hackers-outgoing; Wed, 19 Nov 1997 23:58:43 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [206.84.210.195]) by hub.org (8.8.5/8.7.5) with ESMTP id XAA13512 for ; Wed, 19 Nov 1997 23:58:16 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id XAA03103
-   for [email protected]; Wed, 19 Nov 1997 23:57:44 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: [HACKERS] subselect
-To: [email protected] (PostgreSQL-development)
-Date: Wed, 19 Nov 1997 23:57:44 -0500 (EST)
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-I am going to overhaul all the /parser files, and I may give subselects
-a try while I am in there.  This is where it going to have to be done.
-
-Two things I think I need are:
-
-   temp tables that go away at the end of a statement, so if the
-query elog's out, the temp file gets destroyed
-
-   how do I implement "not in":
-
-       select * from a where x not in (select y from b)
-
-Using <> is not going to work because that returns multiple copies of a,
-one for every one that doesn't equal.  It is like we need not equals,
-but don't return multiple rows.
-
-Any ideas?
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Thu Nov 20 10:00:59 1997
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id KAA22019
-   for ; Thu, 20 Nov 1997 10:00:56 -0500 (EST)
-Received: from golem.jpl.nasa.gov ([email protected] [128.149.70.168]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id JAA21662 for ; Thu, 20 Nov 1997 09:52:55 -0500 (EST)
-Received: from alumni.caltech.edu (localhost [127.0.0.1])
-   by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id GAA22754;
-   Thu, 20 Nov 1997 06:27:21 GMT
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Thu, 20 Nov 1997 06:27:21 +0000
-From: "Thomas G. Lockhart" 
-Organization: Caltech/JPL
-X-Mailer: Mozilla 4.03 [en] (X11; I; Linux 2.0.30 i686)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: PostgreSQL-development 
-Subject: Re: [HACKERS] subselect
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-> I am going to overhaul all the /parser files
-
-??
-
-> , and I may give subselects
-> a try while I am in there.  This is where it going to have to be done.
-
-A first cut at the subselect syntax is already in gram.y. I'm sure that the
-e-mail you had sent which collected several items regarding subselects
-covers some of this topic. I've been thinking about subselects also, and
-had thought that there must be some existing mechanisms in the backend
-which can be used to help implement subselects. It seems to me that UNION
-might be a good thing to implement first, because it has a fairly
-well-defined set of behaviors:
-
-  select a union select b;
-
-chooses elements from a and from b and then sorts/uniques the result.
-
-  select a union all select b;
-
-chooses elements from a, sorts/uniques, and then adds all elements from b.
-
-  select a union select b union all select c;
-
-evaluates left to right, and first evaluates a union b, sorts/uniques, and
-then evaluates
-
-  (result) union all select c;
-
-There are several types of subselects. Examples of some are:
-
-1) select a.f from a union select b.f from b order by 1;
-Needs temporary table(s), optional sort/unique, final order by.
-
-2) select a.f from a where a.f in (select b.f from b);
-Needs temporary table(s). "in" can be first implemented by count(*) > 0 but
-would be better performance to have the backend return after the first
-match.
-
-3) select a.f from a where exists (select b.f from b where b.f = a);
-Need to do the select and do a subselect on _each_ of the returned values?
-Again could use count(*) to help implement.
-
-This brings up the point that perhaps the backend needs a row-counting
-atomic operation and count(*) could be re-implemented using that. At the
-moment count(*) is transformed to a select of OID columns and does not
-quite work on table joins.
-
-I would think that outer joins could use some of these support routines
-also.
-
-                                                       - Tom
-
-> Two things I think I need are:
->
->         temp tables that go away at the end of a statement, so if the
-> query elog's out, the temp file gets destroyed
->
->         how do I implement "not in":
->
->                 select * from a where x not in (select y from b)
->
-> Using <> is not going to work because that returns multiple copies of a,
-> one for every one that doesn't equal.  It is like we need not equals,
-> but don't return multiple rows.
->
-> Any ideas?
->
-> --
-> Bruce Momjian
-> [email protected]
-
-
-
-
-From [email protected] Mon Dec 22 00:49:03 1997
-Received: from hub.org (hub.org [209.47.148.200])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id AAA13311
-   for ; Mon, 22 Dec 1997 00:49:01 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.5/8.7.5) with SMTP id AAA11930; Mon, 22 Dec 1997 00:45:41 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 22 Dec 1997 00:45:17 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.5/8.7.5) id AAA11756 for pgsql-hackers-outgoing; Mon, 22 Dec 1997 00:45:14 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [209.152.195.67]) by hub.org (8.8.5/8.7.5) with ESMTP id AAA11624 for ; Mon, 22 Dec 1997 00:44:57 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id AAA11605
-   for [email protected]; Mon, 22 Dec 1997 00:45:23 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: [HACKERS] subselects
-To: [email protected] (PostgreSQL-development)
-Date: Mon, 22 Dec 1997 00:45:23 -0500 (EST)
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-OK, a few questions:
-
-   Should we use sortmerge, so we can use our psort as temp tables,
-or do we use hashunique?
-
-   How do we pass the query to the optimizer?  How do we represent
-the range table for each, and the links between them in correlated
-subqueries?
-
-I have to think about this.  Comments are welcome.
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Mon Dec 22 02:01:27 1997
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id CAA20608
-   for ; Mon, 22 Dec 1997 02:01:25 -0500 (EST)
-Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id BAA25136 for ; Mon, 22 Dec 1997 01:37:29 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.5/8.7.5) with SMTP id BAA25289; Mon, 22 Dec 1997 01:31:18 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 22 Dec 1997 01:30:45 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.5/8.7.5) id BAA23854 for pgsql-hackers-outgoing; Mon, 22 Dec 1997 01:30:35 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [209.152.195.67]) by hub.org (8.8.5/8.7.5) with ESMTP id BAA22847 for ; Mon, 22 Dec 1997 01:30:15 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id BAA17354
-   for [email protected]; Mon, 22 Dec 1997 01:05:04 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: [HACKERS] subselects (fwd)
-To: [email protected] (PostgreSQL-development)
-Date: Mon, 22 Dec 1997 01:05:03 -0500 (EST)
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-Forwarded message:
-> OK, a few questions:
-> 
->  Should we use sortmerge, so we can use our psort as temp tables,
-> or do we use hashunique?
-> 
->  How do we pass the query to the optimizer?  How do we represent
-> the range table for each, and the links between them in correlated
-> subqueries?
-> 
-> I have to think about this.  Comments are welcome.
-
-One more thing.  I guess I am seeing subselects as a different thing
-that temp tables.  I can see people wanting to put indexes on their temp
-tables, so I think they will need more system catalog support.  For
-subselects, I think we can just stuff them into psort, perhaps, and do
-the unique as we unload them.
-
-Seems like a natural to me.
-
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Tue Dec 23 04:01:07 1997
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id EAA08876
-   for ; Tue, 23 Dec 1997 04:00:57 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id QAA23042;
-   Tue, 23 Dec 1997 16:08:56 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Tue, 23 Dec 1997 16:08:56 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: PostgreSQL-development 
-Subject: Re: [HACKERS] subselects (fwd)
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> Forwarded message:
-> > OK, a few questions:
-> >
-> >       Should we use sortmerge, so we can use our psort as temp tables,
-> > or do we use hashunique?
-> >
-> >       How do we pass the query to the optimizer?  How do we represent
-> > the range table for each, and the links between them in correlated
-> > subqueries?
-> >
-> > I have to think about this.  Comments are welcome.
-> 
-> One more thing.  I guess I am seeing subselects as a different thing
-> that temp tables.  I can see people wanting to put indexes on their temp
-> tables, so I think they will need more system catalog support.  For
-                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-What's the difference between temp tables and temp indices ?
-Both of them are handled via catalog cache...
-
-Vadim
-
-From [email protected] Sat Jan  3 04:01:00 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id EAA28565
-   for ; Sat, 3 Jan 1998 04:00:58 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id DAA19242 for ; Sat, 3 Jan 1998 03:47:07 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id QAA21017;
-   Sat, 3 Jan 1998 16:08:55 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Sat, 03 Jan 1998 16:08:51 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian ,
-        "Thomas G. Lockhart" 
-Subject: Re: subselects
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> With UNIONs done, how are things going with you on subselects?  UNIONs
-> are much easier that subselects.
-> 
-> I am stumped on how to record the subselect query information in the
-> parser and stuff.
-
-   And I'm too. We definitely need in EXISTS node and may be in IN one.
-Also, we have to support ANY and ALL modifiers of comparison operators
-(it would be nice to support ANY and ALL for all operators returning
-bool: >, =, ..., like, ~ and so on). Note, that IN is the same as
-= ANY (NOT IN ==> <> ALL) assuming that '=' means EQUAL for all data types,
-and so, we could avoid IN node, but I'm not sure that I like such
-assumption: postgres is OO-like system allowing operators to be overriden
-and so, '=' can, in theory, mean not EQUAL but something else (someday
-we could allow to specify "meaning" of operator in CREATE OPERATOR) -
-in short, I would like IN node.
-   Also, I would suggest nodes for ANY and ALL.
-   (I need in few days to think more about recording of this stuff...)
-
-> 
-> Please let me know what I can do to help, if anything.
-
-Thanks. As I remember, Tom also wished to work here. Tom ?
-
-Bye,
-   Vadim
-
-P.S. I'll be "on-line" Jan 5.
-
-From [email protected] Mon Jan  5 07:30:51 1998
-Received: from hub.org (hub.org [209.47.148.200])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id HAA05466
-   for ; Mon, 5 Jan 1998 07:30:49 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id HAA04700; Mon, 5 Jan 1998 07:22:06 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 05 Jan 1998 07:21:45 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id HAA02846 for pgsql-hackers-outgoing; Mon, 5 Jan 1998 07:21:35 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by hub.org (8.8.5/8.7.5) with ESMTP id HAA00903 for ; Mon, 5 Jan 1998 07:20:57 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id TAA24278;
-   Mon, 5 Jan 1998 19:36:06 +0700 (KRS)
-   (envelope-from [email protected])
-Message-ID: <[email protected]>
-Date: Mon, 05 Jan 1998 19:35:59 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: PostgreSQL-development 
-Subject: Re: [HACKERS] subselect
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> I was thinking about subselects, and how to attach the two queries.
-> 
-> What if the subquery makes a range table entry in the outer query, and
-> the query is set up like the UNION queries where we put the scans in a
-> row, but in the case we put them over/under each other.
-> 
-> And we push a temp table into the catalog cache that represents the
-> result of the subquery, then we could join to it in the outer query as
-> though it was a real table.
-> 
-> Also, can't we do the correlated subqueries by adding the proper
-> target/output columns to the subquery, and have the outer query
-> reference those columns in the subquery range table entry.
-
-Yes, this is a way to handle subqueries by joining to temp table.
-After getting plan we could change temp table access path to
-node material. On the other hand, it could be useful to let optimizer
-know about cost of temp table creation (have to think more about it)...
-Unfortunately, not all subqueries can be handled by "normal" joins: NOT IN
-is one example of this - joining by <> will give us invalid results.
-Setting special NOT EQUAL flag is not enough: subquery plan must be
-always inner one in this case. The same for handling ALL modifier.
-Note, that we generaly can't use aggregates here: we can't add MAX to 
-subquery in the case of > ALL (subquery), because of > ALL should return FALSE
-if subquery returns NULL(s) but aggregates don't take NULLs into account.
-
-> 
-> Maybe I can write up a sample of this?  Vadim, would this help?  Is this
-> the point we are stuck at?
-
-Personally, I was stuck by holydays -:)
-Now I can spend ~ 8 hours ~ each day for development...
-
-Vadim
-
-
-From [email protected] Mon Jan  5 10:45:30 1998
-Received: from hub.org (hub.org [209.47.148.200])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id KAA10769
-   for ; Mon, 5 Jan 1998 10:45:28 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id KAA17823; Mon, 5 Jan 1998 10:32:00 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 05 Jan 1998 10:31:45 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id KAA17757 for pgsql-hackers-outgoing; Mon, 5 Jan 1998 10:31:38 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [209.152.195.67]) by hub.org (8.8.5/8.7.5) with ESMTP id KAA17727 for ; Mon, 5 Jan 1998 10:31:06 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id KAA10375;
-   Mon, 5 Jan 1998 10:28:48 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: Re: [HACKERS] subselect
-To: [email protected] (Vadim B. Mikheev)
-Date: Mon, 5 Jan 1998 10:28:48 -0500 (EST)
-Cc: [email protected]
-In-Reply-To: <[email protected]> from "Vadim B. Mikheev" at Jan 5, 98 07:35:59 pm
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-> Yes, this is a way to handle subqueries by joining to temp table.
-> After getting plan we could change temp table access path to
-> node material. On the other hand, it could be useful to let optimizer
-> know about cost of temp table creation (have to think more about it)...
-> Unfortunately, not all subqueries can be handled by "normal" joins: NOT IN
-> is one example of this - joining by <> will give us invalid results.
-> Setting special NOT EQUAL flag is not enough: subquery plan must be
-> always inner one in this case. The same for handling ALL modifier.
-> Note, that we generaly can't use aggregates here: we can't add MAX to 
-> subquery in the case of > ALL (subquery), because of > ALL should return FALSE
-> if subquery returns NULL(s) but aggregates don't take NULLs into account.
-
-OK, here are my ideas.  First, I think you have to handle subselects in
-the outer node because a subquery could have its own subquery.  Also, we
-now have a field in Aggreg to all us to 'usenulls'.
-
-OK, here it is.  I recommend we pass the outer and subquery through
-the parser and optimizer separately.
-
-We parse the subquery first.  If the subquery is not correlated, it
-should parse fine.  If it is correlated, any columns we find in the
-subquery that are not already in the FROM list, we add the table to the
-subquery FROM list, and add the referenced column to the target list of
-the subquery.
-
-When we are finished parsing the subquery, we create a catalog cache
-entry for it called 'sub1' and make its fields match the target
-list of the subquery.
-
-In the outer query, we add 'sub1' to its target list, and change
-the subquery reference to point to the new range table.  We also add
-WHERE clauses to do any correlated joins.
-
-Here is a simple example:
-
-   select *
-   from taba
-   where col1 = (select col2
-             from tabb)
-
-This is not correlated, and the subquery parser easily.  We create a
-'sub1' catalog cache entry, and add 'sub1' to the outer query FROM
-clause.  We also replace 'col1 = (subquery)' with 'col1 = sub1.col2'.
-
-Here is a more complex correlated subquery:
-
-   select *
-   from taba
-   where col1 = (select col2
-             from tabb
-             where taba.col3 = tabb.col4)
-
-Here we must add 'taba' to the subquery's FROM list, and add col3 to the
-target list of the subquery.  After we parse the subquery, add 'sub1' to
-the FROM list of the outer query, change 'col1 = (subquery)' to 'col1 =
-sub1.col2', and add to the outer WHERE clause 'AND taba.col3 = sub1.col3'.
-THe optimizer will do the correlation for us.
-
-In the optimizer, we can parse the subquery first, then the outer query,
-and then replace all 'sub1' references in the outer query to use the
-subquery plan.
-
-I realize making merging the two plans and doing IN and NOT IN is the
-real challenge, but I hoped this would give us a start.
-
-What do you think?
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Mon Jan  5 15:02:46 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id PAA28690
-   for ; Mon, 5 Jan 1998 15:02:44 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id OAA08811 for ; Mon, 5 Jan 1998 14:28:43 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id CAA24904;
-   Tue, 6 Jan 1998 02:56:00 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Tue, 06 Jan 1998 02:55:57 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: [email protected]
-Subject: Re: [HACKERS] subselect
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> > always inner one in this case. The same for handling ALL modifier.
-> > Note, that we generaly can't use aggregates here: we can't add MAX to
-> > subquery in the case of > ALL (subquery), because of > ALL should return FALSE
-> > if subquery returns NULL(s) but aggregates don't take NULLs into account.
-> 
-> OK, here are my ideas.  First, I think you have to handle subselects in
-> the outer node because a subquery could have its own subquery.  Also, we
-
-I hope that this is no matter: if results of subquery (with/without sub-subqueries)
-will go into temp table then this table will be re-scanned for each outer tuple.
-
-> now have a field in Aggreg to all us to 'usenulls'.
-                                           ^^^^^^^^
- This can't help:
-
-vac=> select * from x;
-y
--
-1
-2
-3
- <<< this is NULL
-(4 rows)
-
-vac=> select max(y) from x;
-max
----
-  3
-
-==> we can't replace 
-
-select * from A where A.a > ALL (select y from x);
-                                 ^^^^^^^^^^^^^^^
-           (NULL will be returned and so A.a > ALL is FALSE - this is what 
-            Sybase does, is it right ?)
-with
-
-select * from A where A.a > (select max(y) from x);
-                             ^^^^^^^^^^^^^^^^^^^^
-just because of we lose knowledge about NULLs here.
-
-Also, I would like to handle ANY and ALL modifiers for all bool
-operators, either built-in or user-defined, for all data types -
-isn't PostgreSQL OO-like RDBMS -:)
-
-> OK, here it is.  I recommend we pass the outer and subquery through
-> the parser and optimizer separately.
-
-I don't like this. I would like to get parse-tree from parser for
-entire query and let optimizer (on upper level) decide how to rewrite
-parse-tree and what plans to produce and how these plans should be
-merged. Note, that I don't object your methods below, but only where
-to place handling of this. I don't understand why should we add
-new part to the system which will do optimizer' work (parse-tree --> 
-execution plan) and deal with optimizer nodes. Imho, upper optimizer
-level is nice place to do this.
-
-> 
-> We parse the subquery first.  If the subquery is not correlated, it
-> should parse fine.  If it is correlated, any columns we find in the
-> subquery that are not already in the FROM list, we add the table to the
-> subquery FROM list, and add the referenced column to the target list of
-> the subquery.
-> 
-> When we are finished parsing the subquery, we create a catalog cache
-> entry for it called 'sub1' and make its fields match the target
-> list of the subquery.
-> 
-> In the outer query, we add 'sub1' to its target list, and change
-> the subquery reference to point to the new range table.  We also add
-> WHERE clauses to do any correlated joins.
-...
-> Here is a more complex correlated subquery:
-> 
->         select *
->         from taba
->         where col1 = (select col2
->                       from tabb
->                       where taba.col3 = tabb.col4)
-> 
-> Here we must add 'taba' to the subquery's FROM list, and add col3 to the
-> target list of the subquery.  After we parse the subquery, add 'sub1' to
-> the FROM list of the outer query, change 'col1 = (subquery)' to 'col1 =
-> sub1.col2', and add to the outer WHERE clause 'AND taba.col3 = sub1.col3'.
-> THe optimizer will do the correlation for us.
-> 
-> In the optimizer, we can parse the subquery first, then the outer query,
-> and then replace all 'sub1' references in the outer query to use the
-> subquery plan.
-> 
-> I realize making merging the two plans and doing IN and NOT IN is the
-                   ^^^^^^^^^^^^^^^^^^^^^
-This is very easy to do! As I already said we have just change sub1
-access path (SeqScan of sub1) with SeqScan of Material node with 
-subquery plan.
-
-> real challenge, but I hoped this would give us a start.
-
-Decision about how to record subquery stuff in to parse-tree
-would be very good start -:)
-
-BTW, note that for _expression_ subqueries (which are introduced without
-IN, EXISTS, ALL, ANY - this follows Sybase' naming) - as in your examples - 
-we have to check that subquery returns single tuple...
-
-Vadim
-
-From [email protected] Mon Jan  5 20:31:03 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id UAA06836
-   for ; Mon, 5 Jan 1998 20:31:01 -0500 (EST)
-Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id TAA29980 for ; Mon, 5 Jan 1998 19:56:05 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id TAA28044; Mon, 5 Jan 1998 19:06:11 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 05 Jan 1998 19:03:16 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id TAA27203 for pgsql-hackers-outgoing; Mon, 5 Jan 1998 19:03:02 -0500 (EST)
-Received: from clio.trends.ca ([email protected] [209.47.148.2]) by hub.org (8.8.8/8.7.5) with ESMTP id TAA27049 for ; Mon, 5 Jan 1998 19:02:30 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [209.152.195.67])
-   by clio.trends.ca (8.8.8/8.8.8) with ESMTP id RAA09337
-   for ; Mon, 5 Jan 1998 17:31:04 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id RAA02675;
-   Mon, 5 Jan 1998 17:16:40 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: Re: [HACKERS] subselect
-To: [email protected] (Vadim B. Mikheev)
-Date: Mon, 5 Jan 1998 17:16:40 -0500 (EST)
-Cc: [email protected]
-In-Reply-To: <[email protected]> from "Vadim B. Mikheev" at Jan 6, 98 05:18:11 am
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-> > I am confused.  Do you want one flat query and want to pass the whole
-> > thing into the optimizer?  That brings up some questions:
-> 
-> No. I just want to follow Tom's way: I would like to see new
-> SubSelect node as shortened version of struct Query (or use
-> Query structure for each subquery - no matter for me), some 
-> subquery-related stuff added to Query (and SubSelect) to help
-> optimizer to start, and see
-
-OK, so you want the subquery to actually be INSIDE the outer query
-expression.  Do they share a common range table?  If they don't, we
-could very easily just fly through when processing the WHERE clause, and
-start a new query using a new query structure for the subquery.  Believe
-me, you don't want a separate SubQuery-type, just re-use Query for it. 
-It allows you to call all the normal query stuff with a consistent
-structure.
-
-The parser will need to know it is in a subquery, so it can add the
-proper target columns to the subquery, or are you going to do that in
-the optimizer.  You can do it in the optimizer, and join the range table
-references there too.
-
-> 
-> typedef struct A_Expr
-> {
->     NodeTag     type;
->     int         oper;           /* type of operation
->                                  * {OP,OR,AND,NOT,ISNULL,NOTNULL} */
->     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
->             IN, NOT IN, ANY, ALL, EXISTS here,
-> 
->     char       *opname;         /* name of operator/function */
->     Node       *lexpr;          /* left argument */
->     Node       *rexpr;          /* right argument */
->     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
->             and SubSelect (Query) here (as possible case).
-> 
-> One thought to follow this way: RULEs (and so - VIEWs) are handled by using
-> Query - how else can we implement VIEWs on selects with subqueries ?
-
-Views are stored as nodeout structures, and are merged into the query's
-from list, target list, and where clause.  I am working out
-readfunc,outfunc now to make sure they are up-to-date with all the
-current fields.
-
-> 
-> BTW, is
-> 
-> select * from A where (select TRUE from B);
-> 
-> valid syntax ?
-
-I don't think so.
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Mon Jan  5 17:01:54 1998
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id RAA02066
-   for ; Mon, 5 Jan 1998 17:01:47 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id FAA25063;
-   Tue, 6 Jan 1998 05:18:13 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Tue, 06 Jan 1998 05:18:11 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: [email protected]
-Subject: Re: [HACKERS] subselect
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> > > OK, here it is.  I recommend we pass the outer and subquery through
-> > > the parser and optimizer separately.
-> >
-> > I don't like this. I would like to get parse-tree from parser for
-> > entire query and let optimizer (on upper level) decide how to rewrite
-> > parse-tree and what plans to produce and how these plans should be
-> > merged. Note, that I don't object your methods below, but only where
-> > to place handling of this. I don't understand why should we add
-> > new part to the system which will do optimizer' work (parse-tree -->
-> > execution plan) and deal with optimizer nodes. Imho, upper optimizer
-> > level is nice place to do this.
-> 
-> I am confused.  Do you want one flat query and want to pass the whole
-> thing into the optimizer?  That brings up some questions:
-
-No. I just want to follow Tom's way: I would like to see new
-SubSelect node as shortened version of struct Query (or use
-Query structure for each subquery - no matter for me), some 
-subquery-related stuff added to Query (and SubSelect) to help
-optimizer to start, and see
-
-typedef struct A_Expr
-{
-    NodeTag     type;
-    int         oper;           /* type of operation
-                                 * {OP,OR,AND,NOT,ISNULL,NOTNULL} */
-    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-            IN, NOT IN, ANY, ALL, EXISTS here,
-
-    char       *opname;         /* name of operator/function */
-    Node       *lexpr;          /* left argument */
-    Node       *rexpr;          /* right argument */
-    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-            and SubSelect (Query) here (as possible case).
-
-One thought to follow this way: RULEs (and so - VIEWs) are handled by using
-Query - how else can we implement VIEWs on selects with subqueries ?
-
-BTW, is
-
-select * from A where (select TRUE from B);
-
-valid syntax ?
-
-Vadim
-
-From [email protected] Mon Jan  5 18:00:57 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id SAA03296
-   for ; Mon, 5 Jan 1998 18:00:55 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id RAA20716 for ; Mon, 5 Jan 1998 17:22:21 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id FAA25094;
-   Tue, 6 Jan 1998 05:49:02 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Tue, 06 Jan 1998 05:48:58 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Goran Thyni 
-CC: [email protected], [email protected]
-Subject: Re: [HACKERS] subselect
-References: <[email protected]> <[email protected]> <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Goran Thyni wrote:
-> 
-> Vadim,
-> 
->    Unfortunately, not all subqueries can be handled by "normal" joins: NOT IN
->    is one example of this - joining by <> will give us invalid results.
-> 
-> What is you approach towards this problem?
-
-Actually, this is problem of ALL modifier (NOT IN is _not_equal_ ALL)
-and so, we have to have not just NOT EQUAL flag but some ALL node
-with modified operator.
-
-After that, one way is put subquery into inner plan of an join node
-to be sure that for an outer tuple all corresponding subquery tuples
-will be tested with modified operator (this will require either
-changing code of all join nodes or addition of new plan type - we'll see)
-and another way is ... suggested by you:
-
-> I got an idea that one could reverse the order,
-> that is execute the outer first into a temptable
-> and delete from that according to the result of the
-> subquery and then return it.
-> Probably this is too raw and slow. ;-)
-
-This will be faster in some cases (when subquery returns many results
-and there are "not so many" results from outer query) - thanks for idea!
-
-> 
->    Personally, I was stuck by holydays -:)
->    Now I can spend ~ 8 hours ~ each day for development...
-> 
-> Oh, isn't it christmas eve right now in Russia?
-
-Due to historic reasons New Year is mu-u-u-uch popular
-holiday in Russia -:)
-
-Vadim
-
-From [email protected] Mon Jan  5 19:32:59 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id TAA05070
-   for ; Mon, 5 Jan 1998 19:32:57 -0500 (EST)
-Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id SAA26847 for ; Mon, 5 Jan 1998 18:59:43 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id TAA28045; Mon, 5 Jan 1998 19:06:11 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 05 Jan 1998 19:03:40 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id TAA27280 for pgsql-hackers-outgoing; Mon, 5 Jan 1998 19:03:25 -0500 (EST)
-Received: from clio.trends.ca ([email protected] [209.47.148.2]) by hub.org (8.8.8/8.7.5) with ESMTP id TAA27030 for ; Mon, 5 Jan 1998 19:02:25 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86])
-   by clio.trends.ca (8.8.8/8.8.8) with ESMTP id RAA09438
-   for ; Mon, 5 Jan 1998 17:35:43 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id FAA25094;
-   Tue, 6 Jan 1998 05:49:02 +0700 (KRS)
-   (envelope-from [email protected])
-Message-ID: <[email protected]>
-Date: Tue, 06 Jan 1998 05:48:58 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Goran Thyni 
-CC: [email protected], [email protected]
-Subject: Re: [HACKERS] subselect
-References: <[email protected]> <[email protected]> <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-Goran Thyni wrote:
-> 
-> Vadim,
-> 
->    Unfortunately, not all subqueries can be handled by "normal" joins: NOT IN
->    is one example of this - joining by <> will give us invalid results.
-> 
-> What is you approach towards this problem?
-
-Actually, this is problem of ALL modifier (NOT IN is _not_equal_ ALL)
-and so, we have to have not just NOT EQUAL flag but some ALL node
-with modified operator.
-
-After that, one way is put subquery into inner plan of an join node
-to be sure that for an outer tuple all corresponding subquery tuples
-will be tested with modified operator (this will require either
-changing code of all join nodes or addition of new plan type - we'll see)
-and another way is ... suggested by you:
-
-> I got an idea that one could reverse the order,
-> that is execute the outer first into a temptable
-> and delete from that according to the result of the
-> subquery and then return it.
-> Probably this is too raw and slow. ;-)
-
-This will be faster in some cases (when subquery returns many results
-and there are "not so many" results from outer query) - thanks for idea!
-
-> 
->    Personally, I was stuck by holydays -:)
->    Now I can spend ~ 8 hours ~ each day for development...
-> 
-> Oh, isn't it christmas eve right now in Russia?
-
-Due to historic reasons New Year is mu-u-u-uch popular
-holiday in Russia -:)
-
-Vadim
-
-
-From [email protected] Mon Jan  5 18:00:59 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id SAA03300
-   for ; Mon, 5 Jan 1998 18:00:57 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id RAA21652 for ; Mon, 5 Jan 1998 17:42:15 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id GAA25129;
-   Tue, 6 Jan 1998 06:10:05 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Tue, 06 Jan 1998 06:09:56 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: [email protected]
-Subject: Re: [HACKERS] subselect
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> > > I am confused.  Do you want one flat query and want to pass the whole
-> > > thing into the optimizer?  That brings up some questions:
-> >
-> > No. I just want to follow Tom's way: I would like to see new
-> > SubSelect node as shortened version of struct Query (or use
-> > Query structure for each subquery - no matter for me), some
-> > subquery-related stuff added to Query (and SubSelect) to help
-> > optimizer to start, and see
-> 
-> OK, so you want the subquery to actually be INSIDE the outer query
-> expression.  Do they share a common range table?  If they don't, we
-               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-No.
-
-> could very easily just fly through when processing the WHERE clause, and
-> start a new query using a new query structure for the subquery.  Believe
-   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-... and filling some subquery-related stuff in upper query structure -
-still don't know what exactly this could be -:)
-
-> me, you don't want a separate SubQuery-type, just re-use Query for it.
-> It allows you to call all the normal query stuff with a consistent
-> structure.
-
-No objections.
-
-> 
-> The parser will need to know it is in a subquery, so it can add the
-> proper target columns to the subquery, or are you going to do that in
-
-I don't think that we need in it, but list of correlation clauses
-could be good thing - all in all parser has to check all column 
-references...
-
-> the optimizer.  You can do it in the optimizer, and join the range table
-> references there too.
-
-Yes.
-
-> > typedef struct A_Expr
-> > {
-> >     NodeTag     type;
-> >     int         oper;           /* type of operation
-> >                                  * {OP,OR,AND,NOT,ISNULL,NOTNULL} */
-> >     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-> >             IN, NOT IN, ANY, ALL, EXISTS here,
-> >
-> >     char       *opname;         /* name of operator/function */
-> >     Node       *lexpr;          /* left argument */
-> >     Node       *rexpr;          /* right argument */
-> >     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-> >             and SubSelect (Query) here (as possible case).
-> >
-> > One thought to follow this way: RULEs (and so - VIEWs) are handled by using
-> > Query - how else can we implement VIEWs on selects with subqueries ?
-> 
-> Views are stored as nodeout structures, and are merged into the query's
-> from list, target list, and where clause.  I am working out
-> readfunc,outfunc now to make sure they are up-to-date with all the
-> current fields.
-
-Nice! This stuff was out-of-date for too long time.
-
-> > BTW, is
-> >
-> > select * from A where (select TRUE from B);
-> >
-> > valid syntax ?
-> 
-> I don't think so.
-
-And so, *rexpr can be of Query type only for oper "in" OP, IN, NOT IN,
-ANY, ALL, EXISTS - well.
-
-(Time to sleep -:)
-
-Vadim
-
-From [email protected] Mon Jan  5 20:31:08 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id UAA06842
-   for ; Mon, 5 Jan 1998 20:31:06 -0500 (EST)
-Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id UAA00621 for ; Mon, 5 Jan 1998 20:03:49 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id TAA28043; Mon, 5 Jan 1998 19:06:11 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 05 Jan 1998 19:03:38 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id TAA27270 for pgsql-hackers-outgoing; Mon, 5 Jan 1998 19:03:22 -0500 (EST)
-Received: from clio.trends.ca ([email protected] [209.47.148.2]) by hub.org (8.8.8/8.7.5) with ESMTP id TAA27141 for ; Mon, 5 Jan 1998 19:02:50 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86])
-   by clio.trends.ca (8.8.8/8.8.8) with ESMTP id RAA09919
-   for ; Mon, 5 Jan 1998 17:54:47 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id GAA25129;
-   Tue, 6 Jan 1998 06:10:05 +0700 (KRS)
-   (envelope-from [email protected])
-Message-ID: <[email protected]>
-Date: Tue, 06 Jan 1998 06:09:56 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: [email protected]
-Subject: Re: [HACKERS] subselect
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> > > I am confused.  Do you want one flat query and want to pass the whole
-> > > thing into the optimizer?  That brings up some questions:
-> >
-> > No. I just want to follow Tom's way: I would like to see new
-> > SubSelect node as shortened version of struct Query (or use
-> > Query structure for each subquery - no matter for me), some
-> > subquery-related stuff added to Query (and SubSelect) to help
-> > optimizer to start, and see
-> 
-> OK, so you want the subquery to actually be INSIDE the outer query
-> expression.  Do they share a common range table?  If they don't, we
-               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-No.
-
-> could very easily just fly through when processing the WHERE clause, and
-> start a new query using a new query structure for the subquery.  Believe
-   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-... and filling some subquery-related stuff in upper query structure -
-still don't know what exactly this could be -:)
-
-> me, you don't want a separate SubQuery-type, just re-use Query for it.
-> It allows you to call all the normal query stuff with a consistent
-> structure.
-
-No objections.
-
-> 
-> The parser will need to know it is in a subquery, so it can add the
-> proper target columns to the subquery, or are you going to do that in
-
-I don't think that we need in it, but list of correlation clauses
-could be good thing - all in all parser has to check all column 
-references...
-
-> the optimizer.  You can do it in the optimizer, and join the range table
-> references there too.
-
-Yes.
-
-> > typedef struct A_Expr
-> > {
-> >     NodeTag     type;
-> >     int         oper;           /* type of operation
-> >                                  * {OP,OR,AND,NOT,ISNULL,NOTNULL} */
-> >     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-> >             IN, NOT IN, ANY, ALL, EXISTS here,
-> >
-> >     char       *opname;         /* name of operator/function */
-> >     Node       *lexpr;          /* left argument */
-> >     Node       *rexpr;          /* right argument */
-> >     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-> >             and SubSelect (Query) here (as possible case).
-> >
-> > One thought to follow this way: RULEs (and so - VIEWs) are handled by using
-> > Query - how else can we implement VIEWs on selects with subqueries ?
-> 
-> Views are stored as nodeout structures, and are merged into the query's
-> from list, target list, and where clause.  I am working out
-> readfunc,outfunc now to make sure they are up-to-date with all the
-> current fields.
-
-Nice! This stuff was out-of-date for too long time.
-
-> > BTW, is
-> >
-> > select * from A where (select TRUE from B);
-> >
-> > valid syntax ?
-> 
-> I don't think so.
-
-And so, *rexpr can be of Query type only for oper "in" OP, IN, NOT IN,
-ANY, ALL, EXISTS - well.
-
-(Time to sleep -:)
-
-Vadim
-
-
-From [email protected] Thu Jan  8 23:10:50 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id XAA09707
-   for ; Thu, 8 Jan 1998 23:10:48 -0500 (EST)
-Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id XAA19334 for ; Thu, 8 Jan 1998 23:08:49 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id XAA14375; Thu, 8 Jan 1998 23:03:29 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Thu, 08 Jan 1998 23:03:10 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id XAA14345 for pgsql-hackers-outgoing; Thu, 8 Jan 1998 23:03:06 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [209.152.195.67]) by hub.org (8.8.8/8.7.5) with ESMTP id XAA14008 for ; Thu, 8 Jan 1998 23:00:50 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id WAA09243;
-   Thu, 8 Jan 1998 22:55:03 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: [HACKERS] subselects
-To: [email protected] (Vadim B. Mikheev)
-Date: Thu, 8 Jan 1998 22:55:03 -0500 (EST)
-Cc: [email protected] (PostgreSQL-development)
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-Vadim, I know you are still thinking about subselects, but I have some
-more clarification that may help.
-
-We have to add phantom range table entries to correlated subselects so
-they will pass the parser.  We might as well add those fields to the
-target list of the subquery at the same time:
-
-   select *
-   from taba
-   where col1 = (select col2
-             from tabb
-             where taba.col3 = tabb.col4)
-
-becomes:
-
-   select *
-   from taba
-   where col1 = (select col2, tabb.col4 <---
-             from tabb, taba  <---
-             where taba.col3 = tabb.col4)
-
-We add a field to TargetEntry and RangeTblEntry to mark the fact that it
-was entered as a correlation entry:
-
-   bool    isCorrelated;
-
-Second, we need to hook the subselect to the main query.  I recommend we
-add two fields to Query for this:
-
-   Query *parentQuery;
-   List *subqueries;
-
-The parentQuery pointer is used to resolve field names in the correlated
-subquery.
-
-   select *
-   from taba
-   where col1 = (select col2, tabb.col4 <---
-             from tabb, taba  <---
-             where taba.col3 = tabb.col4)
-
-In the query above, the subquery can be easily parsed, and we add the
-subquery to the parsent's parentQuery list.
-
-In the parent query, to parse the WHERE clause, we create a new operator
-type, called IN or NOT_IN, or ALL, where the left side is a Var, and the
-right side is an index to a slot in the subqueries List.
-
-We can then do the rest in the upper optimizer.
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Fri Jan  9 10:01:01 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id KAA27305
-   for ; Fri, 9 Jan 1998 10:00:59 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id JAA21583 for ; Fri, 9 Jan 1998 09:52:17 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id WAA01623;
-   Fri, 9 Jan 1998 22:10:25 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Fri, 09 Jan 1998 22:10:06 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: PostgreSQL-development 
-Subject: Re: subselects
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> Vadim, I know you are still thinking about subselects, but I have some
-> more clarification that may help.
-> 
-> We have to add phantom range table entries to correlated subselects so
-> they will pass the parser.  We might as well add those fields to the
-> target list of the subquery at the same time:
-> 
->         select *
->         from taba
->         where col1 = (select col2
->                       from tabb
->                       where taba.col3 = tabb.col4)
-> 
-> becomes:
-> 
->         select *
->         from taba
->         where col1 = (select col2, tabb.col4 <---
->                       from tabb, taba  <---
->                       where taba.col3 = tabb.col4)
-> 
-> We add a field to TargetEntry and RangeTblEntry to mark the fact that it
-> was entered as a correlation entry:
-> 
->         bool    isCorrelated;
-
-No, I don't like to add anything in parser. Example:
-
-        select *
-        from tabA
-        where col1 = (select col2
-                      from tabB
-                      where tabA.col3 = tabB.col4
-                      and exists (select * 
-                                  from tabC 
-                                  where tabB.colX = tabC.colX and
-                                        tabC.colY = tabA.col2)
-                     )
-
-: a column of tabA is referenced in sub-subselect 
-(is it allowable by standards ?) - in this case it's better 
-to don't add tabA to 1st subselect but add tabA to second one
-and change tabA.col3 in 1st to reference col3 in 2nd subquery temp table -
-this gives us 2-tables join in 1st subquery instead of 3-tables join.
-(And I'm still not sure that using temp tables is best of what can be 
-done in all cases...)
-
-Instead of using isCorrelated in TE & RTE we can add 
-
-Index varlevel;
-
-to Var node to reflect (sub)query from where this Var is come
-(where is range table to find var's relation using varno). Upmost query
-will have varlevel = 0, all its (dirrect) children - varlevel = 1 and so on.
-                        ^^^                          ^^^^^^^^^^^^
-(I don't see problems with distinguishing Vars of different children
-on the same level...)
-
-> 
-> Second, we need to hook the subselect to the main query.  I recommend we
-> add two fields to Query for this:
-> 
->         Query *parentQuery;
->         List *subqueries;
-
-Agreed. And maybe Index queryLevel.
-
-> In the parent query, to parse the WHERE clause, we create a new operator
-> type, called IN or NOT_IN, or ALL, where the left side is a Var, and the
-                                               ^^^^^^^^^^^^^^^^^^
-No. We have to handle (a,b,c) OP (select x, y, z ...) and 
-'_a_constant_' OP (select ...) - I don't know is last in standards,
-Sybase has this.
-
-Well,
-
-typedef enum OpType
-{
-    OP_EXPR, FUNC_EXPR, OR_EXPR, AND_EXPR, NOT_EXPR
-
-+ OP_EXISTS, OP_ALL, OP_ANY
-
-} OpType;
-
-typedef struct Expr
-{
-    NodeTag     type;
-    Oid         typeOid;        /* oid of the type of this expr */
-    OpType      opType;         /* type of the op */
-    Node       *oper;           /* could be Oper or Func */
-    List       *args;           /* list of argument nodes */
-} Expr;
-
-OP_EXISTS: oper is NULL, lfirst(args) is SubSelect (index in subqueries
-           List, following your suggestion)
-
-OP_ALL, OP_ANY:
-
-oper is List of Oper nodes. We need in list because of data types of
-a, b, c (above) can be different and so Oper nodes will be different too.
-
-lfirst(args) is List of expression nodes (Const, Var, Func ?, a + b ?) -
-left side of subquery' operator.
-lsecond(args) is SubSelect.
-
-Note, that there are no OP_IN, OP_NOTIN in OpType-s for Expr. We need in
-IN, NOTIN in A_Expr (parser node), but both of them have to be transferred
-by parser into corresponding ANY and ALL. At the moment we can do:
-
-IN --> = ANY, NOT IN --> <> ALL
-
-but this will be "known bug": this breaks OO-nature of Postgres, because of
-operators can be overrided and '=' can mean  s o m e t h i n g (not equality).
-Example: box data type. For boxes, = means equality of _areas_ and =~
-means that boxes are the same ==> =~ ANY should be used for IN.
-
-> right side is an index to a slot in the subqueries List.
-
-Vadim
-
-From [email protected] Fri Jan  9 17:44:04 1998
-Received: from hub.org (hub.org [209.47.148.200])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id RAA24779
-   for ; Fri, 9 Jan 1998 17:44:01 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id RAA20728; Fri, 9 Jan 1998 17:32:34 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Fri, 09 Jan 1998 17:32:19 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id RAA20503 for pgsql-hackers-outgoing; Fri, 9 Jan 1998 17:32:15 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [209.152.195.67]) by hub.org (8.8.8/8.7.5) with ESMTP id RAA20008 for ; Fri, 9 Jan 1998 17:31:24 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id RAA24282;
-   Fri, 9 Jan 1998 17:31:41 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: [HACKERS] Re: subselects
-To: [email protected] (Vadim B. Mikheev)
-Date: Fri, 9 Jan 1998 17:31:41 -0500 (EST)
-Cc: [email protected]
-In-Reply-To: <[email protected]> from "Vadim B. Mikheev" at Jan 9, 98 10:10:06 pm
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-> 
-> Bruce Momjian wrote:
-> > 
-> > Vadim, I know you are still thinking about subselects, but I have some
-> > more clarification that may help.
-> > 
-> > We have to add phantom range table entries to correlated subselects so
-> > they will pass the parser.  We might as well add those fields to the
-> > target list of the subquery at the same time:
-> > 
-> >         select *
-> >         from taba
-> >         where col1 = (select col2
-> >                       from tabb
-> >                       where taba.col3 = tabb.col4)
-> > 
-> > becomes:
-> > 
-> >         select *
-> >         from taba
-> >         where col1 = (select col2, tabb.col4 <---
-> >                       from tabb, taba  <---
-> >                       where taba.col3 = tabb.col4)
-> > 
-> > We add a field to TargetEntry and RangeTblEntry to mark the fact that it
-> > was entered as a correlation entry:
-> > 
-> >         bool    isCorrelated;
-> 
-> No, I don't like to add anything in parser. Example:
-> 
->         select *
->         from tabA
->         where col1 = (select col2
->                       from tabB
->                       where tabA.col3 = tabB.col4
->                       and exists (select * 
->                                   from tabC 
->                                   where tabB.colX = tabC.colX and
->                                         tabC.colY = tabA.col2)
->                      )
-> 
-> : a column of tabA is referenced in sub-subselect 
-
-This is a strange case that I don't think we need to handle in our first
-implementation.
-
-> (is it allowable by standards ?) - in this case it's better 
-> to don't add tabA to 1st subselect but add tabA to second one
-> and change tabA.col3 in 1st to reference col3 in 2nd subquery temp table -
-> this gives us 2-tables join in 1st subquery instead of 3-tables join.
-> (And I'm still not sure that using temp tables is best of what can be 
-> done in all cases...)
-
-I don't see any use for temp tables in subselects anymore.  After having
-implemented UNIONS, I now see how much can be done in the upper
-optimizer.  I see you just putting the subquery PLAN into the proper
-place in the plan tree, with some proper JOIN nodes for IN, NOT IN.
-
-> 
-> Instead of using isCorrelated in TE & RTE we can add 
-> 
-> Index varlevel;
-
-OK.  Sounds good.
-
-> 
-> to Var node to reflect (sub)query from where this Var is come
-> (where is range table to find var's relation using varno). Upmost query
-> will have varlevel = 0, all its (dirrect) children - varlevel = 1 and so on.
->                         ^^^                          ^^^^^^^^^^^^
-> (I don't see problems with distinguishing Vars of different children
-> on the same level...)
-> 
-> > 
-> > Second, we need to hook the subselect to the main query.  I recommend we
-> > add two fields to Query for this:
-> > 
-> >         Query *parentQuery;
-> >         List *subqueries;
-> 
-> Agreed. And maybe Index queryLevel.
-
-Sure.  If it helps.
-
-> 
-> > In the parent query, to parse the WHERE clause, we create a new operator
-> > type, called IN or NOT_IN, or ALL, where the left side is a Var, and the
->                                                ^^^^^^^^^^^^^^^^^^
-> No. We have to handle (a,b,c) OP (select x, y, z ...) and 
-> '_a_constant_' OP (select ...) - I don't know is last in standards,
-> Sybase has this.
-
-I have never seen this in my eight years of SQL.  Perhaps we can leave
-this for later, maybe much later.
-
-> 
-> Well,
-> 
-> typedef enum OpType
-> {
->     OP_EXPR, FUNC_EXPR, OR_EXPR, AND_EXPR, NOT_EXPR
-> 
-> + OP_EXISTS, OP_ALL, OP_ANY
-> 
-> } OpType;
-> 
-> typedef struct Expr
-> {
->     NodeTag     type;
->     Oid         typeOid;        /* oid of the type of this expr */
->     OpType      opType;         /* type of the op */
->     Node       *oper;           /* could be Oper or Func */
->     List       *args;           /* list of argument nodes */
-> } Expr;
-> 
-> OP_EXISTS: oper is NULL, lfirst(args) is SubSelect (index in subqueries
->            List, following your suggestion)
-> 
-> OP_ALL, OP_ANY:
-> 
-> oper is List of Oper nodes. We need in list because of data types of
-> a, b, c (above) can be different and so Oper nodes will be different too.
-> 
-> lfirst(args) is List of expression nodes (Const, Var, Func ?, a + b ?) -
-> left side of subquery' operator.
-> lsecond(args) is SubSelect.
-> 
-> Note, that there are no OP_IN, OP_NOTIN in OpType-s for Expr. We need in
-> IN, NOTIN in A_Expr (parser node), but both of them have to be transferred
-> by parser into corresponding ANY and ALL. At the moment we can do:
-> 
-> IN --> = ANY, NOT IN --> <> ALL
-> 
-> but this will be "known bug": this breaks OO-nature of Postgres, because of
-> operators can be overrided and '=' can mean  s o m e t h i n g (not equality).
-> Example: box data type. For boxes, = means equality of _areas_ and =~
-> means that boxes are the same ==> =~ ANY should be used for IN.
-
-That is interesting, to use =~ for ANY.
-
-Yes, but how many operators take a SUBQUERY as an operand.  This is a
-special case to me.
-
-I think I see where you are trying to go.  You want subselects to behave
-like any other operator, with a subselect type, and you do all the
-subselect handling in the optimizer, with special Nodes and actions.
-
-I think this may be just too much of a leap.  We have such clean query
-logic for single queries, I can't imagine having an operator that has a
-Query operand, and trying to get everything to properly handle it. 
-UNIONS were very easy to implement as a List off of Query, with some
-foreach()'s in rewrite and the high optimizer.
-
-Subselects are SQL standard, and are never going to be over-ridden by a
-user.  Same with UNION.  They want UNION, they get UNION.  They want
-Subselect, we are going to spin through the Query structure and give
-them what they want.
-
-The complexities of subselects and correlated queries and range tables
-and stuff is so bizarre that trying to get it to work inside the type
-system could be a huge project.
-
-> 
-> > right side is an index to a slot in the subqueries List.
-
-I guess the question is what can we have by February 1?
-
-I have been reading some postings, and it seems to me that subselects
-are the litmus test for many evaluators when deciding if a database
-engine is full-featured.
-
-Sorry to be so straightforward, but I want to keep hashing this around
-until we get a conclusion, so coding can start.
-
-My suggestions have been, I believe, trying to get subselects working
-with the fullest functionality by adding the least amount of code, and
-keeping the logic clean.
-
-Have you checked out the UNION code?  It is very small, but it works.  I
-think it could make a good sample for subselects.
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Sat Jan 10 12:00:51 1998
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id MAA28742
-   for ; Sat, 10 Jan 1998 12:00:43 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id AAA05684;
-   Sun, 11 Jan 1998 00:19:10 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Sun, 11 Jan 1998 00:19:08 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: [email protected], "Thomas G. Lockhart" 
-Subject: Re: subselects
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> > No, I don't like to add anything in parser. Example:
-> >
-> >         select *
-> >         from tabA
-> >         where col1 = (select col2
-> >                       from tabB
-> >                       where tabA.col3 = tabB.col4
-> >                       and exists (select *
-> >                                   from tabC
-> >                                   where tabB.colX = tabC.colX and
-> >                                         tabC.colY = tabA.col2)
-> >                      )
-> >
-> > : a column of tabA is referenced in sub-subselect
-> 
-> This is a strange case that I don't think we need to handle in our first
-> implementation.
-
-I don't know is this strange case or not :)
-But I would like to know is this allowed by standards - can someone
-comment on this ?
-And I don't see problems with handling this...
-
-> 
-> > (is it allowable by standards ?) - in this case it's better
-> > to don't add tabA to 1st subselect but add tabA to second one
-> > and change tabA.col3 in 1st to reference col3 in 2nd subquery temp table -
-> > this gives us 2-tables join in 1st subquery instead of 3-tables join.
-> > (And I'm still not sure that using temp tables is best of what can be
-> > done in all cases...)
-> 
-> I don't see any use for temp tables in subselects anymore.  After having
-> implemented UNIONS, I now see how much can be done in the upper
-> optimizer.  I see you just putting the subquery PLAN into the proper
-> place in the plan tree, with some proper JOIN nodes for IN, NOT IN.
-
-When saying about temp tables, I meant tables created by node Material
-for subquery plan. This is one of two ways - run subquery once for all
-possible upper plan tuples and then just join result table with upper
-query. Another way is re-run subquery for each upper query tuple,
-without temp table but may be with caching results by some ways.
-Actually, there is special case - when subquery can be alternatively 
-formulated as joins, - but this is just special case.
-
-> > > In the parent query, to parse the WHERE clause, we create a new operator
-> > > type, called IN or NOT_IN, or ALL, where the left side is a Var, and the
-> >                                                ^^^^^^^^^^^^^^^^^^
-> > No. We have to handle (a,b,c) OP (select x, y, z ...) and
-> > '_a_constant_' OP (select ...) - I don't know is last in standards,
-> > Sybase has this.
-> 
-> I have never seen this in my eight years of SQL.  Perhaps we can leave
-> this for later, maybe much later.
-
-Are you saying about (a, b, c) or about 'a_constant' ?
-Again, can someone comment on are they in standards or not ?
-Tom ?
-If yes then please add parser' support for them now...
-
-> > Note, that there are no OP_IN, OP_NOTIN in OpType-s for Expr. We need in
-> > IN, NOTIN in A_Expr (parser node), but both of them have to be transferred
-> > by parser into corresponding ANY and ALL. At the moment we can do:
-> >
-> > IN --> = ANY, NOT IN --> <> ALL
-> >
-> > but this will be "known bug": this breaks OO-nature of Postgres, because of
-> > operators can be overrided and '=' can mean  s o m e t h i n g (not equality).
-> > Example: box data type. For boxes, = means equality of _areas_ and =~
-> > means that boxes are the same ==> =~ ANY should be used for IN.
-> 
-> That is interesting, to use =~ for ANY.
-> 
-> Yes, but how many operators take a SUBQUERY as an operand.  This is a
-> special case to me.
-> 
-> I think I see where you are trying to go.  You want subselects to behave
-> like any other operator, with a subselect type, and you do all the
-> subselect handling in the optimizer, with special Nodes and actions.
-> 
-> I think this may be just too much of a leap.  We have such clean query
-> logic for single queries, I can't imagine having an operator that has a
-> Query operand, and trying to get everything to properly handle it.
-> UNIONS were very easy to implement as a List off of Query, with some
-> foreach()'s in rewrite and the high optimizer.
-> 
-> Subselects are SQL standard, and are never going to be over-ridden by a
-> user.  Same with UNION.  They want UNION, they get UNION.  They want
-> Subselect, we are going to spin through the Query structure and give
-> them what they want.
-> 
-> The complexities of subselects and correlated queries and range tables
-> and stuff is so bizarre that trying to get it to work inside the type
-> system could be a huge project.
-
-PostgreSQL is a robust, next-generation, Object-Relational DBMS (ORDBMS),
-derived from the Berkeley Postgres database management system. While
-PostgreSQL retains the powerful object-relational data model, rich data types and
-           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-easy extensibility of Postgres, it replaces the PostQuel query language with an
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-extended subset of SQL.
-^^^^^^^^^^^^^^^^^^^^^^
-
-Should we say users that subselect will work for standard data types only ?
-I don't see why subquery can't be used with ~, ~*, @@, ... operators, do you ?
-Is there difference between handling = ANY and ~ ANY ? I don't see any.
-Currently we can't get IN working properly for boxes (and may be for others too)
-and I don't like to try to resolve these problems now, but hope that someday
-we'll be able to do this. At the moment - just convert IN into = ANY and
-NOT IN into <> ALL in parser.
-
-(BTW, do you know how DISTINCT is implemented ? It doesn't use = but
-use type_out funcs and uses strcmp()... DISTINCT is standard SQL thing...)
-
-> >
-> > > right side is an index to a slot in the subqueries List.
-> 
-> I guess the question is what can we have by February 1?
-> 
-> I have been reading some postings, and it seems to me that subselects
-> are the litmus test for many evaluators when deciding if a database
-> engine is full-featured.
-> 
-> Sorry to be so straightforward, but I want to keep hashing this around
-> until we get a conclusion, so coding can start.
-> 
-> My suggestions have been, I believe, trying to get subselects working
-> with the fullest functionality by adding the least amount of code, and
-> keeping the logic clean.
-> 
-> Have you checked out the UNION code?  It is very small, but it works.  I
-> think it could make a good sample for subselects.
-
-There is big difference between subqueries and queries in UNION - 
-there are not dependences between UNION queries.
-
-Ok, opened issues:
-
-1. Is using upper query' vars in all subquery levels in standard ?
-2. Is (a, b, c) OP (subselect) in standard ?
-3. What types of expressions (Var, Const, ...) are allowed on the left
-   side of operator with subquery on the right ?
-4. What types of operators should we support (=, >, ..., like, ~, ...) ?
-   (My vote for all boolean operators).
-
-And - did we get consensus on presentation subqueries stuff in Query,
-Expr and Var ?
-I would like to have something done in parser near Jan 17 to get
-subqueries working by Feb 1. I vote for support of all standard
-things (1. - 3.) in parser right now - if there will be no time
-to implement something like (a, b, c) then optimizer will call
-elog(WARN) (oh, sorry, - elog(ERROR)).
-
-Vadim
-
-From [email protected] Sat Jan 10 12:31:05 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id MAA29045
-   for ; Sat, 10 Jan 1998 12:31:01 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id MAA23364 for ; Sat, 10 Jan 1998 12:22:30 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id AAA05725;
-   Sun, 11 Jan 1998 00:41:22 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Sun, 11 Jan 1998 00:41:19 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: PostgreSQL-development 
-Subject: Re: [HACKERS] subselects
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> OK, a few questions:
-> 
->         Should we use sortmerge, so we can use our psort as temp tables,
-> or do we use hashunique?
-> 
->         How do we pass the query to the optimizer?  How do we represent
-> the range table for each, and the links between them in correlated
-> subqueries?
-
-My suggestion is just use varlevel in Var and don't put upper query'
-relations into subquery range table.
-
-Vadim
-
-From [email protected] Sat Jan 10 13:01:00 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id NAA29357
-   for ; Sat, 10 Jan 1998 13:00:58 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id MAA24030 for ; Sat, 10 Jan 1998 12:40:02 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id AAA05741;
-   Sun, 11 Jan 1998 00:58:56 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Sun, 11 Jan 1998 00:58:52 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian ,
-        PostgreSQL-development 
-Subject: Re: [HACKERS] subselects
-References: <[email protected]> <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Vadim B. Mikheev wrote:
-> 
-> Bruce Momjian wrote:
-> >
-> > OK, a few questions:
-> >
-> >         Should we use sortmerge, so we can use our psort as temp tables,
-> > or do we use hashunique?
-> >
-> >         How do we pass the query to the optimizer?  How do we represent
-> > the range table for each, and the links between them in correlated
-> > subqueries?
-> 
-> My suggestion is just use varlevel in Var and don't put upper query'
-> relations into subquery range table.
-
-Hmm... Sorry, it seems that I did reply to very old message - forget it.
-
-Vadim
-
-From [email protected] Sat Jan 10 13:30:59 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id NAA29664
-   for ; Sat, 10 Jan 1998 13:30:56 -0500 (EST)
-Received: from golem.jpl.nasa.gov ([email protected] [128.149.70.168]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id NAA25109 for ; Sat, 10 Jan 1998 13:05:09 -0500 (EST)
-Received: from alumni.caltech.edu (localhost [127.0.0.1])
-   by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id SAA03623;
-   Sat, 10 Jan 1998 18:01:03 GMT
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Sat, 10 Jan 1998 18:01:03 +0000
-From: "Thomas G. Lockhart" 
-Organization: Caltech/JPL
-X-Mailer: Mozilla 4.03 [en] (X11; I; Linux 2.0.30 i686)
-MIME-Version: 1.0
-To: "Vadim B. Mikheev" 
-CC: Bruce Momjian , [email protected]
-Subject: Re: subselects
-References: <[email protected]> <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-> > > Note, that there are no OP_IN, OP_NOTIN in OpType-s for Expr. We need in
-> > > IN, NOTIN in A_Expr (parser node), but both of them have to be transferred
-> > > by parser into corresponding ANY and ALL. At the moment we can do:
-> > >
-> > > IN --> = ANY, NOT IN --> <> ALL
-> > >
-> > > but this will be "known bug": this breaks OO-nature of Postgres, because of
-> > > operators can be overrided and '=' can mean  s o m e t h i n g (not equality).
-> > > Example: box data type. For boxes, = means equality of _areas_ and =~
-> > > means that boxes are the same ==> =~ ANY should be used for IN.
-> >
-> > That is interesting, to use =~ for ANY.
-
-If I understand the discussion, I would think is is fine to make an assumption about
-which operator is used to implement a subselect expression. If someone remaps an
-operator to mean something different, then they will get a different result (or a
-nonsensical one) from a subselect.
-
-I'd be happy to remap existing operators to fit into a convention which would work
-with subselects (especially if I got to help choose :).
-
-> > Subselects are SQL standard, and are never going to be over-ridden by a
-> > user.  Same with UNION.  They want UNION, they get UNION.  They want
-> > Subselect, we are going to spin through the Query structure and give
-> > them what they want.
->
-> PostgreSQL is a robust, next-generation, Object-Relational DBMS (ORDBMS),
-> derived from the Berkeley Postgres database management system. While
-> PostgreSQL retains the powerful object-relational data model, rich data types and
->            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-> easy extensibility of Postgres, it replaces the PostQuel query language with an
-> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-> extended subset of SQL.
-> ^^^^^^^^^^^^^^^^^^^^^^
->
-> Should we say users that subselect will work for standard data types only ?
-> I don't see why subquery can't be used with ~, ~*, @@, ... operators, do you ?
-> Is there difference between handling = ANY and ~ ANY ? I don't see any.
-> Currently we can't get IN working properly for boxes (and may be for others too)
-> and I don't like to try to resolve these problems now, but hope that someday
-> we'll be able to do this. At the moment - just convert IN into = ANY and
-> NOT IN into <> ALL in parser.
->
-> (BTW, do you know how DISTINCT is implemented ? It doesn't use = but
-> use type_out funcs and uses strcmp()... DISTINCT is standard SQL thing...)
-
-?? I didn't know that. Wouldn't we want it to eventually use "=" through a sorted
-list? That would give more consistant behavior...
-
-> > I have been reading some postings, and it seems to me that subselects
-> > are the litmus test for many evaluators when deciding if a database
-> > engine is full-featured.
-> >
-> > Sorry to be so straightforward, but I want to keep hashing this around
-> > until we get a conclusion, so coding can start.
-> >
-> > My suggestions have been, I believe, trying to get subselects working
-> > with the fullest functionality by adding the least amount of code, and
-> > keeping the logic clean.
-> >
-> > Have you checked out the UNION code?  It is very small, but it works.  I
-> > think it could make a good sample for subselects.
->
-> There is big difference between subqueries and queries in UNION -
-> there are not dependences between UNION queries.
->
-> Ok, opened issues:
->
-> 1. Is using upper query' vars in all subquery levels in standard ?
-
-I'm not certain. Let me know if you do not get an answer from someone else and I will
-research it.
-
-> 2. Is (a, b, c) OP (subselect) in standard ?
-
-Yes. In fact, it _is_ the standard, and "a OP (subselect)" is a special case where
-the parens are allowed to be omitted from a one element list.
-
-> 3. What types of expressions (Var, Const, ...) are allowed on the left
->    side of operator with subquery on the right ?
-
-I think most expressions are allowed. The "constant OP (subselect)" case you were
-asking about is just a simplified case since "(a, b, constant) OP (subselect)" where
-a and b are column references should be allowed. Of course, our optimizer could
-perhaps change this to "(a, b) OP (subselect where x = constant)", or for the first
-example "EXISTS (subselect where x = constant)".
-
-> 4. What types of operators should we support (=, >, ..., like, ~, ...) ?
->    (My vote for all boolean operators).
-
-Sounds good. But I'll vote with Bruce (and I'll bet you already agree) that it is
-important to get an initial implementation for v6.3 which covers a little, some, or
-all of the usual SQL subselect constructs. If we have to revisit this for v6.4 then
-we will have the benefit of feedback from others in practical applications which
-always uncovers new things to consider.
-
-> And - did we get consensus on presentation subqueries stuff in Query,
-> Expr and Var ?
-> I would like to have something done in parser near Jan 17 to get
-> subqueries working by Feb 1. I vote for support of all standard
-> things (1. - 3.) in parser right now - if there will be no time
-> to implement something like (a, b, c) then optimizer will callelog(WARN) (oh,
-> sorry, - elog(ERROR)).
-
-Great. I'd like to help with the remaining parser issues; at the moment "row_expr"
-does the right thing with expression comparisions but just parses then ignores
-subselect expressions. Let me know what structures you want passed back and I'll put
-them in, or if you prefer put in the first one and I'll go through and clean up and
-add the rest.
-
-                                                  - Tom
-
-
-From [email protected] Sat Jan 10 15:00:58 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id PAA00728
-   for ; Sat, 10 Jan 1998 15:00:56 -0500 (EST)
-Received: from golem.jpl.nasa.gov ([email protected] [128.149.70.168]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id OAA28438 for ; Sat, 10 Jan 1998 14:35:19 -0500 (EST)
-Received: from alumni.caltech.edu (localhost [127.0.0.1])
-   by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id TAA06002;
-   Sat, 10 Jan 1998 19:31:30 GMT
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Sat, 10 Jan 1998 19:31:29 +0000
-From: "Thomas G. Lockhart" 
-Organization: Caltech/JPL
-X-Mailer: Mozilla 4.03 [en] (X11; I; Linux 2.0.30 i686)
-MIME-Version: 1.0
-To: "Vadim B. Mikheev" 
-CC: Bruce Momjian , [email protected]
-Subject: Re: [HACKERS] Re: subselects
-References: <[email protected]> <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-> Are you saying about (a, b, c) or about 'a_constant' ?
-> Again, can someone comment on are they in standards or not ?
-> Tom ?
-> If yes then please add parser' support for them now...
-
-As I mentioned a few minutes ago in my last message, I parse the row descriptors and
-the subselects but for subselect expressions (e.g. "(a,b) OP (subselect)" I currently
-ignore the result. I didn't want to pass things back as lists until something in the
-backend was ready to receive them.
-
-If it is OK, I'll go ahead and start passing back a list of expressions when a row
-descriptor is present. So, what you will find is lexpr or rexpr in the A_Expr node
-being a list rather than an atomic node.
-
-Also, I can start passing back the subselect expression as the rexpr; right now the
-parser calls elog() and quits.
-
-btw, to implement "(a,b,c) OP (d,e,f)" I made a new routine in the parser called
-makeRowExpr() which breaks this up into a sequence of "and" and/or "or" expressions.
-If lists are handled farther back, this routine should move to there also and the
-parser will just pass the lists. Note that some assumptions have to be made about the
-meaning of "(a,b) OP (c,d)", since usually we only have knowledge of the behavior of
-"a OP c". Easy for the standard SQL operators, unknown for others, but maybe it is OK
-to disallow those cases or to look for specific appearance of the operator to guess
-the behavior (e.g. if the operator has "<" or "=" or ">" then build as "and"s and if
-it has "<>" or "!" then build as "or"s.
-
-Let me know what you want...
-
-                                                       - Tom
-
-
-From [email protected] Sun Jan 11 01:01:55 1998
-Received: from golem.jpl.nasa.gov ([email protected] [128.149.70.168])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id BAA11953
-   for ; Sun, 11 Jan 1998 01:01:51 -0500 (EST)
-Received: from alumni.caltech.edu (localhost [127.0.0.1])
-   by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id FAA23797;
-   Sun, 11 Jan 1998 05:58:01 GMT
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Sun, 11 Jan 1998 05:58:01 +0000
-From: "Thomas G. Lockhart" 
-Organization: Caltech/JPL
-X-Mailer: Mozilla 4.03 [en] (X11; I; Linux 2.0.30 i686)
-MIME-Version: 1.0
-To: "Vadim B. Mikheev" 
-CC: Bruce Momjian , [email protected]
-Subject: Re: [HACKERS] Re: subselects
-References: <[email protected]> <[email protected]>
-Content-Type: multipart/mixed; boundary="------------D8B38A0D1F78A10C0023F702"
-Status: OR
-
-This is a multi-part message in MIME format.
---------------D8B38A0D1F78A10C0023F702
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-
-Here are context diffs of gram.y and keywords.c; sorry about sending the full files.
-These start sending lists of arguments toward the backend from the parser to
-implement row descriptors and subselects.
-
-They should apply OK even over Bruce's recent changes...
-
-                                             - Tom
-
---------------D8B38A0D1F78A10C0023F702
-Content-Type: text/plain; charset=us-ascii; name="gram.y.patch"
-Content-Transfer-Encoding: 7bit
-Content-Disposition: inline; filename="gram.y.patch"
-
-*** ../src/backend/parser/gram.y.orig  Sat Jan 10 05:44:36 1998
---- ../src/backend/parser/gram.y   Sat Jan 10 19:29:37 1998
-***************
-*** 195,200 ****
---- 195,201 ----
-               having_clause
-  %type  row_descriptor, row_list
-  %type  row_expr
-+ %type       RowOp, row_opt
-  %type  OptCreateAs, CreateAsList
-  %type  CreateAsElement
-  %type     NumConst
-***************
-*** 242,248 ****
-   */
-  
-  /* Keywords (in SQL92 reserved words) */
-! %token   ACTION, ADD, ALL, ALTER, AND, AS, ASC,
-       BEGIN_TRANS, BETWEEN, BOTH, BY,
-       CASCADE, CAST, CHAR, CHARACTER, CHECK, CLOSE, COLLATE, COLUMN, COMMIT, 
-       CONSTRAINT, CREATE, CROSS, CURRENT, CURRENT_DATE, CURRENT_TIME, 
---- 243,249 ----
-   */
-  
-  /* Keywords (in SQL92 reserved words) */
-! %token   ACTION, ADD, ALL, ALTER, AND, ANY, AS, ASC,
-       BEGIN_TRANS, BETWEEN, BOTH, BY,
-       CASCADE, CAST, CHAR, CHARACTER, CHECK, CLOSE, COLLATE, COLUMN, COMMIT, 
-       CONSTRAINT, CREATE, CROSS, CURRENT, CURRENT_DATE, CURRENT_TIME, 
-***************
-*** 258,264 ****
-       ON, OPTION, OR, ORDER, OUTER_P,
-       PARTIAL, POSITION, PRECISION, PRIMARY, PRIVILEGES, PROCEDURE, PUBLIC,
-       REFERENCES, REVOKE, RIGHT, ROLLBACK,
-!      SECOND_P, SELECT, SET, SUBSTRING,
-       TABLE, TIME, TIMESTAMP, TO, TRAILING, TRANSACTION, TRIM,
-       UNION, UNIQUE, UPDATE, USING,
-       VALUES, VARCHAR, VARYING, VERBOSE, VERSION, VIEW,
---- 259,265 ----
-       ON, OPTION, OR, ORDER, OUTER_P,
-       PARTIAL, POSITION, PRECISION, PRIMARY, PRIVILEGES, PROCEDURE, PUBLIC,
-       REFERENCES, REVOKE, RIGHT, ROLLBACK,
-!      SECOND_P, SELECT, SET, SOME, SUBSTRING,
-       TABLE, TIME, TIMESTAMP, TO, TRAILING, TRANSACTION, TRIM,
-       UNION, UNIQUE, UPDATE, USING,
-       VALUES, VARCHAR, VARYING, VERBOSE, VERSION, VIEW,
-***************
-*** 2853,2866 ****
-  /* Expressions using row descriptors
-   * Define row_descriptor to allow yacc to break the reduce/reduce conflict
-   *  with singleton expressions.
-   */
-  row_expr: '(' row_descriptor ')' IN '(' SubSelect ')'
-               {
-!                  $$ = NULL;
-               }
-       | '(' row_descriptor ')' NOT IN '(' SubSelect ')'
-               {
-!                  $$ = NULL;
-               }
-       | '(' row_descriptor ')' '=' '(' row_descriptor ')'
-               {
---- 2854,2878 ----
-  /* Expressions using row descriptors
-   * Define row_descriptor to allow yacc to break the reduce/reduce conflict
-   *  with singleton expressions.
-+  *
-+  * Note that "SOME" is the same as "ANY" in syntax.
-+  * - thomas 1998-01-10
-   */
-  row_expr: '(' row_descriptor ')' IN '(' SubSelect ')'
-               {
-!                  $$ = makeA_Expr(OP, "=any", (Node *)$2, (Node *)$6);
-               }
-       | '(' row_descriptor ')' NOT IN '(' SubSelect ')'
-               {
-!                  $$ = makeA_Expr(OP, "<>any", (Node *)$2, (Node *)$7);
-!              }
-!      | '(' row_descriptor ')' RowOp row_opt '(' SubSelect ')'
-!              {
-!                  char *opr;
-!                  opr = palloc(strlen($4)+strlen($5)+1);
-!                  strcpy(opr, $4);
-!                  strcat(opr, $5);
-!                  $$ = makeA_Expr(OP, opr, (Node *)$2, (Node *)$7);
-               }
-       | '(' row_descriptor ')' '=' '(' row_descriptor ')'
-               {
-***************
-*** 2880,2885 ****
---- 2892,2907 ----
-               }
-       ;
-  
-+ RowOp:  '='                      { $$ = "="; }
-+      | '<'                   { $$ = "<"; }
-+      | '>'                   { $$ = ">"; }
-+      ;
-+ 
-+ row_opt:  ALL                    { $$ = "all"; }
-+      | ANY                   { $$ = "any"; }
-+      | SOME                  { $$ = "any"; }
-+      ;
-+ 
-  row_descriptor:  row_list ',' a_expr
-               {
-                   $$ = lappend($1, $3);
-***************
-*** 3432,3441 ****
-       ;
-  
-  in_expr:  SubSelect
-!              {
-!                  elog(ERROR,"IN (SUBSELECT) not yet implemented");
-!                  $$ = $1;
-!              }
-       | in_expr_nodes
-               {   $$ = $1; }
-       ;
---- 3454,3460 ----
-       ;
-  
-  in_expr:  SubSelect
-!              {   $$ = makeA_Expr(OP, "=", saved_In_Expr, (Node *)$1); }
-       | in_expr_nodes
-               {   $$ = $1; }
-       ;
-***************
-*** 3449,3458 ****
-       ;
-  
-  not_in_expr:  SubSelect
-!              {
-!                  elog(ERROR,"NOT IN (SUBSELECT) not yet implemented");
-!                  $$ = $1;
-!              }
-       | not_in_expr_nodes
-               {   $$ = $1; }
-       ;
---- 3468,3474 ----
-       ;
-  
-  not_in_expr:  SubSelect
-!              {   $$ = makeA_Expr(OP, "<>", saved_In_Expr, (Node *)$1); }
-       | not_in_expr_nodes
-               {   $$ = $1; }
-       ;
-
---------------D8B38A0D1F78A10C0023F702
-Content-Type: text/plain; charset=us-ascii; name="keywords.c.patch"
-Content-Transfer-Encoding: 7bit
-Content-Disposition: inline; filename="keywords.c.patch"
-
-*** ../src/backend/parser/keywords.c.orig  Mon Jan  5 07:51:33 1998
---- ../src/backend/parser/keywords.c   Sat Jan 10 19:22:07 1998
-***************
-*** 39,44 ****
---- 39,45 ----
-   {"alter", ALTER},
-   {"analyze", ANALYZE},
-   {"and", AND},
-+  {"any", ANY},
-   {"append", APPEND},
-   {"archive", ARCHIVE},
-   {"as", AS},
-***************
-*** 178,183 ****
---- 179,185 ----
-   {"set", SET},
-   {"setof", SETOF},
-   {"show", SHOW},
-+  {"some", SOME},
-   {"stdin", STDIN},
-   {"stdout", STDOUT},
-   {"substring", SUBSTRING},
-
---------------D8B38A0D1F78A10C0023F702--
-
-
-From [email protected] Sun Jan 11 01:31:13 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id BAA12255
-   for ; Sun, 11 Jan 1998 01:31:10 -0500 (EST)
-Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id BAA20396 for ; Sun, 11 Jan 1998 01:10:48 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id BAA22176; Sun, 11 Jan 1998 01:03:15 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sun, 11 Jan 1998 01:02:34 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id BAA22151 for pgsql-hackers-outgoing; Sun, 11 Jan 1998 01:02:26 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [209.152.195.67]) by hub.org (8.8.8/8.7.5) with ESMTP id BAA22077 for ; Sun, 11 Jan 1998 01:01:05 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id AAA11801;
-   Sun, 11 Jan 1998 00:59:23 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: [HACKERS] Re: subselects
-To: [email protected] (Vadim B. Mikheev)
-Date: Sun, 11 Jan 1998 00:59:23 -0500 (EST)
-Cc: [email protected], [email protected]
-In-Reply-To: <[email protected]> from "Vadim B. Mikheev" at Jan 11, 98 00:19:08 am
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-> I would like to have something done in parser near Jan 17 to get
-> subqueries working by Feb 1. I vote for support of all standard
-> things (1. - 3.) in parser right now - if there will be no time
-> to implement something like (a, b, c) then optimizer will call
-> elog(WARN) (oh, sorry, - elog(ERROR)).
-
-First, let me say I am glad we are still on schedule for Feb 1.  I was
-panicking because I thought we wouldn't make it in time.
-
-
-> > > (is it allowable by standards ?) - in this case it's better
-> > > to don't add tabA to 1st subselect but add tabA to second one
-> > > and change tabA.col3 in 1st to reference col3 in 2nd subquery temp table -
-> > > this gives us 2-tables join in 1st subquery instead of 3-tables join.
-> > > (And I'm still not sure that using temp tables is best of what can be
-> > > done in all cases...)
-> > 
-> > I don't see any use for temp tables in subselects anymore.  After having
-> > implemented UNIONS, I now see how much can be done in the upper
-> > optimizer.  I see you just putting the subquery PLAN into the proper
-> > place in the plan tree, with some proper JOIN nodes for IN, NOT IN.
-> 
-> When saying about temp tables, I meant tables created by node Material
-> for subquery plan. This is one of two ways - run subquery once for all
-> possible upper plan tuples and then just join result table with upper
-> query. Another way is re-run subquery for each upper query tuple,
-> without temp table but may be with caching results by some ways.
-> Actually, there is special case - when subquery can be alternatively 
-> formulated as joins, - but this is just special case.
-
-This is interesting.  It really only applies for correlated subqueries,
-and certainly it may help sometimes to just evaluate the subquery for
-valid values that are going to come from the upper query than for all
-possible values.  Perhaps we can use the 'cost' value of each query to
-decide how to handle this.
-
-> 
-> > > > In the parent query, to parse the WHERE clause, we create a new operator
-> > > > type, called IN or NOT_IN, or ALL, where the left side is a Var, and the
-> > >                                                ^^^^^^^^^^^^^^^^^^
-> > > No. We have to handle (a,b,c) OP (select x, y, z ...) and
-> > > '_a_constant_' OP (select ...) - I don't know is last in standards,
-> > > Sybase has this.
-> > 
-> > I have never seen this in my eight years of SQL.  Perhaps we can leave
-> > this for later, maybe much later.
-> 
-> Are you saying about (a, b, c) or about 'a_constant' ?
-> Again, can someone comment on are they in standards or not ?
-> Tom ?
-> If yes then please add parser' support for them now...
-
-OK, Thomas says it is, so we will put in as much code as we can to handle
-it.
-
-> Should we say users that subselect will work for standard data types only ?
-> I don't see why subquery can't be used with ~, ~*, @@, ... operators, do you ?
-> Is there difference between handling = ANY and ~ ANY ? I don't see any.
-> Currently we can't get IN working properly for boxes (and may be for others too)
-> and I don't like to try to resolve these problems now, but hope that someday
-> we'll be able to do this. At the moment - just convert IN into = ANY and
-> NOT IN into <> ALL in parser.
-
-OK.
-
-> 
-> (BTW, do you know how DISTINCT is implemented ? It doesn't use = but
-> use type_out funcs and uses strcmp()... DISTINCT is standard SQL thing...)
-
-I did not know that either.
-
-> There is big difference between subqueries and queries in UNION - 
-> there are not dependences between UNION queries.
-
-Yes, I know UNIONS are trivial compared to subselects.
-
-> 
-> Ok, opened issues:
-> 
-> 1. Is using upper query' vars in all subquery levels in standard ?
-> 2. Is (a, b, c) OP (subselect) in standard ?
-> 3. What types of expressions (Var, Const, ...) are allowed on the left
->    side of operator with subquery on the right ?
-> 4. What types of operators should we support (=, >, ..., like, ~, ...) ?
->    (My vote for all boolean operators).
-> 
-> And - did we get consensus on presentation subqueries stuff in Query,
-> Expr and Var ?
-
-OK, here are my concrete ideas on changes and structures.
-
-I think we all agreed that Query needs new fields:
-
-        Query *parentQuery;
-        List *subqueries;
-
-Maybe query level too, but I don't think so (see later ideas on Var).
-
-We need a new Node structure, call it Sublink:
-
-   int     linkType    (IN, NOTIN, ANY, EXISTS, OPERATOR...)
-   Oid operator    /* subquery must return single row */
-   List    *lefthand;  /* parent stuff */
-   Node    *subquery;  /* represents nodes from parser */
-   Index   Subindex;   /* filled in to index Query->subqueries */
-
-Of course, the names are just suggestions.  Every time we run through
-the parsenodes of a query to create a Query* structure, when we do the
-WHERE clause, if we come upon one of these Sublink nodes (created in the
-parser), we move the supplied Query* in Sublink->subquery to a local
-List variable, and we set Subquery->subindex to equal the index of the
-new query, i.e. is it the first subquery we found, 1, or the second, 2,
-etc.
-
-After we have created the parent Query structure, we run through our
-local List variable of subquery parsenodes we created above, and add
-Query* entries to Query->subqueries.  In each subquery Query*, we set
-the parentQuery pointer.
-
-Also, when parsing the subqueries, we need to keep track of correlated
-references.  I recommend we add a field to the Var structure:
-
-   Index   sublevel;   /* range table reference:
-                  = 0  current level of query
-                  < 0  parent above this many levels
-                  > 0  index into subquery list
-                */
-
-This way, a Var node with sublevel 0 is the current level, and is true
-in most cases.  This helps us not have to change much code.  sublevel =
--1 means it references the range table in the parent query. sublevel =
--2 means the parent's parent. sublevel = 2 means it references the range
-table of the second entry in Query->subqueries.  Varno and varattno are
-still meaningful.  Of course, we can't reference variables in the
-subqueries from the parent in the parser code, but Vadim may want to.
-
-When doing a Var lookup in the parser, we look in the current level
-first, but if not found, if it is a subquery, we can look at the parent
-and parent's parent to set the sublevel, varno, and varatno properly.
-
-We create no phantom range table entries in the subquery, and no phantom
-target list entries.   We can leave that all for the upper optimizer.
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Tue Dec  9 12:14:09 1997
-Received: from hub.org (hub.org [209.47.148.200])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id MAA16186
-   for ; Tue, 9 Dec 1997 12:14:05 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.5/8.7.5) with SMTP id MAA17524; Tue, 9 Dec 1997 12:05:31 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Tue, 09 Dec 1997 12:05:01 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.5/8.7.5) id MAA17316 for pgsql-hackers-outgoing; Tue, 9 Dec 1997 12:04:55 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [206.84.210.195]) by hub.org (8.8.5/8.7.5) with ESMTP id MAA17304 for ; Tue, 9 Dec 1997 12:04:40 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id MAA15973;
-   Tue, 9 Dec 1997 12:05:03 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: Re: [HACKERS] Items for 6.3
-To: [email protected] (Thomas G. Lockhart)
-Date: Tue, 9 Dec 1997 12:05:03 -0500 (EST)
-Cc: [email protected], [email protected]
-In-Reply-To: <[email protected]> from "Thomas G. Lockhart" at Dec 9, 97 06:44:14 am
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-> 
-> Bruce Momjian wrote:
-> 
-> > Here are the items I think would make 6.3 a truly great release:
-> >
-> >         subselects
-> >         outer joins
-> 
-> These two would be sufficient (along with the changes already in the
-> tree) to address the most visible deficiencies in SQL functionality.
-> 
-> >         temp tables
-> >         fix "Reliability" items attached to specific queries
-> 
-> Sure, why not?
-
-We will need temp tables for subselects anyway.
-
-I could implement them, but again we come up against the problem of
-storing these plans and executing them later.  We need to do some of the
-temp table stuff in the optimizer because the plan could be passed with
-a temp table, and we can't bind the temp name to a real name in the
-parser, especially if we save those plans in system tables that other
-backends can execute.  Multiple backends would be using the same temp
-name.
-
-At the same time, we need some temp stuff in the parser so the parser
-can recognize the temp table and its fields when it sees it.
-
-The hardest part is:
-
-select * into tmp mytmp from z where x=y;
-select * from mytmp;
-
-If they are passed together, and we have to plan them both, before
-either is executed, you have to make the parser aware of the fields in
-mytmp, even though you have not executed the select yet, you are just
-storing the plan.
-
-This was Vadim's point about not doing subselects in the parser.
-
-> 
-> >         postmaster sync's pglog, giving almost fsync reliability with
-> >                 no-fsync performance
-> 
-> OK to save for v6.4.
-> 
-> Could we try to do the subselect/join/union features for 6.3? I know you
-> have been looking at it, and found the deepest parts of the backend to
-> be a bit murky. I'm not familiar with that area at all, but perhaps we
-> could divert Vadim for a week or two or three when he has some time.
-> Especially if we trade him for help on his favorite topics for v6.4??
-> 
-
-Sure.  I may be able to do some of the pglog change myself, though Vadim
-has some definite ideas on this.
-
-As for Vadim, trading help is a good idea, but what trade can we make? 
-He can do most of these tough things without us, and in 1/4 the time. 
-We can't even see where to start them.
-
-Basically, without Vadim, this project would have really major problems.
-
-He certainly likes working on PostgreSQL, so he must be busy with other
-things.
-
-It is not fair to keep counting on Vadim to do all these tough jobs.  We
-really need to get other people up to Vadim's level of ability. 
-Unfortunately, the odds of this happening are very slim.
-
-This leaves me scratching my head.
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Fri Dec 19 00:08:21 1997
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id AAA25029
-   for ; Fri, 19 Dec 1997 00:08:13 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id MAA11825;
-   Fri, 19 Dec 1997 12:13:15 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Fri, 19 Dec 1997 12:13:09 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: "Thomas G. Lockhart" 
-CC: Bruce Momjian ,
-        PostgreSQL-development 
-Subject: Re: [HACKERS] Items for 6.3
-References: <[email protected]> <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Thomas G. Lockhart wrote:
-> 
-> Could we try to do the subselect/join/union features for 6.3? I know you
-> have been looking at it, and found the deepest parts of the backend to
-> be a bit murky. I'm not familiar with that area at all, but perhaps we
-> could divert Vadim for a week or two or three when he has some time.
-                                          ^^^^^
-More realistic... And this is for initial release only: tuning performance
-of subselects is very hard, long work.
-
-Ok - I'm ready to do subselects for 6.3 but this means that foreign keys
-may appear in 6.4 only. And I'll need in help: could someone add support
-for them in parser ? Not handling - but parsing and common checking.
-Also, it would be nice to have better temp tables implementation 
-(without affecting pg_class etc) - node material need in query-level 
-temp tables anyway. I'd really like to see temp table files created
-only when its data must go to disk due to local buffer pool is full
-and can't more keep table data in memory. Also, local buffer manager
-should be re-written to use hash table (like shared bufmgr) for buffer search,
-not sequential scan as now (this is item for TODO) - this will speed up
-things and allow to use more than 64 local buffers.
-
-I'm still sure that handling subselects in parser is not right way.
-And the main problem is not in execution plans (we could use tricks
-to resolve this) but in performance. Example:
-
-select b from big where b in (select s from small);
-
-If there is no duplicates in small then this is the same as
-
-select b from big, small where b = s;
-
-Without index on big postgres does seq scan of big and uses hashjoin with
-hash on small. Using temp table makes query only 20% slower (in my test). 
-But with index on big postgres uses nestloop with seq scan of small and 
-index scan of big => select run faster and temp table stuff makes query 
-2.5 times slower! In the case of duplicates in small, handling in parser 
-will use distinct (and so - sorting). But using hashjoin plan distinct 
-may be avoided! Who can analize this ? Optimizer only. He can be smart 
-to check is there unique index on small or not. If not - what is more 
-costless: nestloop with sorting or slower hashjoin without sorting. 
-Only optimizer can find best way to execute query, parser can't.
-
-> Especially if we trade him for help on his favorite topics for v6.4??
-
-Ok, I'd like to see shared catalog cache implemeted in 6.4... -:)
-
-Vadim
-
-From [email protected] Fri Dec 19 00:58:54 1997
-Received: from hub.org (hub.org [209.47.148.200])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id AAA25460
-   for ; Fri, 19 Dec 1997 00:58:52 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.5/8.7.5) with SMTP id AAA27667; Fri, 19 Dec 1997 00:54:39 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Fri, 19 Dec 1997 00:54:09 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.5/8.7.5) id AAA27633 for pgsql-hackers-outgoing; Fri, 19 Dec 1997 00:54:04 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [209.152.195.67]) by hub.org (8.8.5/8.7.5) with ESMTP id AAA27623 for ; Fri, 19 Dec 1997 00:53:53 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id AAA25415;
-   Fri, 19 Dec 1997 00:53:15 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: Re: [HACKERS] Items for 6.3
-To: [email protected] (Vadim B. Mikheev)
-Date: Fri, 19 Dec 1997 00:53:15 -0500 (EST)
-Cc: [email protected], [email protected]
-In-Reply-To: <[email protected]> from "Vadim B. Mikheev" at Dec 19, 97 12:13:09 pm
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-> 
-> Thomas G. Lockhart wrote:
-> > 
-> > Could we try to do the subselect/join/union features for 6.3? I know you
-> > have been looking at it, and found the deepest parts of the backend to
-> > be a bit murky. I'm not familiar with that area at all, but perhaps we
-> > could divert Vadim for a week or two or three when he has some time.
->                                           ^^^^^
-> More realistic... And this is for initial release only: tuning performance
-> of subselects is very hard, long work.
-> 
-> Ok - I'm ready to do subselects for 6.3 but this means that foreign keys
-
-Great.
-
-> may appear in 6.4 only. And I'll need in help: could someone add support
-> for them in parser ? Not handling - but parsing and common checking.
-> Also, it would be nice to have better temp tables implementation 
-> (without affecting pg_class etc) - node material need in query-level 
-> temp tables anyway. I'd really like to see temp table files created
-> only when its data must go to disk due to local buffer pool is full
-> and can't more keep table data in memory. Also, local buffer manager
-> should be re-written to use hash table (like shared bufmgr) for buffer search,
-> not sequential scan as now (this is item for TODO) - this will speed up
-> things and allow to use more than 64 local buffers.
-> 
-> I'm still sure that handling subselects in parser is not right way.
-> And the main problem is not in execution plans (we could use tricks
-> to resolve this) but in performance. Example:
-> 
-> select b from big where b in (select s from small);
-> 
-> If there is no duplicates in small then this is the same as
-> 
-> select b from big, small where b = s;
-> 
-> Without index on big postgres does seq scan of big and uses hashjoin with
-> hash on small. Using temp table makes query only 20% slower (in my test). 
-> But with index on big postgres uses nestloop with seq scan of small and 
-> index scan of big => select run faster and temp table stuff makes query 
-> 2.5 times slower! In the case of duplicates in small, handling in parser 
-> will use distinct (and so - sorting). But using hashjoin plan distinct 
-> may be avoided! Who can analize this ? Optimizer only. He can be smart 
-> to check is there unique index on small or not. If not - what is more 
-> costless: nestloop with sorting or slower hashjoin without sorting. 
-> Only optimizer can find best way to execute query, parser can't.
-> 
-
-OK, let me comment on this.  Let's take your example:
-
->  select b from big where b in (select s from small);
-> 
->  If there is no duplicates in small then this is the same as
-> 
->  select b from big, small where b = s;
-
-My idea was to do this:
-
-   select distinct s into temp table small2 from small;
-   select b from big,small2 where b = s;
-
-And let the optimizer decide how to do the join.  Is this what you are
-saying?
-
-The problem I see is that the temp table is already distinct, and was
-sorted to do that, but you can't pass that information into the
-optimizer.  Is that the problem with using the parser?
-
-But you want the temp table never to hit disk unless it has to, but that
-will not work unless we do a really good job with temp tables.
-
-Also NOT IN will need some type of non-join operator, perhaps a flag in
-the Plan to say "look for a match, but only output if you find it."  How
-do we do that?
-
-We definately need temp tables, and I think we can stuff it into the
-cache as LOCAL, which will make it usable without adding to pg_class.
-
-Perhaps if we create a special Plan in the optimizer called IN, and we
-have the outer and inner queries as plans, and work that plan into the
-executor.
-
-The problem with that is we need to specify a way to join the two plans,
-and the same logic that determines what type of join to do can this too.
-Maybe that's why you wanted stuff done in the optimizer and not the
-parser.
-
-At least now, I understand enough to come up with ideas, and can
-understand what you are saying.
-
-> > Especially if we trade him for help on his favorite topics for v6.4??
-> 
-> Ok, I'd like to see shared catalog cache implemeted in 6.4... -:)
-> 
-> Vadim
-> 
-
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Fri Dec 19 01:00:58 1997
-Received: from hub.org (hub.org [209.47.148.200])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id BAA25512
-   for ; Fri, 19 Dec 1997 01:00:56 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.5/8.7.5) with SMTP id AAA28102; Fri, 19 Dec 1997 00:56:52 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Fri, 19 Dec 1997 00:56:40 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.5/8.7.5) id AAA28077 for pgsql-hackers-outgoing; Fri, 19 Dec 1997 00:56:36 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [209.152.195.67]) by hub.org (8.8.5/8.7.5) with ESMTP id AAA28065 for ; Fri, 19 Dec 1997 00:56:19 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id AAA25436;
-   Fri, 19 Dec 1997 00:55:56 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: Re: [HACKERS] Items for 6.3
-To: [email protected] (Vadim B. Mikheev)
-Date: Fri, 19 Dec 1997 00:55:56 -0500 (EST)
-Cc: [email protected], [email protected]
-In-Reply-To: <[email protected]> from "Vadim B. Mikheev" at Dec 19, 97 12:13:09 pm
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-> select b from big where b in (select s from small);
-> 
-> If there is no duplicates in small then this is the same as
-> 
-> select b from big, small where b = s;
-
-I think I see the problem you are describing now.  If we put the
-subselect into a temp table, we can't use the existing index on small.s,
-even if there is one, or if sorting was involved in creating the temp
-table.
-
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Fri Dec 19 01:34:26 1997
-Received: from golem.jpl.nasa.gov ([email protected] [128.149.70.168])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id BAA25750
-   for ; Fri, 19 Dec 1997 01:34:23 -0500 (EST)
-Received: from alumni.caltech.edu (localhost [127.0.0.1])
-   by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id GAA15234;
-   Fri, 19 Dec 1997 06:29:45 GMT
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Fri, 19 Dec 1997 06:29:45 +0000
-From: "Thomas G. Lockhart" 
-Organization: Caltech/JPL
-X-Mailer: Mozilla 4.03 [en] (X11; I; Linux 2.0.30 i686)
-MIME-Version: 1.0
-To: "Vadim B. Mikheev" 
-CC: Bruce Momjian ,
-        PostgreSQL-development 
-Subject: Re: [HACKERS] Items for 6.3
-References: <[email protected]> <[email protected]> <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-> > Could we try to do the subselect/join/union features for 6.3? I know you
-> > have been looking at it, and found the deepest parts of the backend to
-> > be a bit murky. I'm not familiar with that area at all, but perhaps we
-> > could divert Vadim for a week or two or three when he has some time.
->                                           ^^^^^
-> More realistic... And this is for initial release only: tuning performance
-> of subselects is very hard, long work.
->
-> Ok - I'm ready to do subselects for 6.3 but this means that foreign keys
-> may appear in 6.4 only. And I'll need in help: could someone add support
-> for them in parser ? Not handling - but parsing and common checking.
-
-Yes, I've already added subselect syntax in the parser, but we will need to
-modify or add to the parse tree nodes to push that past the parser into the
-backend. I'm happy to focus on that, since I understand those pieces pretty well.
-There are several places where "subselect syntax" is used: subselects and unions
-come to mind right away. If you have an opinion on how the parse nodes should be
-structured I can start with that, or I can just put something in and then modify
-it as you need later. Do you see unions as being similar to subselects, or are
-they a separate problem? To me, they seem like a simpler case since (perhaps) not
-as much optimization and internal reorganizing needs to happen.
-
-> Also, it would be nice to have better temp tables implementation
-> (without affecting pg_class etc) - node material need in query-level
-> temp tables anyway. I'd really like to see temp table files created
-> only when its data must go to disk due to local buffer pool is full
-> and can't more keep table data in memory.
-
-This sounds very desirable. I noticed that there are, or used to be, multiple
-storage managers. Could a manager for temporary storage be written which stores
-things in memory until it gets too big and then go to disk? Could that manager
-use the mm and md managers internally? Or is all of that at too low a level to be
-helpful for this problem?
-
-SQL92 has the concept of transaction-only and session-only tables and variables.
-Could an implementation of "temporary tables" be used to implement this feature
-at the same time (or form the basis for it later)? It seems like none of these
-non-permanent tables need to go to any of the pg_ tables, since other backends do
-not need to see them and they are allowed to disappear at the end of the session
-(or at a crash). We would just need the "table manager" to cache information on
-temporary stuff before looking at the permanent tables (??).
-
-> Also, local buffer manager
-> should be re-written to use hash table (like shared bufmgr) for buffer search,
-> not sequential scan as now (this is item for TODO) - this will speed up
-> things and allow to use more than 64 local buffers.
->
-> I'm still sure that handling subselects in parser is not right way.
-> And the main problem is not in execution plans (we could use tricks
-> to resolve this) but in performance.
-
-Seems to me that the subselect needs to stay untransformed (i.e. executable but
-non-optimized) so that an optimizer can independently decide how to transform for
-faster execution. That way, in the first implementation we have reliable but
-stupid execution, but then can add a subselect optimizer which looks for cases
-which can be transformed to run faster.
-
-> > Especially if we trade him for help on his favorite topics for v6.4??
->
-> Ok, I'd like to see shared catalog cache implemeted in 6.4... -:)
-
-Sure. (Tell me what it is later :)
-
-                                              - Tom
-
-
-
-From [email protected] Fri Dec 19 06:23:14 1997
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id GAA27849
-   for ; Fri, 19 Dec 1997 06:22:46 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id SAA12239;
-   Fri, 19 Dec 1997 18:28:13 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Fri, 19 Dec 1997 18:28:12 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: [email protected], [email protected]
-Subject: Re: [HACKERS] Items for 6.3
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> OK, let me comment on this.  Let's take your example:
-> 
-> >       select b from big where b in (select s from small);
-> >
-> >       If there is no duplicates in small then this is the same as
-> >
-> >       select b from big, small where b = s;
-> 
-> My idea was to do this:
-> 
->         select distinct s into temp table small2 from small;
->         select b from big,small2 where b = s;
-> 
-> And let the optimizer decide how to do the join.  Is this what you are
-> saying?
-> 
-> The problem I see is that the temp table is already distinct, and was
-> sorted to do that, but you can't pass that information into the
-> optimizer.  Is that the problem with using the parser?
-
-No. I said that in some cases we can avoid distinct at all: if either
-unique index on small exists or by using hashjoin plans with !new!
-HashUnique node (there was mistake in my prev description - not Hash,
-but HashUnique on small should be used, - HashUnique is hash table
-without duplicates, just another way to implement distinct, without
-sorting). This new node can be usefull and for "normal" queries
-(without subselects).
-
-My example is very simple. I just want to say that by handling subqueries
-in optimizer we will have more chances to do better optimization. Maybe not
-now, but latter. I'm sure that subqueries require some specific optimization
-and this is not task of parser.
-
-> 
-> But you want the temp table never to hit disk unless it has to, but that
-> will not work unless we do a really good job with temp tables.
-
-Of 'course.
-
-> 
-> Also NOT IN will need some type of non-join operator, perhaps a flag in
-> the Plan to say "look for a match, but only output if you find it."  How
-                                                           ^^
-                                                          don't ?
-> do we do that?
-
-Just as you said - by using of some flag.
-
-> 
-> We definately need temp tables, and I think we can stuff it into the
-> cache as LOCAL, which will make it usable without adding to pg_class.
-
-We have Relation->rd_istemp flag... Just change it from bool to int:
-0 -> is not temp, 1 -> session level temp table, etc...
-
-Vadim
-
-From [email protected] Fri Dec 19 08:09:11 1997
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id IAA00349
-   for ; Fri, 19 Dec 1997 08:09:05 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id UAA12377;
-   Fri, 19 Dec 1997 20:14:25 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Fri, 19 Dec 1997 20:14:15 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: "Thomas G. Lockhart" 
-CC: Bruce Momjian ,
-        PostgreSQL-development 
-Subject: Re: [HACKERS] Items for 6.3
-References: <[email protected]> <[email protected]> <[email protected]> <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Thomas G. Lockhart wrote:
-> 
-> > Ok - I'm ready to do subselects for 6.3 but this means that foreign keys
-> > may appear in 6.4 only. And I'll need in help: could someone add support
-> > for them in parser ? Not handling - but parsing and common checking.
-> 
-> Yes, I've already added subselect syntax in the parser, but we will need to
-> modify or add to the parse tree nodes to push that past the parser into the
-> backend. I'm happy to focus on that, since I understand those pieces pretty well.
-
-Nice!
-
-> There are several places where "subselect syntax" is used: subselects and unions
-> come to mind right away. If you have an opinion on how the parse nodes should be
-> structured I can start with that, or I can just put something in and then modify
-                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-It's ok for me.
-
-> it as you need later. Do you see unions as being similar to subselects, or are
-> they a separate problem? To me, they seem like a simpler case since (perhaps) not
-> as much optimization and internal reorganizing needs to happen.
-
-I didn't think about unions at all... Yes, it's simpler to implement.
-BTW, I recall Bruce mentioned that unions are used for selects from
-superclass and all descendant classes (select ... from table* ) - maybe
-something is already implemented ? Bruce ?
-
-> 
-> > Also, it would be nice to have better temp tables implementation
-> > (without affecting pg_class etc) - node material need in query-level
-> > temp tables anyway. I'd really like to see temp table files created
-> > only when its data must go to disk due to local buffer pool is full
-> > and can't more keep table data in memory.
-> 
-> This sounds very desirable. I noticed that there are, or used to be, multiple
-> storage managers. Could a manager for temporary storage be written which stores
-> things in memory until it gets too big and then go to disk? Could that manager
-> use the mm and md managers internally? Or is all of that at too low a level to be
-> helpful for this problem?
-
-mm uses shmem... This feature could be implemented in local bufmgr
-directly: when requested buffer is not found in pool and there is no free, 
-!dirty buffer then try to find some dirty buffer of created relation, flush 
-it to disk and use (exception below); if no such buffer -> create some relation 
-(and flush 1st block); exception: also create some relation if # of buffers 
-occupied by already created relations is too small (just to do not break
-buffering of created relations).
-(Note, that using some additional in-memory storage manager will cause
-keeping some buffers in-memory twice - in local pool and in manager.
-The way above is using local bufmgr as storage manager).
-
-> >
-> > I'm still sure that handling subselects in parser is not right way.
-> > And the main problem is not in execution plans (we could use tricks
-> > to resolve this) but in performance.
-> 
-> Seems to me that the subselect needs to stay untransformed (i.e. executable but
-> non-optimized) so that an optimizer can independently decide how to transform for
-> faster execution. That way, in the first implementation we have reliable but
-> stupid execution, but then can add a subselect optimizer which looks for cases
-> which can be transformed to run faster.
-
-Yes, I believe that this is right way.
-
-> 
-> > > Especially if we trade him for help on his favorite topics for v6.4??
-> >
-> > Ok, I'd like to see shared catalog cache implemeted in 6.4... -:)
-> 
-> Sure. (Tell me what it is later :)
-
-Ok -:)
-
-Vadim
-
-From [email protected] Tue Dec 23 04:01:21 1997
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id EAA08884
-   for ; Tue, 23 Dec 1997 04:01:18 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id DAA24250 for ; Tue, 23 Dec 1997 03:57:12 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id QAA23028;
-   Tue, 23 Dec 1997 16:04:25 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Tue, 23 Dec 1997 16:04:23 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: [email protected], [email protected]
-Subject: Re: [HACKERS] Items for 6.3
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> >
-> > I didn't think about unions at all... Yes, it's simpler to implement.
-> > BTW, I recall Bruce mentioned that unions are used for selects from
-> > superclass and all descendant classes (select ... from table* ) - maybe
-> > something is already implemented ? Bruce ?
-> 
-> Yes, it is already there.  See optimizer/prep/prepunion.c, and see the
-> call to it from optimizer/plan/planner.c.  The current source tree has a
-> cleaned up version that will be easier to understand.  Basically, if
-> there are any inherited tables, it calls prepunion, and and cycles
-> through each inherited table, copying the Query plan, and calling the
-> planner() for each one, then it returns to the planner() to so sorting
-> and uniqueness.  I am working on fixing aggregates.
-
-Could you try with unions ?
-I would like to concentrate on single thing - subqueries.
-
-> 
-> > mm uses shmem... This feature could be implemented in local bufmgr
-> > directly: when requested buffer is not found in pool and there is no free,
-> > !dirty buffer then try to find some dirty buffer of created relation, flush
-> > it to disk and use (exception below); if no such buffer -> create some relation
-> > (and flush 1st block); exception: also create some relation if # of buffers
-> > occupied by already created relations is too small (just to do not break
-> > buffering of created relations).
-> > (Note, that using some additional in-memory storage manager will cause
-> > keeping some buffers in-memory twice - in local pool and in manager.
-> > The way above is using local bufmgr as storage manager).
-> 
-> In the psort code, we do a nice job of keeping the stuff in files or
-> memory.  Seems to work well.  Can we use that somehow?  Perhaps make it
-> a separate module, or just force a psort rather than a hash!
-
-I would like to be not restricted to psort only, but use what is better
-in each case. I even can foresee using indices on temp tables: we could
-put data in index without putting data in table itself!
-In any case, we can leave in-memory tables for future.
-
-Vadim
-
-From [email protected] Tue Dec 23 04:31:23 1997
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id EAA09186
-   for ; Tue, 23 Dec 1997 04:31:20 -0500 (EST)
-Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id EAA24391 for ; Tue, 23 Dec 1997 04:04:44 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.5/8.7.5) with SMTP id EAA06421; Tue, 23 Dec 1997 04:00:11 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Tue, 23 Dec 1997 03:58:36 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.5/8.7.5) id DAA06163 for pgsql-hackers-outgoing; Tue, 23 Dec 1997 03:58:32 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by hub.org (8.8.5/8.7.5) with ESMTP id DAA06151 for ; Tue, 23 Dec 1997 03:58:02 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id QAA23028;
-   Tue, 23 Dec 1997 16:04:25 +0700 (KRS)
-   (envelope-from [email protected])
-Message-ID: <[email protected]>
-Date: Tue, 23 Dec 1997 16:04:23 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: [email protected], [email protected]
-Subject: Re: [HACKERS] Items for 6.3
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-Bruce Momjian wrote:
-> >
-> > I didn't think about unions at all... Yes, it's simpler to implement.
-> > BTW, I recall Bruce mentioned that unions are used for selects from
-> > superclass and all descendant classes (select ... from table* ) - maybe
-> > something is already implemented ? Bruce ?
-> 
-> Yes, it is already there.  See optimizer/prep/prepunion.c, and see the
-> call to it from optimizer/plan/planner.c.  The current source tree has a
-> cleaned up version that will be easier to understand.  Basically, if
-> there are any inherited tables, it calls prepunion, and and cycles
-> through each inherited table, copying the Query plan, and calling the
-> planner() for each one, then it returns to the planner() to so sorting
-> and uniqueness.  I am working on fixing aggregates.
-
-Could you try with unions ?
-I would like to concentrate on single thing - subqueries.
-
-> 
-> > mm uses shmem... This feature could be implemented in local bufmgr
-> > directly: when requested buffer is not found in pool and there is no free,
-> > !dirty buffer then try to find some dirty buffer of created relation, flush
-> > it to disk and use (exception below); if no such buffer -> create some relation
-> > (and flush 1st block); exception: also create some relation if # of buffers
-> > occupied by already created relations is too small (just to do not break
-> > buffering of created relations).
-> > (Note, that using some additional in-memory storage manager will cause
-> > keeping some buffers in-memory twice - in local pool and in manager.
-> > The way above is using local bufmgr as storage manager).
-> 
-> In the psort code, we do a nice job of keeping the stuff in files or
-> memory.  Seems to work well.  Can we use that somehow?  Perhaps make it
-> a separate module, or just force a psort rather than a hash!
-
-I would like to be not restricted to psort only, but use what is better
-in each case. I even can foresee using indices on temp tables: we could
-put data in index without putting data in table itself!
-In any case, we can leave in-memory tables for future.
-
-Vadim
-
-
-From [email protected] Thu Dec  5 10:30:53 1996
-Received: from abs.net ([email protected] [207.114.0.130]) by candle.pha.pa.us (8.8.3/8.7.3) with ESMTP id KAA06591 for ; Thu, 5 Dec 1996 10:30:43 -0500 (EST)
-Received: from aixssd.UUCP (nobody@localhost) by abs.net (8.8.3/8.7.3) with UUCP id KAA01387 for [email protected]; Thu, 5 Dec 1996 10:13:56 -0500 (EST)
-Received: by aixssd (AIX 3.2/UCB 5.64/4.03)
-          id AA36963; Thu, 5 Dec 1996 10:10:24 -0500
-Received: by ceodev (AIX 4.1/UCB 5.64/4.03)
-          id AA34942; Thu, 5 Dec 1996 10:07:56 -0500
-Date: Thu, 5 Dec 1996 10:07:56 -0500
-From: [email protected] (Darren King)
-Message-Id: <9612051507.AA34942@ceodev>
-To: [email protected]
-Subject: Subselect info.
-Mime-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Content-Md5: jaWdPH2KYtdr7ESzqcOp5g==
-Status: OR
-
-> Any of them deal with implementing subselects?
-
-There's a white paper at the www.sybase.com that might
-help a little.  It's just a copy of a presentation
-given by the optimizer guru there.  Nothing code-wise,
-but he gives a few ways of flattening them with temp
-tables, etc...
-
-Darren 
-
-From [email protected] Thu Aug 21 23:42:50 1997
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id XAA04109
-   for ; Thu, 21 Aug 1997 23:42:43 -0400 (EDT)
-Received: from www.krasnet.ru (localhost [127.0.0.1]) by www.krasnet.ru (8.7.5/8.7.3) with SMTP id MAA04399; Fri, 22 Aug 1997 12:04:31 +0800 (KRD)
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Fri, 22 Aug 1997 12:04:31 +0800
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 3.01 (X11; I; FreeBSD 2.1.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-Subject: Re: subselects
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> Considering the complexity of the primary/secondary changes you are
-> making, I believe subselects will be easier than that.
-
-I don't do changes for P/F keys - just thinking...
-Yes, I think that impl of referential integrity is
-more complex work.
-
-As for subselects:
-
-in plannodes.h
-
-typedef struct Plan {
-...
-    struct Plan         *lefttree;
-    struct Plan         *righttree;
-} Plan;
-
-/* ----------------
- *  these are are defined to avoid confusion problems with "left"
-                                   ^^^^^^^^^^^^^^^^^^
- *  and "right" and "inner" and "outer".  The convention is that   
- *  the "left" plan is the "outer" plan and the "right" plan is
- *  the inner plan, but these make the code more readable.
- * ----------------
- */
-#define innerPlan(node)         (((Plan *)(node))->righttree)
-#define outerPlan(node)         (((Plan *)(node))->lefttree)
-
-First thought is avoid any confusions by re-defining
-
-#define rightPlan(node)         (((Plan *)(node))->righttree)
-#define leftPlan(node)          (((Plan *)(node))->lefttree)
-
-and change all occurrences of 'outer' & 'inner' in code
-to 'left' & 'inner' ones:
-
-this will allow to use 'outer' & 'inner' things for subselects
-latter, without confusion. My hope is that we may change Executor
-very easy by adding outer/inner plans/TupleSlots to
-EState, CommonState, JoinState, etc and by doing node
-processing in right order.
-
-Subselects are mostly Planner problem.
-
-Unfortunately, I havn't time at the moment: CHECK/DEFAULT...
-
-Vadim
-
-From [email protected] Fri Aug 22 00:00:59 1997
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id AAA04354
-   for ; Fri, 22 Aug 1997 00:00:51 -0400 (EDT)
-Received: from www.krasnet.ru (localhost [127.0.0.1]) by www.krasnet.ru (8.7.5/8.7.3) with SMTP id MAA04425; Fri, 22 Aug 1997 12:22:37 +0800 (KRD)
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Fri, 22 Aug 1997 12:22:37 +0800
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 3.01 (X11; I; FreeBSD 2.1.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-Subject: Re: subselects
-References: <[email protected]> <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Vadim B. Mikheev wrote:
-> 
-> this will allow to use 'outer' & 'inner' things for subselects
-> latter, without confusion. My hope is that we may change Executor
-
-Or may be use 'high' & 'low' for subselecs (to avoid confusion
-with outter hoins).
-
-> very easy by adding outer/inner plans/TupleSlots to
-> EState, CommonState, JoinState, etc and by doing node
-> processing in right order.
-             ^^^^^^^^^^^^^^
-Rule is easy:
-1. Uncorrelated subselect - do 'low' plan node first
-2. Correlated             - do left/right first
-
-- just some flag in structures.
-
-Vadim
-
-From [email protected] Thu Oct 30 17:02:30 1997
-Received: from hub.org (hub.org [209.47.148.200])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id RAA09682
-   for ; Thu, 30 Oct 1997 17:02:28 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.5/8.7.5) with SMTP id QAA20688; Thu, 30 Oct 1997 16:58:40 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Thu, 30 Oct 1997 16:58:24 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.5/8.7.5) id QAA20615 for pgsql-hackers-outgoing; Thu, 30 Oct 1997 16:58:17 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [206.84.210.195]) by hub.org (8.8.5/8.7.5) with ESMTP id QAA20495 for ; Thu, 30 Oct 1997 16:57:54 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id QAA07726
-   for [email protected]; Thu, 30 Oct 1997 16:50:29 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: [HACKERS] subselects
-To: [email protected] (PostgreSQL-development)
-Date: Thu, 30 Oct 1997 16:50:29 -0500 (EST)
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-The only thing I have to add to what I had written earlier is that I
-think it is best to have these subqueries executed as early in query
-execution as possible.
-
-Every piece of the backend: parser, optimizer, executor, is designed to
-work on a single query.  The earlier we can split up the queries, the
-better those pieces will work at doing their job.  You want to be able
-to use the parser and optimizer on each part of the query separately, if
-you can.
-
-
-Forwarded message:
-> I have done some thinking about subselects.  There are basically two
-> issues:
- > 
->  Does the query return one row or several rows?  This can be
->  determined by seeing if the user uses equals on 'IN' to join the
->  subquery. 
-> 
->  Is the query correlated, meaning "Does the subquery reference
->  values from the outer query?"
-> 
-> (We already have the third type of subquery, the INSERT...SELECT query.)
-> 
-> So we have these four combinations:
-> 
->  1) one row, no correlation
->  2) multiple rows, no correlation
->  3) one row, correlated
->  4) multiple rows, correlated
-> 
-> 
-> With #1, we can execute the subquery, get the value, replace the
-> subquery with the constant returned from the subquery, and execute the
-> outer query.
-> 
-> With #2, we can execute the subquery and put the result into a temporary
-> table.  We then rewrite the outer query to access the temporary table
-> and replace the subquery with the column name from the temporary table. 
-> We probabally put an index on the temp. table, which has only one
-> column, because a subquery can only return one column.  We remove the
-> temp. table after query execution.
-> 
-> With #3 and #4, we potentially need to execute the subquery for every
-> row returned by the outer query.  Performance would be horrible for
-> anything but the smallest query.  Another way to handle this is to
-> execute the subquery WITHOUT using any of the outer-query columns to
-> restrict the WHERE clause, and add those columns used to join the outer
-> variables into the target list of the subquery.  So for query:
-> 
->  select t1.name
->  from tab t1
->  where t1.age = (select max(t2.age)
->              from tab2
->              where tab2.name = t1.name)
-> 
-> Execute the subquery and put it in a temporary table:
-> 
->  select t2.name, max(t2.age)
->  into table temp999
->  from tab2
->  where tab2.name = t1.name
-> 
->  create index i_temp999 on temp999 (name)
-> 
-> Then re-write the outer query:
-> 
->  select t1.name
->  from tab t1, temp999
->  where t1.age = temp999.age and
->        t1.name = temp999.name
-> 
-> The only problem here is that the subselect is running for all entries
-> in tab2, even if the outer query is only going to need a few rows. 
-> Determining whether to execute the subquery each time, or create a temp.
-> table is often difficult to determine.  Even some non-correlated
-> subqueries are better to execute for each row rather the pre-execute the
-> entire subquery, expecially if the outer query returns few rows.
-> 
-> One requirement to handle these issues is better column statistics,
-> which I am working on.
-> 
-
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Fri Oct 31 22:30:58 1997
-Received: from renoir.op.net ([email protected] [206.84.208.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id WAA15643
-   for ; Fri, 31 Oct 1997 22:30:56 -0500 (EST)
-Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id WAA24379 for ; Fri, 31 Oct 1997 22:06:08 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.5/8.7.5) with SMTP id WAA15503; Fri, 31 Oct 1997 22:03:40 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Fri, 31 Oct 1997 22:01:38 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.5/8.7.5) id WAA14136 for pgsql-hackers-outgoing; Fri, 31 Oct 1997 22:01:29 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [206.84.210.195]) by hub.org (8.8.5/8.7.5) with ESMTP id WAA13866 for ; Fri, 31 Oct 1997 22:00:53 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id VAA14566;
-   Fri, 31 Oct 1997 21:37:06 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: Re: [HACKERS] subselects
-To: [email protected] (Bruce Momjian)
-Date: Fri, 31 Oct 1997 21:37:06 +1900 (EST)
-Cc: [email protected]
-In-Reply-To: <[email protected]> from "Bruce Momjian" at Oct 30, 97 04:50:29 pm
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-One more issue I thought of.  You can have multiple subselects in a
-single query, and subselects can have their own subselects.
-
-This makes it particularly important that we define a system that always
-is able to process the subselect BEFORE the upper select.  This will
-allow use to handle all these cases without limitations.
-
-> 
-> The only thing I have to add to what I had written earlier is that I
-> think it is best to have these subqueries executed as early in query
-> execution as possible.
-> 
-> Every piece of the backend: parser, optimizer, executor, is designed to
-> work on a single query.  The earlier we can split up the queries, the
-> better those pieces will work at doing their job.  You want to be able
-> to use the parser and optimizer on each part of the query separately, if
-> you can.
-> 
-
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Sun Nov  2 10:33:33 1997
-Received: from sid.trust.ee (sid.trust.ee [194.204.23.180])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id KAA27619
-   for ; Sun, 2 Nov 1997 10:32:04 -0500 (EST)
-Received: from sid.trust.ee (wink.trust.ee [194.204.23.184])
-   by sid.trust.ee (8.8.5/8.8.5) with ESMTP id RAA02233;
-   Sun, 2 Nov 1997 17:30:11 +0200
-Message-ID: <[email protected]>
-Date: Sun, 02 Nov 1997 17:27:57 +0200
-From: Hannu Krosing 
-X-Mailer: Mozilla 4.02 [en] (Win95; I)
-MIME-Version: 1.0
-To: [email protected]
-CC: [email protected]
-Subject: Re: [HACKERS] subselects
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-> Date: Fri, 31 Oct 1997 21:37:06 +1900 (EST)
-> From: Bruce Momjian 
-> Subject: Re: [HACKERS] subselects
->
-> One more issue I thought of.  You can have multiple subselects in a
-> single query, and subselects can have their own subselects.
->
-> This makes it particularly important that we define a system that always
-> is able to process the subselect BEFORE the upper select.  This will
-> allow use to handle all these cases without limitations.
-
-This would severely limit what subselects can be used for as you can't useany of the fields in the upper select in a
-search criteria for the subselect,
-for example you can't do
-
-update parts p1
-set parts.current_id = (
-    select new_id
-    from parts p2
-    where p1.old_id = p2.new_id);or
-
-select id, price, (select sum(price) from parts p2 where p1.id=p2.id) as totalprice
-from parts p1;
-
-there may be of course ways to rewrite these queries (which the optimiser should do
-if it can) but IMHO, these kinds of subselects should still be allowed
-
-> > The only thing I have to add to what I had written earlier is that I
-> > think it is best to have these subqueries executed as early in query
-> > execution as possible.
-> >
-> > Every piece of the backend: parser, optimizer, executor, is designed to
-> > work on a single query.  The earlier we can split up the queries, the
-> > better those pieces will work at doing their job.  You want to be able
-> > to use the parser and optimizer on each part of the query separately, if
-> > you can.
-> >
->
-
-Hannu
-
-
-From [email protected] Sun Nov  2 21:30:59 1997
-Received: from renoir.op.net ([email protected] [206.84.208.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id VAA14831
-   for ; Sun, 2 Nov 1997 21:30:57 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id VAA19683 for ; Sun, 2 Nov 1997 21:20:13 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by www.krasnet.ru (8.8.7/8.7.3) with SMTP id JAA17259; Mon, 3 Nov 1997 09:22:38 +0700 (KRS)
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Mon, 03 Nov 1997 09:22:38 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 3.01 (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: PostgreSQL-development 
-Subject: Re: [HACKERS] subselects
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> > > One more issue I thought of.  You can have multiple subselects in a
-> > > single query, and subselects can have their own subselects.
-> > >
-> > > This makes it particularly important that we define a system that always
-> > > is able to process the subselect BEFORE the upper select.  This will
-> > > allow use to handle all these cases without limitations.
-> >
-> > This would severely limit what subselects can be used for as you can't useany of the fields in the upper select in a
-> > search criteria for the subselect,
-> > for example you can't do
-> >
-> > update parts p1
-> > set parts.current_id = (
-> >     select new_id
-> >     from parts p2
-> >     where p1.old_id = p2.new_id);or
-> >
-> > select id, price, (select sum(price) from parts p2 where p1.id=p2.id) as totalprice
-> > from parts p1;
-> >
-> > there may be of course ways to rewrite these queries (which the optimiser should do
-> > if it can) but IMHO, these kinds of subselects should still be allowed
-> 
-> I hadn't even gotten to this point yet, but it is a good thing to keep
-> in mind.
-> 
-> In these cases, as in correlated subqueries in the where clause, we will
-> create a temporary table, and add the proper join fields and tables to
-> the clauses.  Our version of UPDATE accepts a FROM section, and we will
-> certainly use this for this purpose.
-
-We can't replace subselect with join if there is aggregate
-in subselect.
-
-Actually, I don't see any problems if we going to process subselect
-like sql-funcs: non-correlated subselects can be emulated by
-funcs without args, for correlated subselects parser (analyze.c)
-has to change all upper query references to $1, $2,...
-
-Vadim
-
-From [email protected] Mon Nov  3 06:07:12 1997
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id GAA27433
-   for ; Mon, 3 Nov 1997 06:07:03 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by www.krasnet.ru (8.8.7/8.7.3) with SMTP id SAA18519; Mon, 3 Nov 1997 18:09:44 +0700 (KRS)
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Mon, 03 Nov 1997 18:09:43 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 3.01 (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: [email protected]
-Subject: Re: [HACKERS] subselects
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> >
-> > > In these cases, as in correlated subqueries in the where clause, we will
-> > > create a temporary table, and add the proper join fields and tables to
-> > > the clauses.  Our version of UPDATE accepts a FROM section, and we will
-> > > certainly use this for this purpose.
-> >
-> > We can't replace subselect with join if there is aggregate
-> > in subselect.
-> 
-> I got lost here.  Why can't we handle aggregates?
-
-Sorry, I missed using of temp tables. Sybase uses joins (without
-temp tables) for non-correlated subqueries:
-
-    A noncorrelated subquery can be evaluated as if it were an independent query.
-    Conceptually, the results of the subquery are substituted in the main statement, or
-    outer query. This is not how SQL Server actually processes statements with
-    subqueries. Noncorrelated subqueries can be alternatively stated as joins and
-    are processed as joins by SQL Server. 
-
-but this is not possible if there are aggregates in subquery.
-
-> 
-> My idea was this.  This is a non-correlated subquery.
-...
-No problems with it...
-
-> 
-> Here is a correlated example:
-> 
->         select *
->         from table_a
->         where table_a.col_a in (select table_b.col_b
->                         from table_b
->                         where table_b.col_b = table_a.col_c)
-> 
-> rewrite as:
-> 
->         select distinct table_b.col_b, table_a.col_c -- the distinct is needed
->         into table_sub
->         from table_a, table_b
-
-First, could we add 'where table_b.col_b = table_a.col_c' here ?
-Just to avoid Cartesian results ? I hope we can.
-
-Note that for query
-
-        select *
-        from table_a
-        where table_a.col_a in (select table_b.col_b * table_a.col_c
-                        from table_b)
-
-it's better to do
-
-   select distinct table_a.col_a
-   into table table_sub
-   from table_b, table_a
-        where table_a.col_a = table_b.col_b * table_a.col_c
-
-once again - to avoid Cartesians.
-
-But what could we do for
-
-        select *
-        from table_a
-        where table_a.col_a = (select max(table_b.col_b * table_a.col_c)
-                        from table_b)
-???
-   select max(table_b.col_b * table_a.col_c), table_a.col_a
-   into table table_sub
-   from table_b, table_a
-        group by table_a.col_a
-
-first tries to sort sizeof(table_a) * sizeof(table_b) tuples...
-For tables big and small with 100 000 and 1000 tuples 
-
-select max(x*y), x from big, small group by x
-
-"ate" all free 140M in my file system after 20 minutes (just for
-sorting - nothing more) and was killed...
-
-select x from big where x = cor(x);
-(cor(int4) is 'select max($1*y) from small') takes 20 minutes -
-this is bad too.
-
-> >
-> > Actually, I don't see any problems if we going to process subselect
-> > like sql-funcs: non-correlated subselects can be emulated by
-> > funcs without args, for correlated subselects parser (analyze.c)
-> > has to change all upper query references to $1, $2,...
-> 
-> Yes, logically, they are SQL functions, but aren't we going to see
-> terrible performance in such circumstances.  My experience is that when
-  ^^^^^^^^^^^^^^^^^^^^
-You're right.
-
-> people are given subselects, they start to do huge jobs with them.
-> 
-> In fact, the final solution may be to have both methods available, and
-> switch between them depending on the size of the query sets.  Each
-> method has its advantages.  The function example lets the outside query
-> be executed, and only calls the subquery when needed.
-> 
-> For large tables where the subselect is small and is the entire WHERE
-> restriction, the SQL function gets call much too often.  A simple join
-> of the subquery result and the large table would be much better.  This
-> method also allows for sort/merge join of the subquery results, and
-> index use.
-
-...keep thinking...
-
-Vadim
-
-From [email protected] Mon Nov  3 11:01:01 1997
-Received: from renoir.op.net ([email protected] [206.84.208.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id LAA03633
-   for ; Mon, 3 Nov 1997 11:00:59 -0500 (EST)
-Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id KAA12174 for ; Mon, 3 Nov 1997 10:49:42 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.5/8.7.5) with SMTP id KAA26203; Mon, 3 Nov 1997 10:33:32 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 03 Nov 1997 10:31:43 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.5/8.7.5) id KAA25514 for pgsql-hackers-outgoing; Mon, 3 Nov 1997 10:31:36 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [206.84.210.195]) by hub.org (8.8.5/8.7.5) with ESMTP id KAA25449 for ; Mon, 3 Nov 1997 10:31:23 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id KAA02262;
-   Mon, 3 Nov 1997 10:25:34 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: Re: [HACKERS] subselects
-To: [email protected] (Vadim B. Mikheev)
-Date: Mon, 3 Nov 1997 10:25:34 -0500 (EST)
-Cc: [email protected]
-In-Reply-To: <[email protected]> from "Vadim B. Mikheev" at Nov 3, 97 06:09:43 pm
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-> Sorry, I missed using of temp tables. Sybase uses joins (without
-> temp tables) for non-correlated subqueries:
-> 
->     A noncorrelated subquery can be evaluated as if it were an independent query.
->     Conceptually, the results of the subquery are substituted in the main statement, or
->     outer query. This is not how SQL Server actually processes statements with
->     subqueries. Noncorrelated subqueries can be alternatively stated as joins and
->     are processed as joins by SQL Server. 
-> 
-> but this is not possible if there are aggregates in subquery.
-> 
-> > 
-> > My idea was this.  This is a non-correlated subquery.
-> ...
-> No problems with it...
-> 
-> > 
-> > Here is a correlated example:
-> > 
-> >         select *
-> >         from table_a
-> >         where table_a.col_a in (select table_b.col_b
-> >                         from table_b
-> >                         where table_b.col_b = table_a.col_c)
-> > 
-> > rewrite as:
-> > 
-> >         select distinct table_b.col_b, table_a.col_c -- the distinct is needed
-> >         into table_sub
-> >         from table_a, table_b
-> 
-> First, could we add 'where table_b.col_b = table_a.col_c' here ?
-> Just to avoid Cartesian results ? I hope we can.
-
-Yes, of course.  I forgot that line here.  We can also be fancy and move
-some of the outer where restrictions on table_a into the subquery.
-
-I think the classic subquery for this would be if someone wanted all
-customer names that had invoices in the past month:
-
-select custname
-from customer
-where custid in (select order.custid
-        from order
-        where order.date >= "09/01/97" and
-              order.date <= "09/30/97"
-
-In this case, the subquery can use an index on 'date' to quickly
-evaluate the query, and the resulting temp table can quickly be joined
-to the customer table.  If we used SQL functions, every customer would
-have an order query evaluated for it, and there may be no multi-column
-index on customer and date, or even if there is, this could be many
-query executions.
-
-
-> 
-> Note that for query
-> 
->         select *
->         from table_a
->         where table_a.col_a in (select table_b.col_b * table_a.col_c
->                         from table_b)
-> 
-> it's better to do
-> 
->  select distinct table_a.col_a
->  into table table_sub
->  from table_b, table_a
->         where table_a.col_a = table_b.col_b * table_a.col_c
-
-Yes, I had not thought of cases where they are doing correlated column
-arithmetic, but it looks like this would work.
-
-> 
-> once again - to avoid Cartesians.
-> 
-> But what could we do for
-> 
->         select *
->         from table_a
->         where table_a.col_a = (select max(table_b.col_b * table_a.col_c)
->                         from table_b)
-
-OK, who wrote this horrible query. :-)
-
-Without a join of table_b and table_a, even an SQL function would die on
-this.  You have to take the current value table_a.col_c, and multiply by
-every value of table_b.col_b to get the maximum.
-
-Trying to do a temp table on this is certainly going to be a cartesian
-product, but using an SQL function is also going to be a cartesian
-product, except that the product is generated in small pieces instead of
-in one big query.  The SQL function example may eventually complete, but
-it will take forever to do so in cases where the temp table would bomb.
-
-I can recommend some SQL books for anyone go sends in a bug report on
-this query. :-)
-
-
-
-> ???
->  select max(table_b.col_b * table_a.col_c), table_a.col_a
->  into table table_sub
->  from table_b, table_a
->         group by table_a.col_a
-> 
-> first tries to sort sizeof(table_a) * sizeof(table_b) tuples...
-> For tables big and small with 100 000 and 1000 tuples 
-> 
-> select max(x*y), x from big, small group by x
-> 
-> "ate" all free 140M in my file system after 20 minutes (just for
-> sorting - nothing more) and was killed...
-> 
-> select x from big where x = cor(x);
-> (cor(int4) is 'select max($1*y) from small') takes 20 minutes -
-> this is bad too.
-
-Again, my feeling is that in cases where the temp table would bomb, the
-SQL function will be so slow that neither will be acceptable.
-
-> 
-> > >
-> > > Actually, I don't see any problems if we going to process subselect
-> > > like sql-funcs: non-correlated subselects can be emulated by
-> > > funcs without args, for correlated subselects parser (analyze.c)
-> > > has to change all upper query references to $1, $2,...
-> > 
-> > Yes, logically, they are SQL functions, but aren't we going to see
-> > terrible performance in such circumstances.  My experience is that when
->   ^^^^^^^^^^^^^^^^^^^^
-> You're right.
-> 
-> > people are given subselects, they start to do huge jobs with them.
-> > 
-> > In fact, the final solution may be to have both methods available, and
-> > switch between them depending on the size of the query sets.  Each
-> > method has its advantages.  The function example lets the outside query
-> > be executed, and only calls the subquery when needed.
-> > 
-> > For large tables where the subselect is small and is the entire WHERE
-> > restriction, the SQL function gets call much too often.  A simple join
-> > of the subquery result and the large table would be much better.  This
-> > method also allows for sort/merge join of the subquery results, and
-> > index use.
-> 
-> ...keep thinking...
-> 
-> Vadim
-> 
-
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Thu Nov 20 00:09:18 1997
-Received: from hub.org (hub.org [209.47.148.200])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id AAA05239
-   for ; Thu, 20 Nov 1997 00:09:11 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.5/8.7.5) with SMTP id XAA13776; Wed, 19 Nov 1997 23:59:53 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Wed, 19 Nov 1997 23:58:49 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.5/8.7.5) id XAA13599 for pgsql-hackers-outgoing; Wed, 19 Nov 1997 23:58:43 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [206.84.210.195]) by hub.org (8.8.5/8.7.5) with ESMTP id XAA13512 for ; Wed, 19 Nov 1997 23:58:16 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id XAA03103
-   for [email protected]; Wed, 19 Nov 1997 23:57:44 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: [HACKERS] subselect
-To: [email protected] (PostgreSQL-development)
-Date: Wed, 19 Nov 1997 23:57:44 -0500 (EST)
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-I am going to overhaul all the /parser files, and I may give subselects
-a try while I am in there.  This is where it going to have to be done.
-
-Two things I think I need are:
-
-   temp tables that go away at the end of a statement, so if the
-query elog's out, the temp file gets destroyed
-
-   how do I implement "not in":
-
-       select * from a where x not in (select y from b)
-
-Using <> is not going to work because that returns multiple copies of a,
-one for every one that doesn't equal.  It is like we need not equals,
-but don't return multiple rows.
-
-Any ideas?
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Thu Nov 20 10:00:59 1997
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id KAA22019
-   for ; Thu, 20 Nov 1997 10:00:56 -0500 (EST)
-Received: from golem.jpl.nasa.gov ([email protected] [128.149.70.168]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id JAA21662 for ; Thu, 20 Nov 1997 09:52:55 -0500 (EST)
-Received: from alumni.caltech.edu (localhost [127.0.0.1])
-   by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id GAA22754;
-   Thu, 20 Nov 1997 06:27:21 GMT
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Thu, 20 Nov 1997 06:27:21 +0000
-From: "Thomas G. Lockhart" 
-Organization: Caltech/JPL
-X-Mailer: Mozilla 4.03 [en] (X11; I; Linux 2.0.30 i686)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: PostgreSQL-development 
-Subject: Re: [HACKERS] subselect
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-> I am going to overhaul all the /parser files
-
-??
-
-> , and I may give subselects
-> a try while I am in there.  This is where it going to have to be done.
-
-A first cut at the subselect syntax is already in gram.y. I'm sure that the
-e-mail you had sent which collected several items regarding subselects
-covers some of this topic. I've been thinking about subselects also, and
-had thought that there must be some existing mechanisms in the backend
-which can be used to help implement subselects. It seems to me that UNION
-might be a good thing to implement first, because it has a fairly
-well-defined set of behaviors:
-
-  select a union select b;
-
-chooses elements from a and from b and then sorts/uniques the result.
-
-  select a union all select b;
-
-chooses elements from a, sorts/uniques, and then adds all elements from b.
-
-  select a union select b union all select c;
-
-evaluates left to right, and first evaluates a union b, sorts/uniques, and
-then evaluates
-
-  (result) union all select c;
-
-There are several types of subselects. Examples of some are:
-
-1) select a.f from a union select b.f from b order by 1;
-Needs temporary table(s), optional sort/unique, final order by.
-
-2) select a.f from a where a.f in (select b.f from b);
-Needs temporary table(s). "in" can be first implemented by count(*) > 0 but
-would be better performance to have the backend return after the first
-match.
-
-3) select a.f from a where exists (select b.f from b where b.f = a);
-Need to do the select and do a subselect on _each_ of the returned values?
-Again could use count(*) to help implement.
-
-This brings up the point that perhaps the backend needs a row-counting
-atomic operation and count(*) could be re-implemented using that. At the
-moment count(*) is transformed to a select of OID columns and does not
-quite work on table joins.
-
-I would think that outer joins could use some of these support routines
-also.
-
-                                                       - Tom
-
-> Two things I think I need are:
->
->         temp tables that go away at the end of a statement, so if the
-> query elog's out, the temp file gets destroyed
->
->         how do I implement "not in":
->
->                 select * from a where x not in (select y from b)
->
-> Using <> is not going to work because that returns multiple copies of a,
-> one for every one that doesn't equal.  It is like we need not equals,
-> but don't return multiple rows.
->
-> Any ideas?
->
-> --
-> Bruce Momjian
-> [email protected]
-
-
-
-
-From [email protected] Mon Dec 22 00:49:03 1997
-Received: from hub.org (hub.org [209.47.148.200])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id AAA13311
-   for ; Mon, 22 Dec 1997 00:49:01 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.5/8.7.5) with SMTP id AAA11930; Mon, 22 Dec 1997 00:45:41 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 22 Dec 1997 00:45:17 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.5/8.7.5) id AAA11756 for pgsql-hackers-outgoing; Mon, 22 Dec 1997 00:45:14 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [209.152.195.67]) by hub.org (8.8.5/8.7.5) with ESMTP id AAA11624 for ; Mon, 22 Dec 1997 00:44:57 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id AAA11605
-   for [email protected]; Mon, 22 Dec 1997 00:45:23 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: [HACKERS] subselects
-To: [email protected] (PostgreSQL-development)
-Date: Mon, 22 Dec 1997 00:45:23 -0500 (EST)
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-OK, a few questions:
-
-   Should we use sortmerge, so we can use our psort as temp tables,
-or do we use hashunique?
-
-   How do we pass the query to the optimizer?  How do we represent
-the range table for each, and the links between them in correlated
-subqueries?
-
-I have to think about this.  Comments are welcome.
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Mon Dec 22 02:01:27 1997
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id CAA20608
-   for ; Mon, 22 Dec 1997 02:01:25 -0500 (EST)
-Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id BAA25136 for ; Mon, 22 Dec 1997 01:37:29 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.5/8.7.5) with SMTP id BAA25289; Mon, 22 Dec 1997 01:31:18 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 22 Dec 1997 01:30:45 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.5/8.7.5) id BAA23854 for pgsql-hackers-outgoing; Mon, 22 Dec 1997 01:30:35 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [209.152.195.67]) by hub.org (8.8.5/8.7.5) with ESMTP id BAA22847 for ; Mon, 22 Dec 1997 01:30:15 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id BAA17354
-   for [email protected]; Mon, 22 Dec 1997 01:05:04 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: [HACKERS] subselects (fwd)
-To: [email protected] (PostgreSQL-development)
-Date: Mon, 22 Dec 1997 01:05:03 -0500 (EST)
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-Forwarded message:
-> OK, a few questions:
-> 
->  Should we use sortmerge, so we can use our psort as temp tables,
-> or do we use hashunique?
-> 
->  How do we pass the query to the optimizer?  How do we represent
-> the range table for each, and the links between them in correlated
-> subqueries?
-> 
-> I have to think about this.  Comments are welcome.
-
-One more thing.  I guess I am seeing subselects as a different thing
-that temp tables.  I can see people wanting to put indexes on their temp
-tables, so I think they will need more system catalog support.  For
-subselects, I think we can just stuff them into psort, perhaps, and do
-the unique as we unload them.
-
-Seems like a natural to me.
-
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Tue Dec 23 04:01:07 1997
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id EAA08876
-   for ; Tue, 23 Dec 1997 04:00:57 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id QAA23042;
-   Tue, 23 Dec 1997 16:08:56 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Tue, 23 Dec 1997 16:08:56 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: PostgreSQL-development 
-Subject: Re: [HACKERS] subselects (fwd)
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> Forwarded message:
-> > OK, a few questions:
-> >
-> >       Should we use sortmerge, so we can use our psort as temp tables,
-> > or do we use hashunique?
-> >
-> >       How do we pass the query to the optimizer?  How do we represent
-> > the range table for each, and the links between them in correlated
-> > subqueries?
-> >
-> > I have to think about this.  Comments are welcome.
-> 
-> One more thing.  I guess I am seeing subselects as a different thing
-> that temp tables.  I can see people wanting to put indexes on their temp
-> tables, so I think they will need more system catalog support.  For
-                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-What's the difference between temp tables and temp indices ?
-Both of them are handled via catalog cache...
-
-Vadim
-
-From [email protected] Sat Jan  3 04:01:00 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id EAA28565
-   for ; Sat, 3 Jan 1998 04:00:58 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id DAA19242 for ; Sat, 3 Jan 1998 03:47:07 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id QAA21017;
-   Sat, 3 Jan 1998 16:08:55 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Sat, 03 Jan 1998 16:08:51 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian ,
-        "Thomas G. Lockhart" 
-Subject: Re: subselects
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> With UNIONs done, how are things going with you on subselects?  UNIONs
-> are much easier that subselects.
-> 
-> I am stumped on how to record the subselect query information in the
-> parser and stuff.
-
-   And I'm too. We definitely need in EXISTS node and may be in IN one.
-Also, we have to support ANY and ALL modifiers of comparison operators
-(it would be nice to support ANY and ALL for all operators returning
-bool: >, =, ..., like, ~ and so on). Note, that IN is the same as
-= ANY (NOT IN ==> <> ALL) assuming that '=' means EQUAL for all data types,
-and so, we could avoid IN node, but I'm not sure that I like such
-assumption: postgres is OO-like system allowing operators to be overriden
-and so, '=' can, in theory, mean not EQUAL but something else (someday
-we could allow to specify "meaning" of operator in CREATE OPERATOR) -
-in short, I would like IN node.
-   Also, I would suggest nodes for ANY and ALL.
-   (I need in few days to think more about recording of this stuff...)
-
-> 
-> Please let me know what I can do to help, if anything.
-
-Thanks. As I remember, Tom also wished to work here. Tom ?
-
-Bye,
-   Vadim
-
-P.S. I'll be "on-line" Jan 5.
-
-From [email protected] Mon Jan  5 07:30:51 1998
-Received: from hub.org (hub.org [209.47.148.200])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id HAA05466
-   for ; Mon, 5 Jan 1998 07:30:49 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id HAA04700; Mon, 5 Jan 1998 07:22:06 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 05 Jan 1998 07:21:45 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id HAA02846 for pgsql-hackers-outgoing; Mon, 5 Jan 1998 07:21:35 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by hub.org (8.8.5/8.7.5) with ESMTP id HAA00903 for ; Mon, 5 Jan 1998 07:20:57 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id TAA24278;
-   Mon, 5 Jan 1998 19:36:06 +0700 (KRS)
-   (envelope-from [email protected])
-Message-ID: <[email protected]>
-Date: Mon, 05 Jan 1998 19:35:59 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: PostgreSQL-development 
-Subject: Re: [HACKERS] subselect
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> I was thinking about subselects, and how to attach the two queries.
-> 
-> What if the subquery makes a range table entry in the outer query, and
-> the query is set up like the UNION queries where we put the scans in a
-> row, but in the case we put them over/under each other.
-> 
-> And we push a temp table into the catalog cache that represents the
-> result of the subquery, then we could join to it in the outer query as
-> though it was a real table.
-> 
-> Also, can't we do the correlated subqueries by adding the proper
-> target/output columns to the subquery, and have the outer query
-> reference those columns in the subquery range table entry.
-
-Yes, this is a way to handle subqueries by joining to temp table.
-After getting plan we could change temp table access path to
-node material. On the other hand, it could be useful to let optimizer
-know about cost of temp table creation (have to think more about it)...
-Unfortunately, not all subqueries can be handled by "normal" joins: NOT IN
-is one example of this - joining by <> will give us invalid results.
-Setting special NOT EQUAL flag is not enough: subquery plan must be
-always inner one in this case. The same for handling ALL modifier.
-Note, that we generaly can't use aggregates here: we can't add MAX to 
-subquery in the case of > ALL (subquery), because of > ALL should return FALSE
-if subquery returns NULL(s) but aggregates don't take NULLs into account.
-
-> 
-> Maybe I can write up a sample of this?  Vadim, would this help?  Is this
-> the point we are stuck at?
-
-Personally, I was stuck by holydays -:)
-Now I can spend ~ 8 hours ~ each day for development...
-
-Vadim
-
-
-From [email protected] Mon Jan  5 10:45:30 1998
-Received: from hub.org (hub.org [209.47.148.200])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id KAA10769
-   for ; Mon, 5 Jan 1998 10:45:28 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id KAA17823; Mon, 5 Jan 1998 10:32:00 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 05 Jan 1998 10:31:45 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id KAA17757 for pgsql-hackers-outgoing; Mon, 5 Jan 1998 10:31:38 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [209.152.195.67]) by hub.org (8.8.5/8.7.5) with ESMTP id KAA17727 for ; Mon, 5 Jan 1998 10:31:06 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id KAA10375;
-   Mon, 5 Jan 1998 10:28:48 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: Re: [HACKERS] subselect
-To: [email protected] (Vadim B. Mikheev)
-Date: Mon, 5 Jan 1998 10:28:48 -0500 (EST)
-Cc: [email protected]
-In-Reply-To: <[email protected]> from "Vadim B. Mikheev" at Jan 5, 98 07:35:59 pm
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-> Yes, this is a way to handle subqueries by joining to temp table.
-> After getting plan we could change temp table access path to
-> node material. On the other hand, it could be useful to let optimizer
-> know about cost of temp table creation (have to think more about it)...
-> Unfortunately, not all subqueries can be handled by "normal" joins: NOT IN
-> is one example of this - joining by <> will give us invalid results.
-> Setting special NOT EQUAL flag is not enough: subquery plan must be
-> always inner one in this case. The same for handling ALL modifier.
-> Note, that we generaly can't use aggregates here: we can't add MAX to 
-> subquery in the case of > ALL (subquery), because of > ALL should return FALSE
-> if subquery returns NULL(s) but aggregates don't take NULLs into account.
-
-OK, here are my ideas.  First, I think you have to handle subselects in
-the outer node because a subquery could have its own subquery.  Also, we
-now have a field in Aggreg to all us to 'usenulls'.
-
-OK, here it is.  I recommend we pass the outer and subquery through
-the parser and optimizer separately.
-
-We parse the subquery first.  If the subquery is not correlated, it
-should parse fine.  If it is correlated, any columns we find in the
-subquery that are not already in the FROM list, we add the table to the
-subquery FROM list, and add the referenced column to the target list of
-the subquery.
-
-When we are finished parsing the subquery, we create a catalog cache
-entry for it called 'sub1' and make its fields match the target
-list of the subquery.
-
-In the outer query, we add 'sub1' to its target list, and change
-the subquery reference to point to the new range table.  We also add
-WHERE clauses to do any correlated joins.
-
-Here is a simple example:
-
-   select *
-   from taba
-   where col1 = (select col2
-             from tabb)
-
-This is not correlated, and the subquery parser easily.  We create a
-'sub1' catalog cache entry, and add 'sub1' to the outer query FROM
-clause.  We also replace 'col1 = (subquery)' with 'col1 = sub1.col2'.
-
-Here is a more complex correlated subquery:
-
-   select *
-   from taba
-   where col1 = (select col2
-             from tabb
-             where taba.col3 = tabb.col4)
-
-Here we must add 'taba' to the subquery's FROM list, and add col3 to the
-target list of the subquery.  After we parse the subquery, add 'sub1' to
-the FROM list of the outer query, change 'col1 = (subquery)' to 'col1 =
-sub1.col2', and add to the outer WHERE clause 'AND taba.col3 = sub1.col3'.
-THe optimizer will do the correlation for us.
-
-In the optimizer, we can parse the subquery first, then the outer query,
-and then replace all 'sub1' references in the outer query to use the
-subquery plan.
-
-I realize making merging the two plans and doing IN and NOT IN is the
-real challenge, but I hoped this would give us a start.
-
-What do you think?
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Mon Jan  5 15:02:46 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id PAA28690
-   for ; Mon, 5 Jan 1998 15:02:44 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id OAA08811 for ; Mon, 5 Jan 1998 14:28:43 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id CAA24904;
-   Tue, 6 Jan 1998 02:56:00 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Tue, 06 Jan 1998 02:55:57 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: [email protected]
-Subject: Re: [HACKERS] subselect
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> > always inner one in this case. The same for handling ALL modifier.
-> > Note, that we generaly can't use aggregates here: we can't add MAX to
-> > subquery in the case of > ALL (subquery), because of > ALL should return FALSE
-> > if subquery returns NULL(s) but aggregates don't take NULLs into account.
-> 
-> OK, here are my ideas.  First, I think you have to handle subselects in
-> the outer node because a subquery could have its own subquery.  Also, we
-
-I hope that this is no matter: if results of subquery (with/without sub-subqueries)
-will go into temp table then this table will be re-scanned for each outer tuple.
-
-> now have a field in Aggreg to all us to 'usenulls'.
-                                           ^^^^^^^^
- This can't help:
-
-vac=> select * from x;
-y
--
-1
-2
-3
- <<< this is NULL
-(4 rows)
-
-vac=> select max(y) from x;
-max
----
-  3
-
-==> we can't replace 
-
-select * from A where A.a > ALL (select y from x);
-                                 ^^^^^^^^^^^^^^^
-           (NULL will be returned and so A.a > ALL is FALSE - this is what 
-            Sybase does, is it right ?)
-with
-
-select * from A where A.a > (select max(y) from x);
-                             ^^^^^^^^^^^^^^^^^^^^
-just because of we lose knowledge about NULLs here.
-
-Also, I would like to handle ANY and ALL modifiers for all bool
-operators, either built-in or user-defined, for all data types -
-isn't PostgreSQL OO-like RDBMS -:)
-
-> OK, here it is.  I recommend we pass the outer and subquery through
-> the parser and optimizer separately.
-
-I don't like this. I would like to get parse-tree from parser for
-entire query and let optimizer (on upper level) decide how to rewrite
-parse-tree and what plans to produce and how these plans should be
-merged. Note, that I don't object your methods below, but only where
-to place handling of this. I don't understand why should we add
-new part to the system which will do optimizer' work (parse-tree --> 
-execution plan) and deal with optimizer nodes. Imho, upper optimizer
-level is nice place to do this.
-
-> 
-> We parse the subquery first.  If the subquery is not correlated, it
-> should parse fine.  If it is correlated, any columns we find in the
-> subquery that are not already in the FROM list, we add the table to the
-> subquery FROM list, and add the referenced column to the target list of
-> the subquery.
-> 
-> When we are finished parsing the subquery, we create a catalog cache
-> entry for it called 'sub1' and make its fields match the target
-> list of the subquery.
-> 
-> In the outer query, we add 'sub1' to its target list, and change
-> the subquery reference to point to the new range table.  We also add
-> WHERE clauses to do any correlated joins.
-...
-> Here is a more complex correlated subquery:
-> 
->         select *
->         from taba
->         where col1 = (select col2
->                       from tabb
->                       where taba.col3 = tabb.col4)
-> 
-> Here we must add 'taba' to the subquery's FROM list, and add col3 to the
-> target list of the subquery.  After we parse the subquery, add 'sub1' to
-> the FROM list of the outer query, change 'col1 = (subquery)' to 'col1 =
-> sub1.col2', and add to the outer WHERE clause 'AND taba.col3 = sub1.col3'.
-> THe optimizer will do the correlation for us.
-> 
-> In the optimizer, we can parse the subquery first, then the outer query,
-> and then replace all 'sub1' references in the outer query to use the
-> subquery plan.
-> 
-> I realize making merging the two plans and doing IN and NOT IN is the
-                   ^^^^^^^^^^^^^^^^^^^^^
-This is very easy to do! As I already said we have just change sub1
-access path (SeqScan of sub1) with SeqScan of Material node with 
-subquery plan.
-
-> real challenge, but I hoped this would give us a start.
-
-Decision about how to record subquery stuff in to parse-tree
-would be very good start -:)
-
-BTW, note that for _expression_ subqueries (which are introduced without
-IN, EXISTS, ALL, ANY - this follows Sybase' naming) - as in your examples - 
-we have to check that subquery returns single tuple...
-
-Vadim
-
-From [email protected] Mon Jan  5 20:31:03 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id UAA06836
-   for ; Mon, 5 Jan 1998 20:31:01 -0500 (EST)
-Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id TAA29980 for ; Mon, 5 Jan 1998 19:56:05 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id TAA28044; Mon, 5 Jan 1998 19:06:11 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 05 Jan 1998 19:03:16 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id TAA27203 for pgsql-hackers-outgoing; Mon, 5 Jan 1998 19:03:02 -0500 (EST)
-Received: from clio.trends.ca ([email protected] [209.47.148.2]) by hub.org (8.8.8/8.7.5) with ESMTP id TAA27049 for ; Mon, 5 Jan 1998 19:02:30 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [209.152.195.67])
-   by clio.trends.ca (8.8.8/8.8.8) with ESMTP id RAA09337
-   for ; Mon, 5 Jan 1998 17:31:04 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id RAA02675;
-   Mon, 5 Jan 1998 17:16:40 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: Re: [HACKERS] subselect
-To: [email protected] (Vadim B. Mikheev)
-Date: Mon, 5 Jan 1998 17:16:40 -0500 (EST)
-Cc: [email protected]
-In-Reply-To: <[email protected]> from "Vadim B. Mikheev" at Jan 6, 98 05:18:11 am
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-> > I am confused.  Do you want one flat query and want to pass the whole
-> > thing into the optimizer?  That brings up some questions:
-> 
-> No. I just want to follow Tom's way: I would like to see new
-> SubSelect node as shortened version of struct Query (or use
-> Query structure for each subquery - no matter for me), some 
-> subquery-related stuff added to Query (and SubSelect) to help
-> optimizer to start, and see
-
-OK, so you want the subquery to actually be INSIDE the outer query
-expression.  Do they share a common range table?  If they don't, we
-could very easily just fly through when processing the WHERE clause, and
-start a new query using a new query structure for the subquery.  Believe
-me, you don't want a separate SubQuery-type, just re-use Query for it. 
-It allows you to call all the normal query stuff with a consistent
-structure.
-
-The parser will need to know it is in a subquery, so it can add the
-proper target columns to the subquery, or are you going to do that in
-the optimizer.  You can do it in the optimizer, and join the range table
-references there too.
-
-> 
-> typedef struct A_Expr
-> {
->     NodeTag     type;
->     int         oper;           /* type of operation
->                                  * {OP,OR,AND,NOT,ISNULL,NOTNULL} */
->     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
->             IN, NOT IN, ANY, ALL, EXISTS here,
-> 
->     char       *opname;         /* name of operator/function */
->     Node       *lexpr;          /* left argument */
->     Node       *rexpr;          /* right argument */
->     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
->             and SubSelect (Query) here (as possible case).
-> 
-> One thought to follow this way: RULEs (and so - VIEWs) are handled by using
-> Query - how else can we implement VIEWs on selects with subqueries ?
-
-Views are stored as nodeout structures, and are merged into the query's
-from list, target list, and where clause.  I am working out
-readfunc,outfunc now to make sure they are up-to-date with all the
-current fields.
-
-> 
-> BTW, is
-> 
-> select * from A where (select TRUE from B);
-> 
-> valid syntax ?
-
-I don't think so.
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Mon Jan  5 17:01:54 1998
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id RAA02066
-   for ; Mon, 5 Jan 1998 17:01:47 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id FAA25063;
-   Tue, 6 Jan 1998 05:18:13 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Tue, 06 Jan 1998 05:18:11 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: [email protected]
-Subject: Re: [HACKERS] subselect
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> > > OK, here it is.  I recommend we pass the outer and subquery through
-> > > the parser and optimizer separately.
-> >
-> > I don't like this. I would like to get parse-tree from parser for
-> > entire query and let optimizer (on upper level) decide how to rewrite
-> > parse-tree and what plans to produce and how these plans should be
-> > merged. Note, that I don't object your methods below, but only where
-> > to place handling of this. I don't understand why should we add
-> > new part to the system which will do optimizer' work (parse-tree -->
-> > execution plan) and deal with optimizer nodes. Imho, upper optimizer
-> > level is nice place to do this.
-> 
-> I am confused.  Do you want one flat query and want to pass the whole
-> thing into the optimizer?  That brings up some questions:
-
-No. I just want to follow Tom's way: I would like to see new
-SubSelect node as shortened version of struct Query (or use
-Query structure for each subquery - no matter for me), some 
-subquery-related stuff added to Query (and SubSelect) to help
-optimizer to start, and see
-
-typedef struct A_Expr
-{
-    NodeTag     type;
-    int         oper;           /* type of operation
-                                 * {OP,OR,AND,NOT,ISNULL,NOTNULL} */
-    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-            IN, NOT IN, ANY, ALL, EXISTS here,
-
-    char       *opname;         /* name of operator/function */
-    Node       *lexpr;          /* left argument */
-    Node       *rexpr;          /* right argument */
-    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-            and SubSelect (Query) here (as possible case).
-
-One thought to follow this way: RULEs (and so - VIEWs) are handled by using
-Query - how else can we implement VIEWs on selects with subqueries ?
-
-BTW, is
-
-select * from A where (select TRUE from B);
-
-valid syntax ?
-
-Vadim
-
-From [email protected] Mon Jan  5 18:00:57 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id SAA03296
-   for ; Mon, 5 Jan 1998 18:00:55 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id RAA20716 for ; Mon, 5 Jan 1998 17:22:21 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id FAA25094;
-   Tue, 6 Jan 1998 05:49:02 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Tue, 06 Jan 1998 05:48:58 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Goran Thyni 
-CC: [email protected], [email protected]
-Subject: Re: [HACKERS] subselect
-References: <[email protected]> <[email protected]> <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Goran Thyni wrote:
-> 
-> Vadim,
-> 
->    Unfortunately, not all subqueries can be handled by "normal" joins: NOT IN
->    is one example of this - joining by <> will give us invalid results.
-> 
-> What is you approach towards this problem?
-
-Actually, this is problem of ALL modifier (NOT IN is _not_equal_ ALL)
-and so, we have to have not just NOT EQUAL flag but some ALL node
-with modified operator.
-
-After that, one way is put subquery into inner plan of an join node
-to be sure that for an outer tuple all corresponding subquery tuples
-will be tested with modified operator (this will require either
-changing code of all join nodes or addition of new plan type - we'll see)
-and another way is ... suggested by you:
-
-> I got an idea that one could reverse the order,
-> that is execute the outer first into a temptable
-> and delete from that according to the result of the
-> subquery and then return it.
-> Probably this is too raw and slow. ;-)
-
-This will be faster in some cases (when subquery returns many results
-and there are "not so many" results from outer query) - thanks for idea!
-
-> 
->    Personally, I was stuck by holydays -:)
->    Now I can spend ~ 8 hours ~ each day for development...
-> 
-> Oh, isn't it christmas eve right now in Russia?
-
-Due to historic reasons New Year is mu-u-u-uch popular
-holiday in Russia -:)
-
-Vadim
-
-From [email protected] Mon Jan  5 19:32:59 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id TAA05070
-   for ; Mon, 5 Jan 1998 19:32:57 -0500 (EST)
-Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id SAA26847 for ; Mon, 5 Jan 1998 18:59:43 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id TAA28045; Mon, 5 Jan 1998 19:06:11 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 05 Jan 1998 19:03:40 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id TAA27280 for pgsql-hackers-outgoing; Mon, 5 Jan 1998 19:03:25 -0500 (EST)
-Received: from clio.trends.ca ([email protected] [209.47.148.2]) by hub.org (8.8.8/8.7.5) with ESMTP id TAA27030 for ; Mon, 5 Jan 1998 19:02:25 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86])
-   by clio.trends.ca (8.8.8/8.8.8) with ESMTP id RAA09438
-   for ; Mon, 5 Jan 1998 17:35:43 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id FAA25094;
-   Tue, 6 Jan 1998 05:49:02 +0700 (KRS)
-   (envelope-from [email protected])
-Message-ID: <[email protected]>
-Date: Tue, 06 Jan 1998 05:48:58 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Goran Thyni 
-CC: [email protected], [email protected]
-Subject: Re: [HACKERS] subselect
-References: <[email protected]> <[email protected]> <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-Goran Thyni wrote:
-> 
-> Vadim,
-> 
->    Unfortunately, not all subqueries can be handled by "normal" joins: NOT IN
->    is one example of this - joining by <> will give us invalid results.
-> 
-> What is you approach towards this problem?
-
-Actually, this is problem of ALL modifier (NOT IN is _not_equal_ ALL)
-and so, we have to have not just NOT EQUAL flag but some ALL node
-with modified operator.
-
-After that, one way is put subquery into inner plan of an join node
-to be sure that for an outer tuple all corresponding subquery tuples
-will be tested with modified operator (this will require either
-changing code of all join nodes or addition of new plan type - we'll see)
-and another way is ... suggested by you:
-
-> I got an idea that one could reverse the order,
-> that is execute the outer first into a temptable
-> and delete from that according to the result of the
-> subquery and then return it.
-> Probably this is too raw and slow. ;-)
-
-This will be faster in some cases (when subquery returns many results
-and there are "not so many" results from outer query) - thanks for idea!
-
-> 
->    Personally, I was stuck by holydays -:)
->    Now I can spend ~ 8 hours ~ each day for development...
-> 
-> Oh, isn't it christmas eve right now in Russia?
-
-Due to historic reasons New Year is mu-u-u-uch popular
-holiday in Russia -:)
-
-Vadim
-
-
-From [email protected] Mon Jan  5 18:00:59 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id SAA03300
-   for ; Mon, 5 Jan 1998 18:00:57 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id RAA21652 for ; Mon, 5 Jan 1998 17:42:15 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id GAA25129;
-   Tue, 6 Jan 1998 06:10:05 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Tue, 06 Jan 1998 06:09:56 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: [email protected]
-Subject: Re: [HACKERS] subselect
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> > > I am confused.  Do you want one flat query and want to pass the whole
-> > > thing into the optimizer?  That brings up some questions:
-> >
-> > No. I just want to follow Tom's way: I would like to see new
-> > SubSelect node as shortened version of struct Query (or use
-> > Query structure for each subquery - no matter for me), some
-> > subquery-related stuff added to Query (and SubSelect) to help
-> > optimizer to start, and see
-> 
-> OK, so you want the subquery to actually be INSIDE the outer query
-> expression.  Do they share a common range table?  If they don't, we
-               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-No.
-
-> could very easily just fly through when processing the WHERE clause, and
-> start a new query using a new query structure for the subquery.  Believe
-   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-... and filling some subquery-related stuff in upper query structure -
-still don't know what exactly this could be -:)
-
-> me, you don't want a separate SubQuery-type, just re-use Query for it.
-> It allows you to call all the normal query stuff with a consistent
-> structure.
-
-No objections.
-
-> 
-> The parser will need to know it is in a subquery, so it can add the
-> proper target columns to the subquery, or are you going to do that in
-
-I don't think that we need in it, but list of correlation clauses
-could be good thing - all in all parser has to check all column 
-references...
-
-> the optimizer.  You can do it in the optimizer, and join the range table
-> references there too.
-
-Yes.
-
-> > typedef struct A_Expr
-> > {
-> >     NodeTag     type;
-> >     int         oper;           /* type of operation
-> >                                  * {OP,OR,AND,NOT,ISNULL,NOTNULL} */
-> >     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-> >             IN, NOT IN, ANY, ALL, EXISTS here,
-> >
-> >     char       *opname;         /* name of operator/function */
-> >     Node       *lexpr;          /* left argument */
-> >     Node       *rexpr;          /* right argument */
-> >     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-> >             and SubSelect (Query) here (as possible case).
-> >
-> > One thought to follow this way: RULEs (and so - VIEWs) are handled by using
-> > Query - how else can we implement VIEWs on selects with subqueries ?
-> 
-> Views are stored as nodeout structures, and are merged into the query's
-> from list, target list, and where clause.  I am working out
-> readfunc,outfunc now to make sure they are up-to-date with all the
-> current fields.
-
-Nice! This stuff was out-of-date for too long time.
-
-> > BTW, is
-> >
-> > select * from A where (select TRUE from B);
-> >
-> > valid syntax ?
-> 
-> I don't think so.
-
-And so, *rexpr can be of Query type only for oper "in" OP, IN, NOT IN,
-ANY, ALL, EXISTS - well.
-
-(Time to sleep -:)
-
-Vadim
-
-From [email protected] Mon Jan  5 20:31:08 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id UAA06842
-   for ; Mon, 5 Jan 1998 20:31:06 -0500 (EST)
-Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id UAA00621 for ; Mon, 5 Jan 1998 20:03:49 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id TAA28043; Mon, 5 Jan 1998 19:06:11 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 05 Jan 1998 19:03:38 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id TAA27270 for pgsql-hackers-outgoing; Mon, 5 Jan 1998 19:03:22 -0500 (EST)
-Received: from clio.trends.ca ([email protected] [209.47.148.2]) by hub.org (8.8.8/8.7.5) with ESMTP id TAA27141 for ; Mon, 5 Jan 1998 19:02:50 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86])
-   by clio.trends.ca (8.8.8/8.8.8) with ESMTP id RAA09919
-   for ; Mon, 5 Jan 1998 17:54:47 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id GAA25129;
-   Tue, 6 Jan 1998 06:10:05 +0700 (KRS)
-   (envelope-from [email protected])
-Message-ID: <[email protected]>
-Date: Tue, 06 Jan 1998 06:09:56 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: [email protected]
-Subject: Re: [HACKERS] subselect
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> > > I am confused.  Do you want one flat query and want to pass the whole
-> > > thing into the optimizer?  That brings up some questions:
-> >
-> > No. I just want to follow Tom's way: I would like to see new
-> > SubSelect node as shortened version of struct Query (or use
-> > Query structure for each subquery - no matter for me), some
-> > subquery-related stuff added to Query (and SubSelect) to help
-> > optimizer to start, and see
-> 
-> OK, so you want the subquery to actually be INSIDE the outer query
-> expression.  Do they share a common range table?  If they don't, we
-               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-No.
-
-> could very easily just fly through when processing the WHERE clause, and
-> start a new query using a new query structure for the subquery.  Believe
-   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-... and filling some subquery-related stuff in upper query structure -
-still don't know what exactly this could be -:)
-
-> me, you don't want a separate SubQuery-type, just re-use Query for it.
-> It allows you to call all the normal query stuff with a consistent
-> structure.
-
-No objections.
-
-> 
-> The parser will need to know it is in a subquery, so it can add the
-> proper target columns to the subquery, or are you going to do that in
-
-I don't think that we need in it, but list of correlation clauses
-could be good thing - all in all parser has to check all column 
-references...
-
-> the optimizer.  You can do it in the optimizer, and join the range table
-> references there too.
-
-Yes.
-
-> > typedef struct A_Expr
-> > {
-> >     NodeTag     type;
-> >     int         oper;           /* type of operation
-> >                                  * {OP,OR,AND,NOT,ISNULL,NOTNULL} */
-> >     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-> >             IN, NOT IN, ANY, ALL, EXISTS here,
-> >
-> >     char       *opname;         /* name of operator/function */
-> >     Node       *lexpr;          /* left argument */
-> >     Node       *rexpr;          /* right argument */
-> >     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-> >             and SubSelect (Query) here (as possible case).
-> >
-> > One thought to follow this way: RULEs (and so - VIEWs) are handled by using
-> > Query - how else can we implement VIEWs on selects with subqueries ?
-> 
-> Views are stored as nodeout structures, and are merged into the query's
-> from list, target list, and where clause.  I am working out
-> readfunc,outfunc now to make sure they are up-to-date with all the
-> current fields.
-
-Nice! This stuff was out-of-date for too long time.
-
-> > BTW, is
-> >
-> > select * from A where (select TRUE from B);
-> >
-> > valid syntax ?
-> 
-> I don't think so.
-
-And so, *rexpr can be of Query type only for oper "in" OP, IN, NOT IN,
-ANY, ALL, EXISTS - well.
-
-(Time to sleep -:)
-
-Vadim
-
-
-From [email protected] Thu Jan  8 23:10:50 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id XAA09707
-   for ; Thu, 8 Jan 1998 23:10:48 -0500 (EST)
-Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id XAA19334 for ; Thu, 8 Jan 1998 23:08:49 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id XAA14375; Thu, 8 Jan 1998 23:03:29 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Thu, 08 Jan 1998 23:03:10 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id XAA14345 for pgsql-hackers-outgoing; Thu, 8 Jan 1998 23:03:06 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [209.152.195.67]) by hub.org (8.8.8/8.7.5) with ESMTP id XAA14008 for ; Thu, 8 Jan 1998 23:00:50 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id WAA09243;
-   Thu, 8 Jan 1998 22:55:03 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: [HACKERS] subselects
-To: [email protected] (Vadim B. Mikheev)
-Date: Thu, 8 Jan 1998 22:55:03 -0500 (EST)
-Cc: [email protected] (PostgreSQL-development)
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-Vadim, I know you are still thinking about subselects, but I have some
-more clarification that may help.
-
-We have to add phantom range table entries to correlated subselects so
-they will pass the parser.  We might as well add those fields to the
-target list of the subquery at the same time:
-
-   select *
-   from taba
-   where col1 = (select col2
-             from tabb
-             where taba.col3 = tabb.col4)
-
-becomes:
-
-   select *
-   from taba
-   where col1 = (select col2, tabb.col4 <---
-             from tabb, taba  <---
-             where taba.col3 = tabb.col4)
-
-We add a field to TargetEntry and RangeTblEntry to mark the fact that it
-was entered as a correlation entry:
-
-   bool    isCorrelated;
-
-Second, we need to hook the subselect to the main query.  I recommend we
-add two fields to Query for this:
-
-   Query *parentQuery;
-   List *subqueries;
-
-The parentQuery pointer is used to resolve field names in the correlated
-subquery.
-
-   select *
-   from taba
-   where col1 = (select col2, tabb.col4 <---
-             from tabb, taba  <---
-             where taba.col3 = tabb.col4)
-
-In the query above, the subquery can be easily parsed, and we add the
-subquery to the parsent's parentQuery list.
-
-In the parent query, to parse the WHERE clause, we create a new operator
-type, called IN or NOT_IN, or ALL, where the left side is a Var, and the
-right side is an index to a slot in the subqueries List.
-
-We can then do the rest in the upper optimizer.
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Fri Jan  9 10:01:01 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id KAA27305
-   for ; Fri, 9 Jan 1998 10:00:59 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id JAA21583 for ; Fri, 9 Jan 1998 09:52:17 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id WAA01623;
-   Fri, 9 Jan 1998 22:10:25 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Fri, 09 Jan 1998 22:10:06 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: PostgreSQL-development 
-Subject: Re: subselects
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> Vadim, I know you are still thinking about subselects, but I have some
-> more clarification that may help.
-> 
-> We have to add phantom range table entries to correlated subselects so
-> they will pass the parser.  We might as well add those fields to the
-> target list of the subquery at the same time:
-> 
->         select *
->         from taba
->         where col1 = (select col2
->                       from tabb
->                       where taba.col3 = tabb.col4)
-> 
-> becomes:
-> 
->         select *
->         from taba
->         where col1 = (select col2, tabb.col4 <---
->                       from tabb, taba  <---
->                       where taba.col3 = tabb.col4)
-> 
-> We add a field to TargetEntry and RangeTblEntry to mark the fact that it
-> was entered as a correlation entry:
-> 
->         bool    isCorrelated;
-
-No, I don't like to add anything in parser. Example:
-
-        select *
-        from tabA
-        where col1 = (select col2
-                      from tabB
-                      where tabA.col3 = tabB.col4
-                      and exists (select * 
-                                  from tabC 
-                                  where tabB.colX = tabC.colX and
-                                        tabC.colY = tabA.col2)
-                     )
-
-: a column of tabA is referenced in sub-subselect 
-(is it allowable by standards ?) - in this case it's better 
-to don't add tabA to 1st subselect but add tabA to second one
-and change tabA.col3 in 1st to reference col3 in 2nd subquery temp table -
-this gives us 2-tables join in 1st subquery instead of 3-tables join.
-(And I'm still not sure that using temp tables is best of what can be 
-done in all cases...)
-
-Instead of using isCorrelated in TE & RTE we can add 
-
-Index varlevel;
-
-to Var node to reflect (sub)query from where this Var is come
-(where is range table to find var's relation using varno). Upmost query
-will have varlevel = 0, all its (dirrect) children - varlevel = 1 and so on.
-                        ^^^                          ^^^^^^^^^^^^
-(I don't see problems with distinguishing Vars of different children
-on the same level...)
-
-> 
-> Second, we need to hook the subselect to the main query.  I recommend we
-> add two fields to Query for this:
-> 
->         Query *parentQuery;
->         List *subqueries;
-
-Agreed. And maybe Index queryLevel.
-
-> In the parent query, to parse the WHERE clause, we create a new operator
-> type, called IN or NOT_IN, or ALL, where the left side is a Var, and the
-                                               ^^^^^^^^^^^^^^^^^^
-No. We have to handle (a,b,c) OP (select x, y, z ...) and 
-'_a_constant_' OP (select ...) - I don't know is last in standards,
-Sybase has this.
-
-Well,
-
-typedef enum OpType
-{
-    OP_EXPR, FUNC_EXPR, OR_EXPR, AND_EXPR, NOT_EXPR
-
-+ OP_EXISTS, OP_ALL, OP_ANY
-
-} OpType;
-
-typedef struct Expr
-{
-    NodeTag     type;
-    Oid         typeOid;        /* oid of the type of this expr */
-    OpType      opType;         /* type of the op */
-    Node       *oper;           /* could be Oper or Func */
-    List       *args;           /* list of argument nodes */
-} Expr;
-
-OP_EXISTS: oper is NULL, lfirst(args) is SubSelect (index in subqueries
-           List, following your suggestion)
-
-OP_ALL, OP_ANY:
-
-oper is List of Oper nodes. We need in list because of data types of
-a, b, c (above) can be different and so Oper nodes will be different too.
-
-lfirst(args) is List of expression nodes (Const, Var, Func ?, a + b ?) -
-left side of subquery' operator.
-lsecond(args) is SubSelect.
-
-Note, that there are no OP_IN, OP_NOTIN in OpType-s for Expr. We need in
-IN, NOTIN in A_Expr (parser node), but both of them have to be transferred
-by parser into corresponding ANY and ALL. At the moment we can do:
-
-IN --> = ANY, NOT IN --> <> ALL
-
-but this will be "known bug": this breaks OO-nature of Postgres, because of
-operators can be overrided and '=' can mean  s o m e t h i n g (not equality).
-Example: box data type. For boxes, = means equality of _areas_ and =~
-means that boxes are the same ==> =~ ANY should be used for IN.
-
-> right side is an index to a slot in the subqueries List.
-
-Vadim
-
-From [email protected] Fri Jan  9 17:44:04 1998
-Received: from hub.org (hub.org [209.47.148.200])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id RAA24779
-   for ; Fri, 9 Jan 1998 17:44:01 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id RAA20728; Fri, 9 Jan 1998 17:32:34 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Fri, 09 Jan 1998 17:32:19 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id RAA20503 for pgsql-hackers-outgoing; Fri, 9 Jan 1998 17:32:15 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [209.152.195.67]) by hub.org (8.8.8/8.7.5) with ESMTP id RAA20008 for ; Fri, 9 Jan 1998 17:31:24 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id RAA24282;
-   Fri, 9 Jan 1998 17:31:41 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: [HACKERS] Re: subselects
-To: [email protected] (Vadim B. Mikheev)
-Date: Fri, 9 Jan 1998 17:31:41 -0500 (EST)
-Cc: [email protected]
-In-Reply-To: <[email protected]> from "Vadim B. Mikheev" at Jan 9, 98 10:10:06 pm
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-> 
-> Bruce Momjian wrote:
-> > 
-> > Vadim, I know you are still thinking about subselects, but I have some
-> > more clarification that may help.
-> > 
-> > We have to add phantom range table entries to correlated subselects so
-> > they will pass the parser.  We might as well add those fields to the
-> > target list of the subquery at the same time:
-> > 
-> >         select *
-> >         from taba
-> >         where col1 = (select col2
-> >                       from tabb
-> >                       where taba.col3 = tabb.col4)
-> > 
-> > becomes:
-> > 
-> >         select *
-> >         from taba
-> >         where col1 = (select col2, tabb.col4 <---
-> >                       from tabb, taba  <---
-> >                       where taba.col3 = tabb.col4)
-> > 
-> > We add a field to TargetEntry and RangeTblEntry to mark the fact that it
-> > was entered as a correlation entry:
-> > 
-> >         bool    isCorrelated;
-> 
-> No, I don't like to add anything in parser. Example:
-> 
->         select *
->         from tabA
->         where col1 = (select col2
->                       from tabB
->                       where tabA.col3 = tabB.col4
->                       and exists (select * 
->                                   from tabC 
->                                   where tabB.colX = tabC.colX and
->                                         tabC.colY = tabA.col2)
->                      )
-> 
-> : a column of tabA is referenced in sub-subselect 
-
-This is a strange case that I don't think we need to handle in our first
-implementation.
-
-> (is it allowable by standards ?) - in this case it's better 
-> to don't add tabA to 1st subselect but add tabA to second one
-> and change tabA.col3 in 1st to reference col3 in 2nd subquery temp table -
-> this gives us 2-tables join in 1st subquery instead of 3-tables join.
-> (And I'm still not sure that using temp tables is best of what can be 
-> done in all cases...)
-
-I don't see any use for temp tables in subselects anymore.  After having
-implemented UNIONS, I now see how much can be done in the upper
-optimizer.  I see you just putting the subquery PLAN into the proper
-place in the plan tree, with some proper JOIN nodes for IN, NOT IN.
-
-> 
-> Instead of using isCorrelated in TE & RTE we can add 
-> 
-> Index varlevel;
-
-OK.  Sounds good.
-
-> 
-> to Var node to reflect (sub)query from where this Var is come
-> (where is range table to find var's relation using varno). Upmost query
-> will have varlevel = 0, all its (dirrect) children - varlevel = 1 and so on.
->                         ^^^                          ^^^^^^^^^^^^
-> (I don't see problems with distinguishing Vars of different children
-> on the same level...)
-> 
-> > 
-> > Second, we need to hook the subselect to the main query.  I recommend we
-> > add two fields to Query for this:
-> > 
-> >         Query *parentQuery;
-> >         List *subqueries;
-> 
-> Agreed. And maybe Index queryLevel.
-
-Sure.  If it helps.
-
-> 
-> > In the parent query, to parse the WHERE clause, we create a new operator
-> > type, called IN or NOT_IN, or ALL, where the left side is a Var, and the
->                                                ^^^^^^^^^^^^^^^^^^
-> No. We have to handle (a,b,c) OP (select x, y, z ...) and 
-> '_a_constant_' OP (select ...) - I don't know is last in standards,
-> Sybase has this.
-
-I have never seen this in my eight years of SQL.  Perhaps we can leave
-this for later, maybe much later.
-
-> 
-> Well,
-> 
-> typedef enum OpType
-> {
->     OP_EXPR, FUNC_EXPR, OR_EXPR, AND_EXPR, NOT_EXPR
-> 
-> + OP_EXISTS, OP_ALL, OP_ANY
-> 
-> } OpType;
-> 
-> typedef struct Expr
-> {
->     NodeTag     type;
->     Oid         typeOid;        /* oid of the type of this expr */
->     OpType      opType;         /* type of the op */
->     Node       *oper;           /* could be Oper or Func */
->     List       *args;           /* list of argument nodes */
-> } Expr;
-> 
-> OP_EXISTS: oper is NULL, lfirst(args) is SubSelect (index in subqueries
->            List, following your suggestion)
-> 
-> OP_ALL, OP_ANY:
-> 
-> oper is List of Oper nodes. We need in list because of data types of
-> a, b, c (above) can be different and so Oper nodes will be different too.
-> 
-> lfirst(args) is List of expression nodes (Const, Var, Func ?, a + b ?) -
-> left side of subquery' operator.
-> lsecond(args) is SubSelect.
-> 
-> Note, that there are no OP_IN, OP_NOTIN in OpType-s for Expr. We need in
-> IN, NOTIN in A_Expr (parser node), but both of them have to be transferred
-> by parser into corresponding ANY and ALL. At the moment we can do:
-> 
-> IN --> = ANY, NOT IN --> <> ALL
-> 
-> but this will be "known bug": this breaks OO-nature of Postgres, because of
-> operators can be overrided and '=' can mean  s o m e t h i n g (not equality).
-> Example: box data type. For boxes, = means equality of _areas_ and =~
-> means that boxes are the same ==> =~ ANY should be used for IN.
-
-That is interesting, to use =~ for ANY.
-
-Yes, but how many operators take a SUBQUERY as an operand.  This is a
-special case to me.
-
-I think I see where you are trying to go.  You want subselects to behave
-like any other operator, with a subselect type, and you do all the
-subselect handling in the optimizer, with special Nodes and actions.
-
-I think this may be just too much of a leap.  We have such clean query
-logic for single queries, I can't imagine having an operator that has a
-Query operand, and trying to get everything to properly handle it. 
-UNIONS were very easy to implement as a List off of Query, with some
-foreach()'s in rewrite and the high optimizer.
-
-Subselects are SQL standard, and are never going to be over-ridden by a
-user.  Same with UNION.  They want UNION, they get UNION.  They want
-Subselect, we are going to spin through the Query structure and give
-them what they want.
-
-The complexities of subselects and correlated queries and range tables
-and stuff is so bizarre that trying to get it to work inside the type
-system could be a huge project.
-
-> 
-> > right side is an index to a slot in the subqueries List.
-
-I guess the question is what can we have by February 1?
-
-I have been reading some postings, and it seems to me that subselects
-are the litmus test for many evaluators when deciding if a database
-engine is full-featured.
-
-Sorry to be so straightforward, but I want to keep hashing this around
-until we get a conclusion, so coding can start.
-
-My suggestions have been, I believe, trying to get subselects working
-with the fullest functionality by adding the least amount of code, and
-keeping the logic clean.
-
-Have you checked out the UNION code?  It is very small, but it works.  I
-think it could make a good sample for subselects.
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Sat Jan 10 12:00:51 1998
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id MAA28742
-   for ; Sat, 10 Jan 1998 12:00:43 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id AAA05684;
-   Sun, 11 Jan 1998 00:19:10 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Sun, 11 Jan 1998 00:19:08 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: [email protected], "Thomas G. Lockhart" 
-Subject: Re: subselects
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> > No, I don't like to add anything in parser. Example:
-> >
-> >         select *
-> >         from tabA
-> >         where col1 = (select col2
-> >                       from tabB
-> >                       where tabA.col3 = tabB.col4
-> >                       and exists (select *
-> >                                   from tabC
-> >                                   where tabB.colX = tabC.colX and
-> >                                         tabC.colY = tabA.col2)
-> >                      )
-> >
-> > : a column of tabA is referenced in sub-subselect
-> 
-> This is a strange case that I don't think we need to handle in our first
-> implementation.
-
-I don't know is this strange case or not :)
-But I would like to know is this allowed by standards - can someone
-comment on this ?
-And I don't see problems with handling this...
-
-> 
-> > (is it allowable by standards ?) - in this case it's better
-> > to don't add tabA to 1st subselect but add tabA to second one
-> > and change tabA.col3 in 1st to reference col3 in 2nd subquery temp table -
-> > this gives us 2-tables join in 1st subquery instead of 3-tables join.
-> > (And I'm still not sure that using temp tables is best of what can be
-> > done in all cases...)
-> 
-> I don't see any use for temp tables in subselects anymore.  After having
-> implemented UNIONS, I now see how much can be done in the upper
-> optimizer.  I see you just putting the subquery PLAN into the proper
-> place in the plan tree, with some proper JOIN nodes for IN, NOT IN.
-
-When saying about temp tables, I meant tables created by node Material
-for subquery plan. This is one of two ways - run subquery once for all
-possible upper plan tuples and then just join result table with upper
-query. Another way is re-run subquery for each upper query tuple,
-without temp table but may be with caching results by some ways.
-Actually, there is special case - when subquery can be alternatively 
-formulated as joins, - but this is just special case.
-
-> > > In the parent query, to parse the WHERE clause, we create a new operator
-> > > type, called IN or NOT_IN, or ALL, where the left side is a Var, and the
-> >                                                ^^^^^^^^^^^^^^^^^^
-> > No. We have to handle (a,b,c) OP (select x, y, z ...) and
-> > '_a_constant_' OP (select ...) - I don't know is last in standards,
-> > Sybase has this.
-> 
-> I have never seen this in my eight years of SQL.  Perhaps we can leave
-> this for later, maybe much later.
-
-Are you saying about (a, b, c) or about 'a_constant' ?
-Again, can someone comment on are they in standards or not ?
-Tom ?
-If yes then please add parser' support for them now...
-
-> > Note, that there are no OP_IN, OP_NOTIN in OpType-s for Expr. We need in
-> > IN, NOTIN in A_Expr (parser node), but both of them have to be transferred
-> > by parser into corresponding ANY and ALL. At the moment we can do:
-> >
-> > IN --> = ANY, NOT IN --> <> ALL
-> >
-> > but this will be "known bug": this breaks OO-nature of Postgres, because of
-> > operators can be overrided and '=' can mean  s o m e t h i n g (not equality).
-> > Example: box data type. For boxes, = means equality of _areas_ and =~
-> > means that boxes are the same ==> =~ ANY should be used for IN.
-> 
-> That is interesting, to use =~ for ANY.
-> 
-> Yes, but how many operators take a SUBQUERY as an operand.  This is a
-> special case to me.
-> 
-> I think I see where you are trying to go.  You want subselects to behave
-> like any other operator, with a subselect type, and you do all the
-> subselect handling in the optimizer, with special Nodes and actions.
-> 
-> I think this may be just too much of a leap.  We have such clean query
-> logic for single queries, I can't imagine having an operator that has a
-> Query operand, and trying to get everything to properly handle it.
-> UNIONS were very easy to implement as a List off of Query, with some
-> foreach()'s in rewrite and the high optimizer.
-> 
-> Subselects are SQL standard, and are never going to be over-ridden by a
-> user.  Same with UNION.  They want UNION, they get UNION.  They want
-> Subselect, we are going to spin through the Query structure and give
-> them what they want.
-> 
-> The complexities of subselects and correlated queries and range tables
-> and stuff is so bizarre that trying to get it to work inside the type
-> system could be a huge project.
-
-PostgreSQL is a robust, next-generation, Object-Relational DBMS (ORDBMS),
-derived from the Berkeley Postgres database management system. While
-PostgreSQL retains the powerful object-relational data model, rich data types and
-           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-easy extensibility of Postgres, it replaces the PostQuel query language with an
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-extended subset of SQL.
-^^^^^^^^^^^^^^^^^^^^^^
-
-Should we say users that subselect will work for standard data types only ?
-I don't see why subquery can't be used with ~, ~*, @@, ... operators, do you ?
-Is there difference between handling = ANY and ~ ANY ? I don't see any.
-Currently we can't get IN working properly for boxes (and may be for others too)
-and I don't like to try to resolve these problems now, but hope that someday
-we'll be able to do this. At the moment - just convert IN into = ANY and
-NOT IN into <> ALL in parser.
-
-(BTW, do you know how DISTINCT is implemented ? It doesn't use = but
-use type_out funcs and uses strcmp()... DISTINCT is standard SQL thing...)
-
-> >
-> > > right side is an index to a slot in the subqueries List.
-> 
-> I guess the question is what can we have by February 1?
-> 
-> I have been reading some postings, and it seems to me that subselects
-> are the litmus test for many evaluators when deciding if a database
-> engine is full-featured.
-> 
-> Sorry to be so straightforward, but I want to keep hashing this around
-> until we get a conclusion, so coding can start.
-> 
-> My suggestions have been, I believe, trying to get subselects working
-> with the fullest functionality by adding the least amount of code, and
-> keeping the logic clean.
-> 
-> Have you checked out the UNION code?  It is very small, but it works.  I
-> think it could make a good sample for subselects.
-
-There is big difference between subqueries and queries in UNION - 
-there are not dependences between UNION queries.
-
-Ok, opened issues:
-
-1. Is using upper query' vars in all subquery levels in standard ?
-2. Is (a, b, c) OP (subselect) in standard ?
-3. What types of expressions (Var, Const, ...) are allowed on the left
-   side of operator with subquery on the right ?
-4. What types of operators should we support (=, >, ..., like, ~, ...) ?
-   (My vote for all boolean operators).
-
-And - did we get consensus on presentation subqueries stuff in Query,
-Expr and Var ?
-I would like to have something done in parser near Jan 17 to get
-subqueries working by Feb 1. I vote for support of all standard
-things (1. - 3.) in parser right now - if there will be no time
-to implement something like (a, b, c) then optimizer will call
-elog(WARN) (oh, sorry, - elog(ERROR)).
-
-Vadim
-
-From [email protected] Sat Jan 10 12:31:05 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id MAA29045
-   for ; Sat, 10 Jan 1998 12:31:01 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id MAA23364 for ; Sat, 10 Jan 1998 12:22:30 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id AAA05725;
-   Sun, 11 Jan 1998 00:41:22 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Sun, 11 Jan 1998 00:41:19 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: PostgreSQL-development 
-Subject: Re: [HACKERS] subselects
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> OK, a few questions:
-> 
->         Should we use sortmerge, so we can use our psort as temp tables,
-> or do we use hashunique?
-> 
->         How do we pass the query to the optimizer?  How do we represent
-> the range table for each, and the links between them in correlated
-> subqueries?
-
-My suggestion is just use varlevel in Var and don't put upper query'
-relations into subquery range table.
-
-Vadim
-
-From [email protected] Sat Jan 10 13:01:00 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id NAA29357
-   for ; Sat, 10 Jan 1998 13:00:58 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id MAA24030 for ; Sat, 10 Jan 1998 12:40:02 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id AAA05741;
-   Sun, 11 Jan 1998 00:58:56 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Sun, 11 Jan 1998 00:58:52 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian ,
-        PostgreSQL-development 
-Subject: Re: [HACKERS] subselects
-References: <[email protected]> <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Vadim B. Mikheev wrote:
-> 
-> Bruce Momjian wrote:
-> >
-> > OK, a few questions:
-> >
-> >         Should we use sortmerge, so we can use our psort as temp tables,
-> > or do we use hashunique?
-> >
-> >         How do we pass the query to the optimizer?  How do we represent
-> > the range table for each, and the links between them in correlated
-> > subqueries?
-> 
-> My suggestion is just use varlevel in Var and don't put upper query'
-> relations into subquery range table.
-
-Hmm... Sorry, it seems that I did reply to very old message - forget it.
-
-Vadim
-
-From [email protected] Sat Jan 10 13:30:59 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id NAA29664
-   for ; Sat, 10 Jan 1998 13:30:56 -0500 (EST)
-Received: from golem.jpl.nasa.gov ([email protected] [128.149.70.168]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id NAA25109 for ; Sat, 10 Jan 1998 13:05:09 -0500 (EST)
-Received: from alumni.caltech.edu (localhost [127.0.0.1])
-   by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id SAA03623;
-   Sat, 10 Jan 1998 18:01:03 GMT
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Sat, 10 Jan 1998 18:01:03 +0000
-From: "Thomas G. Lockhart" 
-Organization: Caltech/JPL
-X-Mailer: Mozilla 4.03 [en] (X11; I; Linux 2.0.30 i686)
-MIME-Version: 1.0
-To: "Vadim B. Mikheev" 
-CC: Bruce Momjian , [email protected]
-Subject: Re: subselects
-References: <[email protected]> <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-> > > Note, that there are no OP_IN, OP_NOTIN in OpType-s for Expr. We need in
-> > > IN, NOTIN in A_Expr (parser node), but both of them have to be transferred
-> > > by parser into corresponding ANY and ALL. At the moment we can do:
-> > >
-> > > IN --> = ANY, NOT IN --> <> ALL
-> > >
-> > > but this will be "known bug": this breaks OO-nature of Postgres, because of
-> > > operators can be overrided and '=' can mean  s o m e t h i n g (not equality).
-> > > Example: box data type. For boxes, = means equality of _areas_ and =~
-> > > means that boxes are the same ==> =~ ANY should be used for IN.
-> >
-> > That is interesting, to use =~ for ANY.
-
-If I understand the discussion, I would think is is fine to make an assumption about
-which operator is used to implement a subselect expression. If someone remaps an
-operator to mean something different, then they will get a different result (or a
-nonsensical one) from a subselect.
-
-I'd be happy to remap existing operators to fit into a convention which would work
-with subselects (especially if I got to help choose :).
-
-> > Subselects are SQL standard, and are never going to be over-ridden by a
-> > user.  Same with UNION.  They want UNION, they get UNION.  They want
-> > Subselect, we are going to spin through the Query structure and give
-> > them what they want.
->
-> PostgreSQL is a robust, next-generation, Object-Relational DBMS (ORDBMS),
-> derived from the Berkeley Postgres database management system. While
-> PostgreSQL retains the powerful object-relational data model, rich data types and
->            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-> easy extensibility of Postgres, it replaces the PostQuel query language with an
-> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-> extended subset of SQL.
-> ^^^^^^^^^^^^^^^^^^^^^^
->
-> Should we say users that subselect will work for standard data types only ?
-> I don't see why subquery can't be used with ~, ~*, @@, ... operators, do you ?
-> Is there difference between handling = ANY and ~ ANY ? I don't see any.
-> Currently we can't get IN working properly for boxes (and may be for others too)
-> and I don't like to try to resolve these problems now, but hope that someday
-> we'll be able to do this. At the moment - just convert IN into = ANY and
-> NOT IN into <> ALL in parser.
->
-> (BTW, do you know how DISTINCT is implemented ? It doesn't use = but
-> use type_out funcs and uses strcmp()... DISTINCT is standard SQL thing...)
-
-?? I didn't know that. Wouldn't we want it to eventually use "=" through a sorted
-list? That would give more consistant behavior...
-
-> > I have been reading some postings, and it seems to me that subselects
-> > are the litmus test for many evaluators when deciding if a database
-> > engine is full-featured.
-> >
-> > Sorry to be so straightforward, but I want to keep hashing this around
-> > until we get a conclusion, so coding can start.
-> >
-> > My suggestions have been, I believe, trying to get subselects working
-> > with the fullest functionality by adding the least amount of code, and
-> > keeping the logic clean.
-> >
-> > Have you checked out the UNION code?  It is very small, but it works.  I
-> > think it could make a good sample for subselects.
->
-> There is big difference between subqueries and queries in UNION -
-> there are not dependences between UNION queries.
->
-> Ok, opened issues:
->
-> 1. Is using upper query' vars in all subquery levels in standard ?
-
-I'm not certain. Let me know if you do not get an answer from someone else and I will
-research it.
-
-> 2. Is (a, b, c) OP (subselect) in standard ?
-
-Yes. In fact, it _is_ the standard, and "a OP (subselect)" is a special case where
-the parens are allowed to be omitted from a one element list.
-
-> 3. What types of expressions (Var, Const, ...) are allowed on the left
->    side of operator with subquery on the right ?
-
-I think most expressions are allowed. The "constant OP (subselect)" case you were
-asking about is just a simplified case since "(a, b, constant) OP (subselect)" where
-a and b are column references should be allowed. Of course, our optimizer could
-perhaps change this to "(a, b) OP (subselect where x = constant)", or for the first
-example "EXISTS (subselect where x = constant)".
-
-> 4. What types of operators should we support (=, >, ..., like, ~, ...) ?
->    (My vote for all boolean operators).
-
-Sounds good. But I'll vote with Bruce (and I'll bet you already agree) that it is
-important to get an initial implementation for v6.3 which covers a little, some, or
-all of the usual SQL subselect constructs. If we have to revisit this for v6.4 then
-we will have the benefit of feedback from others in practical applications which
-always uncovers new things to consider.
-
-> And - did we get consensus on presentation subqueries stuff in Query,
-> Expr and Var ?
-> I would like to have something done in parser near Jan 17 to get
-> subqueries working by Feb 1. I vote for support of all standard
-> things (1. - 3.) in parser right now - if there will be no time
-> to implement something like (a, b, c) then optimizer will callelog(WARN) (oh,
-> sorry, - elog(ERROR)).
-
-Great. I'd like to help with the remaining parser issues; at the moment "row_expr"
-does the right thing with expression comparisions but just parses then ignores
-subselect expressions. Let me know what structures you want passed back and I'll put
-them in, or if you prefer put in the first one and I'll go through and clean up and
-add the rest.
-
-                                                  - Tom
-
-
-From [email protected] Sat Jan 10 15:00:58 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id PAA00728
-   for ; Sat, 10 Jan 1998 15:00:56 -0500 (EST)
-Received: from golem.jpl.nasa.gov ([email protected] [128.149.70.168]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id OAA28438 for ; Sat, 10 Jan 1998 14:35:19 -0500 (EST)
-Received: from alumni.caltech.edu (localhost [127.0.0.1])
-   by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id TAA06002;
-   Sat, 10 Jan 1998 19:31:30 GMT
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Sat, 10 Jan 1998 19:31:29 +0000
-From: "Thomas G. Lockhart" 
-Organization: Caltech/JPL
-X-Mailer: Mozilla 4.03 [en] (X11; I; Linux 2.0.30 i686)
-MIME-Version: 1.0
-To: "Vadim B. Mikheev" 
-CC: Bruce Momjian , [email protected]
-Subject: Re: [HACKERS] Re: subselects
-References: <[email protected]> <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-> Are you saying about (a, b, c) or about 'a_constant' ?
-> Again, can someone comment on are they in standards or not ?
-> Tom ?
-> If yes then please add parser' support for them now...
-
-As I mentioned a few minutes ago in my last message, I parse the row descriptors and
-the subselects but for subselect expressions (e.g. "(a,b) OP (subselect)" I currently
-ignore the result. I didn't want to pass things back as lists until something in the
-backend was ready to receive them.
-
-If it is OK, I'll go ahead and start passing back a list of expressions when a row
-descriptor is present. So, what you will find is lexpr or rexpr in the A_Expr node
-being a list rather than an atomic node.
-
-Also, I can start passing back the subselect expression as the rexpr; right now the
-parser calls elog() and quits.
-
-btw, to implement "(a,b,c) OP (d,e,f)" I made a new routine in the parser called
-makeRowExpr() which breaks this up into a sequence of "and" and/or "or" expressions.
-If lists are handled farther back, this routine should move to there also and the
-parser will just pass the lists. Note that some assumptions have to be made about the
-meaning of "(a,b) OP (c,d)", since usually we only have knowledge of the behavior of
-"a OP c". Easy for the standard SQL operators, unknown for others, but maybe it is OK
-to disallow those cases or to look for specific appearance of the operator to guess
-the behavior (e.g. if the operator has "<" or "=" or ">" then build as "and"s and if
-it has "<>" or "!" then build as "or"s.
-
-Let me know what you want...
-
-                                                       - Tom
-
-
-From [email protected] Sun Jan 11 01:01:55 1998
-Received: from golem.jpl.nasa.gov ([email protected] [128.149.70.168])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id BAA11953
-   for ; Sun, 11 Jan 1998 01:01:51 -0500 (EST)
-Received: from alumni.caltech.edu (localhost [127.0.0.1])
-   by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id FAA23797;
-   Sun, 11 Jan 1998 05:58:01 GMT
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Sun, 11 Jan 1998 05:58:01 +0000
-From: "Thomas G. Lockhart" 
-Organization: Caltech/JPL
-X-Mailer: Mozilla 4.03 [en] (X11; I; Linux 2.0.30 i686)
-MIME-Version: 1.0
-To: "Vadim B. Mikheev" 
-CC: Bruce Momjian , [email protected]
-Subject: Re: [HACKERS] Re: subselects
-References: <[email protected]> <[email protected]>
-Content-Type: multipart/mixed; boundary="------------D8B38A0D1F78A10C0023F702"
-Status: OR
-
-This is a multi-part message in MIME format.
---------------D8B38A0D1F78A10C0023F702
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-
-Here are context diffs of gram.y and keywords.c; sorry about sending the full files.
-These start sending lists of arguments toward the backend from the parser to
-implement row descriptors and subselects.
-
-They should apply OK even over Bruce's recent changes...
-
-                                             - Tom
-
---------------D8B38A0D1F78A10C0023F702
-Content-Type: text/plain; charset=us-ascii; name="gram.y.patch"
-Content-Transfer-Encoding: 7bit
-Content-Disposition: inline; filename="gram.y.patch"
-
-*** ../src/backend/parser/gram.y.orig  Sat Jan 10 05:44:36 1998
---- ../src/backend/parser/gram.y   Sat Jan 10 19:29:37 1998
-***************
-*** 195,200 ****
---- 195,201 ----
-               having_clause
-  %type  row_descriptor, row_list
-  %type  row_expr
-+ %type       RowOp, row_opt
-  %type  OptCreateAs, CreateAsList
-  %type  CreateAsElement
-  %type     NumConst
-***************
-*** 242,248 ****
-   */
-  
-  /* Keywords (in SQL92 reserved words) */
-! %token   ACTION, ADD, ALL, ALTER, AND, AS, ASC,
-       BEGIN_TRANS, BETWEEN, BOTH, BY,
-       CASCADE, CAST, CHAR, CHARACTER, CHECK, CLOSE, COLLATE, COLUMN, COMMIT, 
-       CONSTRAINT, CREATE, CROSS, CURRENT, CURRENT_DATE, CURRENT_TIME, 
---- 243,249 ----
-   */
-  
-  /* Keywords (in SQL92 reserved words) */
-! %token   ACTION, ADD, ALL, ALTER, AND, ANY, AS, ASC,
-       BEGIN_TRANS, BETWEEN, BOTH, BY,
-       CASCADE, CAST, CHAR, CHARACTER, CHECK, CLOSE, COLLATE, COLUMN, COMMIT, 
-       CONSTRAINT, CREATE, CROSS, CURRENT, CURRENT_DATE, CURRENT_TIME, 
-***************
-*** 258,264 ****
-       ON, OPTION, OR, ORDER, OUTER_P,
-       PARTIAL, POSITION, PRECISION, PRIMARY, PRIVILEGES, PROCEDURE, PUBLIC,
-       REFERENCES, REVOKE, RIGHT, ROLLBACK,
-!      SECOND_P, SELECT, SET, SUBSTRING,
-       TABLE, TIME, TIMESTAMP, TO, TRAILING, TRANSACTION, TRIM,
-       UNION, UNIQUE, UPDATE, USING,
-       VALUES, VARCHAR, VARYING, VERBOSE, VERSION, VIEW,
---- 259,265 ----
-       ON, OPTION, OR, ORDER, OUTER_P,
-       PARTIAL, POSITION, PRECISION, PRIMARY, PRIVILEGES, PROCEDURE, PUBLIC,
-       REFERENCES, REVOKE, RIGHT, ROLLBACK,
-!      SECOND_P, SELECT, SET, SOME, SUBSTRING,
-       TABLE, TIME, TIMESTAMP, TO, TRAILING, TRANSACTION, TRIM,
-       UNION, UNIQUE, UPDATE, USING,
-       VALUES, VARCHAR, VARYING, VERBOSE, VERSION, VIEW,
-***************
-*** 2853,2866 ****
-  /* Expressions using row descriptors
-   * Define row_descriptor to allow yacc to break the reduce/reduce conflict
-   *  with singleton expressions.
-   */
-  row_expr: '(' row_descriptor ')' IN '(' SubSelect ')'
-               {
-!                  $$ = NULL;
-               }
-       | '(' row_descriptor ')' NOT IN '(' SubSelect ')'
-               {
-!                  $$ = NULL;
-               }
-       | '(' row_descriptor ')' '=' '(' row_descriptor ')'
-               {
---- 2854,2878 ----
-  /* Expressions using row descriptors
-   * Define row_descriptor to allow yacc to break the reduce/reduce conflict
-   *  with singleton expressions.
-+  *
-+  * Note that "SOME" is the same as "ANY" in syntax.
-+  * - thomas 1998-01-10
-   */
-  row_expr: '(' row_descriptor ')' IN '(' SubSelect ')'
-               {
-!                  $$ = makeA_Expr(OP, "=any", (Node *)$2, (Node *)$6);
-               }
-       | '(' row_descriptor ')' NOT IN '(' SubSelect ')'
-               {
-!                  $$ = makeA_Expr(OP, "<>any", (Node *)$2, (Node *)$7);
-!              }
-!      | '(' row_descriptor ')' RowOp row_opt '(' SubSelect ')'
-!              {
-!                  char *opr;
-!                  opr = palloc(strlen($4)+strlen($5)+1);
-!                  strcpy(opr, $4);
-!                  strcat(opr, $5);
-!                  $$ = makeA_Expr(OP, opr, (Node *)$2, (Node *)$7);
-               }
-       | '(' row_descriptor ')' '=' '(' row_descriptor ')'
-               {
-***************
-*** 2880,2885 ****
---- 2892,2907 ----
-               }
-       ;
-  
-+ RowOp:  '='                      { $$ = "="; }
-+      | '<'                   { $$ = "<"; }
-+      | '>'                   { $$ = ">"; }
-+      ;
-+ 
-+ row_opt:  ALL                    { $$ = "all"; }
-+      | ANY                   { $$ = "any"; }
-+      | SOME                  { $$ = "any"; }
-+      ;
-+ 
-  row_descriptor:  row_list ',' a_expr
-               {
-                   $$ = lappend($1, $3);
-***************
-*** 3432,3441 ****
-       ;
-  
-  in_expr:  SubSelect
-!              {
-!                  elog(ERROR,"IN (SUBSELECT) not yet implemented");
-!                  $$ = $1;
-!              }
-       | in_expr_nodes
-               {   $$ = $1; }
-       ;
---- 3454,3460 ----
-       ;
-  
-  in_expr:  SubSelect
-!              {   $$ = makeA_Expr(OP, "=", saved_In_Expr, (Node *)$1); }
-       | in_expr_nodes
-               {   $$ = $1; }
-       ;
-***************
-*** 3449,3458 ****
-       ;
-  
-  not_in_expr:  SubSelect
-!              {
-!                  elog(ERROR,"NOT IN (SUBSELECT) not yet implemented");
-!                  $$ = $1;
-!              }
-       | not_in_expr_nodes
-               {   $$ = $1; }
-       ;
---- 3468,3474 ----
-       ;
-  
-  not_in_expr:  SubSelect
-!              {   $$ = makeA_Expr(OP, "<>", saved_In_Expr, (Node *)$1); }
-       | not_in_expr_nodes
-               {   $$ = $1; }
-       ;
-
---------------D8B38A0D1F78A10C0023F702
-Content-Type: text/plain; charset=us-ascii; name="keywords.c.patch"
-Content-Transfer-Encoding: 7bit
-Content-Disposition: inline; filename="keywords.c.patch"
-
-*** ../src/backend/parser/keywords.c.orig  Mon Jan  5 07:51:33 1998
---- ../src/backend/parser/keywords.c   Sat Jan 10 19:22:07 1998
-***************
-*** 39,44 ****
---- 39,45 ----
-   {"alter", ALTER},
-   {"analyze", ANALYZE},
-   {"and", AND},
-+  {"any", ANY},
-   {"append", APPEND},
-   {"archive", ARCHIVE},
-   {"as", AS},
-***************
-*** 178,183 ****
---- 179,185 ----
-   {"set", SET},
-   {"setof", SETOF},
-   {"show", SHOW},
-+  {"some", SOME},
-   {"stdin", STDIN},
-   {"stdout", STDOUT},
-   {"substring", SUBSTRING},
-
---------------D8B38A0D1F78A10C0023F702--
-
-
-From [email protected] Sun Jan 11 01:31:13 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id BAA12255
-   for ; Sun, 11 Jan 1998 01:31:10 -0500 (EST)
-Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id BAA20396 for ; Sun, 11 Jan 1998 01:10:48 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id BAA22176; Sun, 11 Jan 1998 01:03:15 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sun, 11 Jan 1998 01:02:34 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id BAA22151 for pgsql-hackers-outgoing; Sun, 11 Jan 1998 01:02:26 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [209.152.195.67]) by hub.org (8.8.8/8.7.5) with ESMTP id BAA22077 for ; Sun, 11 Jan 1998 01:01:05 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id AAA11801;
-   Sun, 11 Jan 1998 00:59:23 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: [HACKERS] Re: subselects
-To: [email protected] (Vadim B. Mikheev)
-Date: Sun, 11 Jan 1998 00:59:23 -0500 (EST)
-Cc: [email protected], [email protected]
-In-Reply-To: <[email protected]> from "Vadim B. Mikheev" at Jan 11, 98 00:19:08 am
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-> I would like to have something done in parser near Jan 17 to get
-> subqueries working by Feb 1. I vote for support of all standard
-> things (1. - 3.) in parser right now - if there will be no time
-> to implement something like (a, b, c) then optimizer will call
-> elog(WARN) (oh, sorry, - elog(ERROR)).
-
-First, let me say I am glad we are still on schedule for Feb 1.  I was
-panicking because I thought we wouldn't make it in time.
-
-
-> > > (is it allowable by standards ?) - in this case it's better
-> > > to don't add tabA to 1st subselect but add tabA to second one
-> > > and change tabA.col3 in 1st to reference col3 in 2nd subquery temp table -
-> > > this gives us 2-tables join in 1st subquery instead of 3-tables join.
-> > > (And I'm still not sure that using temp tables is best of what can be
-> > > done in all cases...)
-> > 
-> > I don't see any use for temp tables in subselects anymore.  After having
-> > implemented UNIONS, I now see how much can be done in the upper
-> > optimizer.  I see you just putting the subquery PLAN into the proper
-> > place in the plan tree, with some proper JOIN nodes for IN, NOT IN.
-> 
-> When saying about temp tables, I meant tables created by node Material
-> for subquery plan. This is one of two ways - run subquery once for all
-> possible upper plan tuples and then just join result table with upper
-> query. Another way is re-run subquery for each upper query tuple,
-> without temp table but may be with caching results by some ways.
-> Actually, there is special case - when subquery can be alternatively 
-> formulated as joins, - but this is just special case.
-
-This is interesting.  It really only applies for correlated subqueries,
-and certainly it may help sometimes to just evaluate the subquery for
-valid values that are going to come from the upper query than for all
-possible values.  Perhaps we can use the 'cost' value of each query to
-decide how to handle this.
-
-> 
-> > > > In the parent query, to parse the WHERE clause, we create a new operator
-> > > > type, called IN or NOT_IN, or ALL, where the left side is a Var, and the
-> > >                                                ^^^^^^^^^^^^^^^^^^
-> > > No. We have to handle (a,b,c) OP (select x, y, z ...) and
-> > > '_a_constant_' OP (select ...) - I don't know is last in standards,
-> > > Sybase has this.
-> > 
-> > I have never seen this in my eight years of SQL.  Perhaps we can leave
-> > this for later, maybe much later.
-> 
-> Are you saying about (a, b, c) or about 'a_constant' ?
-> Again, can someone comment on are they in standards or not ?
-> Tom ?
-> If yes then please add parser' support for them now...
-
-OK, Thomas says it is, so we will put in as much code as we can to handle
-it.
-
-> Should we say users that subselect will work for standard data types only ?
-> I don't see why subquery can't be used with ~, ~*, @@, ... operators, do you ?
-> Is there difference between handling = ANY and ~ ANY ? I don't see any.
-> Currently we can't get IN working properly for boxes (and may be for others too)
-> and I don't like to try to resolve these problems now, but hope that someday
-> we'll be able to do this. At the moment - just convert IN into = ANY and
-> NOT IN into <> ALL in parser.
-
-OK.
-
-> 
-> (BTW, do you know how DISTINCT is implemented ? It doesn't use = but
-> use type_out funcs and uses strcmp()... DISTINCT is standard SQL thing...)
-
-I did not know that either.
-
-> There is big difference between subqueries and queries in UNION - 
-> there are not dependences between UNION queries.
-
-Yes, I know UNIONS are trivial compared to subselects.
-
-> 
-> Ok, opened issues:
-> 
-> 1. Is using upper query' vars in all subquery levels in standard ?
-> 2. Is (a, b, c) OP (subselect) in standard ?
-> 3. What types of expressions (Var, Const, ...) are allowed on the left
->    side of operator with subquery on the right ?
-> 4. What types of operators should we support (=, >, ..., like, ~, ...) ?
->    (My vote for all boolean operators).
-> 
-> And - did we get consensus on presentation subqueries stuff in Query,
-> Expr and Var ?
-
-OK, here are my concrete ideas on changes and structures.
-
-I think we all agreed that Query needs new fields:
-
-        Query *parentQuery;
-        List *subqueries;
-
-Maybe query level too, but I don't think so (see later ideas on Var).
-
-We need a new Node structure, call it Sublink:
-
-   int     linkType    (IN, NOTIN, ANY, EXISTS, OPERATOR...)
-   Oid operator    /* subquery must return single row */
-   List    *lefthand;  /* parent stuff */
-   Node    *subquery;  /* represents nodes from parser */
-   Index   Subindex;   /* filled in to index Query->subqueries */
-
-Of course, the names are just suggestions.  Every time we run through
-the parsenodes of a query to create a Query* structure, when we do the
-WHERE clause, if we come upon one of these Sublink nodes (created in the
-parser), we move the supplied Query* in Sublink->subquery to a local
-List variable, and we set Subquery->subindex to equal the index of the
-new query, i.e. is it the first subquery we found, 1, or the second, 2,
-etc.
-
-After we have created the parent Query structure, we run through our
-local List variable of subquery parsenodes we created above, and add
-Query* entries to Query->subqueries.  In each subquery Query*, we set
-the parentQuery pointer.
-
-Also, when parsing the subqueries, we need to keep track of correlated
-references.  I recommend we add a field to the Var structure:
-
-   Index   sublevel;   /* range table reference:
-                  = 0  current level of query
-                  < 0  parent above this many levels
-                  > 0  index into subquery list
-                */
-
-This way, a Var node with sublevel 0 is the current level, and is true
-in most cases.  This helps us not have to change much code.  sublevel =
--1 means it references the range table in the parent query. sublevel =
--2 means the parent's parent. sublevel = 2 means it references the range
-table of the second entry in Query->subqueries.  Varno and varattno are
-still meaningful.  Of course, we can't reference variables in the
-subqueries from the parent in the parser code, but Vadim may want to.
-
-When doing a Var lookup in the parser, we look in the current level
-first, but if not found, if it is a subquery, we can look at the parent
-and parent's parent to set the sublevel, varno, and varatno properly.
-
-We create no phantom range table entries in the subquery, and no phantom
-target list entries.   We can leave that all for the upper optimizer.
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Fri Nov 28 16:34:03 1997
-Received: from hub.org (hub.org [209.47.148.200])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id QAA17454
-   for ; Fri, 28 Nov 1997 16:33:59 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.5/8.7.5) with SMTP id QAA10553; Fri, 28 Nov 1997 16:20:03 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Fri, 28 Nov 1997 16:17:50 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.5/8.7.5) id QAA10116 for pgsql-hackers-outgoing; Fri, 28 Nov 1997 16:17:45 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [206.84.210.195]) by hub.org (8.8.5/8.7.5) with ESMTP id QAA09997 for ; Fri, 28 Nov 1997 16:17:26 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id QAA17309
-   for [email protected]; Fri, 28 Nov 1997 16:18:08 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: [HACKERS] querytrees and multiple statements
-To: [email protected] (PostgreSQL-development)
-Date: Fri, 28 Nov 1997 16:18:08 -0500 (EST)
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-Currently, if a query string arrives that has multiple sql statements in
-it, the parser breaks it down into separate queries, analyzes each one,
-then executes them in order.  (psql automatically breaks things down
-into separate queries, do this will not work there.)  The problem is
-that if the first query creates a table, and the second query goes to
-access it, the parser analysis fails because the table is not yet
-created.  See the attached pginterface source for an example.  The real
-problem is that all the queries in the string are analyzed first, then
-executed, rather than having one analyzed then execute, then the next.
-
-I am going to have touble with subselects and temp tables.  I want to
-pull out the subselect, change it into a SELECT ... INTO TEMP, add it to
-the QueryTree before the outer select, then the outer select is analyzed
-by the parser, the temp table doesn't exist yet, and will cause an
-error.
-
-Currently postgres.c does each step on all queries before moving to the
-next step.  Does anyone know what the ramifications would be if I
-changed this to do to the full set of operations on each statement first
-before moving to the next?
-
----------------------------------------------------------------------------
-
-
-/*
- * pgnulltest.c
- *
-*/
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-int main(int argc, char **argv)
-{
-   char query[4000];
-   int i;
-   
-   if (argc != 2)
-       halt("Usage:  %s database\n",argv[0]);
-
-   connectdb(argv[1],NULL,NULL,NULL,NULL);
-
-   sprintf(query,"create table test(x int); select x from test;");
-   doquery(query);
-
-   disconnectdb();
-   return 0;
-}
-
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Sat Nov 29 05:01:01 1997
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id FAA27942
-   for ; Sat, 29 Nov 1997 05:00:58 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id EAA13666 for ; Sat, 29 Nov 1997 04:35:08 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by www.krasnet.ru (8.8.7/8.7.3) with SMTP id QAA17107; Sat, 29 Nov 1997 16:38:58 +0700 (KRS)
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Sat, 29 Nov 1997 16:38:57 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 3.01 (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: PostgreSQL-development 
-Subject: Re: [HACKERS] querytrees and multiple statements
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> Currently, if a query string arrives that has multiple sql statements in
-> it, the parser breaks it down into separate queries, analyzes each one,
-> then executes them in order.  (psql automatically breaks things down
-> into separate queries, do this will not work there.)  The problem is
-> that if the first query creates a table, and the second query goes to
-> access it, the parser analysis fails because the table is not yet
-> created.  See the attached pginterface source for an example.  The real
-> problem is that all the queries in the string are analyzed first, then
-> executed, rather than having one analyzed then execute, then the next.
-> 
-> I am going to have touble with subselects and temp tables.  I want to
-> pull out the subselect, change it into a SELECT ... INTO TEMP, add it to
-> the QueryTree before the outer select, then the outer select is analyzed
-> by the parser, the temp table doesn't exist yet, and will cause an
-> error.
-> 
-> Currently postgres.c does each step on all queries before moving to the
-> next step.  Does anyone know what the ramifications would be if I
-> changed this to do to the full set of operations on each statement first
-> before moving to the next?
-
-This will break ability to prepare plan (parser + optimizer) for latter
-execution. This ability is used by RULEs (and so - by VIEWs) and will be
-used by PL(s)...
-
-Please, take a look at nodeMaterial.c:
-
-/*-------------------------------------------------------------------------
- *
- * nodeMaterial.c--
- *    Routines to handle materialization nodes.
-...
-/*
- * INTERFACE ROUTINES
- *      ExecMaterial            - generate a temporary relation
-                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-(I'm still very busy. Hope to return soon.)
-
-Vadim
-
-From [email protected] Sun Nov 30 02:30:56 1997
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id CAA15439
-   for ; Sun, 30 Nov 1997 02:30:55 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id CAA17743 for ; Sun, 30 Nov 1997 02:27:40 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by www.krasnet.ru (8.8.7/8.7.3) with SMTP id OAA18937; Sun, 30 Nov 1997 14:32:14 +0700 (KRS)
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Sun, 30 Nov 1997 14:32:14 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 3.01 (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: [email protected]
-Subject: Re: [HACKERS] querytrees and multiple statements
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> > This will break ability to prepare plan (parser + optimizer) for latter
-> > execution. This ability is used by RULEs (and so - by VIEWs) and will be
-> > used by PL(s)...
-> >
-> > Please, take a look at nodeMaterial.c:
-> >
-> > /*-------------------------------------------------------------------------
-> >  *
-> >  * nodeMaterial.c--
-> >  *    Routines to handle materialization nodes.
-> > ...
-> > /*
-> >  * INTERFACE ROUTINES
-> >  *      ExecMaterial            - generate a temporary relation
-> >                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-> 
-> I understand what you are saying here.  The temp table has transaction
-> scope, and breaking each query into multiple commands, each with its own
-> transaction scope will cause the temp table to go away.
-
-No. I just said that there will be no ability to prepare queries with
-subselects for latter execution: will be no ability to get execution plan which
-could be passed to executor to get results without additional parser/planner
-invocations. This ability is used by SQL-functions and SPI_prepare()/SPI_execp()
-(==> PLs). RULEs don't use execution plan, but use parsed query tree (stored
-in pg_rewrite) -> I foresee problems with VIEWs on queries with subselects.
-
-Ability to have execution plans seems important to me. Other DBMS-es use
-this for stored procedures and views.
-
-Vadim
-
-From [email protected] Mon Dec  1 01:30:57 1997
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id BAA10903
-   for ; Mon, 1 Dec 1997 01:30:55 -0500 (EST)
-Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id BAA26262 for ; Mon, 1 Dec 1997 01:21:28 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.5/8.7.5) with SMTP id BAA05263; Mon, 1 Dec 1997 01:02:12 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 01 Dec 1997 01:00:12 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.5/8.7.5) id BAA03357 for pgsql-hackers-outgoing; Mon, 1 Dec 1997 01:00:07 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [206.84.210.195]) by hub.org (8.8.5/8.7.5) with ESMTP id AAA03290 for ; Mon, 1 Dec 1997 00:59:45 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id AAA10395;
-   Mon, 1 Dec 1997 00:57:07 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: Re: [HACKERS] querytrees and multiple statements
-To: [email protected] (Vadim B. Mikheev)
-Date: Mon, 1 Dec 1997 00:57:07 -0500 (EST)
-Cc: [email protected]
-In-Reply-To: <[email protected]> from "Vadim B. Mikheev" at Nov 30, 97 02:32:14 pm
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-> 
-> No. I just said that there will be no ability to prepare queries with
-> subselects for latter execution: will be no ability to get execution plan which
-> could be passed to executor to get results without additional parser/planner
-> invocations. This ability is used by SQL-functions and SPI_prepare()/SPI_execp()
-> (==> PLs). RULEs don't use execution plan, but use parsed query tree (stored
-> in pg_rewrite) -> I foresee problems with VIEWs on queries with subselects.
-> 
-> Ability to have execution plans seems important to me. Other DBMS-es use
-> this for stored procedures and views.
-> 
-> Vadim
-> 
-
-I see what you are saying about other people calling pg_plan().  pg_plan
-returns the query rewritten, and a plan, and some areas use that.  I
-will have to make sure I honor that functionality in any changes I make
-to it.  I will think more about this.  I may have to add an 'execute me'
-flag to it.  However, I am unsure how I am going to generate 'just a
-plan or rewritten query structure' without actually running the query
-and having the temp table created so the rest can be parsed.
-
-
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Mon Dec  1 02:00:58 1997
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id CAA11221
-   for ; Mon, 1 Dec 1997 02:00:57 -0500 (EST)
-Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id BAA26994 for ; Mon, 1 Dec 1997 01:55:19 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.5/8.7.5) with SMTP id BAA23269; Mon, 1 Dec 1997 01:47:13 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 01 Dec 1997 01:45:31 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.5/8.7.5) id BAA22653 for pgsql-hackers-outgoing; Mon, 1 Dec 1997 01:45:25 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by hub.org (8.8.5/8.7.5) with ESMTP id BAA22590 for ; Mon, 1 Dec 1997 01:45:13 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by www.krasnet.ru (8.8.7/8.7.3) with SMTP id NAA21318; Mon, 1 Dec 1997 13:49:58 +0700 (KRS)
-Message-ID: <[email protected]>
-Date: Mon, 01 Dec 1997 13:49:58 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 3.01 (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: [email protected]
-Subject: Re: [HACKERS] querytrees and multiple statements
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> >
-> > No. I just said that there will be no ability to prepare queries with
-> > subselects for latter execution: will be no ability to get execution plan which
-> > could be passed to executor to get results without additional parser/planner
-> > invocations. This ability is used by SQL-functions and SPI_prepare()/SPI_execp()
-> > (==> PLs). RULEs don't use execution plan, but use parsed query tree (stored
-> > in pg_rewrite) -> I foresee problems with VIEWs on queries with subselects.
-> >
-> > Ability to have execution plans seems important to me. Other DBMS-es use
-> > this for stored procedures and views.
-> >
-> > Vadim
-> >
-> 
-> I see what you are saying about other people calling pg_plan().  pg_plan
-> returns the query rewritten, and a plan, and some areas use that.  I
-> will have to make sure I honor that functionality in any changes I make
-> to it.  I will think more about this.  I may have to add an 'execute me'
-> flag to it.  However, I am unsure how I am going to generate 'just a
-> plan or rewritten query structure' without actually running the query
-> and having the temp table created so the rest can be parsed.
-
-That's why I suggest to try with nodeMaterial(): this could allow to handle
-subqueries on optimizer level and got single execution plan for
-single user query.
-
-Vadim
-
-
-From [email protected] Mon Dec  1 02:46:23 1997
-Received: from hub.org (hub.org [209.47.148.200])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id CAA11762
-   for ; Mon, 1 Dec 1997 02:46:21 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.5/8.7.5) with SMTP id CAA11681; Mon, 1 Dec 1997 02:35:00 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 01 Dec 1997 02:33:17 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.5/8.7.5) id CAA11451 for pgsql-hackers-outgoing; Mon, 1 Dec 1997 02:33:09 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [206.84.210.195]) by hub.org (8.8.5/8.7.5) with ESMTP id CAA11110 for ; Mon, 1 Dec 1997 02:32:10 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id CAA11574;
-   Mon, 1 Dec 1997 02:32:45 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: Re: [HACKERS] querytrees and multiple statements
-To: [email protected] (Vadim B. Mikheev)
-Date: Mon, 1 Dec 1997 02:32:45 -0500 (EST)
-Cc: [email protected]
-In-Reply-To: <[email protected]> from "Vadim B. Mikheev" at Dec 1, 97 01:49:58 pm
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-> 
-> Bruce Momjian wrote:
-> > 
-> > >
-> > > No. I just said that there will be no ability to prepare queries with
-> > > subselects for latter execution: will be no ability to get execution plan which
-> > > could be passed to executor to get results without additional parser/planner
-> > > invocations. This ability is used by SQL-functions and SPI_prepare()/SPI_execp()
-> > > (==> PLs). RULEs don't use execution plan, but use parsed query tree (stored
-> > > in pg_rewrite) -> I foresee problems with VIEWs on queries with subselects.
-> > >
-> > > Ability to have execution plans seems important to me. Other DBMS-es use
-> > > this for stored procedures and views.
-> > >
-> > > Vadim
-> > >
-> > 
-> > I see what you are saying about other people calling pg_plan().  pg_plan
-> > returns the query rewritten, and a plan, and some areas use that.  I
-> > will have to make sure I honor that functionality in any changes I make
-> > to it.  I will think more about this.  I may have to add an 'execute me'
-> > flag to it.  However, I am unsure how I am going to generate 'just a
-> > plan or rewritten query structure' without actually running the query
-> > and having the temp table created so the rest can be parsed.
-> 
-> That's why I suggest to try with nodeMaterial(): this could allow to handle
-> subqueries on optimizer level and got single execution plan for
-> single user query.
-
-Can you give me more details on this?  I realize I can create an empty
-tmp table to get through the parser analysis stuff, but how do I do
-something in nodeMaterial?
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Tue Dec  2 00:04:05 1997
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id AAA00350
-   for ; Tue, 2 Dec 1997 00:03:58 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by www.krasnet.ru (8.8.7/8.7.3) with SMTP id MAA22889; Tue, 2 Dec 1997 12:09:57 +0700 (KRS)
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Tue, 02 Dec 1997 12:09:56 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 3.01 (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: "Vadim B. Mikheev" , [email protected]
-Subject: Re: [HACKERS] querytrees and multiple statements
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> >
-> > That's why I suggest to try with nodeMaterial(): this could allow to handle
-> > subqueries on optimizer level and got single execution plan for
-> > single user query.
-> 
-> Can you give me more details on this?  I realize I can create an empty
-> tmp table to get through the parser analysis stuff, but how do I do
-> something in nodeMaterial?
-
- *      ExecMaterial
- *
- *      The first time this is called, ExecMaterial retrieves tuples
- *      this node's outer subplan and inserts them into a temporary
-                          ^^^^^^^
-
- *      relation.  After this is done, a flag is set indicating that
- *      the subplan has been materialized.  Once the relation is
- *      materialized, the first tuple is then returned.  Successive
- *      calls to ExecMaterial return successive tuples from the temp 
- *      relation.
-
-As you see, this node materializes some plan results into temp relation:
-instead of doing SELECT ... INTO temp FROM ... WHERE ... you could
-create Material node using plan for 'SELECT ... FROM ... WHERE ...' as
-its subplan. SeqScan of this materialized relation can be used in any
-join plans just like scan od normal relation, e.g. - NESTLOOP plan:
-
-   NESTLOOP
-       SeqScan A
-       SeqScan B
-
-becomes
-
-   NESTLOOP
-       SeqScan
-           Material
-               ...subplan here...
-       SeqScan B (or other Material)
-
-and so on...
-
-Vadim
-
-From [email protected] Tue Dec  2 01:28:02 1997
-Received: from hub.org (hub.org [209.47.148.200])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id BAA02313
-   for ; Tue, 2 Dec 1997 01:28:00 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.5/8.7.5) with SMTP id BAA00346; Tue, 2 Dec 1997 01:03:55 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Tue, 02 Dec 1997 01:03:04 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.5/8.7.5) id BAA28750 for pgsql-hackers-outgoing; Tue, 2 Dec 1997 01:02:57 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [206.84.210.195]) by hub.org (8.8.5/8.7.5) with ESMTP id BAA28254 for ; Tue, 2 Dec 1997 01:02:38 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id BAA01042;
-   Tue, 2 Dec 1997 01:02:15 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: Re: [HACKERS] querytrees and multiple statements
-To: [email protected] (Vadim B. Mikheev)
-Date: Tue, 2 Dec 1997 01:02:15 -0500 (EST)
-Cc: [email protected], [email protected]
-In-Reply-To: <[email protected]> from "Vadim B. Mikheev" at Dec 2, 97 12:09:56 pm
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-> 
-> Bruce Momjian wrote:
-> > 
-> > >
-> > > That's why I suggest to try with nodeMaterial(): this could allow to handle
-> > > subqueries on optimizer level and got single execution plan for
-> > > single user query.
-> > 
-> > Can you give me more details on this?  I realize I can create an empty
-> > tmp table to get through the parser analysis stuff, but how do I do
-> > something in nodeMaterial?
-> 
->  *      ExecMaterial
->  *
->  *      The first time this is called, ExecMaterial retrieves tuples
->  *      this node's outer subplan and inserts them into a temporary
->                           ^^^^^^^
-> 
->  *      relation.  After this is done, a flag is set indicating that
->  *      the subplan has been materialized.  Once the relation is
->  *      materialized, the first tuple is then returned.  Successive
->  *      calls to ExecMaterial return successive tuples from the temp 
->  *      relation.
-> 
-> As you see, this node materializes some plan results into temp relation:
-> instead of doing SELECT ... INTO temp FROM ... WHERE ... you could
-> create Material node using plan for 'SELECT ... FROM ... WHERE ...' as
-> its subplan. SeqScan of this materialized relation can be used in any
-> join plans just like scan od normal relation, e.g. - NESTLOOP plan:
-> 
->  NESTLOOP
->      SeqScan A
->      SeqScan B
-> 
-> becomes
-> 
->  NESTLOOP
->      SeqScan
->          Material
->              ...subplan here...
->      SeqScan B (or other Material)
-> 
-> and so on...
-
-The problem now is that I don't understand much about what happens
-inside the optimizer or executor.  I am sure you are correct that we can
-have the subselect as a subnode, and if you think that is best, then it
-is.
-
-This pretty much stops me in developing subselects.  I have the concepts
-down of what has to happen, but I can not implement it.  It will take me
-several months to learn how the optimizer and executor work in enough
-detail to implement this.
-
-I usually alot 2-3 days a month for PostgreSQL development.
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Thu Oct 30 01:30:59 1997
-Received: from renoir.op.net ([email protected] [206.84.208.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id BAA17986
-   for ; Thu, 30 Oct 1997 01:30:58 -0500 (EST)
-Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id BAA27090 for ; Thu, 30 Oct 1997 01:19:49 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.5/8.7.5) with SMTP id BAA28901; Thu, 30 Oct 1997 01:16:38 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Thu, 30 Oct 1997 01:16:17 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.5/8.7.5) id BAA28673 for pgsql-hackers-outgoing; Thu, 30 Oct 1997 01:16:10 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by hub.org (8.8.5/8.7.5) with ESMTP id BAA27557 for ; Thu, 30 Oct 1997 01:15:27 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by www.krasnet.ru (8.8.7/8.7.3) with SMTP id NAA20275; Thu, 30 Oct 1997 13:16:10 +0700 (KRS)
-Message-ID: <[email protected]>
-Date: Thu, 30 Oct 1997 13:16:09 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 3.01 (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: PostgreSQL Developers List 
-Subject: [HACKERS] Subqueries?
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-Hi!
-
-Bruce, did you begin with them ?
-I agreed that subqueries should be implemented like SQL-funcs, but
-I would suggest to don't CREATE FUNCTION - this is quite bad for
-performance, but use some new node (VirtualFunc or SubQuery or) and
-handle such nodes like sql-funcs are handled in function.c
-(but without parser/planner invocation on each call - should be
-fixed!). Also, not corelated subqueries returning single result
-can't be replaced in parser/planner by constant node: rules (and so -
-views), spi and PL use _prepared_ plans...
-It seems that this is not hard work...
-
-Vadim
-
-
-From [email protected] Thu Oct 30 16:31:59 1997
-Received: from hub.org (hub.org [209.47.148.200])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id QAA07360
-   for ; Thu, 30 Oct 1997 16:31:49 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.5/8.7.5) with SMTP id QAA11483; Thu, 30 Oct 1997 16:27:11 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Thu, 30 Oct 1997 16:26:14 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.5/8.7.5) id QAA11163 for pgsql-hackers-outgoing; Thu, 30 Oct 1997 16:26:07 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [206.84.210.195]) by hub.org (8.8.5/8.7.5) with ESMTP id QAA10874 for ; Thu, 30 Oct 1997 16:25:12 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id QAA06370;
-   Thu, 30 Oct 1997 16:07:52 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: Re: [HACKERS] Subqueries?
-To: [email protected] (Vadim B. Mikheev)
-Date: Thu, 30 Oct 1997 16:07:51 -0500 (EST)
-Cc: [email protected]
-In-Reply-To: <[email protected]> from "Vadim B. Mikheev" at Oct 30, 97 01:16:09 pm
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-> 
-> Hi!
-> 
-> Bruce, did you begin with them ?
-> I agreed that subqueries should be implemented like SQL-funcs, but
-> I would suggest to don't CREATE FUNCTION - this is quite bad for
-> performance, but use some new node (VirtualFunc or SubQuery or) and
-> handle such nodes like sql-funcs are handled in function.c
-> (but without parser/planner invocation on each call - should be
-> fixed!). Also, not corelated subqueries returning single result
-> can't be replaced in parser/planner by constant node: rules (and so -
-> views), spi and PL use _prepared_ plans...
-> It seems that this is not hard work...
-> 
-> Vadim
-> 
-> 
-
-OK, here is what I have collected over the months about subqueries.
-The Sybase whitepaper is also attached.
-
-This should get us thinking about how to implement each subquery type,
-what operations need to be performed, and in what order.
-
----------------------------------------------------------------------------
-
-From: Bruce Momjian 
-Subject: Re: [PG95-DEV] Need info on other databases.
-To: [email protected]
-Date: Fri, 22 Nov 1996 12:49:24 -0500 (EST)
-
-> 
-> 
-> What I'm specifically interested in is the SQL-92 spec
-> for the ANSI things that postgres95 is missing and the
-> syntax/limitations on systems like Informix, Sybase,
-> Microsoft, et.al...
-> 
-> Any technical info such as performance hits, disabling
-> the use of indices, stuff like that would be _greatly_
-> appreciated.  I have a decent understanding of this for
-> Oracle, but not for any other systems.  I want to get
-> an idea of the work load of adding the IN, BETWEEN/AND
-> and HAVING clauses.
-
-I have done some thinking about subselects.  There are basically two
-issues:
-
-   Does the query return one row or several rows?  This can be
-   determined by seeing if the user uses equals on 'IN' to join the
-   subquery. 
-
-   Is the query correlated, meaning "Does the subquery reference
-   values from the outer query?"
-
-(We already have the third type of subquery, the INSERT...SELECT query.)
-
-So we have these four combinations:
-
-   1) one row, no correlation
-   2) multiple rows, no correlation
-   3) one row, correlated
-   4) multiple rows, correlated
-
-
-With #1, we can execute the subquery, get the value, replace the
-subquery with the constant returned from the subquery, and execute the
-outer query.
-
-With #2, we can execute the subquery and put the result into a temporary
-table.  We then rewrite the outer query to access the temporary table
-and replace the subquery with the column name from the temporary table. 
-We probabally put an index on the temp. table, which has only one
-column, because a subquery can only return one column.  We remove the
-temp. table after query execution.
-
-With #3 and #4, we potentially need to execute the subquery for every
-row returned by the outer query.  Performance would be horrible for
-anything but the smallest query.  Another way to handle this is to
-execute the subquery WITHOUT using any of the outer-query columns to
-restrict the WHERE clause, and add those columns used to join the outer
-variables into the target list of the subquery.  So for query:
-
-   select t1.name
-   from tab t1
-   where t1.age = (select max(t2.age)
-               from tab2
-               where tab2.name = t1.name)
-
-Execute the subquery and put it in a temporary table:
-
-   select t2.name, max(t2.age)
-   into table temp999
-   from tab2
-   where tab2.name = t1.name
-
-   create index i_temp999 on temp999 (name)
-
-Then re-write the outer query:
-
-   select t1.name
-   from tab t1, temp999
-   where t1.age = temp999.age and
-         t1.name = temp999.name
-
-The only problem here is that the subselect is running for all entries
-in tab2, even if the outer query is only going to need a few rows. 
-Determining whether to execute the subquery each time, or create a temp.
-table is often difficult to determine.  Even some non-correlated
-subqueries are better to execute for each row rather the pre-execute the
-entire subquery, expecially if the outer query returns few rows.
-
-One requirement to handle these issues is better column statistics,
-which I am working on.
-
-------------------------------------------------------------------------------
-
-Date: Thu, 5 Dec 1996 10:07:56 -0500
-From: [email protected] (Darren King)
-To: [email protected]
-Subject: Subselect info.
-
-> Any of them deal with implementing subselects?
-
-There's a white paper at the www.sybase.com that might
-help a little.  It's just a copy of a presentation
-given by the optimizer guru there.  Nothing code-wise,
-but he gives a few ways of flattening them with temp
-tables, etc...
-
-Darren 
-
-------------------------------------------------------------------------------
-
-Date: Fri, 22 Aug 1997 12:04:31 +0800
-From: "Vadim B. Mikheev" 
-To: Bruce Momjian 
-Subject: Re: subselects
-
-Bruce Momjian wrote:
-> 
-> Considering the complexity of the primary/secondary changes you are
-> making, I believe subselects will be easier than that.
-
-I don't do changes for P/F keys - just thinking...
-Yes, I think that impl of referential integrity is
-more complex work.
-
-As for subselects:
-
-in plannodes.h
-
-typedef struct Plan {
-...
-    struct Plan         *lefttree;
-    struct Plan         *righttree;
-} Plan;
-
-/* ----------------
- *  these are are defined to avoid confusion problems with "left"
-                                   ^^^^^^^^^^^^^^^^^^
- *  and "right" and "inner" and "outer".  The convention is that   
- *  the "left" plan is the "outer" plan and the "right" plan is
- *  the inner plan, but these make the code more readable.
- * ----------------
- */
-#define innerPlan(node)         (((Plan *)(node))->righttree)
-#define outerPlan(node)         (((Plan *)(node))->lefttree)
-
-First thought is avoid any confusions by re-defining
-
-#define rightPlan(node)         (((Plan *)(node))->righttree)
-#define leftPlan(node)          (((Plan *)(node))->lefttree)
-
-and change all occurrences of 'outer' & 'inner' in code
-to 'left' & 'inner' ones:
-
-this will allow to use 'outer' & 'inner' things for subselects
-latter, without confusion. My hope is that we may change Executor
-very easy by adding outer/inner plans/TupleSlots to
-EState, CommonState, JoinState, etc and by doing node
-processing in right order.
-
-Subselects are mostly Planner problem.
-
-Unfortunately, I havn't time at the moment: CHECK/DEFAULT...
-
-Vadim
-
-------------------------------------------------------------------------------
-
-Date: Fri, 22 Aug 1997 12:22:37 +0800
-From: "Vadim B. Mikheev" 
-To: Bruce Momjian 
-Subject: Re: subselects
-
-Vadim B. Mikheev wrote:
-> 
-> this will allow to use 'outer' & 'inner' things for subselects
-> latter, without confusion. My hope is that we may change Executor
-
-Or may be use 'high' & 'low' for subselecs (to avoid confusion
-with outter hoins).
-
-> very easy by adding outer/inner plans/TupleSlots to
-> EState, CommonState, JoinState, etc and by doing node
-> processing in right order.
-             ^^^^^^^^^^^^^^
-Rule is easy:
-1. Uncorrelated subselect - do 'low' plan node first
-2. Correlated             - do left/right first
-
-- just some flag in structures.
-
-Vadim
-
-
----------------------------------------------------------------------------
-
-[Image]
-Home | Search/Index
-
-Performance Tips for Transact-SQL
-
-Slides from a presentation by Jeff Lichtman
-
-----------------------------------------------------------------------------
-
-Table of Contents
-
-Overview
->versus>=
-Exists Versus Not Exists
-Exists Versus Not Exists II
-Correlated Subqueries with Restrictive Outer Joins
-Correlated Subqueries with Restrictive Outer Joins Example
-Correlated Subqueries with Restrictive Outer Joins III
-Correlated Subqueries with Restrictive Outer Joins IV
-Correlated Subqueries with Restrictive Outer Joins V
-Correlated Subqueries with Restrictive Outer Joins Example
-Creating Tables in Stored Procedures
-Creating Tables in Stored Procedures Example
-Variables versus Parameters in Where Clause
-Variables versus Parameters in Where Clause Example
-Count versus Exists
-Count versus Exists II
-Or versus Union
-Or versus Union Example
-MAX and MIN Aggregates
-MAX and MIN Aggregates II
-MAX and MIN Aggregates Example
-MAX and MIN Aggregates III
-Joins and Datatypes
-Joins and Datatypes Example
-Joins and Datatypes II
-Joins and Datatypes III
-Parameters and Datatypes
-Parameters and Datatypes Example
-Summary
-----------------------------------------------------------------------------
-
-Overview
-
-   * Goal Is to Learn Some Tips to Help You Improve the Performance of Your
-     Queries.
-   * Emphasis Is on Queries, Not on Schema.
-   * Many Tips Are Not Related to Query Optimizer.
-   * Tips Are Based on Actual Customer Cases Seen by SQL Server Development
-     Engineer.
-   * These Tips Are Intended As Suggestions and Guidelines, Not Absolute
-     Rules.
-   * Some of These Tips Could Become Obsolete As Sybase Improves the SQL
-     Server.
-
-----------------------------------------------------------------------------
-
-> versus >=
-
-Given the query:
-
-select * from tab where x > 3
-
-with an index on x. This query works by using the index to find the first
-value where x = 3, and scanning forward.
-
-Suppose there are many rows in tab where x = 3.
-
-In this case, the server has to scan many pages before finding the first row
-where x > 3.
-
-It is more efficient to write the query like this:
-
-select * from tab where x >= 4
-
-----------------------------------------------------------------------------
-
-Exists Versus Not Exists
-
-In subqueries and IF statements, EXISTS and IN are faster than NOT EXISTS
-and NOT IN.
-
-With IF statements, one can easily avoid NOT EXISTS:
-
-if not exists (select * from ...)
-begin /* Statement group 1 */
-...
-end else begin /* Statement group 2 */
-...
-end
-
-can be re-written as:
-
-if exists (select * from ...)
-begin /* Statement group 2 */
-...
-end else begin /* Statement group 1 */
-...
-end
-
-----------------------------------------------------------------------------
-
-Exists versus Not Exists (cont.)
-
-Even without an ELSE clause, it is possible to avoid
-
-NOT EXISTS in IF statements :
-
-if not exists (select * from ...)
-begin
-               /* Statement group */
-               ...
-end
-...
-
-can be re-written as:
-
-if exists (select * from ...)
-begin
-     goto exists_label
-end
-/* Statement group */
-...
-exists_label:
-...
-
-----------------------------------------------------------------------------
-
-Correlated Subqueries with Restrictive Outer Joins
-
-   * SQL Server Processes Subqueries "Inside-Out"
-   * For Correlated Subqueries, It Creates a Worktable Containing Subquery
-     Results
-   * The Worktable Is Grouped on the Correlation Columns
-
-----------------------------------------------------------------------------
-
-Correlated Subqueries with Restrictive Outer Joins
-
-For example:
-
-select w from outer where x =
-     (select sum(a) from inner
-      where inner.b = outer.z)
-
-becomes:
-
-select outer.z, summ = sum(inner.a)
-into #work
-from outer, inner
-where inner.b = outer.z
-group by outer.z
-select outer.w
-from outer, #work
-where outer.z = #work.z
-and outer.x = #work.summ
-
-----------------------------------------------------------------------------
-
-Correlated Subqueries with Restrictive Outer Joins (cont.)
-
-The SQL Server copies search clauses from the outer query to the subquery to
-improve performance:
-
-select w from outer
-where y = 1
-and x = (select sum(a)
-     from inner
-     where inner.b = outer.z)
-
-becomes:
-
-select outer.z, summ = sum(inner.a)
-into #work
-from outer, inner
-where inner.b = outer.z and outer.y = 1
-group by outer .z
-select outer.w
-from outer, #work
-where outer.z = #work.z and outer.y = 1 and outer.x =#work.summ
-
-----------------------------------------------------------------------------
-
-Correlated Subqueries with Restrictive Outer Joins (cont.)
-
-   * The SQL Server Does Not Copy Join Clauses Into Correlated Subqueries As
-     It Does With Search Clauses.
-   * Copying Search Clauses Will Always Make the Query Run Faster, but
-     Copying a Join Clause Might Make It Run Slower.
-   * Copying the Join Clause Is Beneficial Only If the Join Clause Is Very
-     Restrictive.
-   * Only the Query Optimizer Knows Whether a Join Clause Is Restrictive,
-     but the SQL Server Breaks the Query Into Steps Before Optimization.
-   * Since You Know Your Data, You Can Copy Join Clauses Into Subqueries
-     When You Know It Will Help.
-
-----------------------------------------------------------------------------
-
-Correlated Subqueries with Restrictive Outer Joins (cont.)
-
-An example of when to copy join clause:
-
-select *
-from huge_tab, single_row_tab
-where huge_tab.unique_column = single_row_tab.a
-and huge_tab.b = (select sum©
-       from inner
-       where huge_tab.d = inner.e)
-
-should be re-written as:
-
-select *
-from huge_tab, single_row_tab
-where huge_tab.unique_column = single_row_tab.a
-and huge_tab.b = (select sum©
-        from inner
-        where huge_tab.d = inner.e
-        and huge_tab.unique_column = single_row_tab.a)
-
-----------------------------------------------------------------------------
-
-Correlated Subqueries with Restrictive Outer Joins (cont.)
-
-An example of when not to copy join clause:
-
-select *
-from huge_tab, single_row_tab
-where huge_tab.many_duplicates_in_column = single_row_tab.a and
-single_row_tab.b = (select sum©
-     from inner
-     where single_row_tab.d = inner.e)
-
-Should not be re-written as:
-
-select *
-from huge_tab, single_row_tab
-where huge_tab.many_duplicates_in_column = single_row_tab.a and
-single_row_tab.b = (select sum©
-      from inner
-      where single_row tab.d = inner .e
-      and huge_tab.many_duplicates_in_column = single_row_tab.a)
-
-----------------------------------------------------------------------------
-
-Creating Tables in Stored Procedures
-
-   * When You Create a Table in the Same Stored Procedure Where It Is Used,
-     the Query Optimizer Cannot Know How Big the Table Is.
-   * The Optimizer Assumes That Any Such Table Has 10 Data Pages and 100
-     Rows.
-   * If the Table Is Really Big, This Assumption Can Lead the Optimizer to
-     Choose a Sub-Optimal Query Plan.
-   * In Cases Like This, It Is Better to Create the Table Outside the
-     Procedure, Which Allows the Optimizer to See How Large the Table Is.
-
-----------------------------------------------------------------------------
-
-Creating Tables in Stored Procedures (cont)
-
-For example:
-
-create proc p as
-      select * into #huge_result from ...
-      select * from tab, #huge_result where
- ...
-
-can be re-written as:
-
-create proc p as
-      select * into #huge_result from ...
-      exec s
-create proc s as
-      select * from tab, #huge_result where
- ...
-
-----------------------------------------------------------------------------
-
-Variables versus Parameters in Where Clause
-
-   * The Query Optimizer Cannot Predict the Value of a Declared Variable.
-   * The Query Does Know the Value of a Parameter to a Stored Procedure at
-     Compile Time.
-   * Knowing the Values in the WHERE Clause of a Query Can Help the
-     Optimizer Make Better Choices.
-   * To Avoid Putting Variables Into WHERE Clauses, One Can Split up Stored
-     Procedures.
-
-----------------------------------------------------------------------------
-
-Variables versus Parameters in Where Clause (cont)
-
-For example:
-
-create procedure p as
-       declare @x int
-       select @x = col from tab where ...
-       select * from tab2 where col2 = @x
-
-can be re-written as:
-
-create procedure p as
-       declare @x int
-       select @x = col from tab where ...
-       exec s @x
-create procedure s @x int as
-       select * from tab2 where col2 = @x
-
-----------------------------------------------------------------------------
-
-Count versus Exists
-
-It is possible to use the COUNT aggregate in a subquery to do an existence
-check:
-
-select * from tab where 0 <
-        (select count(*) from tab2 where ...)
-
-It is possible to write this same query using EXISTS (or IN):
-
-select * from tab where exists
-       (select * from tab2 where ...)
-
-----------------------------------------------------------------------------
-
-Count versus Exists (cont)
-
-   * Using COUNT to Do an Existence Check Is Slower Than Using EXISTS.
-   * When You Use COUNT, the SQL Server Does Not Know That You Are Doing an
-     Existence Check. It Counts All of the Matching Values.
-   * When You Use EXISTS, the SQL Server Knows You Are Doing an Existence
-     Check, So It Stops Looking When It Finds the First Matching Value.
-   * The Same Applies to Using COUNT Instead of IN or ANY.
-
-----------------------------------------------------------------------------
-
-Or versus Union
-
-   * The SQL Server Cannot Optimize Join Clauses That Are Linked With OR.
-   * The SQL Server Can Optimize Selects That Are Linked With UNION.
-   * The Result of OR Is Somewhat Like the Result of UNION, Except For the
-     Treatment of Duplicate Rows and Empty Tables.
-
-----------------------------------------------------------------------------
-
-Or versus Union (cont)
-
-For example:
-
-select * from tab1, tab2
-where tab1.a = tab2.b
-or tab1.x = tab2.y
-
-can be re-written as:
-
-select * from tab1, tab2
-where tab1.a = tab2.b
-union all
-select * from tab1, tab2
-where tab1.x = tab2.y
-
-You can use UNION instead of UNION ALL if you want to eliminate duplicates,
-but this will eliminate all duplicates. It may not be possible to get
-exactly the same set of duplicates from the re-written query.
-----------------------------------------------------------------------------
-
-MAX and MIN Aggregates
-
-   * The SQL Server Uses Special Optimizations for the MAX and MIN
-     Aggregates When There Is an Index on the Aggregated Column.
-   * For MIN, It Stops the Scan on the First Qualifying Row.
-   * For MAX, It Goes Directly to the End of the Index to Find the Last Row.
-   * The Optimization Is Not Applied If:
-        o The Expression Inside the MAX or MIN Is Anything but a Column
-        o The Column Inside the MAX or MIN Is Not the First Column of an
-          Index
-        o There Is Another Aggregate in the Query
-        o There Is a GROUP BY Clause
-   * In Addition, the MAX Optimization Is Not Applied If There Is a WHERE
-     Clause.
-
-----------------------------------------------------------------------------
-
-MAX and MIN Aggregates (cont)
-
-If you have an optimizable MAX or MIN aggregate, it can pay to put it in a
-query separate from other aggregates. For example:
-
-select max(x), min(x) from tab
-
-will result in a full scan of tab, even if there is an index on x. The query
-can be re-written as:
-
-select max(x) from tab
-select min(x) from tab
-
-This can result in using the index twice, rather than scanning the entire
-table once.
-----------------------------------------------------------------------------
-
-MAX and MIN Aggregates (cont)
-
-The MIN optimization can backfire if the where clause is highly selective.
-For example:
-
-select min(index_col)
-from tab
-where
-       col_in_other_index = "value only at end of first index"
-
-The MIN optimization will result in a nearly complete scan of the entire
-index.
-
-This is counter-intuitive. The more selective the WHERE clause, the slower
-the query.
-----------------------------------------------------------------------------
-
-MAX and MIN Aggregates (cont)
-
-In cases like this, it can pay to disable the MIN optimization by combining
-it with another aggregate:
-
-select min(index_col), max(index_col)
-from tab
-where
-col_in_other_index = Òvalue only at end of first indexÓ
-
-This convinces the optimizer not to use the MIN optimization, so it chooses
-the next best plan, which might be the other index.
-----------------------------------------------------------------------------
-
-Joins and Datatypes
-
-   * When Joining Between Two Columns of the Different Datatypes, One of the
-     Columns Must Be Converted to the Type of the Other.
-   * The Commands Reference Manual Shows the Hierarchy of Types.
-   * The Column Whose Type Is Lower in the Hierarchy Is the One That Is
-     Converted.
-   * The Query Optimizer Cannot Choose an Index on the Column That Is
-     Converted.
-
-----------------------------------------------------------------------------
-
-Joins and Datatypes (cont)
-
-For example:
-
-select *
-from tab1, tab2
-where tab1.float_column = tab2.int_column
-
-In this case, no index on tab2.int_column can be used, because int is lower
-in the hierarchy than float.
-
-Note that CHAR NULL is really VARCHAR, and BINARY NULL is really VARBINARY.
-
-Joining CHAR NOT NULL with CHAR NULL involves a conversion (BINARY too).
-----------------------------------------------------------------------------
-
-Joins and Datatypes (cont)
-
-It's best to avoid datatype problems in joins by designing the schema
-accordingly.
-
-If a join between different datatypes is unavoidable, and it hurts
-performance, you can force the conversion to be on the other side of the
-join.
-
-For example:
-
-select *
-from tab1, tab2
-where tab1.char_column = convert(char(75),tab2.varchar_column)
-
-----------------------------------------------------------------------------
-
-Joins and Datatypes (cont)
-
-Be careful! This tactic can change the meaning of the query.
-
-For example:
-
-select *
-from tab1, tab2
-where tab1.int_column = convert(int, tab2.float_column)
-
-This will not return the same results as the join without the convert. It
-can be salvaged by adding:
-
-and tab2.float_column = convert(int, tab2.float_column)
-
-This assumes that all values in tab2.float_column can be converted to int.
-----------------------------------------------------------------------------
-
-Parameters and Datatypes
-
-   * The Query Optimizer Can Use the Values of Parameters to Stored
-     Procedures to Help Determine Costs.
-   * If a Parameter Is Not of the Same Type As the Column in The WHERE
-     Clause That It Is Being Compared to, the Server Has to Convert the
-     Parameter.
-   * The Optimizer Cannot Use the Value of a Converted Parameter.
-   * It Pays to Make Sure That Parameters Have the Same Type As the Columns
-     They Are Compared To.
-
-----------------------------------------------------------------------------
-
-Parameters and Datatypes (cont)
-
-For example:
-
-create proc p @x varchar(30) as
-select * from tab where char_column = @x
-
-may get a poorer query plan than:
-
-create proc p @x char(30) as
-select * from tab where char_column = @x
-
-Remember that CHAR NULL is really VARCHAR, and BINARY NULL is really
-VARBINARY.
-----------------------------------------------------------------------------
-
-Summary
-
-   * How you write your queries can make a big difference in performance.
-   * Two different queries that do the same thing may perform differently.
-   * There are few absolutes to improving performance, but the tips given
-     here can help.
-   * These tips are not all there is to know about performance.
-
-About the Author
-
-Jeff Lichtman has worked at Sybase since 1987. In 1994, he was given the new
-position of architect of query processing for SQL Server. He is informally
-known as Sybase's optimizer guru.
-
-For more info send email to [email protected]
-
-Copyright 1995 © Sybase, Inc. All Rights Reserved.
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Sun Jan 11 23:49:44 1998
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id XAA19252
-   for ; Sun, 11 Jan 1998 23:49:02 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id MAA08095;
-   Mon, 12 Jan 1998 12:09:24 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Mon, 12 Jan 1998 12:09:20 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: [email protected], [email protected]
-Subject: Re: [HACKERS] Re: subselects
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> We need a new Node structure, call it Sublink:
-> 
->         int     linkType        (IN, NOTIN, ANY, EXISTS, OPERATOR...)
->         Oid     operator        /* subquery must return single row */
->         List    *lefthand;      /* parent stuff */
->         Node    *subquery;      /* represents nodes from parser */
->         Index   Subindex;       /* filled in to index Query->subqueries */
-
-Ok, I agreed that it's better to have new node and don't put subquery stuff
-into Expr node.
-
-int linkType
-        is one of EXISTS, ANY, ALL, EXPR. EXPR is for the case of expression
-        subqueries (following Sybase naming) which must return single row -
-        (a, b, c) = (subquery).
-        Note again, that there are no linkType for IN and NOTIN here. 
-        User' IN and NOT IN must be converted to = ANY and <> ALL by parser.
-
-We need not in Oid operator! In all cases we need in
-
-List *oper
-        list of Oper nodes for each of a, b, c, ... and operator (=, ...)
-        corresponding to data type of a, b, c, ...
-
-List *lefthand
-        is list of Var/Const nodes - representation of (a, b, c, ...)
-
-What is Node *subquery ?
-In optimizer we need either in Subindex (to get subquery from Query->subqueries
-when beeing in Sublink) or in Node *subquery inside Sublink itself.
-BTW, after some thought I don't see how Query->subqueries will be usefull.
-So, may be just add bool hassubqueries to Query (and Query *parentQuery)
-and use Query *subquery in Sublink, but not subindex ?
-
-> 
-> Also, when parsing the subqueries, we need to keep track of correlated
-> references.  I recommend we add a field to the Var structure:
-> 
->         Index   sublevel;       /* range table reference:
->                                    = 0  current level of query
->                                    < 0  parent above this many levels
->                                    > 0  index into subquery list
->                                  */
-> 
-> This way, a Var node with sublevel 0 is the current level, and is true
-> in most cases.  This helps us not have to change much code.  sublevel =
-> -1 means it references the range table in the parent query. sublevel =
-> -2 means the parent's parent. sublevel = 2 means it references the range
-> table of the second entry in Query->subqueries.  Varno and varattno are
-> still meaningful.  Of course, we can't reference variables in the
-> subqueries from the parent in the parser code, but Vadim may want to.
-                                                     ^^^^^^^^^^^^^^^^^
-No. So, just use sublevel >= 0: 0 - current level, 1 - one level up, ...
-sublevel is for optimizer only - executor will not use it.
-
-> 
-> When doing a Var lookup in the parser, we look in the current level
-> first, but if not found, if it is a subquery, we can look at the parent
-> and parent's parent to set the sublevel, varno, and varatno properly.
-> 
-> We create no phantom range table entries in the subquery, and no phantom
-> target list entries.   We can leave that all for the upper optimizer.
-
-Ok.
-
-Vadim
-
-From [email protected] Mon Jan 12 08:06:41 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id IAA00786
-   for ; Mon, 12 Jan 1998 08:06:39 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id EAA12270 for ; Mon, 12 Jan 1998 04:16:10 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id QAA08460;
-   Mon, 12 Jan 1998 16:34:54 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Mon, 12 Jan 1998 16:34:45 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: "Thomas G. Lockhart" 
-CC: Bruce Momjian , [email protected]
-Subject: Re: [HACKERS] Re: subselects
-References: <[email protected]> <[email protected]> <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Thomas G. Lockhart wrote:
-> 
-> btw, to implement "(a,b,c) OP (d,e,f)" I made a new routine in the parser called
-> makeRowExpr() which breaks this up into a sequence of "and" and/or "or" expressions.
-> If lists are handled farther back, this routine should move to there also and the
-> parser will just pass the lists. Note that some assumptions have to be made about the
-> meaning of "(a,b) OP (c,d)", since usually we only have knowledge of the behavior of
-> "a OP c". Easy for the standard SQL operators, unknown for others, but maybe it is OK
-> to disallow those cases or to look for specific appearance of the operator to guess
-> the behavior (e.g. if the operator has "<" or "=" or ">" then build as "and"s and if
-> it has "<>" or "!" then build as "or"s.
-
-Oh, god! I never thought about this!
-Ok, I have to agree:
-
-1. Only <, <=, =, >, >=, <> is allowed with subselects
-2. Use OR's for <>, and so - we need in bool useor in SubLink 
-   for <>, <> ANY and <> ALL:
-
-typedef struct SubLink {
-   NodeTag     type;
-   int     linkType; /* EXISTS, ALL, ANY, EXPR */
-   bool        useor;    /* TRUE for <> */
-   List            *lefthand; /* List of Var/Const nodes on the left */
-   List            *oper;     /* List of Oper nodes */
-   Query           *subquery; /* */
-} SubLink;
-
-Vadim
-
-From [email protected] Mon Jan 12 08:06:53 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id IAA00814
-   for ; Mon, 12 Jan 1998 08:06:51 -0500 (EST)
-Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id EAA12449 for ; Mon, 12 Jan 1998 04:26:03 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id EAA01671; Mon, 12 Jan 1998 04:17:59 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 12 Jan 1998 04:17:29 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id EAA01651 for pgsql-hackers-outgoing; Mon, 12 Jan 1998 04:17:23 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by hub.org (8.8.8/8.7.5) with ESMTP id EAA01633 for ; Mon, 12 Jan 1998 04:16:44 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id QAA08460;
-   Mon, 12 Jan 1998 16:34:54 +0700 (KRS)
-   (envelope-from [email protected])
-Message-ID: <[email protected]>
-Date: Mon, 12 Jan 1998 16:34:45 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: "Thomas G. Lockhart" 
-CC: Bruce Momjian , [email protected]
-Subject: Re: [HACKERS] Re: subselects
-References: <[email protected]> <[email protected]> <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-Thomas G. Lockhart wrote:
-> 
-> btw, to implement "(a,b,c) OP (d,e,f)" I made a new routine in the parser called
-> makeRowExpr() which breaks this up into a sequence of "and" and/or "or" expressions.
-> If lists are handled farther back, this routine should move to there also and the
-> parser will just pass the lists. Note that some assumptions have to be made about the
-> meaning of "(a,b) OP (c,d)", since usually we only have knowledge of the behavior of
-> "a OP c". Easy for the standard SQL operators, unknown for others, but maybe it is OK
-> to disallow those cases or to look for specific appearance of the operator to guess
-> the behavior (e.g. if the operator has "<" or "=" or ">" then build as "and"s and if
-> it has "<>" or "!" then build as "or"s.
-
-Oh, god! I never thought about this!
-Ok, I have to agree:
-
-1. Only <, <=, =, >, >=, <> is allowed with subselects
-2. Use OR's for <>, and so - we need in bool useor in SubLink 
-   for <>, <> ANY and <> ALL:
-
-typedef struct SubLink {
-   NodeTag     type;
-   int     linkType; /* EXISTS, ALL, ANY, EXPR */
-   bool        useor;    /* TRUE for <> */
-   List            *lefthand; /* List of Var/Const nodes on the left */
-   List            *oper;     /* List of Oper nodes */
-   Query           *subquery; /* */
-} SubLink;
-
-Vadim
-
-
-From [email protected] Mon Jan 12 08:06:38 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id IAA00783
-   for ; Mon, 12 Jan 1998 08:06:36 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id EAA12377 for ; Mon, 12 Jan 1998 04:21:55 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id QAA08470;
-   Mon, 12 Jan 1998 16:40:49 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Mon, 12 Jan 1998 16:40:48 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: "Thomas G. Lockhart" 
-CC: Bruce Momjian , [email protected]
-Subject: Re: [HACKERS] Re: subselects
-References: <[email protected]> <[email protected]> <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Thomas G. Lockhart wrote:
-> 
-> btw, to implement "(a,b,c) OP (d,e,f)" I made a new routine in the parser called
-> makeRowExpr() which breaks this up into a sequence of "and" and/or "or" expressions.
-> If lists are handled farther back, this routine should move to there also and the
-> parser will just pass the lists. Note that some assumptions have to be made about the
-> meaning of "(a,b) OP (c,d)", since usually we only have knowledge of the behavior of
-> "a OP c". Easy for the standard SQL operators, unknown for others, but maybe it is OK
-> to disallow those cases or to look for specific appearance of the operator to guess
-> the behavior (e.g. if the operator has "<" or "=" or ">" then build as "and"s and if
-> it has "<>" or "!" then build as "or"s.
-
-Sorry, I forgot something: is (a, b) OP (x, y) in standard ?
-If not then I suggest to don't implement it at all and allow
-(a, b) OP [ANY|ALL] (subselect) only.
-
-Vadim
-
-From [email protected] Tue Jan 13 09:30:58 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id JAA28551
-   for ; Tue, 13 Jan 1998 09:30:56 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id JAA26483 for ; Tue, 13 Jan 1998 09:21:36 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id VAA04356;
-   Tue, 13 Jan 1998 21:20:31 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Tue, 13 Jan 1998 21:20:25 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: [email protected], [email protected]
-Subject: Re: [HACKERS] Re: subselects
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Ok. I don't see how Query->subqueries could me help, but I foresee
-that Query->sublinks can do it. Could you add this ? 
-
-Bruce Momjian wrote:
-> 
-> >
-> > What is Node *subquery ?
-> > In optimizer we need either in Subindex (to get subquery from Query->subqueries
-> > when beeing in Sublink) or in Node *subquery inside Sublink itself.
-> > BTW, after some thought I don't see how Query->subqueries will be usefull.
-> > So, may be just add bool hassubqueries to Query (and Query *parentQuery)
-> > and use Query *subquery in Sublink, but not subindex ?
-> 
-> OK, I originally created it because the parser would have trouble
-> filling in a List* field in SelectStmt while it was parsing a WHERE
-> clause.  I decided to just stick the SelectStmt* into Sublink->subquery.
-> 
-> While we are going through the parse output to fill in the Query*, I
-> thought we should move the actual subquery parse output to a separate
-> place, and once the Query* was completed, spin through the saved
-> subquery parse list and stuff Query->subqueries with a list of Query*
-> for the subqueries.  I thought this would be easier, because we would
-> then have all the subqueries in a nice list that we can manage easier.
-> 
-> In fact, we can fill Query->subqueries with SelectStmt* as we process
-> the WHERE clause, then convert them to Query* at the end.
-> 
-> If you would rather keep the subquery Query* entries in the Sublink
-> structure, we can do that.  The only issue I see is that when you want
-> to get to them, you have to wade through the WHERE clause to find them.
-> For example, we will have to run the subquery Query* through the rewrite
-> system.  Right now, for UNION, I have a nice union List* in Query, and I
-> just spin through it in postgres.c for each Union query.  If we keep the
-> subquery Query* inside Sublink, we have to have some logic to go through
-> and find them.
-> 
-> If we just have an Index in Sublink to the Query->subqueries, we can use
-> the nth() macro to find them quite easily.
-> 
-> But it is up to you.  I really don't know how you are going to handle
-> things like:
-> 
->         select *
->         from taba
->         where x = 3 and y = 5 and (z=6 or q in (select g from tabb ))
-
-No problems.
-
-> 
-> My logic was to break the problem down to single queries as much as
-> possible, so we would be breaking the problem up into pieces.  Whatever
-> is easier for you.
-
-Vadim
-
-From [email protected] Tue Jan 13 10:32:35 1998
-Received: from hub.org (hub.org [209.47.148.200])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id KAA29523
-   for ; Tue, 13 Jan 1998 10:32:33 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id KAA03743; Tue, 13 Jan 1998 10:32:13 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Tue, 13 Jan 1998 10:31:57 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id KAA03708 for pgsql-hackers-outgoing; Tue, 13 Jan 1998 10:31:51 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [209.152.195.67]) by hub.org (8.8.8/8.7.5) with ESMTP id KAA03628 for ; Tue, 13 Jan 1998 10:31:20 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id JAA28747;
-   Tue, 13 Jan 1998 09:48:00 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: Re: [HACKERS] Re: subselects
-To: [email protected] (Vadim B. Mikheev)
-Date: Tue, 13 Jan 1998 09:48:00 -0500 (EST)
-Cc: [email protected], [email protected]
-In-Reply-To: <[email protected]> from "Vadim B. Mikheev" at Jan 13, 98 09:20:25 pm
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-> 
-> Ok. I don't see how Query->subqueries could me help, but I foresee
-> that Query->sublinks can do it. Could you add this ? 
-
-OK, so instead of moving the query out of the SubLink structure, you
-want the Query* in the Sublink structure, and a List* of SubLink
-pointers in the query structure?
-
-   Query
-   {
-       ...
-       List *sublink;  /* list of pointers to Sublinks
-       ...
-   }
-
-I can do that.  Let me know.
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Tue Jan 13 22:23:46 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id WAA08806
-   for ; Tue, 13 Jan 1998 22:23:45 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id WAA11486 for ; Tue, 13 Jan 1998 22:09:55 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id KAA05660;
-   Wed, 14 Jan 1998 10:09:07 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Wed, 14 Jan 1998 10:09:02 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: [email protected], [email protected]
-Subject: Re: [HACKERS] Re: subselects
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> >
-> > Ok. I don't see how Query->subqueries could me help, but I foresee
-> > that Query->sublinks can do it. Could you add this ?
-> 
-> OK, so instead of moving the query out of the SubLink structure, you
-> want the Query* in the Sublink structure, and a List* of SubLink
-> pointers in the query structure?
-
-Yes.
-
-> 
->         Query
->         {
->                 ...
->                 List *sublink;  /* list of pointers to Sublinks
->                 ...
->         }
-> 
-> I can do that.  Let me know.
-
-Thanks!
-
-Are there any opened issues ?
-
-Vadim
-
-From [email protected] Thu Jan 15 19:00:40 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id TAA21676
-   for ; Thu, 15 Jan 1998 19:00:39 -0500 (EST)
-Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id SAA23948 for ; Thu, 15 Jan 1998 18:35:59 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id SAA27814; Thu, 15 Jan 1998 18:32:40 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Thu, 15 Jan 1998 18:32:20 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id SAA27668 for pgsql-hackers-outgoing; Thu, 15 Jan 1998 18:32:08 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [209.152.195.67]) by hub.org (8.8.8/8.7.5) with ESMTP id SAA27425 for ; Thu, 15 Jan 1998 18:31:32 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id SAA12920;
-   Thu, 15 Jan 1998 18:18:32 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: Re: [HACKERS] Re: subselects
-To: [email protected] (Vadim B. Mikheev)
-Date: Thu, 15 Jan 1998 18:18:31 -0500 (EST)
-Cc: [email protected], [email protected]
-In-Reply-To: <[email protected]> from "Vadim B. Mikheev" at Jan 14, 98 10:09:02 am
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-> 
-> Bruce Momjian wrote:
-> > 
-> > >
-> > > Ok. I don't see how Query->subqueries could me help, but I foresee
-> > > that Query->sublinks can do it. Could you add this ?
-> > 
-> > OK, so instead of moving the query out of the SubLink structure, you
-> > want the Query* in the Sublink structure, and a List* of SubLink
-> > pointers in the query structure?
-> 
-> Yes.
-> 
-> > 
-> >         Query
-> >         {
-> >                 ...
-> >                 List *sublink;  /* list of pointers to Sublinks
-> >                 ...
-> >         }
-> > 
-> > I can do that.  Let me know.
-> 
-> Thanks!
-> 
-> Are there any opened issues ?
-
-OK, what do you need me to do.  Do you want me to create the Sublink
-support stuff, fill them in in the parser, and pass them through the
-rewrite section and into the optimizer.  I will prepare a list of
-changes.
-
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Thu Jan 15 19:00:38 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id TAA21663
-   for ; Thu, 15 Jan 1998 19:00:36 -0500 (EST)
-Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id SAA23925 for ; Thu, 15 Jan 1998 18:35:42 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id SAA27796; Thu, 15 Jan 1998 18:32:37 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Thu, 15 Jan 1998 18:31:52 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id SAA27463 for pgsql-hackers-outgoing; Thu, 15 Jan 1998 18:31:37 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [209.152.195.67]) by hub.org (8.8.8/8.7.5) with ESMTP id SAA27167 for ; Thu, 15 Jan 1998 18:31:06 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id SAA26747;
-   Thu, 15 Jan 1998 18:26:42 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: Re: [HACKERS] Re: subselects
-To: [email protected] (Vadim B. Mikheev)
-Date: Thu, 15 Jan 1998 18:26:41 -0500 (EST)
-Cc: [email protected], [email protected]
-In-Reply-To: <[email protected]> from "Vadim B. Mikheev" at Jan 12, 98 04:34:45 pm
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-> typedef struct SubLink {
->  NodeTag     type;
->  int     linkType; /* EXISTS, ALL, ANY, EXPR */
->  bool        useor;    /* TRUE for <> */
->  List            *lefthand; /* List of Var/Const nodes on the left */
->  List            *oper;     /* List of Oper nodes */
->  Query           *subquery; /* */
-> } SubLink;
-
-OK, we add this structure above.  During parsing, *subquery actually
-will hold Node *parsetree, not Query *.
-
-And add to Query:
-
-   bool    hasSubLinks;
-
-Also need a function to return a List* of SubLink*.  I just did a
-similar thing with Aggreg*.  And Var gets:
-
-   int uplevels;
-
-Is that it?
-
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Fri Jan 16 04:36:05 1998
-Received: from hub.org (hub.org [209.47.148.200])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id EAA09604
-   for ; Fri, 16 Jan 1998 04:36:03 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id EAA07040; Fri, 16 Jan 1998 04:35:27 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Fri, 16 Jan 1998 04:35:18 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id EAA06936 for pgsql-hackers-outgoing; Fri, 16 Jan 1998 04:35:13 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by hub.org (8.8.8/8.7.5) with ESMTP id EAA06823 for ; Fri, 16 Jan 1998 04:34:22 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id QAA10384;
-   Fri, 16 Jan 1998 16:34:15 +0700 (KRS)
-   (envelope-from [email protected])
-Message-ID: <[email protected]>
-Date: Fri, 16 Jan 1998 16:34:15 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: [email protected], [email protected]
-Subject: Re: [HACKERS] Re: subselects
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> > typedef struct SubLink {
-> >       NodeTag         type;
-> >       int             linkType; /* EXISTS, ALL, ANY, EXPR */
-> >       bool            useor;    /* TRUE for <> */
-> >       List            *lefthand; /* List of Var/Const nodes on the left */
-> >       List            *oper;     /* List of Oper nodes */
-> >       Query           *subquery; /* */
-> > } SubLink;
-> 
-> OK, we add this structure above.  During parsing, *subquery actually
-> will hold Node *parsetree, not Query *.
-            ^^^^^^^^^^^^^^^
-But optimizer will get node Query here, yes ?
-
-> 
-> And add to Query:
-> 
->         bool    hasSubLinks;
-> 
-> Also need a function to return a List* of SubLink*.  I just did a
-> similar thing with Aggreg*.  And Var gets:
-> 
->         int uplevels;
-> 
-> Is that it?
-
-Yes.
-
-Vadim
-
-
-From [email protected] Fri Jan 16 04:36:21 1998
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id EAA09607
-   for ; Fri, 16 Jan 1998 04:36:06 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id QAA10396;
-   Fri, 16 Jan 1998 16:37:21 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Fri, 16 Jan 1998 16:37:20 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: [email protected], [email protected]
-Subject: Re: [HACKERS] Re: subselects
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> >
-> > Are there any opened issues ?
-> 
-> OK, what do you need me to do.  Do you want me to create the Sublink
-> support stuff, fill them in in the parser, and pass them through the
-> rewrite section and into the optimizer.  I will prepare a list of
-> changes.
-
-Please do this. I'm ready to start coding of things in optimizer.
-
-Vadim
-
-From [email protected] Sun Jan 18 07:32:52 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id HAA14786
-   for ; Sun, 18 Jan 1998 07:32:51 -0500 (EST)
-Received: from www.krasnet.ru ([193.125.44.86]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id HAA29385 for ; Sun, 18 Jan 1998 07:25:55 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id TAA15780;
-   Sun, 18 Jan 1998 19:27:14 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Sun, 18 Jan 1998 19:27:09 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: "Thomas G. Lockhart" 
-CC: Bruce Momjian ,
-        PostgreSQL-development 
-Subject: Re: [HACKERS] subselects coding started
-References: <[email protected]> <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Thomas G. Lockhart wrote:
-> 
-> Bruce Momjian wrote:
-> 
-> > OK, I have created the SubLink structure with supporting routines, and
-> > have added code to create the SubLink structures in the parser, and have
-> > added Query->hasSubLink.
-> >
-> > I changed gram.y to support:
-> >
-> >         (x,y,z) OP (subselect)
-> >
-> > where OP is any operator.  Is that right, or are we doing only certain
-> > ones, and of so, do we limit it in the parser?
-> 
-> Seems like we would want to pass most operators and expressions through
-> gram.y, and then call elog() in either the transformation or in the
-> optimizer if it is an operator which can't be supported.
-
-Not in optimizer, in parser, please.
-Remember that for <> SubLink->useor must be TRUE and this is parser work
-(optimizer don't know about "=", "<>", etc but only about Oper nodes).
-
-IN ("=" ANY) and NOT IN ("<>" ALL) transformations are also parser work.
-
-Vadim
-
-From [email protected] Sun Jan 18 21:08:59 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id VAA00825
-   for ; Sun, 18 Jan 1998 21:08:57 -0500 (EST)
-Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id TAA25254 for ; Sun, 18 Jan 1998 19:18:24 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id TAA06912; Sun, 18 Jan 1998 19:17:01 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sun, 18 Jan 1998 19:11:05 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id TAA06322 for pgsql-hackers-outgoing; Sun, 18 Jan 1998 19:11:01 -0500 (EST)
-Received: from clio.trends.ca ([email protected] [209.47.148.2]) by hub.org (8.8.8/8.7.5) with ESMTP id TAA06144 for ; Sun, 18 Jan 1998 19:10:31 -0500 (EST)
-Received: from www.krasnet.ru ([193.125.44.86])
-   by clio.trends.ca (8.8.8/8.8.8) with ESMTP id HAA12383
-   for ; Sun, 18 Jan 1998 07:28:38 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id TAA15780;
-   Sun, 18 Jan 1998 19:27:14 +0700 (KRS)
-   (envelope-from [email protected])
-Message-ID: <[email protected]>
-Date: Sun, 18 Jan 1998 19:27:09 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: "Thomas G. Lockhart" 
-CC: Bruce Momjian ,
-        PostgreSQL-development 
-Subject: Re: [HACKERS] subselects coding started
-References: <[email protected]> <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-Thomas G. Lockhart wrote:
-> 
-> Bruce Momjian wrote:
-> 
-> > OK, I have created the SubLink structure with supporting routines, and
-> > have added code to create the SubLink structures in the parser, and have
-> > added Query->hasSubLink.
-> >
-> > I changed gram.y to support:
-> >
-> >         (x,y,z) OP (subselect)
-> >
-> > where OP is any operator.  Is that right, or are we doing only certain
-> > ones, and of so, do we limit it in the parser?
-> 
-> Seems like we would want to pass most operators and expressions through
-> gram.y, and then call elog() in either the transformation or in the
-> optimizer if it is an operator which can't be supported.
-
-Not in optimizer, in parser, please.
-Remember that for <> SubLink->useor must be TRUE and this is parser work
-(optimizer don't know about "=", "<>", etc but only about Oper nodes).
-
-IN ("=" ANY) and NOT IN ("<>" ALL) transformations are also parser work.
-
-Vadim
-
-
-From [email protected] Sun Jan 18 23:59:08 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id XAA10497
-   for ; Sun, 18 Jan 1998 23:59:07 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id XAA06941 for ; Sun, 18 Jan 1998 23:44:32 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id LAA16745
-   for ; Mon, 19 Jan 1998 11:46:28 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Mon, 19 Jan 1998 11:46:27 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-Subject: Re: SubLink->oper
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> In SubLink->oper, do you want the oid of the pg_operator, or the oid of
-> the pg_proc assigned to the operator?
-> 
-> Currently, I am giving you the oid of pg_operator.
-
-No! I need in Oper nodes here. For "normal" operators parser
-returns Expr node with opType = OP_EXPR and corresponding Oper
-in Node *oper. Near the same for SubLink: I need in Oper node
-for each pair of Var/Const from the left side and target entry from
-the subquery.
-
-Vadim
-
-From [email protected] Mon Jan 19 01:02:23 1998
-Received: from hub.org (hub.org [209.47.148.200])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id BAA24036
-   for ; Mon, 19 Jan 1998 01:02:21 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id BAA13913; Mon, 19 Jan 1998 01:02:16 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 19 Jan 1998 01:01:41 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id BAA13824 for pgsql-hackers-outgoing; Mon, 19 Jan 1998 01:01:34 -0500 (EST)
-Received: from candle.pha.pa.us ([email protected] [209.152.195.67]) by hub.org (8.8.8/8.7.5) with ESMTP id BAA13699 for ; Mon, 19 Jan 1998 01:00:59 -0500 (EST)
-Received: (from maillist@localhost)
-   by candle.pha.pa.us (8.8.5/8.8.5) id AAA23866;
-   Mon, 19 Jan 1998 00:54:49 -0500 (EST)
-From: Bruce Momjian 
-Message-Id: <[email protected]>
-Subject: [HACKERS] subselects
-To: [email protected] (Vadim B. Mikheev)
-Date: Mon, 19 Jan 1998 00:54:49 -0500 (EST)
-Cc: [email protected] (PostgreSQL-development)
-X-Mailer: ELM [version 2.4 PL25]
-MIME-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-
-OK, I have added code to allow the SubLinks make it to the optimizer.
-
-I implemented ParseState->parentParseState, but not parentQuery, because
-the parentParseState is much more valuable to me, and Vadim thought it
-might be useful, but was not positive.  Also, keeping that parentQuery
-pointer valid through rewrite may be difficult, so I dropped it. 
-ParseState is only valid in the parser.
-
-I have not done:
-
-   correlated subquery column references
-   added Var->sublevels_up
-   gotten this to work in the rewrite system
-   have not added full CopyNode support
-
-I will address these in the next few days.
-
--- 
-Bruce Momjian
-[email protected]
-
-
-From [email protected] Mon Jan 19 01:32:54 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id BAA24335
-   for ; Mon, 19 Jan 1998 01:32:52 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id BAA10610 for ; Mon, 19 Jan 1998 01:23:02 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id NAA16879
-   for ; Mon, 19 Jan 1998 13:25:28 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Mon, 19 Jan 1998 13:25:22 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-Subject: Re: SubLink->oper
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> >
-> > Bruce Momjian wrote:
-> > >
-> > > In SubLink->oper, do you want the oid of the pg_operator, or the oid of
-> > > the pg_proc assigned to the operator?
-> > >
-> > > Currently, I am giving you the oid of pg_operator.
-> >
-> > No! I need in Oper nodes here. For "normal" operators parser
-> > returns Expr node with opType = OP_EXPR and corresponding Oper
-> > in Node *oper. Near the same for SubLink: I need in Oper node
-> > for each pair of Var/Const from the left side and target entry from
-> > the subquery.
-> >
-> > Vadim
-> >
-> 
-> OK, can I give you an Oper* for each field.
-
-Nice! But what's this:
-
-typedef struct SubLink
-{
-struct Query;
-^^^^^^^^^^^^^
-    NodeTag     type;
-
-Vadim
-
-From [email protected] Mon Jan 19 01:34:39 1998
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id BAA24346
-   for ; Mon, 19 Jan 1998 01:34:33 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id NAA16904;
-   Mon, 19 Jan 1998 13:37:42 +0700 (KRS)
-   (envelope-from [email protected])
-Sender: [email protected]
-Message-ID: <[email protected]>
-Date: Mon, 19 Jan 1998 13:37:41 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: PostgreSQL-development 
-Subject: Re: subselects
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> OK, I have added code to allow the SubLinks make it to the optimizer.
-> 
-> I implemented ParseState->parentParseState, but not parentQuery, because
-> the parentParseState is much more valuable to me, and Vadim thought it
-> might be useful, but was not positive.  Also, keeping that parentQuery
-> pointer valid through rewrite may be difficult, so I dropped it.
-> ParseState is only valid in the parser.
-> 
-> I have not done:
-> 
->         correlated subquery column references
->         added Var->sublevels_up
->         gotten this to work in the rewrite system
->         have not added full CopyNode support
-> 
-> I will address these in the next few days.
-
-Nice! I'm starting with non-correlated subqueries...
-
-Vadim
-
-From [email protected] Mon Jan 19 01:35:50 1998
-Received: from hub.org (hub.org [209.47.148.200])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id BAA24362
-   for ; Mon, 19 Jan 1998 01:35:48 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id BAA17531; Mon, 19 Jan 1998 01:35:39 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 19 Jan 1998 01:35:33 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id BAA17460 for pgsql-hackers-outgoing; Mon, 19 Jan 1998 01:35:28 -0500 (EST)
-Received: from www.krasnet.ru (www.krasnet.ru [193.125.44.86]) by hub.org (8.8.8/8.7.5) with ESMTP id BAA17323 for ; Mon, 19 Jan 1998 01:35:03 -0500 (EST)
-Received: from sable.krasnoyarsk.su (www.krasnet.ru [193.125.44.86])
-   by www.krasnet.ru (8.8.7/8.8.7) with ESMTP id NAA16904;
-   Mon, 19 Jan 1998 13:37:42 +0700 (KRS)
-   (envelope-from [email protected])
-Message-ID: <[email protected]>
-Date: Mon, 19 Jan 1998 13:37:41 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: PostgreSQL-development 
-Subject: [HACKERS] Re: subselects
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> OK, I have added code to allow the SubLinks make it to the optimizer.
-> 
-> I implemented ParseState->parentParseState, but not parentQuery, because
-> the parentParseState is much more valuable to me, and Vadim thought it
-> might be useful, but was not positive.  Also, keeping that parentQuery
-> pointer valid through rewrite may be difficult, so I dropped it.
-> ParseState is only valid in the parser.
-> 
-> I have not done:
-> 
->         correlated subquery column references
->         added Var->sublevels_up
->         gotten this to work in the rewrite system
->         have not added full CopyNode support
-> 
-> I will address these in the next few days.
-
-Nice! I'm starting with non-correlated subqueries...
-
-Vadim
-
-
-From [email protected] Wed Jan 21 04:00:59 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id EAA14981
-   for ; Wed, 21 Jan 1998 04:00:56 -0500 (EST)
-Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id DAA02432 for ; Wed, 21 Jan 1998 03:46:22 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id DAA12583; Wed, 21 Jan 1998 03:45:43 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Wed, 21 Jan 1998 03:44:07 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id DAA12288 for pgsql-hackers-outgoing; Wed, 21 Jan 1998 03:44:02 -0500 (EST)
-Received: from gandalf.sd.spardat.at (gandalf.telecom.at [194.118.26.84]) by hub.org (8.8.8/8.7.5) with ESMTP id DAA12263 for ; Wed, 21 Jan 1998 03:43:18 -0500 (EST)
-Received: from sdgtw.sd.spardat.at (sdgtw.sd.spardat.at [172.18.99.31])
-   by gandalf.sd.spardat.at (8.8.8/8.8.8) with ESMTP id JAA38408
-   for ; Wed, 21 Jan 1998 09:42:55 +0100
-Received: by sdgtw.sd.spardat.at with Internet Mail Service (5.0.1458.49)
-   id ; Wed, 21 Jan 1998 09:42:55 +0100
-Message-ID: <[email protected]>
-From: Zeugswetter Andreas DBT 
-To: "'[email protected]'" 
-Subject: [HACKERS] Re: subselects
-Date: Wed, 21 Jan 1998 09:42:52 +0100
-X-Priority: 3
-MIME-Version: 1.0
-X-Mailer: Internet Mail Service (5.0.1458.49)
-Content-Type: text/plain
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-Bruce wrote:
-> I have completed adding Var.varlevelsup, and have added code to the
-> parser to properly set the field.  It will allow correlated references
-> in the WHERE clause, but not in the target list.
-
-select i2.ip1, i1.ip4 from nameip i1 where ip1 = (select ip1 from nameip
-i2);
-   522: Table (i2) not selected in query.
-select i1.ip4 from nameip i1 where ip1 = (select i1.ip1 from nameip i2);
-   284: A subquery has returned not exactly one row.
-select i1.ip4 from nameip i1 where ip1 = (select i1.ip1 from nameip i2
-where name='zeus');
- 2 row(s) retrieved.
-
-Informix allows correlated references in the target list. It also allows
-subselects in the target list as in:
-select i1.ip4, (select i1.ip1 from nameip i2) from nameip i1;
-   284: A subquery has returned not exactly one row.
-select i1.ip4, (select i1.ip1 from nameip i2 where name='zeus') from
-nameip i1;
- 2 row(s) retrieved.
-
-Is this what you were looking for ?
-
-Andreas
-
-
-From [email protected] Wed Jan 21 05:31:02 1998
-Received: from renoir.op.net ([email protected] [209.152.193.4])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id FAA15884
-   for ; Wed, 21 Jan 1998 05:31:01 -0500 (EST)
-Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.14 $) with ESMTP id FAA04709 for ; Wed, 21 Jan 1998 05:16:16 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id FAA05191; Wed, 21 Jan 1998 05:15:42 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Wed, 21 Jan 1998 05:14:02 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id FAA04951 for pgsql-hackers-outgoing; Wed, 21 Jan 1998 05:13:57 -0500 (EST)
-Received: from dune.krasnet.ru (www.krasnet.ru [193.125.44.86]) by hub.org (8.8.8/8.7.5) with ESMTP id FAA04610 for ; Wed, 21 Jan 1998 05:12:18 -0500 (EST)
-Received: from sable.krasnoyarsk.su (dune.krasnet.ru [193.125.44.86])
-   by dune.krasnet.ru (8.8.7/8.8.7) with ESMTP id RAA01918;
-   Wed, 21 Jan 1998 17:10:24 +0700 (KRS)
-   (envelope-from [email protected])
-Message-ID: <[email protected]>
-Date: Wed, 21 Jan 1998 17:10:22 +0700
-From: "Vadim B. Mikheev" 
-Organization: ITTS (Krasnoyarsk)
-X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
-MIME-Version: 1.0
-To: Bruce Momjian 
-CC: PostgreSQL-development 
-Subject: [HACKERS] Re: subselects
-References: <[email protected]>
-Content-Type: text/plain; charset=us-ascii
-Content-Transfer-Encoding: 7bit
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-Bruce Momjian wrote:
-> 
-> We are only going to have subselects in the WHERE clause, not in the
-> target list, right?
-> 
-> The standard says we can have them either place, but I didn't think we
-> were implementing the target list subselects.
-> 
-> Is that correct?
-
-Yes, this is right for 6.3. I hope that we'll support subselects in 
-target list, FROM, etc in future.
-
-BTW, I'm going to implement subselect in (let's say) "natural" way -
-without substitution of parent query relations into subselect and so on,
-but by execution of (correlated) subqueries for each upper query row
-(may be with cacheing of results in hash table for better performance).
-Sure, this is much more clean way and much more clear how to do this.
-This seems like SQL-func way, but funcs start/run/stop Executor each time
-when called and this breaks performance. 
-
-Vadim
-
-
-From [email protected] Wed Jan 21 10:02:02 1998
-Received: from hub.org (hub.org [209.47.148.200])
-   by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id KAA20456
-   for ; Wed, 21 Jan 1998 10:02:01 -0500 (EST)
-Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id KAA06778; Wed, 21 Jan 1998 10:02:13 -0500 (EST)
-Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Wed, 21 Jan 1998 10:00:41 -0500 (EST)
-Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id KAA06544 for pgsql-hackers-outgoing; Wed, 21 Jan 1998 10:00:37 -0500 (EST)
-Received: from u1.abs.net ([email protected] [207.114.0.131]) by hub.org (8.8.8/8.7.5) with ESMTP id KAA06326 for ; Wed, 21 Jan 1998 10:00:03 -0500 (EST)
-Received: from insightdist.com (nobody@localhost)
-   by u1.abs.net (8.8.5/8.8.5) with UUCP id JAA08009
-   for [email protected]; Wed, 21 Jan 1998 09:40:29 -0500 (EST)
-X-Authentication-Warning: u1.abs.net: nobody set sender to insightdist.com!darrenk using -f
-Received: by insightdist.com (AIX 3.2/UCB 5.64/4.03)
-          id AA33174; Wed, 21 Jan 1998 09:26:09 -0500
-Received: by ceodev (AIX 4.1/UCB 5.64/4.03)
-          id AA36452; Wed, 21 Jan 1998 09:13:05 -0500
-Date: Wed, 21 Jan 1998 09:13:05 -0500
-From: [email protected] (Darren King)
-Message-Id: <9801211413.AA36452@ceodev>
-To: [email protected]
-Subject: Re: [HACKERS] subselects
-Mime-Version: 1.0
-Content-Type: text/plain; charset=US-ASCII
-Content-Transfer-Encoding: 7bit
-Content-Md5: 4wI6dUsUAXei+yg3JycjGw==
-Sender: [email protected]
-Precedence: bulk
-Status: OR
-
-> We are only going to have subselects in the WHERE clause, not in the
-> target list, right?
-> 
-> The standard says we can have them either place, but I didn't think we
-> were implementing the target list subselects.
-> 
-> Is that correct?
-
-What about the HAVING clause?  Currently not in, but someone here wants
-to take a stab at it.
-
-Doesn't seem that tough...loops over the tuples returned from the group
-by node and checks the expression such as "x > 5" or "x = (subselect)".
-
-The cost analysis in the optimizer could be tricky come to think of it.
-If a subselect has a HAVING, would have to have a formula to determine
-the selectiveness.  Hmmm...
-
-darrenk
-
-
author	Bruce Momjian
	Mon, 29 Jan 2001 17:52:47 +0000 (17:52 +0000)
committer	Bruce Momjian
	Mon, 29 Jan 2001 17:52:47 +0000 (17:52 +0000)
doc/TODO		patch \| blob \| blame \| history
doc/TODO.detail/subquery	[deleted file]	patch \| blob \| blame \| history