- <Chapter ID="query">
- <TITLE>The Query Language>
+ <chapter id="query">
+ <title>The Query Language>
- <Para>
- The <ProductName>Postgresame> query language is a variant of
- the <Acronym>SQL3cronym> draft next-generation standard. It
+ <para>
+ The <productname>Postgresame> query language is a variant of
+ the <acronym>SQL3cronym> draft next-generation standard. It
has many extensions such as an extensible type system,
inheritance, functions and production rules. These are
- features carried over from the original <ProductName>Postgresame> query
- language, <ProductName>PostQuelame>. This section provides an overview
- of how to use <ProductName>Postgresame>
- <Acronym>SQLcronym> to perform simple operations.
+ features carried over from the original <productname>Postgresame> query
+ language, <productname>PostQuelame>. This section provides an overview
+ of how to use <productname>Postgresame>
+ <acronym>SQLcronym> to perform simple operations.
This manual is only intended to give you an idea of our
- flavor of <Acronym>SQLcronym> and is in no way a complete tutorial on
- <Acronym>SQLcronym>. Numerous books have been written on
- <Acronym>SQLcronym>, including
+ flavor of <acronym>SQLcronym> and is in no way a complete tutorial on
+ <acronym>SQLcronym>. Numerous books have been written on
+ <acronym>SQLcronym>, including
[MELT93] and [DATE97].
You should be aware that some language features
- are extensions to the <Acronym>ANSIcronym> standard.
- Para>
+ are extensions to the <acronym>ANSIcronym> standard.
+ para>
- <Sect1>
- <Title>Interactive Monitoritle>
+ <sect1>
+ <title>Interactive Monitoritle>
- <Para>
+ <para>
In the examples that follow, we assume that you have
created the mydb database as described in the previous
- subsection and have started <Application>psqlpplication>.
+ subsection and have started <application>psqlpplication>.
Examples in this manual can also be found in
- <FileName>/usr/local/pgsql/src/tutorial/ame>. Refer to the
- <FileName>READMEame> file in that directory for how to use them. To
+ <filename>/usr/local/pgsql/src/tutorial/ame>. Refer to the
+ <filename>READMEame> file in that directory for how to use them. To
start the tutorial, do the following:
- <ProgramListing>
+ <programlisting>
% cd /usr/local/pgsql/src/tutorial
% psql -s mydb
Welcome to the POSTGRESQL interactive sql monitor:
You are currently connected to the database: postgres
mydb=> \i basics.sql
- ProgramListing>
- Para>
+ programlisting>
+ para>
- <Para>
- The <Literal>\iiteral> command read in queries from the specified
- files. The <Literal>-siteral> option puts you in single step mode which
+ <para>
+ The <literal>\iiteral> command read in queries from the specified
+ files. The <literal>-siteral> option puts you in single step mode which
pauses before sending a query to the backend. Queries
- in this section are in the file <FileName>basics.sqlame>.
- Para>
+ in this section are in the file <filename>basics.sqlame>.
+ para>
- <Para>
- <Application>psqlpplication>
- has a variety of <Literal>\diteral> commands for showing system information.
+ <para>
+ <application>psqlpplication>
+ has a variety of <literal>\diteral> commands for showing system information.
Consult these commands for more details;
- for a listing, type <
Literal>\? at the psqlpplication> prompt.
- Para>
+ for a listing, type <
literal>\? at the psqlpplication> prompt.
+ para>
- <Sect1>
- <Title>Conceptsitle>
+ <sect1>
+ <title>Conceptsitle>
- <Para>
- The fundamental notion in <ProductName>Postgresame> is that of a class,
+ <para>
+ The fundamental notion in <productname>Postgresame> is that of a class,
which is a named collection of object instances. Each
instance has the same collection of named attributes,
and each attribute is of a specific type. Furthermore,
- each instance has a permanent <FirstTerm>object identifiererm>
- (<Acronym>OIDcronym>)
+ each instance has a permanent <firstterm>object identifiererm>
+ (<acronym>OIDcronym>)
that is unique throughout the installation. Because
-
SQL syntax refers to tables, we will use the terms
- table and class interchangeably.
- Likewise, an
SQL row is an
-
instance and
SQL columns
- are attributes.
+
SQL syntax refers to tables, we will use the terms
+ table and class interchangeably.
+ Likewise, an
SQL row is an
+ columns
+ are attributes.
As previously discussed, classes are grouped into
databases, and a collection of databases managed by a
- single <Application>postmasterpplication> process constitutes an installation
+ single <application>postmasterpplication> process constitutes an installation
or site.
- Para>
+ para>
- <Sect1>
- <Title>Creating a New Classitle>
+ <sect1>
+ <title>Creating a New Classitle>
- <Para>
+ <para>
You can create a new class by specifying the class
name, along with all attribute names and their types:
- <ProgramListing>
+ <programlisting>
CREATE TABLE weather (
city varchar(80),
temp_lo int, -- low temperature
prcp real, -- precipitation
date date
);
- ProgramListing>
+ programlisting>
- <Para>
+ <para>
Note that both keywords and identifiers are case-insensitive; identifiers can become
case-sensitive by surrounding them with double-quotes as allowed
-
Postgres SQL supports the usual
- float, real, smallint, char(N),
- varchar(N), date, time,
- and timestamp, as well as other types of general utility and
+
Postgres SQL supports the usual
+ float, real, smallint,
+char(N),
+ varchar(N), date, time,
+ and timestamp, as well as other types of general utility and
a rich set of geometric types. As we will
- see later, <ProductName>Postgresame> can be customized with an
+ see later, <productname>Postgresame> can be customized with an
arbitrary number of
user-defined data types. Consequently, type names are
not syntactical keywords, except where required to support special
- cases in the <Acronym>SQL92cronym> standard.
- So far, the <ProductName>Postgres create command
+ cases in the <acronym>SQL92cronym> standard.
+ So far, the <productname>Postgres CREATE command
looks exactly like
the command used to create a table in a traditional
relational system. However, we will presently see that
classes have properties that are extensions of the
relational model.
- Para>
+ para>
- <Sect1>
- <Title>Populating a Class with Instancesitle>
+ <sect1>
+ <title>Populating a Class with Instancesitle>
- <Para>
- The <Command>insertommand> statement is used to populate a class with
+ <para>
+ The <command>insertommand> statement is used to populate a class with
instances:
- <ProgramListing>
+ <programlisting>
INSERT INTO weather
VALUES ('San Francisco', 46, 50, 0.25, '11/27/1994');
- ProgramListing>
- Para>
+ programlisting>
+ para>
- <Para>
- You can also use the <Command>copyommand> command to perform load large
- amounts of data from flat (<Acronym>ASCIIcronym>) files.
+ <para>
+ You can also use the <command>copyommand> command to perform load large
+ amounts of data from flat (<acronym>ASCIIcronym>) files.
This is usually faster because the data is read (or written) as a single atomic
transaction directly to or from the target table. An example would be:
- <ProgramListing>
-COPY INTO weather FROM '/home/user/weather.txt'
+ <programlisting>
+COPY weather FROM '/home/user/weather.txt'
USING DELIMITERS '|';
- ProgramListing>
+ programlisting>
where the path name for the source file must be available to the backend server
machine, not the client, since the backend server reads the file directly.
- <Sect1>
- <Title>Querying a Classitle>
+ <sect1>
+ <title>Querying a Classitle>
- <Para>
+ <para>
The weather class can be queried with normal relational
- selection and projection queries. A
SQL select
+ selection and projection queries. A
SQL
+ select
statement is used to do this. The statement is divided into
a target list (the part that lists the attributes to be
returned) and a qualification (the part that specifies
any restrictions). For example, to retrieve all the
rows of weather, type:
- <ProgramListing>
-SELECT * FROM WEATHER;
- ProgramListing>
+ <programlisting>
+SELECT * FROM weather;
+ programlisting>
and the output should be:
- <ProgramListing>
+ <programlisting>
+--------------+---------+---------+------+------------+
|city | temp_lo | temp_hi | prcp | date |
+--------------+---------+---------+------+------------+
+--------------+---------+---------+------+------------+
|Hayward | 37 | 54 | | 11-29-1994 |
+--------------+---------+---------+------+------------+
- ProgramListing>
+ programlisting>
You may specify any arbitrary expressions in the target list. For example, you can do:
- <ProgramListing>
+ <programlisting>
SELECT city, (temp_hi+temp_lo)/2 AS temp_avg, date FROM weather;
- ProgramListing>
- Para>
+ programlisting>
+ para>
- <Para>
+ <para>
Arbitrary Boolean operators
- (<Command>and, or and notommand>) are
+ (<command>and, or and notommand>) are
allowed in the qualification of any query. For example,
- <ProgramListing>
+ <programlisting>
SELECT * FROM weather
WHERE city = 'San Francisco'
AND prcp > 0.0;
+--------------+---------+---------+------+------------+
|San Francisco | 46 | 50 | 0.25 | 11-27-1994 |
+--------------+---------+---------+------+------------+
- ProgramListing>
- Para>
+ programlisting>
+ para>
- <Para>
+ <para>
As a final note, you can specify that the results of a
- select can be returned in a <FirstTerm>sorted ordererm>
- or with <FirstTerm>duplicate instanceserm> removed.
+ select can be returned in a <firstterm>sorted ordererm>
+ or with <firstterm>duplicate instanceserm> removed.
- <ProgramListing>
+ <programlisting>
SELECT DISTINCT city
FROM weather
ORDER BY city;
- ProgramListing>
- Para>
+ programlisting>
+ para>
- <Sect1>
- <Title>Redirecting SELECT Queriesitle>
+ <sect1>
+ <title>Redirecting SELECT Queriesitle>
- <Para>
+ <para>
Any select query can be redirected to a new class
- <ProgramListing>
+ <programlisting>
SELECT * INTO TABLE temp FROM weather;
- ProgramListing>
- Para>
+ programlisting>
+ para>
- <Para>
- This forms an implicit <Command>createommand> command, creating a new
+ <para>
+ This forms an implicit <command>createommand> command, creating a new
class temp with the attribute names and types specified
- in the target list of the <Command>select intoommand> command. We can
+ in the target list of the <command>select intoommand> command. We can
then, of course, perform any operations on the resulting
class that we can perform on other classes.
- Para>
+ para>
- <Sect1>
- <Title>Joins Between Classesitle>
+ <sect1>
+ <title>Joins Between Classesitle>
- <Para>
+ <para>
Thus far, our queries have only accessed one class at a
time. Queries can access multiple classes at once, or
access the same class in such a way that multiple
effect, we need to compare the temp_lo and temp_hi
attributes of each EMP instance to the temp_lo and
temp_hi attributes of all other EMP instances.
- <Note>
- <Para>
+ <note>
+ <para>
This is only a conceptual model. The actual join may
be performed in a more efficient manner, but this is invisible to the user.
- Para>
- Note>
+ para>
+ note>
We can do this with the following query:
- <ProgramListing>
+ <programlisting>
SELECT W1.city, W1.temp_lo AS low, W1.temp_hi AS high,
W2.city, W2.temp_lo AS low, W2.temp_hi AS high
FROM weather W1, weather W2
+--------------+-----+------+---------------+-----+------+
|San Francisco | 37 | 54 | San Francisco | 46 | 50 |
+--------------+-----+------+---------------+-----+------+
- ProgramListing>
+ programlisting>
- <Note>
- <Para>
+ <note>
+ <para>
The semantics of such a join are
that the qualification
is a truth expression defined for the Cartesian product of
the classes indicated in the query. For those instances in
the Cartesian product for which the qualification is true,
- <ProductName>Postgresame> computes and returns the
+ <productname>Postgresame> computes and returns the
values specified in the target list.
- <
ProductName>Postgres SQLcronym>
+ <
productname>Postgres SQLcronym>
does not assign any meaning to
duplicate values in such expressions.
- This means that <ProductName>Postgresame>
+ This means that <productname>Postgresame>
sometimes recomputes the same target list several times;
this frequently happens when Boolean expressions are connected
with an "or". To remove such duplicates, you must use
- the <Command>select distinctommand> statement.
- Para>
- Note>
+ the <command>select distinctommand> statement.
+ para>
+ note>
- <Para>
+ <para>
In this case, both W1 and W2 are surrogates for an
instance of the class weather, and both range over all
instances of the class. (In the terminology of most
- database systems, W1 and W2 are known as <FirstTerm>range variableserm>.)
+ database systems, W1 and W2 are known as <firstterm>range variableserm>.)
A query can contain an arbitrary number of
class names and surrogates.
- Para>
+ para>
- <Sect1>
- <Title>Updatesitle>
+ <sect1>
+ <title>Updatesitle>
- <Para>
+ <para>
You can update existing instances using the update command.
Suppose you discover the temperature readings are
all off by 2 degrees as of Nov 28, you may update the
data as follow:
- <ProgramListing>
+ <programlisting>
UPDATE weather
SET temp_hi = temp_hi - 2, temp_lo = temp_lo - 2
WHERE date > '11/28/1994';
- ProgramListing>
- Para>
+ programlisting>
+ para>
- <Sect1>
- <Title>Deletionsitle>
+ <sect1>
+ <title>Deletionsitle>
- <Para>
- Deletions are performed using the <Command>deleteommand> command:
- <ProgramListing>
+ <para>
+ Deletions are performed using the <command>deleteommand> command:
+ <programlisting>
DELETE FROM weather WHERE city = 'Hayward';
- ProgramListing>
+ programlisting>
All weather recording belongs to Hayward is removed.
One should be wary of queries of the form
- <ProgramListing>
+ <programlisting>
DELETE FROM classname;
- ProgramListing>
+ programlisting>
- Without a qualification, <Command>deleteommand> will simply
+ Without a qualification, <command>deleteommand> will simply
remove all instances of the given class, leaving it
empty. The system will not request confirmation before
doing this.
- Para>
+ para>
- <Sect1>
- <Title>Using Aggregate Functionsitle>
+ <sect1>
+ <title>Using Aggregate Functionsitle>
- <Para>
+ <para>
Like most other query languages,
- <ProductName>PostgreSQLame> supports
+ <productname>PostgreSQLame> supports
aggregate functions.
An aggregate function computes a single result from multiple input rows.
For example, there are aggregates to compute the
- <Function>count, sumunction>,
- <Function>avg (average), maxunction> (maximum) and
- <Function>minunction> (minimum) over a set of instances.
+ <function>count, sumunction>,
+ <function>avg (average), maxunction> (maximum) and
+ <function>minunction> (minimum) over a set of instances.
- <Para>
+ <para>
It is important to understand the interaction between aggregates and
- SQL's <Command>where and havingommand> clauses.
- The fundamental difference between <Command>whereommand> and
- <Command>having is this: whereommand> selects
+ SQL's <command>where and havingommand> clauses.
+ The fundamental difference between <command>whereommand> and
+ <command>having is this: whereommand> selects
input rows before groups and aggregates are computed (thus, it controls
which rows go into the aggregate computation), whereas
- <Command>havingommand> selects group rows after groups and
+ <command>havingommand> selects group rows after groups and
aggregates are computed. Thus, the
- <Command>whereommand> clause may not contain aggregate functions;
+ <command>whereommand> clause may not contain aggregate functions;
it makes no sense to try to use an aggregate to determine which rows
will be inputs to the aggregates. On the other hand,
- <Command>havingommand> clauses always contain aggregate functions.
- (Strictly speaking, you are allowed to write a <Command>havingommand>
+ <command>havingommand> clauses always contain aggregate functions.
+ (Strictly speaking, you are allowed to write a <command>havingommand>
clause that doesn't use aggregates, but it's wasteful; the same condition
- could be used more efficiently at the <Command>whereommand> stage.)
+ could be used more efficiently at the <command>whereommand> stage.)
- <Para>
+ <para>
As an example, we can find the highest low-temperature reading anywhere
with
- <ProgramListing>
+ <programlisting>
SELECT max(temp_lo) FROM weather;
- ProgramListing>
+ programlisting>
If we want to know which city (or cities) that reading occurred in,
we might try
- <ProgramListing>
+ <programlisting>
SELECT city FROM weather WHERE temp_lo = max(temp_lo);
- ProgramListing>
+ programlisting>
but this will not work since the aggregate max() can't be used in
- <Command>whereommand>. However, as is often the case the query can be
+ <command>whereommand>. However, as is often the case the query can be
restated to accomplish the intended result; here by using a
- <FirstTerm>subselecterm>:
- <ProgramListing>
+ <firstterm>subselecterm>:
+ <programlisting>
SELECT city FROM weather WHERE temp_lo = (SELECT max(temp_lo) FROM weather);
- ProgramListing>
+ programlisting>
This is OK because the sub-select is an independent computation that
computes its own aggregate separately from what's happening in the outer
select.
- Para>
+ para>
- <Para>
+ <para>
Aggregates are also very useful in combination with
- <FirstTerm>group byerm> clauses. For example, we can get the
+ <firstterm>group byerm> clauses. For example, we can get the
maximum low temperature observed in each city with
- <ProgramListing>
+ <programlisting>
SELECT city, max(temp_lo)
FROM weather
GROUP BY city;
- ProgramListing>
+ programlisting>
which gives us one output row per city. We can filter these grouped
- rows using <Command>havingommand>:
- <ProgramListing>
+ rows using <command>havingommand>:
+ <programlisting>
SELECT city, max(temp_lo)
FROM weather
GROUP BY city
HAVING min(temp_lo) < 0;
- ProgramListing>
+ programlisting>
which gives us the same results for only the cities that have some
below-zero readings. Finally, if we only care about cities whose
names begin with 'P', we might do
- <ProgramListing>
+ <programlisting>
SELECT city, max(temp_lo)
FROM weather
WHERE city like 'P%'
GROUP BY city
HAVING min(temp_lo) < 0;
- ProgramListing>
+ programlisting>
Note that we can apply the city-name restriction in
- <Command>whereommand>, since it needs no aggregate. This is
- more efficient than adding the restriction to <Command>havingommand>,
+ <command>whereommand>, since it needs no aggregate. This is
+ more efficient than adding the restriction to <command>havingommand>,
because we avoid doing the grouping and aggregate calculations
- for all rows that fail the <Command>whereommand> check.
- Para>
+ for all rows that fail the <command>whereommand> check.
+ para>
- Chapter>
+ chapter>