In SQL, to retrieve data stored in our tables, we use the SELECT statement. The result of this statement is always in the form of a table that we can view with our database client software or use with programming languages to build dynamic web pages or desktop applications. While the result may look like a table, it is not stored in the database like the named tables are. The result of a SELECT statement can also be used as part of another statement.
Basic syntax of SELECT
statement
The basic syntax consists of four clauses as shown in the figure below. While SQL is not case sensitive, by convention many database developers use uppercase for keywords to improve readability.
SELECT {attribute}+
FROM {table}+
[ WHERE {boolean predicate to pick rows} ]
[ ORDER BY {attribute}+ ];
The four basic clauses of a SQL SELECT
statement.Of the four clauses, only the first two are required. The two shown in square brackets are optional. When you start learning to build queries, it is helpful to follow a specific step-by-step sequence, look at the data after each modification to the query, and be sure that you understand the results at each step. This iterative refinement will allow you to hone in on just the right SQL statement to retrieve the desired information. Below is a summary of the clauses.
- The
SELECT
clause allows us to specify a comma-separated list of attribute names corresponding to the columns that are to be retrieved. You can use an asterisk character, *, to retrieve all the columns. - In queries where all the data is found in one table, the
0 clause is where we specify the name of the table from which to retrieve rows. In other articles we will use it to retrieve rows from multiple tables.SELECT * FROM customers;
- The
1 clause is used to constrain which rows to retrieve. We do this by specifying a boolean predicate that compares the values of table columns to literal values or to other columns.SELECT * FROM customers;
- The
2 clause gives us a way to order the display of the rows in the result of the statement.SELECT * FROM customers;
The example of the next section provides more information on how to retrieve information using this SELECT
statement.
SQL Example: customers in a specified zip code
We’ll build a list of customers who live in a specific zip code area, showing their first and last names and phone numbers and listing them in alphabetical order by last name. A company might want to do this to initiate a marketing campaign to customers in this area. In this example, we’ll use zip code 90840. Listed below are the refinement steps we take to arrive at the statement that will retrieve what we need.
- Start by retrieving all of the relevant data; in this case, that is all data of every customer. In our database all of this is stored in only one table, so that table is specified in the FROM clause. Since we want to retrieve all columns from this table, instead of naming each of them individually, we can use the abbreviation symbol * to indicate that all columns are to be retrieved. That completes the recipe for our SQL statement which is shown below; note, we have no use for the two optional clauses in this initial statement. In the same figure below, you will also find the result of this query executed on a tiny database.
CustomersTomJewett714-555-121210200 Slater92708AlvaroMonge562-333-41412145 Main90840WayneDick562-777-30301250 Bellflower90840SQL statement to retrieve all customers and the result setWhile the result of a query is known as a result set, the result is not in fact always a set. The result could be a multiset, that is, a collection of rows that can have duplicate rows.SELECT * FROM customers;
- Clearly we need to a refinement step as the query retrieves all customers while we are only interested in customers who live in zip code 90840. We need to specify in the statement that the only rows to retrieve from the database are those that meet this criteria. Such qualifying criteria is specified in the
1 clause using boolean expressions. Our first statement is thus refined as shown in the figure below.SELECT * FROM customers;
Customers in zip code 90840AlvaroMonge562-333-41412145 Main90840WayneDick562-777-30301250 Bellflower90840Refinement #2 to retrieve desired customers.Note that SQL syntax requires the use of single quotes around literal strings likeSELECT * FROM customers WHERE cZipCode = '90840';
5. While not illustrated in this example and unlike SQL keywords, literal strings and strings stored in the database are case sensitive; thus,SELECT * FROM customers;
6 is a different string thanSELECT * FROM customers;
7.SELECT * FROM customers;
- We need just a couple of more refinements. While we now are retrieving only the customers we desire, we are also retrieving every column from the table yet, not all are needed. We need a way to pick the attributes [columns] we want. This is done by listing them in the
SELECT
clause, each column name separated by a comma. The figure below shows this refinement and its corresponding result set.
Columns from SELECTMongeAlvaro562-333-4141DickWayne562-777-3030Refinement #3 to retrieve specific columns.Note that changing the order of the columns [like showing the last name first] does not change the meaning of the results.SELECT cLastName, cFirstName, cPhone FROM customers WHERE cZipCode = '90840';
- For practical purposes our last refinement is all that we need. To make the result set more appealing to a human, we may want to order the result set. Imagine having a result set that is 100 times of what we are showing here! It would be better to display the result sorted alphabetically by the name of the customer. In SQL, you can use the
2 clause to specify the order in which to retrieve the results. Once again, this ordering does not change the meaning of the results; the result set does not change, all it changes is the order in which the rows are displayed. This final refinement and its result are shown below.SELECT * FROM customers;
Rows in orderDickWayne562-777-3030MongeAlvaro562-333-4141Refinement #4 to order the rows in the result.The keywordSELECT cLastName, cFirstName, cPhone FROM customers WHERE cZipCode = '90840' ORDER BY cLastName ASC, cFirstName ASC;
0 is used to order the rows in ascending values, which is the default ordering so the keyword is not necessary and is shown here for completeness. To order rows in descending values, use the keywordSELECT * FROM customers WHERE cZipCode = '90840';
1. In the statement above, rows are first ordered in ascending value of the last name and in case of ties [two or more customers with the same name], then the rows are ordered in ascending value of the first name.SELECT * FROM customers WHERE cZipCode = '90840';
Retrieval with relational algebra
SQL is a declarative language. As such, SQL is used to declare what is to be retrieved from the database. In our SQL statement above, we did not specify how to retrieve the result. In an imperative language, we do specify the steps to take to solve a problem, such as how to retrieve a result from a database. Thus, it is the responsibility of the database system to determine how to retrieve what is declared in SQL. In relational database systems, this is commonly done by translating SQL into Relational Algebra.
Like all algebras, RA applies operators to operands to produce results of the same type as the operands. RA operands are relations and thus the results are also relations. Furthermore, like all algebras, the results of operators can be used as operands in building more complex expressions. We introduce two of the RA operators following the example and refinements above for SQL.
RA operators: σ and π
To retrieve a single relation in RA, we only need to use its name. The common notation in the relational model is to use uppercase letters for relation scheme [R, S, T, U, etc] and lowercase letters for relations [r, s, t, u, etc]. Thus, the simplest RA expression is to retrieve all columns and every row of a relation is just the name of the relation: r
The two RA operators introduced here are σ, the select operator, and π, the project operator.
- The select [RA] operator specified by the symbol σ picks tuples that satisfy a predicate; thus, serving a similar purpose as the SQL
1 clause. This RA select operator σ is unary taking a single relation or RA expression as its operand. The predicate, θ, to specify which tuples are required is written as a subscript of the operator, giving the syntax ofSELECT * FROM customers;
3, where e is a RA expression.SELECT * FROM customers WHERE cZipCode = '90840';
The scheme of the result of
4 is R—the same scheme we started with—since the entire tuple is selected, as long as the tuple satisfies the predicate. The result of this operation includes all tuples of relation r that satisfy the predicate θ—that is, θ evaluates to true.SELECT * FROM customers WHERE cZipCode = '90840';
- The project [RA] operator specified by the symbol π picks attributes, confusingly like the SQL
SELECT
clause. It is also a unary operator that takes a single relation or expression as its operand and the attributes to retrieve are specified as a a subscheme, X [subset of its operand]. The syntax is
6 where, as before, e is a RA expression. Following are additional properties of the project operator.SELECT * FROM customers WHERE cZipCode = '90840';
- For X to be a subscheme of R, it must be a subset of the attributes in R that preserves the assignment rule from R [that is, each attribute of X must have the same domain as its corresponding attribute in R].
- The scheme of the result of πXr is X. The tuples resulting from this operation are tuples of the original relation, r, but cut down to the attributes contained in X.
- If X is a super key of r, then there will be the same number of tuples in the result as there were to begin with in r. If X is not a super key of r, then any duplicate [non-distinct] tuples are eliminated from the result, ensuring the result is always a set. This is unlike SQL where the result of a
SELECT
statement with a
1 clause is a multiset.SELECT * FROM customers;
- As with other algebras, we can use function composition by applying the project operator to the result of the select operator from the previous set to get:
9SELECT * FROM customers WHERE cZipCode = '90840';
RA Example: customers in a specified zip code
Given the above RA syntax, we can now use RA to create expressions that match the SQL statements from above which retrieve the customers who live in zip code 90840.