IDS 410: 8 Advanced SQL

Processing Multiple Tables

SQL has two ways to use SELECT for combining data from related tables. The most frequently used relational operation, which brings together data from two or more related tables into one resultant table is called a join.

Join: A relational operation that causes two tables with a common domain to be combined into a single table or view.

An important rule of thumb in forming join conditions is the following: there should be one condition within the WHERE clause for each pair of tables being joined.

Four types of joins are discussed next:

Equi-join: A join in which the joining condition is based on equality between values in the common columns. Common columsn appear (redundnatly) in the result table

Query: What are the names of all customers who have placed orders?

 SELECT CUSTOMER_T.CUSTOMER_ID, ORDER_T.CUSTOMER_ID, CUSTOMER_NAME, ORDER_ID
   FROM CUSTOMER_T, ORDER_T
   WHERE CUSTOMER_T.CUSTOMER_ID = ORDER_T.CUSTOMER_ID;

In MS ACCESS, the syntax for the above query is as follows:

 SELECT Customer_t.Customer_ID, Order_t.Customer_ID, Customer_Name,    
   Order_ID
   FROM Customer_t INNER JOIN Order_t ON
   Customer_t.Customer_ID = Order_t.Customer_ID;

Natural-join: same as equi-join, except that one of the duplicate columns is eliminated in the result table. The natural join is the most commonly used form of join operation.

Query: For each customer who has placed an order, what is the customer’s name and order ID?

 SELECT CUSTOMER_NAME, ORDER_ID
     FROM CUSTOMER_T, ORDER_T
     WHERE CUSTOMER_T.CUSTOMER_ID = ORDER_T.CUSTOMER_ID;

Another multiple join query example: Assemble all information necessary to create an invoice for order number 1006

SELECT CUSTOMER_T.CUSTOMER_ID, CUSTOMER_NAME, CUSTOMER_ADDRESS, 
CITY, SATE, POSTAL_CODE, ORDER_T.ORDER_ID, ORDER_DATE, 
QUANTITY, PRODUCT_NAME, UNIT_PRICE,      (QUANTITY * UNIT_PRICE)
     FROM CUSTOMER_T, ORDER_T, ORDER_LINE_T, PRODUCT_T
     WHERE CUSTOMER_T.CUSTOMER_ID = ORDER_T.CUSTOMER_ID AND 
     ORDER_T.ORDER_ID = ORDER_LINE_T.ORDER_ID AND
     ORDER_LINE_T.PRODUCT_ID = PRODUCT_T.PRODUCT_ID AND
     ORDER_T.ORDER_ID = 1006;

The same multiple join example: assemble all information necessary to create an invoice for order number 1006. Use MS Access SQL syntax.

SELECT CUSTOMER_T.CUSTOMER_ID, CUSTOMER_NAME, CUSTOMER_ADDRESS, 
CITY, SATE, POSTAL_CODE, ORDER_T.ORDER_ID, ORDER_DATE, 
QUANTITY, PRODUCT_NAME, UNIT_PRICE,      (QUANTITY * UNIT_PRICE)
     FROM ((CUSTOMER_T inner join ORDER_T on CUSTOMER_T.CUSTOMER_ID = ORDER_T.CUSTOMER_ID)      inner join ORDER_LINE_T on ORDER_T.ORDER_ID = ORDER_LINE_T.ORDER_ID) inner      join PRODUCT_T on ORDER_LINE_T.PRODUCT_ID = PRODUCT_T.PRODUCT_ID
     WHERE ORDER_T.ORDER_ID = 1006;

Outer join: a join in which rows that do not have matching values in common columns are nevertheless included in the result table.

In joining two tables, we often find that a row in one table does not have a matching row in the other table. For example, several CUSTOMER_ID numbers do not appear in the ORDER_T table (see Figure 7-3 on pages 263). We can assume that this is because those customers have not placed orders. As a result, the equi-join and natural join do not include all of the customers shown in CUSTOMER_T. The organization may be very interested in identifying those customers who have not placed orders. Using an outer join, we can produce the information about these customers.

Query: List customer name, customer ID, and order ID for all customers listed in the CUSTOMER table. Include the customer ID and name even if there is no order available for that customer.

 SELECT CUSTOMER_T.CUSTOMER_ID, CUSTOMER_NAME, ORDER_ID
     FROM CUSTOMER_T LEFT OUTER JOIN ORDER_T
     WHERE CUSTOMER_T.CUSTOMER_ID = ORDER_T.CUSTOMER_ID;

Alternatively:

 SELECT CUSTOMER_T.CUSTOMER_ID, CUSTOMER_NAME, ORDER_ID
     FROM ORDER_T RIGHT OUTER JOIN CUSTOMER_T
     WHERE CUSTOMER_T.CUSTOMER_ID = ORDER_T.CUSTOMER_ID;

It is also possible to request a FULL OUTER JOIN, in which case, all rows would be matched and returned, including rows that do not have a match in the other table.

Union Join: Specified in SQL-99; includes all data from each table that is joined. Not to be confused with the UNION command used to join multiple SELECT statements, covered later in this chapter.

Subqueries

The preceding SQL examples illustrate one of the two basic approaches for joining two tables: the joining technique. The other approach is the subquery technique, which involves placing an inner query (SELECT, FROM, WHERE) within the WHERE or HAVING clause of another (outer) query.

Subquery: placing an inner query (SELECT statement) inside an outer query

Options:

In a condition of the WHERE clause
As a “table” of the FROM clause
Within the HAVING clause

Subqueries can be: non-correlated (execute once for the entire outer query) or correlated (execute once for each row returned by the outer query).

Sometimes either the joining or the subqueries technique may be used to accomplish the same result; at other times, only a join or a subquery will work. The joining technique is useful when data from several relations are to be retrieved and displayed, and the relationships are not necessarily nested. The subquery technique may be used when we only need to display data from the table in the outer query.

Query: What is the name and address of the customer who placed order number 1008?

 SELECT CUSTOMER_NAME, CUSTOMER_ADDRESS, CITY, STATE, POSTAL_CODE
     FROM CUSTOMER_T, ORDER_T
     WHERE CUSTOMER_T.CUSTOMER_ID = ORDER_T.CUSTOMER_ID AND ORDER_ID = 1008;
     Using subqueries to accomplish the same result:
     SELECT CUSTOMER_NAME, CUSTOMER_ADDRESS, CITY, STATE, POSTAL_CODE
     FROM CUSTOMER_T
     WHERE CUSTOMER_T.CUSTOMER_ID =
     (SELECT ORDER_T.CUSTOMER_ID
     FROM ORDER_T
     WHERE ORDER_ID = 1008);

Query: Which customers have placed orders?

 SELECT CUSTOMER_NAME
     FROM CUSTOMER_T
     WHERE CUSTOMER_ID IN
     ( SELECT DISTINCT CUSTOMER_ID
     FROM ORDER_T);

Query: Which customers have not placed any orders?

 SELECT CUSTOMER_NAME
     FROM CUSTOMER_T
     WHERE CUSTOMER_ID NOT IN
     ( SELECT DISTINCT CUSTOMER_ID
     FROM ORDER_T);

Note that the qualifiers NOT, ANY, and ALL may be used in front of IN or with logical operators such as =, >, and <.

Query: Which customers have not placed any orders for computer desks?

 SELECT CUSTOMER_NAME
     FROM CUSTOMER_T
     WHERE CUSTOMER_ID NOT IN
     ( SELECT CUSTOMER_ID
     FROM ORDER_T, ORDER_LINE_T, PRODUCT_T
     WHERE ORDER_T.ORDER_ID = ORDER_LINE_T.ORDER_ID AND
     ORDER_LINE_T.PRODUCT_ID = PRODUCT_T.PRODUCT_ID AND
     PRODUCT_NAME = ‘Computer Desk’
     );

Two other conditions associated with using subqueries are EXISTS and NOT EXISTS. (See p. 302)

Correlated Subqueries

Correlated Subqueries: In SQL, a subquery in which processing the inner query depends on data from the outer query. That is, the inner query is computed for each outer row (as opposed to being computed only once for all rows processed in the outer query.

Does make use of data from the outer query
Executes once for each row of the outer query
Can make use of the EXISTS operator

Correlated subquery. Query: List the details about the product with the highest unit price.

 SELECT PRODUCT_NAME, PRODUCT_FINISH, UNIT_PRICE
     FROM PRODUCT_T PA
     WHERE UNIT_PRICE > ALL
     ( SELECT UNIT_PRICE FROM PRODUCT_T PB
     WHERE PB.PRODUCT_ID != PA.PRODUCT_ID);

Alternatively, we can use a non-correlated query (discussed next) to get the same result:

 SELECT PRODUCT_NAME, PRODUCT_FINISH, UNIT_PRICE
     FROM PRODUCT_T
     WHERE UNIT_PRICE = (SELECT MAX (UNIT_PRICE) FROM PRODUCT_T);

Query: What are the order numbers for all orders that have included furniture finished in natural ash?

SELECT DISTINCT ORDER_ID FROM ORDER_LINE_T
     WHERE EXISTS
     (SELECT * FROM PRODUCT_T 
     WHERE PRODUCT_ID = ORDER_LINE_T.PRODUCT_ID 
     AND PRODUCT_FINISH = 'Natural ash');

Non-correlated subqueries

Do not depend on data from the outer query
Executes once for the entire outer query

Using Derived Tables

Subqueries are not limited to inclusion in the WHERE clause. They may also be used in the FROM clause, creating a temporary derived table that is used in the query. Creating a derived table that has an aggregate value in it, such as MAX, AVG, or MIN allow the aggregate to be used in the WHERE clause.

Query: Which products have a unit price that is higher than the average unit price?

 SELECT Product_Name, Unit_Price, AvgPrice
     FROM (SELECT AVG (Unit_Price) AvgPrice FROM Product_T), Product_T
     WHERE Unit_Price > AvgPrice;

Combining Queries

The UNION clause is used to combine the output from multiple queries together into a single result table. In order to use the UNION clause, each query involved must output the same number of columns (book says rows, Sunjaya says columns!), and they must be UNION-compatible. This means that the output from each query for each column should be of compatible data types. When performing a union where output for a column will merge two different datatypes, it is safest to use the CAST command to control the data conversion ourselves. For example, the DATE datatype in ORDER_T might need to be converted to a text datatype:

SELECT CAST(ORDER_DATE AS CHAR) FROM ORDET_T;

Query: Determine the customer(s) who have purchased the largest quantity of any Pine Valley product, and the customer(s) who have purchased the smallest quantity, and return the results in one table.

SELECT C1.Customer_ID, Customer_Name, Ordered_Quantity, 'Largest Quantity' Quantity
     FROM Customer_t C1, Order_t O1, Order_Line_t Q1
     WHERE C1.Customer_ID = O1.Customer_ID AND
     O1.Order_ID = Q1.Order_ID AND Ordered_Quantity = 
     (SELECT MAX (Ordered_Quantity) FROM Order_Line_t)
 UNION
     SELECT C1.Customer_ID, Customer_Name, Ordered_Quantity, 'Smallest Quantity' Quantity
     FROM Customer_t C1, Order_t O1, Order_Line_t Q1
     WHERE C1.Customer_ID = O1.Customer_ID AND
     O1.Order_ID = Q1.Order_ID AND Ordered_Quantity = 
     (SELECT MIN (Ordered_Quantity) FROM Order_Line_t)
     ORDERED BY Ordered_Quantity;

Conditional Expressions

Establishing IF-THEN-ELSE logical processing within SQL can (now) be accomplished using the CASE keyword in a statement. The CASE form can be constructed using either an expression that equates to a value or a predicate:

The predicate form is based on three-value logic, (true, false, don't know) but allows for more complex operations
The value expression form requires a match to the value expression

NULLIF and COALESCE are the keywords associated with the other two forms of the CASE expression.

Ensuring Transaction Integrity

Transaction: a discrete unit of work that must be completely processed or not processed at all. Or, the complete set of closely related update commands that must all be done, or none of them done, for the database to remain valid.

When a single SQL command constitutes a transaction, some RDBMSs will automatically commit or rollback after the command is run. With user-defined transactions with multiple SQL commands however, there is need for explicit commands to either entirely commit (COMMIT WORK) or entire rollback (ROLLBACK WORK) the transaction. Most systems will have a BEGIN TRANSACTION and END TRANSACTION (or COMMIT WORK) commands, which are used to mark the boundaries of a logical unit of work. There is also an AUTOCOMMIT (ON/OFF) command in some RDBMSs which specified whether changes are made permanently after each data modification command (ON) or only when work is explicitly made permanent (OFF) by the COMMIT WORK command.

(BEGIN TRANSACTION creates a log file and starts recording changes (insertions, deletions, and updates) to the database in this file. END TRANSACTION or COMMIT WORK takes the contents of the log file and starts recording all changes to the database. ROLLBACK WORK asks SQL to empty the log file.)

Further, some SQl systems have concurrency controls that handle the updating of a shared database by concurrent users. These can journalize database changes, so that a database can be recovered after abnormal termination in the middle of a transaction. They can also undo erroneous transactions.

Data Dictionary Facilities

RDMSs store database definition information in system-created tables; we can consider these system tables a data dictionary, and can be accessed using SQL SELECT statements that can generate reports about system usage, user priviledges, constraints, etc.

SQL-99 Enhancements and Extensions to SQL

p. 309

Triggers and Routines

Triggers

Trigger: A named set of SQL statements that are considered (triggered) when a data modification (INSERT, UPDATE, DELETE) occurs. If a condition stated within the trigger is met, then a prescribed action is taken.

Triggers are database objects that are stored in the database and controlled by the DBMS; the code required to create them is stored in only one location and is administered centrally (they execute against all applications that access the database). This promotes stronger data integrity and consistency of use within the database.

Triggers can also cascade, causing other triggers to fire; thus, a single request from a client can result in a series of integrity or logic checks being performed in the server without causing extensive network traffic between the client and server. Triggers can also be used to ensure referential integrity, enforce business rules, create audit trails, replicate tables, or activate a procedure (routine).

Both triggers and routines consist of procedural code. However, while trigger codes runs automatically whenever the triggering event occurs, routines do not run automatically; they have to be called in order to operate.

Triggers have three parts:

The event
The condition
The action

Simplified Oracle PL/SQL trigger syntax:

 CREATE [OR REPLACE] TRIGGER trigger_name
     {BEFORE|AFTER}{INSERT|DELETE|UPDATE} ON table_name
     [FOR EACH ROW [WHEN (trigger_condition)]]
     Trigger_body_here;

A simple example of a trigger written in PL/SQL:

CREATE OR REPLACE TRIGGER Order_ID_BIR
     BEFORE INSERT ON Order_t
     FOR EACH ROW
     BEGIN
     SELECT ID_Sequence.NextVal
     INTO :New.Order_ID
     FROM DUAL;
     END Order_ID_BIR;

In the preceding example, the trigger will automatically insert the order number whenever a new order is added. BIR is part of a trigger naming convention and stands for Before Insert Row. Triggers may occur before or after the statement that aroused the trigger is executed.

Routines

Routine: a stored block of code that have to be called in order to operate. It is like a trigger, except that it does not run automatically.

SQL-invoked routines can be either procedures or functions.

Function: A stored subroutine that returns one value and has only input parameters.

Procedure: A collection of procedural and SQL statements that are assigned a unique name within the schema and stored in the database. They may have input parameters, output parameters, and parameters that are both input and output parameters.

Advantages of SQL-invoked routines:

Flexibility. Routines may be used in more situations that constrainsts or triggers, which are limited to data modification circumstances
Efficiency. Routines can be carefully crafted and optimized to run more quickly than generic SQL statements
Sharability. Routines may be cached on the server and made available to all user, so that they do not have to be rewritten
Appplicability. Routines may apply to the entire database rather tahn being kept with one application. (This advantage is a corollary to sharability)

In order to build a simple procedure that will set a sale price, the existing PRODUCT_T table is altered by adding a new column, Sale_Price, that will hold the sale price for the product:

 ALTER TABLE PRODUCT_T
     ADD (SALE_PRICE DECIMAL (6, 2));

The following simple procedure will execute two SQL statements. Products with a Unit_Price of $400 or higher are discounted 10 percent, and products with a Unit_Price of less than $400 are discounted 15 percent. The Oracle code module that will create and store the procedure named PRODUCT_LINE_SALE is as follows:

CREATE OR REPLACE PROCEDURE Product_Line_Sale
     AS BEGIN
     UPDATE PRODUCT_T
     SET Sale_Price = 0.90 * Unit_Price
     WHERE Unit_Price >= 400;
     UPDATE PRODUCT_T
     SET Sale_Price = 0.85 * Unit_Price
     WHERE Unit_Price < 400;
     END;

To run the procedure in Oracle, use this command:

 SQL> EXEC Product_Line_Sale

Embedded SQL and Dynamic SQL

Embedded SQL

SQL was originally created to handle database access alone, and did not have the flow control or the other structures necessary to create an application.

Embedded SQL: The process of including hard-coded SQL statements in a program written in another programming language, such as C, Java, Visual Basic, ASP, etc.

Programmers write an API (application programming interface) to achieve the interface between languages. The Open Database Connectivity (ODBC) standard is the most commonly used API today. Java Database Connectivity (JDBC) is an industry standard used for connecting fom Java. It is not yet an ISO standard).

To embed SQL, we place the SQL statement within the source code of the host program, preded with the phrase EXEC SQL. A precompiler will convert the SQL statement(s) that the host language can interpret, generating a database request module (DBRM). Once the host file is compiled into object code, a linker program links the DBMS calls in the host language to a DBMS library. A program usually called BIND processes the SQL statements in the access module, parsing and validating each statemetn and deriving an access plan that attempts to optimize the statements' execution.

Dynamic Execution

Dynamic SQL: The process of making an application capable of generating specific SQL code on the fly (by deriving the precise SQL statement at runtime), as the application is processing.

Most programmers write an API, such as ODBC, which can then be passed through any ODBC-compliant database. Dynamic SQL is central to most Internet applications. The developer is able to create a more flexible application because the exact SQL query is determined at runtime, including the number of parameters to be passed, which tables will be accessed, etc. Dynamic SQL is very useful where an SQL statement shell will be used repeatedly, with different parameter values being inserted each time it executes.

Conclusion

The next few years will be a period in which SQL-99 compliant products exist side-by-side with older but entrenched versions (created by various vendors independently). We need to be aware of these possibilities and deal with them.