CH 05
CH 05
Modifying Data
Objectives
• Learn to use Transact-SQL INSERT, UPDATE, and DELETE
statements.
• Learn to use bulk copy to insert rows.
• Understand how transaction isolation levels affect performance and
concurrency.
• Learn to avoid locking and blocking problems.
• Use snapshot isolation to improve concurrency.
The files associated with this chapter are located in the following folders:
{Install Folder}\Modifying
{Install Folder}\ModifyingDataLab
Modifying Data
The SQL standard draws a distinction between Data Manipulation Language
(DML) and Data Definition Language (DDL). SELECT statements are
considered to be part of DML, even though you might not think of selecting
data as a form of “manipulation.”
When you retrieve data with a SELECT statement, a result set is returned.
However, any further actions you take on that data (besides just looking at it)
involve data manipulation queries that use one of the following three
statements: INSERT, UPDATE, or DELETE. SQL queries that use these
statements are also sometimes known as action queries.
You’ll learn how to improve performance and avoid the locking and blocking
issues that can occur during updates later in this chapter. The first sections in
the chapter are devoted to learning how to use Transact-SQL to modify data.
Inserting Data
The INSERT statement adds new rows to a table, either single rows or
multiple rows. The basic syntax of an INSERT statement is:
See The simplest INSERT statement adds a single row to a particular table. Use the
ModifyingData.sql following query to add a new category called ‘Vitamins’ to the Categories
table:
With only one value to add to the Categories table, the syntax of the INSERT
statement reads: INSERT INTO followed by the name of the table the row
should be added to, and then, in parentheses, the list of columns that the data
goes into. For this statement, only one column, CategoryName, receives new
data.
After the column list comes the keyword VALUES, which indicates that the
next set of values in parentheses contains the data for the row. Each value must
be of the same data type as the column it goes into, and the columns and values
must be in the same order. Since CategoryName is a nvarchar data type, the
‘Vitamins’ value must be delimited by single quotes, indicating that it is a
string value compatible with the nvarchar data type.
TIP: When working with Unicode data types, such as nvarchar and ntext, it is
safest to preface all literal strings with an uppercase N. This indicates that the
string contains Unicode characters. Without the N prefix, the string is
converted to the default code page of the database, which may not be able to
handle certain characters. Here’s how the INSERT statement above would
look, using this syntax:
The INTO keyword is optional, although INTO is part of the SQL92 standard,
and helps make the SQL statement clearer. The columns list is also optional; if
you leave off the list, you must specify every column in the VALUES list, in
the order that the columns are stored in the table. The following query would
insert the values 1 and Topeka in a table with two columns of int and varchar
data types:
Also, it is best to use a column list every time you create an INSERT
statement, even if you don’t need one. Explicit code is always easier
to understand, and a column list also protects you against errors that
could occur later if a new column is added to the table.
Rows Affected
When an INSERT statement executes, no result set is returned. As with all
action queries, the only result you get back from SQL Server is a message like:
(1 rows(s) affected)
The message lets you know how many rows were changed by your action
query. This information is also returned to client applications. You can
suppress the return of rows affected by setting NOCOUNT ON:
Returns the last identity Returns the last identity Returns the last identity
value generated in any table value only within the value generated for a
in the current session. current execution scope. specific table in any session
and any scope.
Table 1. The three statements that return identity column values.
@@IDENTITY and IDENT_CURRENT may not return the value you expect
for a single-table insert. They are affected by triggers and other inserts. Use
SCOPE_IDENTITY when you want to retrieve the identity column value for
the table you are performing the insert on. Here is the syntax for the three
methods:
Single quotes within the data would confuse SQL Server, since it uses single
quotes to identify where strings begin and end. When you place two single
quotes together, SQL Server recognizes that the quote is actually within the
string. The following example shows two single quotes, not a double quote,
around ‘Sunshine’:
TIP: Date values also require single quote delimiters. If you use parameterized
queries to perform data modification from .NET code, then you don’t need to
worry about single quotes in your strings. This is only a concern when you
are executing SQL statements with embedded string values.
The following query inserts all of the rows in the Products table with a
CategoryID of 1 into the Beverages table.
Figure 3. The result set returned from the SELECT statement in the multi-row
INSERT query.
NOTE When you create a new table with SELECT INTO, the new table
schema will match the original schema, including identity
columns. The Produce table will have a ProductID that is an
identity column. The nullability of columns is also copied from the
source, but indexes, constraints, and triggers are not copied.
Temporary Tables
Although the SELECT INTO syntax allows you to create tables in a database,
you probably won’t want to use this technique very often in applications.
Allowing users to create tables at will introduces problems with backups,
security, and database clutter. But sometimes you’ll want to create temporary,
or temp, tables.
You can create temp tables so that they are accessible only by the connection
that creates them. Once that connection ends, the tables are automatically
destroyed, which eliminates all of the problems with persistent tables. Temp
tables are stored in the tempdb database, and are identified by the first
character defining its name: the pound symbol (#).
NOTE Temp tables created in stored procedures are destroyed when the
stored procedure terminates.
Namespace conflicts won’t occur, since every connection can have its own
#Produce in tempdb. Each #Produce table in tempdb has a suffix with an
underscore and the connection information to keep them separate. To select
data from the temp table, use the following query:
Another way to create temp tables is to use the standard CREATE TABLE
syntax, as shown in the following DDL statement:
The following query inserts data into the temp table. Note that the data types
you explicitly define in the temp table must match the data types defined in the
table you are selecting data from:
You can do virtually everything with a temporary table that you can do with a
standard table: create indexes, create defaults and rules, modify the table, and
use the temporary table anywhere that you use a table. Any objects created for
the temporary table, such as indexes, are automatically dropped when the
temporary table is dropped.
Temporary tables can be used in stored procedures, in which case the table is
handled slightly differently. Temp tables in stored procedures are dropped
when the stored procedure ends, and do not persist for the duration of the
connection.
In practical usage, you will find that global temp tables are rarely called for,
since temp tables are generally used only by the connection that creates them.
The following example creates a new table named Condiments and uses a table
variable named @NewValues to capture the identity column values when data
is inserted into the table.
SELECT
ProductName
FROM dbo.Products
WHERE CategoryID = 2
--Display the identity values
SELECT ProductID, ProductName FROM @NewValues
The first few rows of the result set are shown in Figure 4.
This example uses the BULK INSERT statement to load data from a text file
into SQL Server. The Products.txt file is a tab-delimited text file containing
product information. The data is in the following format. Note that there is no
header row in the text file.
1001 'Chai' 18
1002 'Chang' 19
The BULK INSERT syntax allows you to set a variety of options for inserting
the data and is fully documented in SQL Server Books Online. This example
uses a simple format specifying a tab field terminator and a newline row
terminator.
The Transact-SQL BULK INSERT statement specifies the location of the table
using the three-part naming syntax. The FROM clause specifies the name and
location of the source file, as shown here.
TIP: SQL Server Integration Services (SSIS) is a new component of SQL Server
2005 that replaces the old Data Transformation Services (DTS). SSIS
provides comprehensive support for importing data from a variety of sources,
as well as support for exporting and transforming data.
Updating Data
The Transact-SQL UPDATE statement allows you to modify existing data.
The basic syntax for updating data is:
UPDATE tablename
SET column1 = value1, column2 = value2...
However, if you execute such a statement, the values in all of the rows in the
table are updated. The WHERE clause shown here restricts the number of rows
that the statement affects.
UPDATE table_name
SET [table_name].column_name = expression
WHERE search_condition
UPDATE dbo.Categories
SET
Description = 'Pills that are good for you'
WHERE CategoryName = 'Vitamins';
Note the various elements of the UPDATE statement. First, it specifies the
name of the table that contains the data to be modified. The SET clause lists all
changes to be made, specifying the column name and the value to set the
column to. You can change multiple values by listing each field name and the
value to set it to. Use a comma to separate each pair.
UPDATE dbo.Products
SET
UnitPrice = UnitPrice * 1.1,
ReorderLevel = ReorderLevel + 5
WHERE CategoryID = 2
TIP: Always develop your data modification queries as SELECT queries first.
Accidentally changing all the customer addresses to the same value can cause
tremendous problems if you do not have a current backup available. By
testing the WHERE clause of your UPDATE statement, or any other action
query, you’ll be certain that you’re affecting only the rows you want to affect.
UPDATE dbo.Beverages
SET
dbo.Beverages.UnitPrice = dbo.Products.UnitPrice
FROM dbo.Beverages JOIN dbo.Products
ON dbo.Beverages.ProductID = dbo.Products.ProductID;
UPDATE
{ <object> }
SET
SELECT DocumentSummary
FROM Production.Document
WHERE DocumentID = 7;
To change the word “important” to the word “critical” you need to specify the
offset and the length of the string to be replaced.
UPDATE Production.Document
SET DocumentSummary .WRITE (N'critical',6,9)
WHERE DocumentID = 7;
TIP: The CHARINDEX and PATINDEX functions are useful for calculating the
offset, or number of characters from the start of the string to the value to be
replaced.
UPDATE Production.Document
SET DocumentSummary .WRITE (N'important',6,8)
WHERE DocumentID = 7 ;
Deleting Data
The DELETE statement can delete a single row or multiple rows, depending
on the selectivity of your WHERE clause. The syntax of the DELETE
statement is:
If you omit the optional WHERE clause, then all the rows in the table will be
deleted.
If you want to remove all of the rows from the Beverages table, simply omit
the WHERE clause:
The result of this DELETE query is that all rows in the Beverages table are
deleted.
In addition, identity columns are reset to the seed value defined for the column
when you use TRUNCATE. When you use DELETE, the identity seed is not
reset.
TIP: The term concurrency refers to the ability of multiple users or applications to
access the same data. In relation to updates, concurrency conflicts occur when
multiple users select data and then try to update the data. One problem is that
locks on the data can delay the updates. Another problem is that users may be
allowed to update data that has changed since they last inspected the data.
Applications often prevent this by adding a WHERE clause to the update to
ensure that the update will succeed only if the data in the database still has the
values that were retrieved by that user. As the following section demonstrates,
locks can also affect performance for SELECT queries by other users, which
is another type of concurrency problem.
READ COMMITTED
READ COMMITTED issues shared locks when data is being read. Shared
locks do not prevent other processes from reading the same data. When data is
modified, exclusive locks are issued that remain in effect until the data is
committed or rolled back. It is important to understand that while data is
locked for modification, READ COMMITTED blocks that data from being
selected by preventing other processes from obtaining the shared locks.
READ UNCOMMITTED
READ UNCOMMITTED is the least restrictive isolation level. It does not
issue any locks itself or honor any locks placed by other processes. It allows
dirty reads because the underlying data may change while it is being read.
As you’ll learn, the new snapshot isolation level provides a more robust
alternative to using READ UNCOMMITTED.
REPEATABLE READ
REPEATABLE READ prevents other processes from viewing or updating
data until the transaction using REPEATABLE READ completes, but it does
not prevent other processes from inserting new rows. The inserted rows are
known as phantom rows, because they are not detected by a transaction that
was initiated prior to their insertion.
SERIALIZABLE
The SERIALIZABLE isolation level is the most restrictive isolation level of
all. It ensures that if a query is reissued inside the same transaction, existing
rows won’t look any different and new rows won’t suddenly appear. It
employs a range of locks that prevents edits, deletions, or insertions until the
transaction is complete.
Blocking
Blocking occurs when locks are held for too long. When a transaction is
blocked, it just sits there and waits. Over a network, blocked transactions will
either hang or eventually time out with an error message.
Deadlocks
A deadlock occurs when two separate processes are each holding a resource
that the other needs. Each process is waiting to release the resource it is
holding until the other resource becomes available. Unless one of the processes
is forced to yield, they will stay deadlocked forever. Deadlocks are difficult to
simulate in a development environment; they only seem to appear when your
database is running and many users attempt to complete the same operation at
the same time.
SQL Server uses an interval to determine which processes are running and
which are being blocked. If this interval passes twice and a process is still
blocked, SQL Server chooses a deadlock victim. The victim’s transaction is
rolled back, and error code 1205 is returned to the client application. Your
error handling routine in the client application can test for error 1205 and
resubmit or cancel the query. If SQL Server did not select a deadlock victim,
then eventually your server would run out of available processes and crash.
To avoid deadlocks, access objects in the same order every time you access
them, so that if a process is blocked it goes into a wait queue until the process
holding the lock is complete.
Try It Out!
Follow these steps to see how the READ COMMITTED isolation level affects
concurrency. This example uses an explicit transaction.
1. Execute the following statement in a query window. Note that you are
beginning a transaction, which locks data until the transaction either
commits or rolls back. This transaction uses the default READ
COMMITTED isolation level.
BEGIN TRANSACTION
UPDATE dbo.Categories
SET CategoryName = 'Supplements'
WHERE CategoryName = 'Vitamins'
USE Northwind;
SELECT CategoryName FROM dbo.Categories;
3. You will see that the query is blocked by looking at the status bar at
the bottom of the query window, as shown in Figure 7. This icon will
continue to display indefinitely until the UPDATE statement in the
first window is either committed or rolled back.
5. Start the transaction in the first window again. Switch to the second
query window and execute this statement:
7. Return to the first query window and roll back the update. Execute the
query in the second window again, and you will see the original value
displayed.
Read operations do not request shared locks on the data, so transactions that
modify data do not block transactions that read data, and transactions that read
data do not block transactions that write data, as they normally would under
the default READ COMMITTED isolation level.
The term “snapshot” reflects the fact that all queries in the transaction see the
same version, or snapshot, of the database, based on the state of the database at
the moment the transaction begins. No locks are acquired on the underlying
data rows or data pages in a snapshot transaction, which permits transactions to
execute without being blocked by a prior uncompleted transaction.
NOTE Two write operations will block each other even while running
under row versioning-based isolation levels because two write
operations cannot modify the same data at the same time.
When a user or application retrieves data, it automatically gets the last saved
version of each row.
This allows snapshot isolation when it is explicitly invoked, but the default
READ COMMITTED transaction isolation level remains in effect for implicit
transactions that do not specify snapshot isolation.
Try It Out!
Follow these steps to see how snapshot isolation works.
BEGIN TRANSACTION
UPDATE dbo.TestSnapshot
SET TestValue = 200
WHERE TestID = 1;
5. Note that the query wasn’t blocked even though the row is in the
middle of an uncommitted transaction. The last saved value (100) is
displayed in the results pane. Recall that if you use READ
COMMIT TRANSACTION;
--CLEANUP
USE Northwind;
IF OBJECT_ID('dbo.Beverages') IS NOT NULL DROP TABLE
dbo.Beverages;
IF OBJECT_ID('dbo.Produce') IS NOT NULL DROP TABLE
dbo.Produce;
IF OBJECT_ID('dbo.Condiments') IS NOT NULL DROP TABLE
dbo.Condiments;
IF OBJECT_ID('dbo.NewProducts') IS NOT NULL DROP TABLE
dbo.NewProducts;
IF OBJECT_ID('dbo.TestSnapshot') IS NOT NULL DROP TABLE
dbo.TestSnapshot;
DELETE FROM dbo.Categories WHERE CategoryID > 8;
DELETE FROM dbo.Products WHERE ProductName='Pure
''Sunshine'' Orange Juice';
Summary
• Executing DML statements is the most efficient way to add, edit, or
delete rows in SQL Server tables.
• Add new rows using the INSERT statement.
• Use SCOPE_IDENTITY to retrieve identity column values.
• Use the single quote delimiter for text and date values.
• Temp tables having names that begin with one pound symbol (#) are
only accessible by the connection that creates them and are destroyed
when the connection is closed.
• Use INSERT with OUTPUT to retrieve identity column values for
multiple rows.
• Use BULK COPY to insert a large number of rows.
• Modify rows using the UPDATE statement.
• Use the TOP clause to limit the number of rows modified in an
UPDATE statement.
• Use UPDATE .WRITE to perform partial updates on columns defined
as varchar(max), nvarchar(max), or varbinary(max).
• Delete rows using the DELETE statement.
• The TRUNCATE TABLE statement is not logged and executes much
faster than a DELETE statement when you want to delete all rows.
• READ COMMITTED is the default isolation level for SQL Server.
• READ UNCOMMITTED is the least restrictive isolation level and
allows dirty reads.
• Blocking occurs when locks are held and a transaction sits and waits.
• A deadlock occurs when each of two processes is waiting for a
resource that is locked by the other process.
• Snapshot isolation can improve performance by avoiding blocking
scenarios so that read operations do not request shared locks and
operations that modify data do not block read operations.
Questions
1. What is the advantage of DML statements over server-side cursors?
4. What clause do you use to limit the number of rows being deleted in a
DELETE statement?
5. What two methods can you use to delete all the data from a table?
Answers
1. What is the advantage of DML statements over server-side cursors?
DML statements execute faster and more efficiently.
4. What clause do you use to limit the number of rows being deleted in a
DELETE statement?
The WHERE clause
5. What two methods can you use to delete all the data from a table?
TRUNCATE TABLE and DELETE without a WHERE clause
Lab 5:
Modifying Data
Lab 5 Overview
In this lab you’ll learn how to add, edit, and delete data from a table using
Insert, Update, and Delete queries.
• Adding a Product
• Editing a Product
• Deleting a Product
Each exercise includes an “Objective” section that describes the purpose of the
exercise. You are encouraged to try to complete the exercise from the
information given in the Objective section. If you require more information to
complete the exercise, the Objective section is followed by detailed step-by-
step instructions.
Adding a Product
Objective
In this exercise, you’ll add a record to the Products table using an Insert query.
A new product named Green Tea Cola needs to be added to the lineup of
products. Table 2 contains the values you’ll be entering for this new product.
Name Value
You will then retrieve the new identity column value for the inserted row so
that it can be used in the subsequent exercises.
Things to Consider
• What are the column names and data types for the Products table?
• Is there an identity column in the table?
• How do you retrieve the new identity value?
• Are there any special characters that must be handled properly?
Step-by-Step Instructions
1. Examine the structure of the Products table in the Object Browser pane.
The ProductID is an identity column and does not need to be specified in
the INSERT list.
3. Do not execute the query yet. You will need to retrieve the new identity
column value with the SCOPE_IDENTITY function. Add this statement:
Execute the statements. The new identity column value will appear in the
results pane, as shown in Figure 9.
NOTE Your table may return a different ProductID value than the value
in Figure 9. In the following exercises, it is important that you use
the ProductID that your query returned.
Editing a Product
Objective
In this exercise, you’ll set the price of the product added in the previous
exercise.
Things to Consider
• What kind of action query is used to modify existing data?
• What is the name and data type of the column being modified?
• How will you identify the record to be modified?
Step-by-Step Instructions
1. Determine the data type for the UnitPrice.
2. Use the ProductID of the new product in the WHERE clause of the
UPDATE statement to ensure that only a single row is affected.
UPDATE dbo.Products
SET UnitPrice = 4.95
WHERE ProductID = 79;
5. To verify the results of the UPDATE statement, type the following query
in the Query Analyzer window:
Deleting a Product
Objective
In this exercise, you’ll delete the product that you worked with in the previous
two exercises.
Things to Consider
• Which SQL statement is used to delete records?
• How will you uniquely identify the record to be deleted?
Step-by-Step Instructions
1. To create a DELETE query that does not accidentally delete all of the rows
in a table, start with a SELECT query that has a WHERE clause specifying
the row or rows to be deleted. Use the SELECT query from the previous
exercise, in which you verified that the row was modified:
2. Delete the first line of the query with the SELECT statement.
FROM dbo.Products
WHERE ProductID = 79;
3. Add the DELETE keyword in front of the FROM keyword so that the
query looks like the following:
Figure 11. The SELECT query results after the row is deleted.