Integrating
Perl and Databases
Prepared with the help internet, Google and countless
contributors to this PPT
Thanks to all of them
Thanks to Tim Bunce – Author of DBI
Viswanatha Yarasi
[email protected]
[email protected]
(9849743894)
PERL Programming
• Objectives
– Introduction to PERL for databases.
– Why Perl for Databases?
– Perl ODBC
– What is Perl DBI?
– Why Perl DBI?
– Design, Architecture & Examples
– Safety
Intro. To Perl for Databases
Making simple things easy
and difficult things possible – Tim Bunce
Why Perl for Databases?
• Ideally suited to data processing:
– parsing, cleansing / filtering / merging and
formatting
• Applications
– Data Migration
– Data acquisition and reporting
– Web Page creation (static and dynamic)
• Perl lets you build sharper tools faster
Database Access
• Standardized through the DBI module
• Differing Database systems have their
own DB driver module
– DBD::mysql, DBD::Oracle, DBD::Informix, etc
• By simply changing which DBD::* you use,
your existing code can work with a new
database.
• ODBC Module
ODBC
• Flow Diagram
• Connectivity
• Details
ODBC Connectivity
ODBC: Open DataBase
Connectivity
• Uses SQL
• Requires ODBC to be installed
• Win32::ODBC supports access into the
ODBC API
ODBC Options
• Perl can access ODBC
– Win32::OLE ( OLEDB, ADO )
• Memory & processor overhead
• Not guaranteed to be ODBC or SQL
• Some bugs yet to be ironed out (eg. SQL Server and multiple
text columns)
– ODBCTie
• Is this still maintained?
– DBI
• General DBI rules apply
• Cross platform
– Win32::ODBC
• Requires Win32 machine
What ODBC drivers are installed?
• Enumerate installed ODBC drivers with:
%Drivers = Win32::ODBC::Drivers()
– Returns a hash
– Hash keys are driver names
– Hash value is string with driver specific
attributes delimited by semicolons:
“Attrib1=Value1;Attrib2=Value2;Attribn=Valuen”
DSN: Data Source Name
• All database connections begin with a DSN
• Named database configuration
• Three types:
– User DSN
– System DSN
– File DSN
• Win 95/98 only understand User and File
• When used as a CGI/ASP script with a web
server always use System DSN!
DSN: Data Source Name (2)
• Create manually using the ODBC control panel applet
• Create using Win32::ODBC::ConfigDSN()
ConfigDSN( $Action, $Driver, $Attrib1, $Attribn )
– Actions
ODBC_ADD_DSN Add new DSN
ODBC_MODIFY_DSN Modify existing DSN
ODBC_REMOVE_DSN Remove existing DSN
ODBC_ADD_SYS_DSN Add new system DSN
ODBC_MODIFY_SYS_DSN Modify existing system DSN
ODBC_REMOVE_SYS_DSN Remove existing system DSN
– Driver depends upon installed drivers (keys from
Win32::ODBC::Drivers() ):
Microsoft Access Driver (*.mdb)
DSN: Data Source Name (3)
– Attributes are any valid ODBC driver
keywords
• One required keyword:
– DSN=Foo
• Other keywords differ based on ODBC driver
• Best way to discover keywords is by reading docs
or manually creating a DSN then examining the
Registry
• Do not include a “Driver” keyword
DSN: Data Source Name (4)
Connecting Through ODBC
• Create a new Win32::ODBC object:
$db = new Win32::ODBC( "My DSN" );
• The DSN can either be the name of a DSN or it
can be a full connect string:
– “My DSN”
– “DSN=My DSN;UID=Foo;PWD=Bar”
• If the DSN passed in is really a Win32::ODBC
object then that object is “cloned”
$db2 = new Win32::ODBC( $db );
– $db2 is identical to $db but with different STMT’s
– Some database systems do not like such clones
Executing SQL Statement
• Submit a text based SQL query
$result = $db->sql( “SELECT * FROM Foo” );
• This is the only method call which returns
a non-false value upon failure
– Returns error number (ODBC driver specific;
not really valuable)
– Call $db->Error() for more error details
Fetching Results
• Call FetchRow() until it returns false
• Collect the data with Data() or
DataHash()
while( $db->FetchRow() )
{
my( %Data ) = $db->DataHash();
…process data…
}
Batch Queries
• If you submitted a batch query or a stored
procedure returns multiple result sets
repeat the FetchRow() process until
MoreResults() returns FALSE.
do
{
while( $db->FetchRow() )
{
my( %Data ) = $db->DataHash();
…process data…
}
} while( $db->MoreResults() );
Closing The Connection
• To close the database connection call
Close()
$db->close();
Transactions
• By default an ODBC connection is in
AutoCommit mode
– All transactions are committed to the database
immediately
• Turn off AutoCommit mode with:
$db->SetConnectOption( $db->SQL_AUTOCOMMIT,
$db->SQL_AUTOCOMMIT_OFF );
• Perform queries (select, insert, delete, update,
etc)
• To commit or rollback call Transact():
$db->Transact( $db->SQL_COMMIT );
$db->Transact( $db->SQL_ROLLBACK );
Row Counts
• Report number of rows in the result set with:
$db->rowCount();
• Not all ODBC drivers support it
• Some ODBC drivers only support it for insert
and delete
• Alternative is to issue a SQL count query:
SELECT Count(*) FROM Foo
– The result set is one row and one column containing a
value which is the number of rows
Connection Options
• You can set and query connection options
with:
$db->getConnectOption( $Option );
$db->setConnectOption( $Option, $Value );
• SetConnectOption() returns TRUE if
successfully set and FALSE if it failed to
set the option
• getConnectOption() returns the current
value of the specified option. It does not
return any errors!
Connection Option Examples
• To discover the current qualifier (SQL Server
this is the database, in Access it is the .mdb file):
$Row = $db->GetConnectOption( $db->
SQL_CURRENT_QUALIFIER );
• To change the login timeout value (in seconds):
$db->SetConnectOption( $db->SQL_LOGIN_TIMEOUT,
10 );
• Query the ODBC trace log file:
$db->GetConnectOption( $db->SQL_OPT_TRACEFILE
);
• Set ODBC trace logging to on:
$db->SetConnectOption( $db->SQL_OPT_TRACE,
$db->SQL_OPT_TRACE_ON );
Special Connect Options
• Some connection options must be set
before the connection to the database
is performed
• Pass these into the new() function:
$db = new Win32::ODBC( $Dsn,
ODBC::SQL_LOGIN_TIMEOUT => 10,
ODBC::SQL_OPT_TRACEFILE =>
‘c:\trace.log’,
ODBC::SQL_OPT_TRACE =>
ODBC::SQL_OPT_TRACE_ON );
Stmt Options
• Every time you submit a command to the
database (such as a query) call it a statement or
a stmt (for short)
• You can set and query options for stmt’s with:
$db->GetStmtOption( $Option );
$db->SetStmtOption( $Option, $Value );
• SetStmtOption() returns TRUE if successfully
set and FALSE if it failed to set the option
• GetStmtOption() returns the current value of the
specified option. It does not return any errors!
Stmt Option Examples
• To discover the current row:
$Row = $db->GetStmtOption( $db-
>SQL_ROW_NUMBER );
• To change the query timeout value (in
seconds):
$db->SetStmtOption( $db-
>SQL_QUERY_TIMEOUT, 10 );
• Set the driver to not scan the query for
escape clauses:
$db->SetStmtOption( $db->SQL_NOSCAN,
$db->SQL_NOSCAN_ON );
Other Sources Of Information
• Programming DBI, by Tim Bunce and
Alligator Descartes, O’Reilly & Associates.
• Win32 Perl Programming: The Standard
Extensions, Dave Roth, MacMillan
Publishing.
• Win32 Scripting Journal,
http://www.winntmag.com/newsletter/script
ing/
• The Perl Journal, http://www.tpj.com/
DBI
• Flow Diagram
• Connectivity
• Details
PERL – DBI/DBD
What Is the Perl DBI?
• The standard Database Interface for Perl
• “A perl module and specification that
defines a consistent database interface,
independent of the actual database being
used”
So why use the Perl DBI?
• Because...
– It delivers what it promises
– It’s simple and quick to use
– It’s fast, flexible and well proven
– It’s here, there and everywhere
– It’s free, with source
– Drivers available for many databases
– Commercial support is available
– It has a large user base and a strong future
What drives the design of the DBI?
• Goals of the DBI
– Be simple to use for simple applications
– Have sufficient flexibility to accommodate unusual
functionality and non-SQL databases
– Conform to applicable standards (ODBC etc.)
– Enable the creation of database-independent Perl
scripts without being limited to the lowest functionality
– Be free with open source code
• A ‘higher-level’ interface than
ODBC/JDBC
Where does the DBI fit?
Perl Application
DBI Module
DBD::Oracle DBD::Informix DBD::Other
Oracle Server Informix Server Other Server
A Simple Example - DBI (1)
• ### Prepare and Execute a statement
• if ($db->Sql("SELECT * FROM $table")) {
• print "SQL Error: " . $db->Error() . "\n";
• $db->Close();
• exit;
• }
• ### Fetch row from data source
• while ($db->FetchRow) {
• @row = $db->Data(); ### Get data values from the row
• print "row:@row\n";
• }
• print "\n\n";
• ### Prepare and Execute a statement
• if ($db->Sql("desc $table")) {
• print "SQL Error: " . $db->Error() . "\n";
• $db->Close();
• exit;
• }
• ### Fetch row from data source
• while ($db->FetchRow) {
• @row = $db->Data(); ### Get data values from the row
• print "row:@row\n";
• }
• print "\n\n";
• ### Disconnect
• $db->Close();
A Simple Example - DBI (2)
• #!perl -w
• # works on BSD-LPT0037 where DBI.pm is working
• ### Attributes to pass to DBI->connect( )
• my %attr = (
• PrintError => 1,
• RaiseError => 1
• );
• use DBI; # Load the DBI module
• # Declare and initialize variables
• #my $host = 'HOT-LPT0077';
• my $host = 'BSD-LPT0037';
• my $db = 'perl_demo';
• my $db_user = 'simanta';
• my $db_password = 'simanta';
• my $db_type = "mysql";
• ### Perform the connection using the mysql driver
• my $dbh = DBI->connect("dbi:$db_type:$db:$host", "$db_user", "$db_password", {
• PrintError => 0,
• RaiseError => 0
• } ) or die "Can't connect to the database: $DBI::errstr\n";
• ### Prepare a SQL statement for execution
• my $sth = $dbh->prepare( "SELECT * FROM megaliths" )
• or die "Can't prepare SQL statement: $DBI::errstr\n";
• ### Execute the statement in the database
• $sth->execute
• or die "Can't execute SQL statement: $DBI::errstr\n";
• ### Retrieve the returned rows of data
• my @row;
• while ( @row = $sth->fetchrow_array( ) ) {
• print "Row: @row\n";
• }
• warn "Data fetching terminated early by error: $DBI::errstr\n"
• if $DBI::err;
• print "\n\n";
Example for Insert/Update
1: #!/usr/bin/perl -w
2:
3: use DBI;
4: # assume user name and password are all defined
5: my $dbh = DBI->connect("dbi:$db_type:$db:$host", "$db_user", "$db_password", {
PrintError => 0,
RaiseError => 0
} ) or die "Can't connect to the database: $DBI::errstr\n";
6:
7: $upd = $dbh->prepare("UPDATE prices SET price=? WHERE
product=?");
8: $ins = $dbh->prepare("INSERT INTO prices (product,price)
VALUES(?,?)");
9: $rows = $upd->execute(42, "Widgets");
10: $ins->execute("Widgets", 42) if $rows == 0;
11:
12: $dbh->disconnect;
A Simple Example - ODBC
• #! perl
• # works on BSD-LPT0037 where ODBC driver for MySQL is installed
• use Win32::ODBC;
• my($DSN, $UID, $PWD, $TABLE);
• $DSN = "perl_demo";
• $UID = "simanta";
• $PWD = "simanta";
• $table = "megaliths";
• ### Connect to a data source
• $db = new Win32::ODBC("DSN=$DSN;UID=$UID;PWD=$PWD")
• or die Win32::ODBC::Error();
• ### Prepare and Execute a statement
• if ($db->Sql("SELECT id FROM $table")) {
• print "SQL Error: " . $db->Error() . "\n";
• $db->Close();
• exit;
• }
• ### Fetch row from data source
• while ($db->FetchRow) {
• my ($id) = $db->Data(); ### Get data values from the row
• print "id:$id\n";
• }
• print "\n\n";
Fetching Methods
• fetchrow_hashref
– fetch "next" row, as a hash reference (key = column name, value = field
value)
• fetchrow_arrayref
– fetch next row, as an array reference (in order defined by the table)
• fetchrow_array
– fetch next row as an array
• fetchall_arrayref
– fetch entire results
• no parameters - entire table as arrayref of arrayrefs
• pass array ref - select numbers of which columns of entire table to return
• pass hashref - select names of which columns to return, and return as an
array of hash referneces
• pass empty hashref - return entire table as array of hash references
• fetchall_hashref($key)
– fetch entire results, as a hashref of hashrefs. Key is index of table
DBI errata
• if you don't need to fetch (an UPDATE, INSERT, or
DELETE statement),
– just use do(). No need to prepare or execute
– $dbh->do('DELETE FROM class WHERE drop = 1');
• $sth->rows returns the number of rows inserted,
updated, or deleted.
– DOES NOT return number of rows that are selected!
• SQL NULL ==> Perl undef
• $dbh->{RaiseError} = 1, Perl will die on any SQL
error.
– (Otherwise, must check return value of every db call, and then
check $DBI::err)
• http://dbi.perl.org & perldoc DBI
Under the Hood
• DBI defines and implements an interface
• Driver modules do much of the real work
• DBI provides default methods, functions, tools
etc for drivers
• Not limited to the lowest common denominator -
mechanism provided for driver specific
extensions
• Designed and built for speed
• Valuable detailed call tracing/debugging built-in
Give me SPEED!
• DBI designed for speed from day one.
• DBI dispatch written in hand-crafted XS/C.
• Dispatch to XS driver methods optimized.
• Cached attributes returned by dispatch.
• Placeholders give maximum server
performance.
• Binding columns and/or fetchrow_arrayref
give maximum client-side fetch performance.
• DBI overhead is generally insignificant.
Hot handles
• Avoid using $dbh->do(…) in a speed-critical loop
• It’s usually creating and destroying a statement handle each time
• Use $sth = $dbh->prepare(…)and $sth->execute() instead
• Using prepare() gets a handle on the statement in the SQL cache
• Avoids a round-trip to server for SQL cache check on each use
• For example… convert looped
$dbh->do("insert … ?", undef, $id)
into $sth = $dbh->prepare("insert … ?”) before the loop
and $sth->execute($id) inside the loop
• This often gives a significant performance boost
• even where placeholders are emulated, such as the current
DBD::mysql
• because it avoids statement handle creation overhead
Sling less for speed
• while(@row = $sth->fetchrow_array) { }
– one column: 51,155 fetches per second
– 20 columns: 24,032 fetches per second
• while($row = $sth->fetchrow_arrayref) { }
– one column: 58,653 fetches per second - approximately 12% faster
– 20 columns: 49,390 fetches per second - approximately 51% faster
• while($row = shift(@$rowcache)
|| shift(@{$rowcache=$sth->fetchall_arrayref(undef, $max_rows)})) { }
– one column: 348,140 fetches per second - by far the fastest!
– 20 columns: 42,128 fetches per second - now slower than fetchrow_arrayref!
– Why? Balance time saved making fewer calls with time spent managing more memory
– Do your own benchmarks to find what works best for your situations
• Notes:
– Tests used DBD::mysql on 100,000 rows with fields 9 chars each. $max_rows=1000;
– Time spent inside fetchrow_* method is approximately 0.000011s (~90,000 per second)
Bind those columns!
• Compare
while($row = $sth->fetchrow_arrayref) {
print “$row->[0]: $row->[1]\n”;
}
❍ with
$sth->bind_columns(\$key, \$value);
while($sth->fetchrow_arrayref) {
print “$key: $value\n”;
}
• No row assignment code!
• No column access code!
Do more with less!
• Reduce the number of DBI calls
• The DBI is fast -- but it isn’t free!
• Using RaiseError is faster than checking return values
• and much faster than checking $DBI::err or $h->err
• Using fetchall_arrayref (or selectall_arrayref) is now much
faster
• if using a driver extension compiled with the DBI’s Driver.xst wrapper (most
are)
• because the loop is written in C and doesn’t make a method call per row
• Using fetchall_arrayref is possible for very large result sets
• new $max_rows parameter limits rows returned (and memory consumed)
• just add an outer loop to process the results in ‘batches’, or do it in-line:
$row = shift(@$cache)
|| shift @{$cache=$sth->fetchall_arrayref(undef, 1000)};
Speedy Summary
• Think about the big picture first
• Choice of tools, schema design, partitioning, latency, etc.
• Check and tune the access plans for your statements
• Teach your database about any uneven key distributions
• Use placeholders - where supported
• Especially for any statements that vary and will be executed often
• Replace do() in a loop
• with prepare() and execute()
• Sling less data for faster row fetching
• Or sling none per row by binding columns to perl variables
• Do more with less by using the DBI in the most efficient way
• Make fewer, better, DBI method calls
• Other important things to consider…
• your perl code, plus hardware, operating system, and database configuration
etc.
Give me SAFETY!
• A recipe for safety
– Use transactions (don't use AutoCommit)
– Use RaiseError
– Use eval { … } to catch errors and rollback
Safety by example
use DBI;
use Text::Wrap;
$dbh = DBI->connect(…, …, …, {
AutoCommit => 0, RaiseError => 1
});
eval {
$sth = $dbh->prepare('insert into table values
(?)');
foreach (@array_of_long_strings) {
$text = wrap($_);
$sth->execute($test);
}
};
$@ ? $dbh->rollback : $dbh->commit;
Wheels within wheels
Watching the DBI in action
• DBI has detailed call tracing built-in
• The trace be very helpful in understanding
application behavior and for debugging
• Shows parameters and results
• Trace information can be written to a file
• Not used often enough!
Attribution - For Handles
• Two kinds of attributes: Handle Attributes and Method Attributes
• A DBI handle is a reference to a hash
• Handle Attributes can be read or set by accessing the hash via the
reference
$h->{AutoCommit} = 0;
$autocomitting = $h->{AutoCommit};
• Some attributes are read-only
$sth->{NUM_OF_FIELDS} = 42; # fatal error
• Using an unknown attribute triggers a warning
$sth->{AutoCommat} = 42; # triggers a warning
$autocomitting = $sth->{AutoCommat}; # triggers a warning
• driver-private attributes (which have lowercase names) do not trigger a warning
Attribution - For Methods
• Two kinds of attributes: Handle Attributes and Method Attributes
• Many DBI methods take an ‘attributes’ parameter
• in the form of a reference to a hash of key-value pairs
• The attributes parameter is typically used to provide ‘hints’ to the
driver
• Unrecognised attributes are simply ignored
• So invalid attribute name (like typos) won't be caught
• The method attributes are generally unrelated to handle attributes
• The connect() method is an exception
• In DBI v2 prepare() will also accept handle attributes for the new
handle
Let the DBI cache your handles
• Sometimes it's not easy to hold all your handles
• e.g., library code to lookup values from the database
• The prepare_cached() method
• gives you a client side statement handle cache:
sub lookup_foo {
my ($dbh, $id) = @_;
$sth = $dbh->prepare_cached("select foo from table where
id=?");
return $dbh->selectrow_array($sth, $id);
}
• On later calls returns the previously cached handle
• for the given statement text and any method attributes
• Can avoid the need for global statement handle variables
• which can cause problems in some situations, see later
Some prepare_cached() issues
• A cached statement handle may still be Active
– because some other code is still fetching from it
– or didn't fetch all the rows (and didn't didn't call finish)
– perhaps due to an exception
• Default behavior for prepare_cached()
– if Active then warn and call finish()
• Rarely an issue in practice
• But if it is
– Alternative behaviors are available via the $is_active
parameter
prepare_cached($sql, \%attr, $if_active)
– See the docs for details
Keep a handle on your databases
• Connecting to a database can be slow
• Oracle especially so
• Try to connect once and stay connected where practical
• We'll discuss web server issues later
• The connect_cached() method …
– Acts like prepare_cached() but for database handles
– Like prepare_cached(), it’s handy for library code
– It also checks the connection and automatically reconnects if it's
broken
– Works well combined with prepare_cached(), see following
example
A connect_cached() example
• Compare and contrast...
my $dbh = DBI->connect(…);
sub lookup_foo_1 {
my ($id) = @_;
$sth = $dbh->prepare_cached("select foo from table
where id=?");
return $dbh->selectrow_array($sth, $id);
}
❍ with...
sub lookup_foo_2 {
my ($id) = @_;
my $dbh = DBI->connect_cached(…);
$sth = $dbh->prepare_cached("select foo from table
where id=?");
return $dbh->selectrow_array($sth, $id);s
Some connect_cached() issues
• Because connect_cached() may return a new connection...
– it’s important to specify all significant attributes within the connect() call
– e.g., AutoCommit, RaiseError, PrintError
– So pass the same set of attributes into all connect calls
• Similar, but not quite the same as Apache::DBI
– Doesn’t disable the disconnect() method.
• The caches can be accessed via the CachedKids handle attribute
– $dbh->{CachedKids} - for prepare_cached()
– $dbh->{Driver}->{CachedKids} - for
connect_cached()
– Could also be tied to implement LRU and other size-limiting caching
strategies
tie %{$dbh->{CachedKids}}, SomeCacheModule
First, the simple stuff...
• After calling prepare() on a statement with placeholders:
$sth = $dbh->prepare(“select * from table where k1=? and
k2=?”);
• Values need to be assigned (‘bound’) to each placeholder before
the database can execute the statement
• Either at execute, for simple cases:
$sth->execute($p1, $p2);
• or before execute:
$sth->bind_param(1, $p1);
$sth->bind_param(2, $p2);
$sth->execute;
Then, some more detail...
• If $sth->execute(…) specifies any values, it must
specify them all
• Bound values are sticky across multiple executions:
$sth->bind_param(1, $p1);
foreach my $p2 (@p2) {
$sth->bind_param(2, $p2);
$sth->execute;
}
• The currently bound values are retrievable using:
%bound_values = %{ $sth->{ParamValues} };
• Relatively new DBI feature, added in 1.28, not implemented by all
drivers yet
Your TYPE or mine?
• Sometimes the data type for bind values needs to be specified
use DBI qw(:sql_types);
• to import the type constants
$sth->bind_param(1, $value, { TYPE => SQL_INTEGER });
• to specify the INTEGER type
• which can be abbreviated to:
$sth->bind_param(1, $value, SQL_INTEGER);
• To just distinguish numeric versus string types, try
$sth->bind_param(1, $value+0); # bind as numeric value
$sth->bind_param(1, ”$value”); # bind as string value
• Works because perl values generally know if they are strings or numbers. So...
• Generally the +0 or ”” isn’t needed because $value has the right ‘perl type’ already
Got TIME for a DATE?
• Date and time types are strings in the native database format
– many valid formats, some incompatible or ambiguous 'MM/DD/YYYY' vs
'DD/MM/YYYY'
• Obvious need for a common format
– The SQL standard (ISO 9075) uses 'YYYY-MM-DD' and 'YYYY-MM-DD
HH:MM:SS'
• DBI now says using a date/time TYPE mandates ISO 9075 format
$sth->bind_param(1, "2004-12-31", SQL_DATE);
$sth->bind_param(2, "2004-12-31 23:59:59", SQL_DATETIME);
$sth->bind_col(1, \$foo, SQL_DATETIME); # for selecting data
• Driver is expected to convert to/from native database format
– New feature, as of DBI 1.43, not yet widely supported
Some TYPE gotchas
• Bind TYPE attribute is just a hint
• and like all hints in the DBI, they can be ignored
• the driver is unlikely to warn you that it's ignoring an attribute
• Many drivers only care about the number vs string
distinction
• and ignore other kinds of TYPE value
• For some drivers/databases that do pay attention to the
TYPE…
• using the wrong type can mean an index on the value field isn’t used
• Some drivers let you specify private types
$sth->bind_param(1, $value, { ora_type => 97 });
The importance of error checking
• Errors happen!
– Failure happens when you don't expect errors!
• database crash / network disconnection
• lack of disk space for insert or select (sort space for order by)
• server math error on select (divide by zero after fetching 10,000
rows)
• and maybe, just maybe, errors in your own code [Gasp!]
– Beat failure by expecting errors!
– Detect errors early to limit effects
• Defensive Programming, e.g., check assumptions
• Through Programming, e.g., check for errors after fetch loops
– (and undefined values are your friends: always enable warnings)
Error checking - ways and means
• Error checking the hard way...
$h->method or die "DBI method failed: $DBI::errstr";
$h->method or die "DBI method failed: $DBI::errstr";
$h->method or die "DBI method failed: $DBI::errstr";
• Error checking the smart way...
$h->{RaiseError} = 1;
$h->method;
$h->method;
$h->method;
Handling errors the smart way
• Setting RaiseError make the DBI call die for you
• For simple applications immediate death on error is fine
• The error message is usually accurate and detailed enough
• Better than the error messages some developers use!
• For more advanced applications greater control is needed, perhaps:
• Correct the problem and retry
• or, Fail that chunk of work and move on to another
• or, Log error and clean up before a graceful exit
• or, whatever else to need to do
• Buzzwords:
• Need to catch the error exception being thrown by RaiseError
Catching the Exception
• Life after death
$h->{RaiseError} = 1;
eval {
foo();
$h->method; # if it fails then the
DBI calls die
bar($h); # may also call DBI methods
};
if ($@) { # $@ holds error message
... handle the error here …
}
• Bonus
• Other, non-DBI, code within the eval block may also raise an
exception
• that will also be caught and can be handled cleanly
Picking up the Pieces
• So, what went wrong?
$@
• holds the text of the error message
if ($DBI::err && $@ =~ /^(\S+) (\S+) failed: /)
• then it was probably a DBI error
• and $1 is the driver class (e.g. DBD::foo::db), $2 is the name of the method (e.g.
prepare)
$DBI::lasth
• holds last DBI handle used (not recommended for general use)
$h->{Statement}
• holds the statement text associated with the handle (even if it’s a database handle)
• $h->{ShowErrorStatement} = 1
• appends $h->{Statement} to RaiseError/PrintError messages:
• DBD::foo::execute failed: duplicate key [for ``insert
…’’]
• for statement handles it also includes the $h->{ParamValues} if available.
• Makes error messages much more useful. Better than using $DBI::lasth
• Many drivers should enable it by default. Inherited by child handles.
Custom Error Handling
• Don’t want to just Print or Raise an Error?
– Now you can Handle it as well…
$h->{HandleError} = sub { … };
• The HandleError code
– is called just before PrintError/RaiseError are handled
– it’s passed
• the error message string that RaiseError/PrintError would use
• the DBI handle being used
• the first value being returned by the method that failed (typically undef)
– if it returns false then RaiseError/PrintError are checked and acted upon
as normal
• The handler code can
– alter the error message text by changing $_[0]
– use caller() or Carp::confess() or similar to get a stack trace
– use Exception or a similar module to throw a formal exception object
More Custom Error Handling
• It is also possible for HandleError to hide an error, to a limited degree
• use set_err() to reset $DBI::err and $DBI::errstr
• alter the return value of the failed method
$h->{HandleError} = sub {
my ($errmsg, $h) = @_;
return 0 unless $errmsg =~ /^\S+ fetchrow_arrayref failed:/;
return 0 unless $h->err == 1234; # the error to 'hide'
$h->set_err(0,""); # turn off the error
$_[2] = [ ... ]; # supply alternative return value by altering
parameter
return 1;
};
• Only works for methods which return a single value and is hard to make
reliable (avoiding infinite loops, for example) and so isn't recommended for
general use!
• If you find a good use for it then please let me know.
Information and Warnings
• Drivers can indicate Information and Warning states in addition to Error states
– Uses false-but-defined values of $h->err and $DBI::err
– Zero "0" indicates a "warning"
– Empty "" indicates "success with information" or other messages from database
• Drivers should use $h->set_err(…) method to record info/warn/error states
– implements logic to correctly merge multiple info/warn/error states
– info/warn/error messages are appended to errstr with a newline
– $h->{ErrCount} attribute is incremented whenever an error is recorded
• The $h->{HandleSetErr} attribute can be used to influence $h->set_err()
– A code reference that's called by set_err and can edit its parameters
– So can promote warnings/info to errors or demote/hide errors etc.
– Called at point of error from within driver, unlike $h->{HandleError}
• The $h->{PrintWarn} attribute acts like $h->{PrintError} but for warnings
– Default is on
Transactions - Eh?
• Far more than just locking
• The A.C.I.D. test
• Atomicity - Consistency - Isolation - Durability
• True transactions give true safety
• even from power failures and system crashes!
• Incomplete transactions are automatically rolled-back by the
database server when it's restarted.
• Also removes burden of undoing incomplete changes
• Hard to implement (for the vendor)
• and can have significant performance cost
• A very large topic worthy of an entire tutorial
Transactions - Life Preservers
• Text Book:
• system crash between one bank account being debited and another being credited.
• Dramatic:
• power failure during update on 3 million rows when only part way through.
• Real-world:
• complex series of inter-related updates, deletes and inserts on many separate tables
fails at the last step due to a duplicate unique key on an insert.
• Locking alone won’t help you in any of these situations
• (And locking with DBD::mysql < 2.1027 is unsafe due to auto reconnect)
• Transaction recovery would handle all these situations - automatically
• Makes a system far more robust and trustworthy over the long term.
• Use transactions if your database supports them.
• If it doesn't and you need them, switch to a different database.
Transactions - How the DBI helps
• Tools of the trade:
– Set AutoCommit off
– Set RaiseError on
– Wrap eval { … } around the code
– Use $dbh->commit; and $dbh->rollback;
• Disable AutoCommit via $dbh->{AutoCommit}=0 or $dbh->begin_work;
• to enable use of transactions
• Enable RaiseError via $dbh->{RaiseError} = 1;
• to automatically 'throw an exception' when an error is detected
• Add surrounding eval { … }
• catches the exception, the error text is stored in $@
• Test $@ and then $dbh->rollback() if set
• note that a failed statement doesn’t automatically trigger a transaction rollback
Transactions - Example code
$dbh->{RaiseError} = 1;
$dbh->begin_work; # AutoCommit off
till commit/rollback
eval {
$dbh->method(…); # assorted DBI calls
foo(...); # application code
$dbh->commit; # commit the changes
};
if ($@) {
warn "Transaction aborted because $@";
$dbh->rollback;
...
}
Transactions - Further comments
• The eval { … } catches all exceptions
• not just from DBI calls. Also catches fatal runtime errors from Perl
• Put commit() inside the eval
• ensures commit failure is caught cleanly
• remember that commit itself may fail for many reasons
• Don't forget that rollback() may also fail
• due to database crash or network failure etc.
• so you'll probably want to use eval { $dbh->rollback };
• Other points:
• Always explicitly commit or rollback before disconnect
• Destroying a connected $dbh should always rollback
• END blocks can catch exit-without-disconnect to rollback and disconnect
cleanly
• You can use ($dbh && $dbh->{Active}) to check if still connected
Bulk Operations
• Execute a statement for multiple values
$sth = $dbh->prepare("insert into table (foo,bar) values
(?,?)");
$tuples = $sth->execute_array(\%attr,
\@list_of_param_array_refs);
• returns count of executions, even ones that failed, and not rows-affected
• Explicit array binding
$dbh->bind_param_array(1, \@array_of_foo_values, \%attr);
$dbh->bind_param_array(2, \@array_of_bar_values, \%attr);
$sth->execute_array(\%attr) # uses bind_param_array values
• Attributes for row-wise binding and per-tuple status:
ArrayTupleFetch => $code_ref_or_sth row-wise binding
ArrayTupleStatus => $array_ref return rows-affected and
errors
• DBI provides default methods that work for all drivers
• Some drivers support use of specific database API for very high performance
DBI security tainting
• By default DBI ignores Perl tainting
• doesn't taint database data returned ‘out’ of the DBI
• doesn't check that parameters passed ‘in’ to the DBI are not tainted
• The TaintIn and TaintOut attributes enable those behaviours
• If Perl itself is in taint mode.
• Each handle has it's own inherited tainting attributes
• So can be enabled for particular connections and disabled for particular
statements, for example:
$dbh = DBI->connect(…, { Taint => 1 }); # enable TaintIn and
TaintOut
$sth = $dbh->prepare("select * from safe_table");
$sth->{TaintOut} = 0; # don’t taint data from this statement
handle
• Attribute metadata currently varies in degree of tainting
$sth->{NAME}; — generally not tainted
$dbh->get_info(…); — may be tainted if the item of info is fetched from
database
Handling LONG/BLOB data
• What makes LONG / BLOB data special?
– Not practical to pre-allocate fixed size buffers for worst case
• Fetching LONGs - treat as normal fields after setting:
– $dbh->{LongReadLen} - buffer size to allocate for expected data
– $dbh->{LongTruncOk} - should truncating-to-fit be allowed
• Inserting LONGs
– The limitations of string literals
– The benefits of placeholders
• Chunking / Piecewise processing not yet supported
– So you're limited to available memory
– Some drivers support blob_read()and other private methods
Comparing DBI and Alternatives
(Attempting the Impossible!)
“Power”
Sybase::DBlib Native API
ODBC/
DBI Win32::ODBC JDBC
and and
Drivers Precompilers drivers
Command Line
TCL
(oratcl, sybtcl)
Please note: This chart is correct only for some values of correct! “Effort”