Benohead Sybase Ase Cookbook
Benohead Sybase Ase Cookbook
After working for over 10 years with ASE, I’ve gathered a lot of information I share on a daily
basis with colleagues and once in a while on my blog at http://benohead.com.
I provide this ebook in the hope to be helpful. It is thus available for free. Since I’m not a
professional writer, I do not have a crew of people proof-reading it. So spelling might be not as
good as it should be and I can’t be sure that everything in there is 100% accurate. If you find
any mistake, please contact me at [email protected] and I’ll do my best to update it.
On one of our test server, we had an abrupt reboot while ASE was rolling back a huge
transaction. After restart, Sybase recovered that database and then did a dbccdb recovery as well.
During the ANALYSIS and REDO passes went fine but during the UNDO pass, the transaction
log got full:
Can't allocate space for object 'syslogs' in database 'dbccdb' because 'logsegment' segment is
full/has no free extents. If you ran out of space in syslogs, dump the transaction log. Otherwise,
use ALTER DATABASE to increase the size of the segment.
Dumping the transaction log did not work since the database was recovering and couldn't be
accessed. So we had to use a more hardcore way known as log suicide i.e. short-circuiting the
recovery. This is generally not recommended and I wouldn't have done it on our database but on
dbccdb, it was worth a try.
First you need to allow updates on system tables as we need to manipulate the sysdatabases
table:
use master
go
sp_configure "allow updates",1
go
Now you can manually start the dataserver. When the dataserver is up again, execute the
following to truncate the transaction log of dbccdb:
This will show you how to extend the transaction log on a user database by adding an additional
log device. In many cases dumping the transaction log or increasing the size of the existing
device would be better but in some occasions, I've needed this...
First you have to create a new device on the hard disk using “disk init”:
use master
go
disk init name = "log_2_dev", physname = "/db_data/devices/log_2_dev",
size=204800
go
In this example, “log_2_dev” is the name of the new device in ASE. It will be created in the file
"/db_data/devices/log_2_dev". And its size will be 400 MB (204800 pages, 2KB each). Instead
of specifying a size in 2K-blocks you can also write 400M to directly define a size in megabytes.
You can check whether the device has been properly created like this:
sp_helpdevice
go
Then you need to assign this device to the transaction log of your database (assuming the name
of the database is mydb):
use master
go
alter database mydb log on log_2_dev = '400M'
go
It tells Sybase to additionally use the new device for the transaction log (using 400 MB).
You can check with the following whether everything went fine:
sp_helpdb mydb
go
(1 row affected)
device_fragments size usage
created free kbytes
------------------------------ ------------- --------------------
------------------- ----------------
data_1_dev 991.0 MB data only
Jan 24 2012 11:33PM 809514
log_1_dev 109.0 MB log only
Jan 24 2012 11:33PM not applicable
index_1_dev 297.0 MB data only
Jan 24 2012 11:34PM 243404
log_2_dev 400.0 MB log only
Jan 25 2012 3:57AM not applicable
--------------------------------------------------------------
log only free kbytes = 292174
(return status = 0)
After that you shouldn’t have problems anymore with the transaction log getting full too quickly.
If you need to remove the log device later on, please use the following:
use master
go
sp_dboption mydb, "single user", true
go
use mydb
go
sp_dropsegment 'logsegment','mydb','log_2_dev'
go
use master
go
sp_configure 'allow updates', 1
go
delete from sysusages where dbid=db_id('mydb') and vstart=(select low from
sysdevices where name='log_2_dev')
go
sp_configure 'allow updates', 0
go
sp_dropdevice 'log_2_dev'
go
sp_dboption mydb, "single user", false
go
The sing user mode means that no other connection than the one you're working with will be
available. This ensures that no other activities are done in parallel.
Allow updates allows you to modify system tables manually.
If you see that your transaction log is growing at an unexpected speed or are just curious which
transactions are currently contained in the log, you can use the dbcc log operation:
dbcc traceon(3604)
go
dbcc log(mydbname, 0)
go
Replacing mydbname by the name of the database which transaction log you want to inspect.
This will return quite some data. If you only want to see which objects (e.g. tables) are affected,
you can use the following shell command:
cat <<EOT | isql -Usa -Pmysapassword | grep "objid=" | awk ' { print $1; }' |
awk -F "=" '{ print $2; }' | sort | uniq -c | sort -n -r
dbcc traceon(3604)
go
dbcc log(mydbname, 0)
go
EOT
If isql is not in your PATH, write the full path instead (probably $SYBASE/OCS/bin/isql on a
Linux box).
This will get the log data, extract the lines containing the object ID, then extract the object IDs,
counting the number of time each one appears and sorting the list by decreasing number of
occurences.
10 2062679415
6 20
1 8
2062679415 being a test table I've filled for the purpose of this test.
8 is syslogs.
20 is sysanchors.
You can check what those returned object IDs are with the following statement run in your
database:
----------- --------
8 syslogs
2062679415 henri
Note that sysanchors (object ID 20) is not shown here because it is not a normal database table.
It's a per-database pseudo-catalog where row anchors used by system catalogs are stored.
Also keep in mind that checking the contents of the transaction log takes longer when it's already
full. So if you see that the log is filling at a constant rate, it might be a good idea to read from the
log not too long after the last dump database or dump of the transaction log, so that it is faster (it
will have an engine use 100% CPU for quite some time if it's already full).
In some cases, you keep getting problems with a transaction log always filling up but do not need
to be able to restore all data in a disaster recovery scenario (e.g. because you're initially filling
the database and if something goes wrong, you can just repeat the process or because it's a test or
development system and you do not care about the data).
Unfortunately, it is not possible to completely disable it. But you can make sure that the
transaction log will be truncate on every checkpoint. The transaction log is then still there. It still
costs you resources but it will be cleared at every checkpoint which will prevent it from filling
up.
Replace mydb by the name of the database on which you want to perform it.
Most of you probably already know this but just to make sure... There are three kinds of
operations from a transaction log perspective:
1. logged operations: those are operations which are completely logged in the transaction
log.
2. minimally logged or non-logged operations which are logged in the transaction log but
not every change performed by this operation is logged. They do not prevent a dump
and restore of the transaction logged
3. minimally logged or non-logged operations which are not logged in the transaction log at
all.
When an operation of the third category is performed, since the transaction log entries are
missing, a dump and restore of the transaction log only is not possible anymore. This means ASE
is not able to recover the database in a disaster recovery scenario unless you take a full dump of
the database. Since the dumped transaction log does not contain the required information, ASE
prevents you from dumping the transaction log once one of these operations has been performed
because you couldn't use the dumped log to recover the database anyway. Many people tend to
think that truncate table also prevents a transaction log dump, which is not the true. Truncate
table does not log every deletion in the table and is thus not a fully logged operation but it does
log all page deallocations in the transaction log so that it's still possible to reconstruct the
database. So if you rely on a transaction log dump to recover the database or if you use it to
migrated data from a system to the other one, it is important to:
prevent such operations to happen This can be done by setting an option on the database:
Replace mydb by the name of the database you want to prevent such operations on. With this
option set select into, fast bcp and parallel sort operations will not be allowed anymore on this
database. check whether such operations have been performed You can use the following
query if such operations are not prevented as shown above.
select tran_dumpable_status('mydb')
If it returns 0, then everything is fine and a dump of the transaction log should work fine.
Otherwise, such an operation was performed and it is not possible to dump the transaction log
until a full dump of the database is performed.
If the returned value is not 0, you can find out exactly what happens by interpreting the return bit
mask:
So if you do not need any of the operations which prevent the transaction log from being
dumped, the best solution is to prevent them on the database level. Otherwise, when you need to
perform a transaction log dump (e.g. because the transaction is full or because you need to
migrate the changes to a replicated database), you should first check whether a transaction log
dump would be possible and trigger a full dump instead if not.
Database dump
Get information about a dumped database
In ASE 15, it is possible to get information from a dump file directly using the load database
command with the "with headeronly" parameter:
1> load database master from '/db/dump1/master.201209170128' with headeronly
2> go
Backup Server session id is: 69. Use this value when executing the
'sp_volchanged' system stored procedure after fulfilling any volume change
request from the Backup Server.
Backup Server: 6.28.1.1: Dumpfile name 'master12261014D2 ' section number 1
mounted on disk file '/db/dump1/master.201209170128'
This is a database dump of database ID 1, name 'master', from Sep 17 2012
1:28AM. ASE version: Adaptive Server Enterprise/15.5/EBF 18164 SMP
ESD#2/P/x86_64/Enterprise Linux/asear155/2514/64-bit/F. Backup Server
version: Backup Server/15.5/EBF 18164 ESD#2/P/Linux AMD Opteron/Enterprise
Linux/asear155/3197/64-bit/OPT/We. Database page size is 4096.
Database contains 15360 pages; checkpoint RID=(Rid pageid = 0x30e5; row num =
0x20); next object ID=1929054877; sort order ID=50, status=0; charset ID=1.
Database log version=7; database upgrade version=35; database
durability=UNDEFINED.
segmap: 0x00000007 lstart=0 vstart=[vpgdevno=0 vpvpn=4] lsize=6656
unrsvd=3461
segmap: 0x00000007 lstart=6656 vstart=[vpgdevno=0 vpvpn=22532] lsize=8704
unrsvd=8223
This will not really load the dump but just display information about the dump. In the example
above, you can see the following:
We've had the following issue: After a problem on a customer database, we needed to restore a
single table. A compressed dump was available containing the latest version of this one table but
an older version of the other tables. So the plan was to load the dump on a test system with the
same device layout, ASE version and operating system, export the table and import it on the live
system.
Oct 7 10:57:00 2013: Backup Server: 2.23.1.1: Connection from Server SYBASE
on Host myhost with HostProcid 7705.
Oct 7 10:57:00 2013: Backup Server: 4.132.1.1: Attempting to open byte
stream device: 'compress::9::/dump1_1/mydb.201310030015.000::00'
Oct 7 10:57:00 2013: Backup Server: 6.28.1.1: Dumpfile name 'mydb13276003C2
' section number 1 mounted on byte stream
'compress::9::/dump1_1/mydb.201310030015.000::00'
Oct 7 10:57:23 2013: Backup Server: 4.188.1.1: Database mydb: 2310872
kilobytes (1%) LOADED.
Oct 7 10:57:32 2013: Backup Server: 4.124.2.1: Archive API error for
device='compress::9::/dump1_1/mydb.201310030015.000::00': Vendor application
name=Compress API, Library version=1, API routine=syb_read(),
Message=syb_read: gzread() error=-1 msg=1075401822
Oct 7 10:57:32 2013: Backup Server: 6.32.2.3:
compress::9::/dump1_1/mydb.201310030015.000::00: volume not valid or not
requested (server: n byte stream 'cu
@ess::9::/dump1_1/mydb.20¤D, session id: 17.)
Oct 7 10:57:32 2013: Backup Server: 1.14.2.4: Unrecoverable I/O or volume
error. This DUMP or LOAD session must exit.
So it looks like there was a problem uncompressing the dump. I am not too sure where the
strange characters in the second to last line come from but I'm not sure either that it's related to
the problem.
Reading the header from the dump as described in a previous post worked fine. So the dump was
not completely corrupt. It's also the reason why the first percent of the dump could be loaded.
We also tried loading the dump using the "with listonly" option but it failed:
I never found out why it wasn't possible to use listonly on this dump file but I didn't really have
time to look into it in details...
The I saw that there was a with verify only option. Here from the Sybase documentation:
But it failed saying there was an error near "only"... Then I wondered why the syntax would be
"with headeronly" and "with listonly" but "with verify only" i.e. with an extra space. So we tried
without the space and it worked. Well, kind of... It could still load the header but failed with the
same error message while reading the rest.
Next I thought it might have been a problem while transferring the dump through FTP (I wasn't
sure whether the transfer was done in binary or ASCII mode). One way to check it is to search
for \r\n characters in the dump. It can be done using the od command. od dumps files in octal and
other formats. You can use the -c option to show special characters as escape characters (i.e. \r
and \n). So you need to run od and pipe it to a grep e.g.:
Another thing you need to check is whether uncompressing failed because of memory or disk
space issues. In our case we had plenty of free disk space and RAM available.
Another thing I found while googling for a solution was the following in a newsgroup:
Backup Server will use asynchronous I/O by default and there was a CR 335852 to work around
this behavior. Try starting backupserver using trace flag -D32 .
CR Description :-
6.21 Dumping or loading databases with asynchronous I/O
[CR #335852] On an IA32 running Red Hat, running a dump or load database command can
cause Backup Server to stop responding when using asynchronous I/O. Backup Server uses
asynchronous I/O by default.
[ Workaround : Start Backup Server using trace flag -D32 to force a synchronous I/O.
So we tried adding the flag to the start script of the backup server. But it didn't help. Anyway we
didn't know whether the problem was during loading or whether there had been a problem while
dumping.
The next thing which came up to my mind was to try and uncompress the dump file manually to
see whether it's corrupt. This can be done with gunzip. You just need to rename the file in case it
doesn't have a valid gzip extension e.g.:
mv mydb.201310060030.000.gz
gunzip mydb.201310060030.000.gz
In our case it failed. So we repeated it on a dump file we knew was fine and it worked. So we
had the source of the problem. The dump stripe was corrupt.
Repeating it on the dump on site worked. So the stripe was not corrupt after the dump but was
somehow corrupted in the transfer. So all we had to do was to transfer it again.
I'm not too sure why the stripe got corrupted during the transfer but was happy it didn't get
corrupted while dumping as we had feared in the beginning.
Archive databases are used to access data from a backup file directly without having the restore
the database. Let's say you lost some data in a table but had many other changes to other tables
since the last backup. Just loading the last backup is not an option since you'd lose everything
since the last backup. Of course, if you work with transaction log dumps, you can reduce the loss
of data but very often it's still too much. Additionally, in some cases you know the data you want
to reload have not changed since the last backup (i.e. some kind of master data). So the best
solution would be to be able to keep the current database but just reload this one table. Or maybe
you do not want to reload a complete table but just copy a few deleted lines back in a table.
That's exactly what an archive database is for. You cannot dump an archive database. An archive
database is just a normal database dump loaded in a special way so that you can access the data
without having to do a regular load of the dump which would overwrite everything.
So what do you need in order to mount a database as an archive database. Well, you need two
additional databases:
1. A "scratch database"
2. An archive database
The "scratch database" is a small database you need to store a system table called sysaltusages.
This table maps the database dump files you are loading to the archive database.
The archive database is an additional database you need to store "modified pages". Modified
pages are pages which are created additionally to the pages stored in the dump files. These are
e.g. the result of a recovery performed after loading the database dump. So this database is
typically much smaller than the dump files you are loading. But it is difficult to tell upfront how
big it will be.
So once you have loaded an archive database, the data you see come from these three sources:
So let's first create the two database (I assume here you have some devices available to create
these databases).
use master
go
create database scratchdb on scratch_data_dev='100M' log on
scratch_log_dev='100M'
go
This will create the scratch database and take in online. Then we need to mark this database as a
scratch database:
use master
go
create archive database archivedb on archive_data_dev='100M' with
scratch_database = scratchdb
go
Now we're ready to load the dump. Just do it the way you would load the database to restore it
but only load it to the just created archive database e.g.:
Note that while loading the database dump or the transaction log dumps, you might get error
message saying that either the transaction log of the scratch database or the modified pages
section of the archive database run full e.g.:
There is no more space in the modified pages section for the archive database 'pdir_archive_db'.
Use the ALTER DATABASE command to increase the amount of space available to the
database.
Depending on the message you get, you'll have to add more space for the transaction log of the
scratch database or extend the archive database using alter database. Note that ASE usually gives
you a chance to do it before aborting. But at some point in time, it will probably abort, so do not
take your time ;-)
If you do not care about the recovery and have limited storage available for the archive database
you can use:
Loading with norecovery also reduces the time required to load. Also the database is
automatically brought online (this also means you cannot load additional transaction logs). The
downside is that the database might be inconsistent (from a physical and transactional point of
view).
If you did not use the norecovery option, you have to bring the archive database online:
Caches
Force loading in cache
You can load an entire index into Sybase's cache by executing the following:
This will perform an index scan and load the index data in cache.
If you want to perform a table scan instead i.e. load the data pages in cache, you can write the
table name instead of the index name. But if you have an unclustered index with the same name
as the table, it will be loaded in cache instead to make sure that the table is loaded, you can issue
the following statement:
Please note that this will not load the text and image columns of the table.
In order to check whether the whole table is in cache, you can do the following:
This lists all tables in cache and the percentage of the data pages loaded.
In order to peek into the procedure cache, you can use the following dbcc command:
dbcc procbuf
You'll see that the output is pretty extensive. If what you are after is which trigger and
procedures are using space in the procedure cache and how much space it uses, you only are
interested in the lines like:
...
Total # of bytes used : 1266320
...
pbname='sp_aux_getsize' pbprocnum=1
...
You can thus execute it and grep for these two lines:
...
sp_jdbc_tables 62880 bytes
sp_getmessage 68668 bytes
sp_aux_getsize 80596 bytes
sp_mda 81433 bytes
sp_mda 81433 bytes
sp_drv_column_default 90144 bytes
sp_dbcc_run_deletehistory 133993 bytes
sp_lock 180467 bytes
sp_helpsegment 181499 bytes
sp_dbcc_run_summaryreport 207470 bytes
sp_modifystats 315854 bytes
sp_autoformat 339825 bytes
sp_spaceused 353572 bytes
sp_jdbc_columns 380403 bytes
sp_do_poolconfig 491584 bytes
sp_configure 823283 bytes
Identity columns
Using identity_insert to insert data in tables with an identity column
I have a stored procedure working which get the name of a table as input and does some
processing on the table. It first copies some rows from this table to a tempdb table using select
into. It then inserts some additional rows based on some logic. The stored procedure is written in
a generic way and doesn't know upfront on which table it will be working. Some of the tables
have an identity column. So after the select into, the tempdb table also has an identity column.
The second statement (an insert into) then fails if there is an identity column as it is by default
not possible to specify the value for an identity column.
In order to be able to do it, you need to set the identity_insert option to on:
But if you set it for a table without any identity column, you get the following error message:
Cannot use 'SET IDENTITY_INSERT' for table 'tempdb..mytemptable' because the table does
not have the identity property.
So you basically first need to check whether an identity column exists for this table and only set
the option for this table if it does. This can be done with the following statements:
(Replace tempdb by the database containing your table and mytemptable by the name of your
table)
This first checks whether there is any column for this table which has the status bit 128 set (i.e. is
an identity column) and only then sets identity_insert to on for this table.
Additionally you should note that identity_insert can only be set to on for a single table in a
given database at a time (within one session). If it is already on for a table and you try to set it on
for another one, you'll get the following error message:
Of course if you're doing this all programatically it doesn't help you much. So you'd first need to
find out whether it's already on for another table. Unfortunately there seems to be no way to get
this kind of information from ASE. The only solution to this problem seems to loop through all
tables in the current database having an identity column and set identity_insert to off for this
table. This can be done using the following stored procedure:
open id_ins_cursor
fetch id_ins_cursor into @table_name
while @@sqlstatus = 0
begin
set @sqlstatement = 'set identity_insert '+@table_name+' off'
exec( @sqlstatement )
fetch id_ins_cursor into @table_name
end
close id_ins_cursor
deallocate cursor id_ins_cursor
end
go
Just execute it without parameters and identity_insert will be set to off for all tables in the current
database. The following should now work without problem:
use tempdb
go
set identity_insert myothertemptable on
go
sp__clearidentityinsert
go
set identity_insert mytemptable on
go
If you want to set it for tables in another database than the current database, you'll need to
hardcode the database name in the stored procedure before sysobjects, syscolumns and in
@sqlstatement e.g.:
open id_ins_cursor
fetch id_ins_cursor into @table_name
while @@sqlstatus = 0
begin
set @sqlstatement = 'set identity_insert
tempdb..'+@table_name+' off'
exec( @sqlstatement )
fetch id_ins_cursor into @table_name
end
close id_ins_cursor
deallocate cursor id_ins_cursor
end
go
Parametrizing the database name is not so easy since cannot use a parameter for the database
name in the select of the cursor. We need to use some tricks using a temporary table and
dynamic SQL:
open id_ins_cursor
fetch id_ins_cursor into @table_name
while @@sqlstatus = 0
begin
set @sqlstatement = 'set identity_insert
'+@dbname+'..'+@table_name+' off'
print @sqlstatement
exec( @sqlstatement )
fetch id_ins_cursor into @table_name
end
close id_ins_cursor
deallocate cursor id_ins_cursor
sp__clearidentityinsert tempdb
When you use SELECT INTO to copy data from one table to a new table (e.g. to create a copy of
the table) and your source table has an identity column, the corresponding column in the newly
created table will also be an identity column. Here an example:
1> select convert(int, id) as id, something into mycopy from mytable_with_id
2> go
1> sp_help mycopy
2> go
...
Column_name Type Length Prec Scale Nulls Default_name Rule_name
Access_Rule_name Computed_Column_object Identity
----------- ---- ------ ---- ----- ----- ------------ --------- ------------
---- ---------------------- ----------
id int 4 NULL NULL 0 NULL NULL NULL
NULL 0
something char 1 NULL NULL 0 NULL NULL NULL
NULL 0
...
Since the identity property is also not propagated when using a UNION, you can also do the
following:
1> select id, something into mycopy from mytable_with_id UNION select id,
something from mytable_with_id where 1=0
2> go
1> sp_help mycopy
2> go
...
Column_name Type Length Prec Scale Nulls Default_name Rule_name
Access_Rule_name Computed_Column_object Identity
----------- ---- ------ ---- ----- ----- ------------ --------- ------------
---- ---------------------- ----------
id int 4 NULL NULL 0 NULL NULL NULL
NULL 0
something char 1 NULL NULL 0 NULL NULL NULL
NULL 0
...
The second SELECT in the UNION doesn't add new data (1 is generally not equal to 0) but
prevents the identity property from being propagated.
When working with an identity column, a counter per table is maintained. This counter is used to
generate new values for the identity column of the table.
When you load many data into a table as a test and then truncate the table, the identity counter is
not reset. So if you inserted 1000000 rows, truncated the table and inserted an entry, you'd then
the value 1000001 for the new entry.
If you want to reset the identity counter for a table in order to have the next value of the identity
column be 1 instead, you can change the identity_burn_max attribute for the table e.g.:
Please note that this command creates an exclusive table lock on the table. The lock is not kept
for long but this means that the command will be blocked by any read or write lock on the table.
The last parameter is the identity counter you want to set. So if you have kept the entries with the
identity values 1 to 10 and deleted the rest, you'd have to set the identiy_burn_max attribute to
10:
If you try to set it to a lower value (i.e. a value lower than the maximum value already in use in
the table), sp_chgattribute will fail, refusing to update the attribute because you then risk having
duplicate values in there.
You can work around it by directly setting the attribute using dbcc:
Also note that if what you want is to actually create an identity gap, all you have to do to
increase the counter is to allow identity inserts on the table, insert a higher value and delete it:
If the query optimizer doesn't choose the query plan you'd expect, and basically would expect
tables to be joined in a different order than you'd have expected, you need to get more details
about what the optimizer has considered and why it's chosen a given query plan.
First to see what the optimizer chooses for a given query, you should execute the following
before executing the query:
set showplan on
If your query is causing a huge load, you should consider executing it with the noexec option.
This will cause the optimizer to show you the query plan, it would use but not execute it:
set showplan on
go
set noexec on
go
After executing the query, you can set back both options:
The order is important since you cannot set showplan back to off while noexec is on.
This will also show you, whether you are performing any table scans and what the expected total
cost for the query is.
Now you know which tables are processed in which order and how the joins are performed. Very
often, you get a different order on two systems depending on the number of rows, distribution of
data, how exact the statistics are... It might also use different indexes on different systems.
If what you're after is why a given index has been chosen for a table, you should execute the
following (before setting the noexec option):
3604 makes sure that the trace output is sent to the client preventing it from filling up your error
log.
302 prints information regarding which indexes were considered and why a given index was
chosen.
You should check in the output of dbcc trace 302 that all where clauses in the query have been
evaluated. If it is not the case the optimizer considered it as not being a valid search argument.
This will give you a first possible cause of the problem.
Now that you know which indexes were chosen and why, you will often also want to know why
the tables were joined in a particular order. For this you should use the following:
It will show you the number of tables in join and the number of tables considered at a time. It
then prints the first plan considered by the optimizer, and then each plan which is considered
cheaper (so iteratively showing you the currently selected plan during the analysis). It writes the
heading "NEW PLAN" for each of these plans (it's important if you combine 310 with 317 which
displays the rejected plans with another heading). It also prints information about the number of
rows expected as input and output of a join, the costs involved (as well expected logical and
physical IO).
If the plan you'd have expected to be chosen is in the list, you're done here. You see why the
optimizer tought the other plan is better and can start looking for a way to convince the optimizer
that the other plan would have been better. Often, the optimizer is not just wrong but has wrong
input data (statistics...).
If the plan you'd have expeceted is not there, it's one of the plans which was rejected because the
optimizer thought it already had a better plan. So you need to display also rejected/ignored plans:
The 317 trace basically displays the same info but for not selected plans. It uses the heading
"WORK PLAN" instead of "NEW PLAN". This produces a very large output (can easily be 20
MB of text).
If you cannot find any better way to have the optimizer think the way you want it to think, you
can use forceplan:
There is a way to force the optimizer to choose a certain plan. Specifically, the join order
specified by the query itself, in the "from" clause.
set forceplan on
This forces the optimizer to use the exact join order defined in the list of tables in the FROM
clause of the query. But this is only useful if you have an influence on the order of table in the
FROM clause. I usually do not like using forceplan but sometimes it's the only solution you have
(especially if you don't have endless time to solve a performance issue).
Please note that it only tells Sybase which join order to use, not which join types.
In databases, data skew means that many values occupy a small number of rows each and a few
values occupy many rows each. So basically it is an asymmetry in the distribution of the data
values.
The distribution of data would look like this:
We have an employee table containing all our employees world-wide (and since we're in this
example a quick-ass multinational corporation, we have millions of employees). Now assume we
also want to track in this table employees of external companies we are contracting. Additionally
we want to also track for which company they are actually working. So we add a table called
"company" and reference the ID of this company in our employee table.
Now let's assume we want to get the list of all employees working for company "ext2":
STEP 1
The type of query is SELECT.
FROM TABLE
company
c
Nested iteration.
Table Scan.
Forward Scan.
Positioning at start of table.
Using I/O Size 4 Kbytes for data pages.
With LRU Buffer Replacement Strategy for data pages.
FROM TABLE
employee
e
Nested iteration.
Table Scan.
Forward Scan.
Positioning at start of table.
Using I/O Size 32 Kbytes for data pages.
With LRU Buffer Replacement Strategy for data pages.
So basically ASE decides to perform a table scan on both tables. The density information for the
company_id column leads the optimizer into thinking that it will probably have to through all
data pages anyway when using the index on company_id and that it's then actually cheaper to
skip the index and directly read the data pages.
The solution to this issue is to modify the statistics in order to change the total density of a
column to be equal to the range density:
STEP 1
The type of query is SELECT.
FROM TABLE
company
c
Nested iteration.
Table Scan.
Forward Scan.
Positioning at start of table.
Using I/O Size 4 Kbytes for data pages.
With LRU Buffer Replacement Strategy for data pages.
FROM TABLE
employee
e
Nested iteration.
Index : employee_company_index
Forward Scan.
Positioning by key.
Keys are:
company_id ASC
Using I/O Size 4 Kbytes for index leaf pages.
With LRU Buffer Replacement Strategy for index leaf pages.
Using I/O Size 4 Kbytes for data pages.
With LRU Buffer Replacement Strategy for data pages.
Which is much faster. Of course if the company name we search for is "benohead corp." (our
company) instead of "ext2", this query plan would actually be slower than going directly for a
table scan. So the optimizer is actually kind of right. You just need to know how what kind of
query you perform and if you see that you only check employees for the external companies then
it makes sense to modify the density information.
You will have to repeat this operation (modifying the statistics) after every update statistics since
updating the statistics will undo the REMOVE_SKEW_FROM_DENSITY.
Note that this problem doesn't occur anymore (or less often ?) with ASE 15. So it's mostly a
problem when using ASE 12.5.4 or lower or ASE 15 with the compatibility mode switched ON
(e.g. because you have a legacy application which performs really bad with the new optimizer...).
With the ASE 15 optimizer, you get the following query plan even though without modifying the
density information:
STEP 1
The type of query is SELECT.
But the ASE 15 optimizer also has other problems of its own...
Updating statistics
In ASE, statistics are used by the Optimiser to determine the appropriate Query Plan. Wrong or
out-of-date statistics can cause the wrong query plan to be selected which can lead to massive
performance issues. It is thus very important to consider statistics in regular database
maintenance activities and make sure they are updated on a regular basis.
There are a few ways in which you can make sure that the statistics are up-to-date and support
the query optimizer in choosing the optimal query plan:
In the end of the article, I'll also shortly discuss the performance considerations required when
choosing the appropriate command.
Update Statistics
Using the update statistics command, you can update the statistics about the distribution of key
values for a table, an index on the table, specific columns of the table or on partitions.
it will update statistics for the leading columns of all indexes on the table. The leading column of
a composite index is the first column in the index definition. This is the most important one since
the index cannot be used if only the second column is used in the WHERE clause but not the first
one.
it will update the statistics for the leading column of the specified index i.e. it will update the
statistics on col1.
it will update the density information on all columns of all indexes of the partition. It will
additionally also create histograms for all leading columns of indexes of the partition. You can
also provide a list of column names. In this case it will create histograms for the first column and
densities for the composite columns.
Note that only local indexes of the partition are considered. Global indexes are not considered.
Also note that updating the statistics for a partition also updates the global statistics i.e. only
local indexes are considered but global statistics are updated with the gathered data.
it will create histograms for the first referenced column and density information for all column
groups with the first column as leading column.
Also note that in many cases statistics are mostly useful for columns referenced by indexes
(especially as leading columns). Updating statistics for other columns create an overhead. But in
some cases it is required and better than using update all statistics.
The command update index statistics update the statistics of all columns of an index:
is that the latter focuses on the lead column of the index. The former creates histograms not only
for the leading column of the index but for all columns specified in the index definition.
You can also update the statistics for all columns of all indexes on the table by omitting the index
name:
Note that it is not exactly the same as calling update index statistics for each index on the table
since the latter updates the statistics of columns referenced by multiple indexes multiple times.
So if you intend to update the statistics for all indexes, it's more performant to omit the index
name than issuing one command per index.
1. An index scan will be performed to update the statistics of the leading column of the
index
2. For the other columns, a table scan for each of them will be required followed by a
sorting based on a table in tempdb
This is basically the same problem as when using update statistics on columns which are not the
leading column of an index.
This command creates statistics for all columns of the specified table:
1. An index scan will be performed for each column which is the leading column of an index
2. A table scan will be performed for each other table, creating a work table in tempdb
which will then be sorted
The update table statistics is kind of a different beast. It doesn't update the data in systabstats but
only table or partition level statistics in sysstatistics i.e. it does not affect column level statistics.
Modifying Statistics
sp_modifystats is a system stored procedure which can be used to update density information for
a table of a column group.
You can use the MODIFY_DENSITY parameter to change the cell density information for
columns:
This will set the range cell density to 0.5 for the column col1 as well as for all column group
with density information having col1 as leading column.
Please refer to my previous post regarding data skew, their effect on queries and how to handle
them: Data skew and query plans.
Performance
Updates of statistics are a required maintenance activities if you want to keep your queries fast.
On the other hand, updating statistics also has a non-negligeable impact on performance and
load.
In general the leading column of indexes are the critical ones. So you need to make sure they are
always up-to-date.
But in some cases it does make sense to have up-to-date statistics also for other columns: If you
have a WHERE clause also containing this column not being part of an index and 99% of the
values in this column are the same, this will greatly impact how joins are done. Without statistics
on this column, a default distribution will be assumed and the wrong join might be selected.
So I'd recommend:
1. Using "update statistics" instead of the other commands to optimize the leading columns
of indexes
2. Using "update statistics" on specific columns which are not leading column in an index,
in case you see that ASE chooses the wrong join because it assumes a default
distribution of data and it is not the case. Use it only if really required as it creates a huge
load on the system
3. Avoid using "update all statistics". It generally makes more sense to use dedicated
commands to update what needs to be updated
More information regarding sampling, histogram steps and the degree of parallelism will be
added later...
You can load an entire index into Sybase's cache by executing the following:
This will perform an index scan and load the index data in cache.
If you want to perform a table scan instead i.e. load the data pages in cache, you can write the
table name instead of the index name. But if you have an unclustered index with the same name
as the table, it will be loaded in cache instead to make sure that the table is loaded, you can issue
the following statement:
Please note that this will not load the text and image columns of the table.
In order to check whether the whole table is in cache, you can do the following:
This lists all tables in cache and the percentage of the data pages loaded.
Indexes
Find all tables in a database with a unique index
The following statement will return a list of all user tables in the current database which have a
unique index:
If you've already changed the locking scheme of a huge table from allpages locking (which was
the locking scheme available in older versions of ASE) to datapages or datarows locking, you've
noticed that it does take a long time and keeps you CPU pretty busy.
The reason is that when switching between allpages locking and data locking basically means
copying the whole table and recreating the indexes.
Note that this is also the reason why some people use this switching back and forth as a way to
defragment a table.
The first two steps are the one taking the most time. It's difficult to estimate the time required for
this. But you can get an idea by checking the size of the data and indexes for this table. This can
be done using sp_spaceused:
(1 row affected)
(return status = 0)
It doesn't tell you how much time is needed but if you do it on different table, you could assume
the time needed is almost proportional to the size of data+indexes.
Note that switching between datapages and datarows locking schemes only updates system tables
and is thus pretty fast.
Get info regarding indexes and which segment they are located on
The following SQL statement displays all indexes on user tables in the current database, with the
related table, whether it's a clustered index and on which segment it is located:
Please note that clustered indexes are usually located on the default segment.
Combine with the output of sp_estspace, you can find out what the required size of the index
segment should be.
In order to find the space used by a table (data and indexes) you can execute the following:
1> sp_spaceused mytable
2> go
name rowtotal reserved data index_size unused
------- -------- --------- ------- ---------- ------
mytable 20612 103320 KB 7624 KB 95216 KB 480 KB
(1 row affected)
(return status = 0)
If you see that the indexes take much more space than expected, you can also use 1 as a second
parameter to the sp_spaceused stored procedure:
(1 row affected)
name rowtotal reserved data index_size unused
------- -------- --------- ------- ---------- ------
mytable 20612 103320 KB 7624 KB 95216 KB 480 KB
(return status = 0)
When a table has a clustered index, ASE makes sure that all rows are physically stored in the
order defined by the columns on which you have the clustered index. There can only be one
clustered index on a given table as ASE cannot store the data with two different orders.
If the table has a data lock scheme, the table will be reorganized when the clustered index is
created but the order of rows will not be further updated. If the table has an allpages lock
scheme, then ASE will make sure that the order is maintained.
Note that although clustered indexes (especially when defined through primary key constraints)
are often created as unique indexes, it doesn't have to be the case.
Reading from a table with allpages lock and a clustered index using the keys of the clustered
index as criteria is almost always faster than without the clustered index. But writing to the table
is slower since ASE needs to maintain the order. This can create huge performance issues when
working with huge tables with many updates on the index keys or many inserts/deletes. In some
cases (I observed a case on a table with 28 million entries), committing or rolling back changes
on such a table can cause many physical IOs. If this is done in a transaction also involving
updates on other tables this could cause many other connections to blocked. In the case I had
observed, it took up to 30 minutes to finish some rollbacks. My assumption was that it is because
ASE needs to reorganize the whole index which involves reading an writing many pages. In this
case dropping the primary key constraints solved the problem. You can just replace the clustered
index by a non-clustered one.
So I'd recommend in general not to use clustered index on huge allpages tables except if you are
100% sure that you need the extra read performance. The penalty while writing can be so huge
that it cannot be compensated by the extra read speed.
We have a table with allpages lock scheme. A nonclustered index is added to this table (which
has millions of entries). During the creation of the index, many blocked delete commands are
observed. Would having another lock scheme (data page lock or data row lock) help ?
Answer:
The locking scheme of the table doesn't matter in this case. When an index is created, a table
lock is always created. If the index is clustered, then the lock is exclusive i.e. SELECT, INSERT,
UPDATE and DELETE operations are blocked. Otherwise it is only a shared lock i.e. INSERT,
UPDATE and DELETE statements are still blocked but SELECT statement are not blocked. In
this case it is a nonclustered index but for a DELETE statement it makes no difference, the
statement is still blocked.
If you need to find out which devices contain fragments of a database and their physical location,
you can use the following SELECT statement:
I use this when dropping proxy databases I'm creating for some temporary checks. I want to drop
the database and drop their related devices and delete the files.
I have called all my proxy databases tempdb_old1, tempdb_old2,... So I'd get the information I
require for my cleanup activity like this:
(4 rows affected)
First, dsync has no effect on raw devices (i.e., a device on a raw partition) and on devices on
Windows operating system (i.e., it only affects Unix/Linux operating systems).
ASE opens a database device file of a device with the dsync setting on, using the operating
system dsync flag.With this flag, when ASE writes to the device file, both the written data must
be physically stored on disk before the system call returns.
This allows for a better recoverability of the written data in case of crash: If the writes are
buffered by the OS and the system crashes, these writes are lost. Of course, this only handles OS
level buffering. The data could still be in the disk write cache and get lost...
The drawback of dsync is that it costs performance (because the writes, even if buffered by the
OS, are guaranteed to go to the disk before the operation finishes).
It should be noted that dsync doesn't mean that there is not asynchronous I/O. It just means that
when you write synchronously or check for whether the asynchronous I/O was performed, you'll
only get the response that the write is completed once the data are effectively on the physical
disk.
dsync is always on for the master device: the performance of writes there is not critical and it's
important that it can be fully recovered.
On the other hand, it is common to turn off dsync on devices of databases which do not need to
be recovered like the tempdb.
directio
directio is basically a way to get a way to perform I/O on file system devices in a similar way to
raw devices i.e. the OS buffer caches are bypassed and data are written directly to disk.
But directio does not guarantee that the writes will only return after all data have been stored on
disk (just that data will not go into caches). But it since the OS buffer caches is bypassed, it does
provide a pretty good recoverability.
directio provides better write performance than sync (especially if the device is stored on a
SAN). On the other hand, dsync is faster on devices for read operations. So transaction log
devices are very good candidates for directio (or for raw devices).
Also on newer Linux kernels dsync provides awful performance and you should then rather use
directio than dsync.
For the tempdb devices you should use neither dsync nor directio (as you do not need the
recoverability at all).
Size of data and log segments for all databases
Use the following SQL statement in order to get information about the size and usage of the data
and log segments in all databases:
@@maxpagesize returns the server’s logical page size. It's basically the same value you'd get by
using the following:
1048575 / 1048576 * 4096 returns 0 as 1048575 / 1048576 is 0 when doing some pure
integer arithmetic
1048575 / 1048576. * 4096 returns 4095.99606784
When computing the values you have to make sure that you avoid arithmetic overflow which
would happen e.g. if you multiplied by @@maxpagesize before dividing by 1048576.
That's why we exclude segmap = 4 when computing values for the "data" columns, consider only
segmap = 4 for the columns related to the size of the log segment or it's usage. But we do
consider both segmap=4 and segment=7 for the usage percentage of the log since when both are
on the same segment, a full segment would also indicate a full log.
curunreservedpgs returns the number of free pages in the specified piece of disk. The third
parameter (we provide here sysusages.unreservedpgs) is returned instead of the value in memory
when the database is not opened i.e. not in use.
lct_admin with "logsegment_freepages" as the first parameter returns the number of free pages in
a dedicated log segment
Note that all strings in bold+italic should be changed to something which is relevant for your
setup.
You can choose anything as a logical name. This is how the remote server
will be called in master..sysservers and the name used by the other
commands we'll use.
If you use another port than 2055, you of course have to change it in the
statement above.
"ASEntreprise" is the remote server type and means it is also an ASE server.
You can check whether the remote server has been added properly:
Now you need to create an entry in the sysremotelogins table for the remote server. This is done
with the stored procedure sp_addremotelogin.
If you have the same users on both server, you can just execute the following to map remote
names to local names:
sp_addremotelogin logical_name
If you want to map all logins from the remote server to a local user name:
If you only want to map a single remote login from a remote user on the
remote server to a local use:
If you do not want to create it on default but on a new device, you'll need to
first create the device.
Of course the parent directory of the file which path is set in physname should exist and the
appropriate rights should be set.
(you can also added the parameter directio=true to this command if required)
And then create the database:
Now you can use the proxy database to access data of all tables in the
remote database:
The data are still residing on the remote server so you do not need to do anything when data
change. But if the structure changes (i.e. if you add or remove tables or update the structure of a
table), you need to update the proxy:
In order to access a remote database, you need to add a remote server and create a proxy
database as shown here.
If your remote server gets a new IP address, you can of course drop the proxy database and
remote server and recreate them. But I didn't want to do that, so just update the sysservers table
in the master database:
(1 row affected)
Configuration option changed. ASE need not be rebooted since the option is
dynamic.
Changing the value of 'allow updates to system tables' does not increase the
amount of memory Adaptive Server uses.
(return status = 0)
1> update master..sysservers set srvnetname='192.168.230.236:2055' where
srvid=4
2> go
(1 row affected)
1> sp_configure 'allow updates to system tables', 0
2> go
Parameter Name Default Memory Used Config Value
Run Value Unit Type
------------------------------ -------------------- ----------- ------------
-------- -------------------- -------------------- --------------------
allow updates to system tables 0 0 0
0 switch dynamic
(1 row affected)
Configuration option changed. ASE need not be rebooted since the option is
dynamic.
Changing the value of 'allow updates to system tables' does not increase the
amount of memory Adaptive Server uses.
(return status = 0)
In order to update a system table like sysservers you first need to allow such changes. After
modifying sysservers, you can disable such updates again as shown above.
So I can connect to the remote ASE with telnet, so must have missed something else. In the end,
I never figured out why it didn't but did find an alternative way to do this:
Ok, it doesn't work because I've already updated sysservers manually, so setting it back to the
way it was before the update:
(1 row affected)
Configuration option changed. ASE need not be rebooted since the option is
dynamic.
Changing the value of 'allow updates to system tables' does not increase the
amount of memory Adaptive Server uses.
(return status = 0)
1> update master..sysservers set srvnetname='192.168.230.225:2055' where
srvid=4
2> go
(1 row affected)
1> sp_configure 'allow updates to system tables', 0
2> go
Parameter Name Default Memory Used Config Value
Run Value Unit Type
------------------------------ -------------------- ----------- ------------
-------- -------------------- -------------------- --------------------
allow updates to system tables 0 0 0
0 switch dynamic
(1 row affected)
Configuration option changed. ASE need not be rebooted since the option is
dynamic.
Changing the value of 'allow updates to system tables' does not increase the
amount of memory Adaptive Server uses.
(return status = 0)
That's right! sp_addserver doesn't only add a server but can also change the physical name of a
server. Now trying to refresh the proxy again:
No error message this time ! That was easy (well, once you figure out that sp_addserver is
actually sp_add_or_update_server...).
In order to get the text of a trigger, you can use (just like for stored procedure) the sp_helptext
stored procedure:
sp_helptext mytriggername
It's not easy to further use the output of the procedure in further processing. You need to
work with the loopback adapter and create a proxy table...
The text is stored in chunks of 255 characters so if you just execute the procedure and
redirect to a file, you get unwanted newlines in there
The text of triggers and stored procedure is stored in the syscomments table:
The id column is the object id of the trigger or stored procedure. This can be used to make a join
with the sysobjects table to have access to the object name:
select c.text
from syscomments c, sysobjects o
where o.id=c.id and o.name='mytriggername' order by c.colid
With the statement above you still get chunks of 255 characters. Now you need to iterate through
the results and store them in a variable:
declare trigger_text_cursor cursor
for select c.text
from syscomments c, sysobjects o
where o.id=c.id and o.name='mytriggername' order by c.colid
for read only
go
open trigger_text_cursor
fetch trigger_text_cursor into @text
while @@sqlstatus = 0
begin
set @triggertext=coalesce(@triggertext, '')+@text
fetch trigger_text_cursor into @text
end
close trigger_text_cursor
deallocate cursor trigger_text_cursor
select @triggertext
go
You can then use @triggertext to perform any further processing you need.
coalesce is used so that if a value is null, it will use the empty string. You could also do the same
thing using isnull.
Note that it is not possible to declare a TEXT variable. Instead you have to declare a large
VARCHAR variable (as done above). The only drawback is that the maximal length of such a
variable is 16384 characters. If you have triggers or stored procedures with a longer text, you'll
have to implement the loop in a script or program (instead of using a cursor in Transact-SQL).
To workaround the limitation above: It could only retrieve up to 16384 characters which is the
limit for a varchar variable. Defining a text variable is not possible. Now I have a few triggers
longer than 16384 and their text is being truncated.
I ended up have a table and storing one row per line in the trigger text. Identifying the lines
basically means appending the text to a varchar(16384) variable, searching for a line feed,
storing the text before the line feed in the table, doing this recursively until you got all lines
currently in the buffer and resuming with the rest.
open trigger_text_cursor
fetch trigger_text_cursor into @text
while @@sqlstatus = 0
begin
-- Appends each line of the trigger text in syscomments to the
text already gathered.
set @triggertext=coalesce(@triggertext, '')+@text
-- Loop through the lines (delimited by a line feed
select @index=charindex(Char(10),@triggertext)
while (@index >0)
begin
select @triggertext2=substring(@triggertext,1,@index-1)
-- Add each line to the table
if (@triggertext2 is not null)
insert into tempdb..triggertext (triggertext)
values(@triggertext2)
-- Continue with the rest of the string
select
@triggertext=substring(@triggertext,@index+1,16384)
select @index=charindex(Char(10),@triggertext)
end
fetch trigger_text_cursor into @text
end
close trigger_text_cursor
deallocate cursor trigger_text_cursor
If you want to execute this with isql and write the output to a file, you additionally do the
following:
call isql with the -b option which disables the display of the table headers output.
set nocount on to suppress X rows affected messages.
set proc_return_status off to suppress the display of return statuses when stored
procedures are called.
pipe the result to the following sed command to remove all those annoying trailing
spaces (basically removing all spaces before an end of line):
sed 's/[[:space:]]*$//'
In order to find all triggers on a table (select, delete, insert and update triggers), you can use the
stored procedure sp_depends:
sp_depends relies on the sysdepends table and there were always problems with this
table not being updated properly.
As you can see above it also shows you all triggers referencing this table even if it's a
trigger on another table.
A similar solution to this which also has the problem that it displays too many objects:
In order to reliably find only triggers on a given table, you have to check the sysobjects table:
1> select so2.name from sysobjects so1, sysobjects so2 where (so2.id =
so1.deltrig or so2.id = so1.instrig or so2.id=so1.updtrig or
so2.id=so1.seltrig) and so1.name='tablename'
2> go
This will return a list of the names of all triggers on the table.
Alternatively, you can also get the names of the triggers using the following statement (without
the double join and with one column per trigger instead of one row per trigger):
If you need to also get the text of the triggers, you can use sp_helptext
<trigger_name> or use the method described in Get the text of a trigger or stored procedure.
Note: There is a seltrig column in sysobjects but I'm not aware of a way to actually define such a
trigger so it's probably not used in ASE.
You can use the following stored procedure to drop all triggers on a table:
open delete_trigger_cursor
fetch delete_trigger_cursor into @trigger_name
while @@sqlstatus = 0
begin
set @selectString = 'drop trigger '+@trigger_name
print @selectString
exec (@selectString)
fetch delete_trigger_cursor into @trigger_name
end
close delete_trigger_cursor
deallocate cursor delete_trigger_cursor
end
go
It basically does the following:
Fetch the name of all triggers on the table into a temp table (we have to go through a
temp table since by dropping a trigger we modifying the sysobject table and it would
mess with the cursor.
Loop through the trigger names
And drop each of them using an execute immediate. Somehow you cannot directly call
drop trigger...
If you want to disable a trigger, you can of course just drop it. But when you need it again, you'll
have to recreate it. If you just want to disable the trigger because you're copying data to the table
and this data has already been preprocessed and the triggered is just getting in the way, you can
disable it with the alter table command:
This will disable all triggers on the table. If you only want to disable a single trigger, you can use
the following:
If you want to check whether the triggers are enabled or disabled, you'll have to check the
sysobjects entry for this table. The column sysstat2 contains the information you're looking for:
select
object_name(instrig) as InsertTriggerName,
case sysstat2 & 1048576 when 0 then 'enabled' else 'disabled' end as
InsertTriggerStatus,
object_name(updtrig) as UpdateTriggerName,
case sysstat2 & 4194304 when 0 then 'enabled' else 'disabled' end as
UpdateTriggerStatus,
object_name(deltrig) as DeleteTriggerName,
case sysstat2 & 2097152 when 0 then 'enabled' else 'disabled' end as
DeleteTriggerStatus
from sysobjects where name='mytable'
Keep in mind that the three flags used above are not documented and could by changed by SAP
in the future.
If you use identifiers longer than 30 bytes (which is only possible in ASE 15 and above), you can
remove the convert varchar(30).
Sometimes you need a stored procedure to be executed in a given database. The reason could be
that you use functions like drop trigger or create trigger which do not allow you to specify a
database. But if you are currently in the context of another database and need to stay there (e.g.
because you need to update proxy databases and need to be in the master database for this), you
cannot switch databases by using:
But what you can do is create a second stored procedure in the other database and call it using
the database name:
Note that this also works for procedure which are in the sybsystemprocs database (and are thus
callable from any database) i.e. you can call any of these procedures with a database name before
the procedure name and execute this procedure in the context of another database e.g.:
Other topics
Check whether a temporary table exists
You can check for the existence of non-temporary tables (even in in tempdb) like this:
Unfortunately it doesn't work for temporary tables since temporary tables (e.g. #mytemptable)
are created for the current connection and another connection might create another temporary
table with the same name. So Sybase will create an entry in tempdb..sysobjects with a name
containing some other info (like the spid):
Unfortunately, this will not work since you might have a connection with spid 30 and another
with spid 306 and these '%' will mess everything up.
Also note that ASE 12.5 only considers the 13 first bytes of the table name. ASE 15 supports 238
bytes.
Please also note that the way these names are generated is not documented and might thus be
changed without notice.
So finding out whether another session has created a temporary table with a given name is
possible, but not easy and might be broken with any future version or patch.
But finding out whether such a table has been created in the same session is much easier:
Note that this also works with global temporary tables (##xxxxx).
Similar to the writetext command, you can use the dbwritetext function of the Open Client DB
Library to update a text or image column.
#include "stdafx.h"
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sybfront.h>
#include <sybdb.h>
LOGINREC *login;
DBPROCESS *dbproc;
DBCHAR mytextcol[512];
// The end...
dbexit();
return 0;
}
myservername is not the hostname of the server but the name in the Sybase configuration (you
can see it using dsedit).
TRUE: it means that the operation should be logged. If you set it to FALSE, the operation will be
minimally logged which is suboptimal in case of media recovery (but is faster and doesn't fill the
transaction log).
You need to additionally set the following in your visual C++ project:
Additional Include Directories: "C:sybaseOCS-15_0include"
Additional Library Directories: "C:sybaseOCS-15_0lib";"C:sybaseOCS-15_0dll"
Additional Dependencies: libsybct.lib libsybdb.lib
It's fast.
It can handle large volumes of data (up to 2GB).
If the program crashes on dbbind, check whether your SQL-Select is right (I had a typo in the
selected column name and wasted half an hour wondering why it was crashing).
If you have a column name but do not know to which table it belongs to, you can use the
syscolumns system table in the appropriate database.
syscolumns contains column metadata:
The id column is basically the same value as syscolumns.id. The name column is the name of the
table. For user table, the type column contains 'U'.
So basically you can execute the following to get the table name and column name for columns
which name is 'status':
After restarting a Sybase ASE server, the dataserver did not come up again and the following
error messages was at the end of the error log:
If you google for this error message, you oddly find many things related to locales. So I tried
setting different values for LC_ALL before starting the data servers. But this didn't help.
The next idea was to compare the current SYBASE.cfg with the previous one. And there was one
difference:
number of user connections was increased from 170 to 300. But this happened 5 months ago...
So it shouldn't be the cause of the problem... Since it was running late, we tried setting it back
anyway and there it was, our server was up and running again !
I first wondered whether someone had set it manually in the config file instead of setting it in isql
using sp_configure. I was sure that Sybase would tell me than it didn't have enough memory if
the value was too high. So I typed in the following in isql:
And to my surprise ASE just informed me that it was using some more memory with this setting
but didn't complain... So it looks like the value was too high to be able to start but it was fine to
set it later on. So now we just need to remember that if we restart ASE, we will probably need to
reduce the number of user connections and then increase it again using isql... Strange world...
Find and delete duplicates records
If you need to delete duplicate records from a table (e.g. in order to copy the data to a table with
a unique index or to add a unique index on this table), you can use one of the following methods.
First it's important to know what kind of duplication we're talking about:
There is a unique key on this table (e.g. an identity column). The duplicates are entries
where the other columns of the rows are identical but the unique key is different.
The duplicates are 100% identical rows.
In both cases, you need to delete all rows but one, with the same values for X columns. The
difference is just that in the first case you can reference each column with a key (i.e. some kind
of row ID) and in the second one you can't.
In the first case, identifying the row to keep is quite easy: you need to get the minimum (or
maximum) value of your key (let's call it row_id from now on) and delete all rows with the same
values for the other columns but a higher (or lower) row_id. Here an example keeping the lowest
row_id:
xxx and yyy being the columns you want to use to identify duplicates.
Note that I assume that we're talking about non-null values. If you also want to handle null
values you need to replace:
by:
If you do not have something like a row_id, it's getting a little bit more difficult since you cannot
directly reference a row. There are basically two methods to clean up the duplicates:
If you have few duplicate entries, you can use ROWCOUNT to make sure that for each set of
duplicate rows you delete all of them except one.
First you need to find out which sets of duplicate entries you have and how many entries you
have in each set:
SELECT xxx, yyy, COUNT(*) FROM tablename GROUP BY xxx, yyy HAVING COUNT(*) >
1
This will for example return:
xxx_value_1 yyy_value_1 5
So you know you have 5 rows with these two values for xxx and yyy. So you need to delete 4:
SET ROWCOUNT 4
DELETE FROM tablename WHERE xxx='xxx_value_1' AND yyy='yyy_value_1'
And you can repeat it for all values returned by the SELECT statement. Of course this is kind of
a lot of manual work and you only want to do it for a few combinations. If you have thousands or
millions of duplicates, this is definitely not a good solution.
In this case, you have to create a new table containing only distinct entries and then...
or rename the original table and select the distinct entries directly as the original table:
The advantage of this second method is that you save one step. The disadvantage is that you
don't have any index/constraints/triggers/... on tablename anymore. You can also do it the other
way around, first a SELECT INTO then drop the original table and then sp_rename on the new
table (but you still miss a few things like the indexes).
Of course another possibility is to add an identity column so that you can use the first method of
this post. After removing the duplicates, you can then remove the identity column.
First you need to execute the system stored procedure sp_plan_dbccdb (you can start it without
parameters to consider all database or provide a database name i.e. if you only plan to run the
dbcc checks on one database):
Recommended size for dbccdb database is 2023MB (data = 2021MB, log = 2MB).
...
Recommended values for workspace size, cache size and process count are:
dbname scan ws text ws cache comp mem process
count
mydb 1340M 335M 335M 0K 15
This gives you the basic sizes you need to use when creating and configuring the DBCC database
(dbccdb).
You of course need to have these 2 devices (dbcc_data_dev and dbcc_log_dev) already available
and with sufficient space for the database.
a pool for small IOs (2K on a server with 2K pages...): it needs to be at least 512K since
it's the minimum size for a cache pool
a pool for large IOs (for the size of an extent. An extent being 8 pages, it means that on a
server with 2K pages you need to create a 16K pool, on a server with 4K pages a 32K
pool...): it's the size shown by sp_plan_dbccdb (under cache).
The 4K pool is automatically taking up all the configure cached, so we need to reassign some of
it for large IOs:
Then you need to add the scan and text segments in the dbccdb for our mydb databases:
and
1> sp_addsegment mydb_textseg, dbccdb, dbcc_data_dev
2> go
You can then assign these 2 segments to our database using sp_dbcc_createws (this basically
creates the appropriate tables in dbccdb):
and
and
Update: You can check the settings using sp_dbcc_evaluatedb (using the database name e.g.
mydb as parameter). You might notice that the suggested cache size returned by this stored
procedure is actually much smaller than the one returned by sp_plan_dbccdb. Never found out
why there is such a difference (in my case sp_plan_dbccdb reported 335M just the same as for
the text workspace and sp_dbcc_evaluatedb reported less than 10 MB...).
If you have a procedure returning multiple result sets or printing messages and running for quite
some time, you usually want to get the intermediate result sets and message back on the client as
they are produced. ASE has a buffer where they are stored on their way to the client and you thus
always have a delay which duration depends on how fast the buffer gets filled.
To prevent this behavior you can switch in the FLUSHMESSAGE option before executing the
procedure:
SET FLUSHMESSAGE ON
go
You will see that the messages issued with the print command are returned immediately but there
is still a delay for the result sets. FLUSHMESSAGE only has the result set returned to the client
when a print is executed. This means that you have to execute a print after the selects for which
you want the result back immediately e.g.:
print ''
The only problem with print is that it handles strings and if you want to just return a number, you
have to convert it to a string to print it.
Remove rows affected, return status and dashes from isql output
When you execute a procedure from a shell using isql, the output will contain the following
additional stuff:
Since I'm returning SQL statements I want to store in a file and execute later on, I do not want to
see all of this.
The number of rows affected can be suppressed by switching on the nocount option
The dashes can be suppressed by calling isql with the -b argument
The return status can be suppressed by switching off the proc_return_status option
Here an example:
# isql -Umyusername -Pmypassword -b <<EOT
set nocount on
go
set proc_return_status off
go
exec mystoredprocedure
go
EOT
In Transact-SQL (as well as on T-SQL on Microsoft SQL Server), you can use both ISNULL
and COALESCE to use a default in case you have NULL values:
-----------
123
-----------
123
1. COALESCE is ANSI standard. So if you think you might have to port the code to
another DBMS, it's a safer bet.
2. ISNULL means something totally different in other DBMS e.g. MySQL. There it
returns a boolean value meaning whether the expression is NULL or not. So this
might confuse colleagues coming from different DBMS.
3. COALESCE is harder to spell... After you've mispelled it 20 times you might
consider using ISNULL instead !
4. COALESCE can do more than ISNULL. You can provide a list of X expressions
and it will return the first one which is not NULL. You can of course write
something like ISNULL(expression1, ISNULL(expression2, expression3)) but it's
then much more complex to read.
5. From a performance point of view COALESCE is converted to a case statement
which seems to be a little bit slower than ISNULL which is a system function. But
I guess the difference in performance is so small that it shouldn't be a criterion to
go for ISNULL instead of COALESCE.
6. The datatype, scale and precision of the return expression is the one of the first
expression for ISNULL. With COALESCE it's more difficult to figure it out since
it is determined by datatype hierarchy. You can get the hierarchy with the
following query:
If you need to switch to a given database, you can use the following statement:
If you want to figure out what's the current database, there is unfortunately no @@db, @@dbid
or @@dbname global variable. But you can check it over the current process ID:
(1 row affected)
Update: As Peter correctly noticed, there are built-in functions you can use for this too:
------
1
(1 row affected)
1> select db_name()
2> go
------------------------------
master
(1 row affected)
1> select user_id()
2> go
-----------
1
(1 row affected)
1> select user_name()
2> go
------------------------------
dbo
(1 row affected)
1> select suser_id()
2> go
-----------
1
(1 row affected)
1> select suser_name()
2> go
------------------------------
sa
(1 row affected)
If you have numeric and not numeric values and what to the isnumeric function to find out which
values you can convert, pay attention. Isnumeric is not always reliable. Here an example:
-----------
1
(1 row affected)
1> select convert(int, almostnum) from benohead where isnumeric(almostnum) =
1
2> go
Msg 249, Level 16, State 1:
Server 'SYBASE', Line 1:
Syntax error during explicit conversion of VARCHAR value 'e' to a INT field.
So in this case isnumeric returns 1 i.e. true. But you still can't convert it to an int. Converting to a
float, money or real will also fail:
But converting to a decimal value will work (well it won't throw an error):
---------------------
0
(1 row affected)
Of course, it depends whether you actually mean 0 when writing e in the column... It will not
throw an error but might deliver wrong results...
If you need to know which trace flags are active, you can use the dbcc traceflags command.
Before using it, you show switch on the trace flag 3604 to display the output to standard output
(i.e. the console).
DBCC execution completed. If DBCC printed error messages, contact a user with
System Administrator (SA) role.
If you do not switch on the trace flag 3604, you'll see the following:
Instead of trace flag 3604 you can also use the flag 3605. The ouput will then be writte to the
error log:
You'll notice that it also returns 3604 although it is not active in our session (the ouput wasn't
written to the console). Actually trace 3604 is a global trace. Switching it off in another session
will also disable it in this session. But switching it on in another session, will not have the output
displayed on the console for this session. No clue why...
Please also note that you can switch on the trace flags only for a session using the following:
This will switch on the trace flag for this session but it will not be visible with dbcc traceflags
(even in this session):
DBCC execution completed. If DBCC printed error messages, contact a user with
System Administrator (SA) role.
If you execute a bunch of SQL statement from a shell script (e.g. to create tables/procedures) and
get the following error message:
Msg 102, Level 15, State 1:
Server 'SYBASE', Procedure 'xxxxxx', Line xx:
Incorrect syntax near 'go'.
But it all works fine when executing it in an SQL client (like SqlDbx or ASEIsql), the problem is
probably related to the end of line. If you create the SQL file executed by the script on a
Windows machine, it will end the lines with Carriage Return and Line Feed (CR+LF) and on a
Linux/Unix system, it should only be a Line Feed (LF). After converting the file to Unix format,
it should not see the error anymore.
or:
or:
# dos2unix inputfile
If you want to know which tables are all referencing a specific table using a foreign key, you can
use one the following system tables:
The first way will only work if you have constraints backing the foreign keys. If you have just
foreign keys without any referential constraints, you will not be able to see anything.
Here's how you can use the syskeys system table to get information about foreign keys:
Meaning: look for foreign keys (type=2) referecing the report table(depid=).
select object_name(k.id)
select object_name(k.id),
col_name(k.depid, depkey1)
+', '+col_name(k.depid, depkey2)
+', '+col_name(k.depid, depkey3)
+', '+col_name(k.depid, depkey4)
+', '+col_name(k.depid, depkey5)
+', '+col_name(k.depid, depkey6)
+', '+col_name(k.depid, depkey7)
+', '+col_name(k.depid, depkey8)
from syskeys k
where k.type = 2
and k.depid = object_id('report')
In order to make it look nicer, we'll need to trim the strings for trailing spaces, limit the length of
the individual strings to 30 characters and remove the extra commas.
For the extra commas, we can use syskeys.keycnt which will tell us how many depkeyX columns
are filled. And we can use the fact that substring('blabla', X, Y) will return NULL if X is less
than 1. This means we need a function which returns 1 if the column index is less than keycnt
and returns 0 or a negative number otherwise. Luckily such a function does exist: sign. It will
return -1 if the argument is negative, 0 if it's 0 and 1 if it's positive. So we can use sign(keycnt -
columnIndex + 1). If there are 3 columns:
sign(3 - 1 +1) = 1
sign(3 - 2 +1) = 0
sign(3 - 3 +1) = -1
sign(3 - 4 +1) = -1
sign(3 - 5 +1) = -1
sign(3 - 6 +1) = -1
sign(3 - 7 +1) = -1
sign(3 - 8 +1) = -1
And if we also want to have the column names of the referenced table:
If you want to provide some filter possibilities in your application showing the data stored for the
previous, the current or the next month. So you basically need to figure out the first and last day
of the corresponding month.
In case you cannot or do not want to do it in the application code itself, you can use simple SQL
statements to get this info.
Today is January, 30th. So the first day of the month is January, 1st. Unfortunately, you cannot
directly set the day to 1. But you can extract the day from the date, in our case 30 and 1 is 30 -
(30 - 1), so:
So basically we have the right date but for comparison purposes, we'd rather have midnight than
1:49pm. In order to do it, you need to convert it to a date string and back to a datetime:
if you're only interested in a string containing the date, just drop the outer convert:
Use another format than 101 if needed. The complete list of date conversion formats can be
found here. For example, for the German date format, use 104 (dd.mm.yyyy).
Now let's get the last day of the current month. This is basically the day before the first day of
next month.
So first let's get the first day of next month. This is actually just 1 month later than the first day
of the current month:
Now let's just substract one day and we'll get the last day of the current month:
Since we already have the first day of next month, let's get the last day of next month. This is
basically the same again but instead of adding 1 month, you add 2 months:
Now let's tackle the previous month. The first day of last month is basically the first day of the
current month minus 1 month:
And then the last day of previous month. It is one day before the first day of the current month:
I needed to write a very short C# program to access a Sybase ASE database and extract some
information.
First had to download the appropriate version of ASE. It contains a directory called
\archives\odbc. There is a setup.exe. Just run it.
There is no need to add a data source to access ASE from an ODBC connection using C#. Just
went there to check whether the driver was properly installed.
Then just create a program connecting to ASE using the following connection string:
Driver={Adaptive Server
Enterprise};server=THE_HOSTNAME;port=2055;db=THE_DB_NAME;uid=sa;pwd=THE_SA_PA
SSWORD;
If you omit the db=... part, you'll just land in the master database.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data.Odbc;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
String errorMsg;
OdbcConnection con = Connect("sa", "sa.pwd", "2055",
"192.168.190.200", "mydb", out errorMsg);
Console.WriteLine(errorMsg);
if (con != null)
{
Console.WriteLine("In database {0}", con.Database);
OdbcCommand command = con.CreateCommand();
command.CommandText = "SELECT name FROM sysobjects WHERE
type='U' ORDER BY name";
OdbcDataReader reader = command.ExecuteReader();
int fCount = reader.FieldCount;
for (int i = 0; i < fCount; i++)
{
String fName = reader.GetName(i);
Console.Write(fName + ":");
}
Console.WriteLine();
while (reader.Read())
{
for (int i = 0; i < fCount; i++)
{
String col = reader.GetValue(i).ToString();
Console.Write(col + ":");
}
Console.WriteLine();
}
reader.Close();
command.Dispose();
Close(con);
Console.WriteLine("Press any key too continue...");
Console.ReadLine();
}
}
return con;
}
That was really easy ! Not that it'd have been more difficult with Java or PHP, but I'd have
expected to waste ours making mistakes and trying to debug it...
In order to find which connections have been open for a long time, you can use the following
SELECT statement:
SELECT
spid,
sl.name as 'login',
sd.name as 'database',
loggedindatetime,
hostname,
program_name,
ipaddr,
srl.remoteusername as 'remotelogin',
ss.srvname as 'remoteservername',
ss.srvnetname as 'remoteservernetname'
FROM
master..sysprocesses sp,
master..syslogins sl,
master..sysdatabases sd,
master..sysremotelogins srl,
master..sysservers ss
WHERE
sp.suid>0
AND datediff(day,loggedindatetime,getdate())>=1
AND sl.suid = sp.suid
AND sd.dbid = sp.dbid
AND sl.suid *= srl.suid
AND srl.remoteserverid *= ss.srvid
ORDER BY
loggedindatetime
Here a few explanations:
Let's say you get a number of seconds and want to convert it to a string so that it's human
readable e.g. saying that something took 15157 seconds is probably not a good idea and it'd
make more sense to say it took 4 hours 12 minutes and 37 seconds.
Note that the you may see a line wrap here but actually everything after begin and before end
should be on one line.
--------
04:12:37
(1 row affected)
(return status = 0)
Sybase ASE supports both the old syntax and the newer SQL-92 syntax for left outer joins:
Old syntax:
New syntax:
SELECT * FROM table1 LEFT JOIN table2 ON table1.key=table2.fkey
As long as you do not have other criteria, the results will be the same. But you might experience
some differing results as soon as you add some other criteria e.g. the two following statements
seem to do the same but do deliver different results:
1> select top 10 p.p_key, e.e_uid, e.field1 from table_p p, table_e e where
p.p_key*=e.p_key and e.field1='V1'
2> go
p_key e_uid field1
----------- ---------------- ----------------
2 2005092612595815 V1
2 2005030715593204 V1
2 2005092614251692 V1
4 NULL NULL
8 NULL NULL
9 NULL NULL
10 NULL NULL
11 NULL NULL
14 NULL NULL
15 NULL NULL
The reason is that the database engines when executing the first statement does not only consider
p.p_key=e.p_key as join criterion but also e.field1='V1'. So basically the first statement is
equivalent to the following SQL-92 statement:
1> select top 10 p.p_key, e.e_uid, e.field1 from table_p p left join table_e
e on p.p_key=e.p_key and e.field1='V1'
2> go
p_key e_uid field1
----------- ---------------- ----------------
2 2005092612595815 V1
2 2005030715593204 V1
2 2005092614251692 V1
4 NULL NULL
8 NULL NULL
9 NULL NULL
10 NULL NULL
11 NULL NULL
14 NULL NULL
15 NULL NULL
Note that the second criterion is not in the where clause but in the on part.
So, the old left outer join syntax is compacter. But it is ambiguous as it doesn’t properly separate
the join criteria and the where criteria. In case of a left outer join it makes a huge difference since
the join criteria do not filter the returned rows but the where criteria do.
In most cases, the results you were after are the ones returned by the first and last queries above.
But you should avoid the old left outer join syntax and try to use the SQL-92 syntax everywhere.
It makes it clearer what you mean with the statement and can save some time searching why you
did not get the output you were expecting. But also with the SQL-92 syntax you should carefully
think whether you want to add a criterion to the join criteria or to the where clause (and as stated
above in most cases when using a left outer join, the criteria on the joined tables should probably
go in the join criteria).
In order to get a list of all tables in the current database, you can filter the sysobjects table by
type = ‘U’ e.g.:
In order to get the number of rows of each table, you can use the row_count function. It takes
two arguments:
the database ID – you can get the ID of the current database using the db_id function
the object ID – it’s the id column in sysobjects
e.g.:
And in order to get some size information you can use the data_pages function. It will return the
number of pages and you can then multiply it by the number of kilobyte per page e.g.:
select convert(varchar(30),o.name) AS table_name,
row_count(db_id(), o.id) AS row_count,
data_pages(db_id(), o.id, 0) AS pages,
data_pages(db_id(), o.id, 0) * (@@maxpagesize/1024) AS kbs
from sysobjects o
where type = 'U'
order by table_name
The first column returned by this statement contains the table name (if you have names longer
than 30 characters, you should replace 30 by something higher), the number of rows, the number
of data pages, the size in kilobytes.
If you have an ASE version older than 15, the statement above will not work but you can use the
statement below instead:
select sysobjects.name,
Pages = sum(data_pgs(sysindexes.id, ioampg)),
Kbs = sum(data_pgs(sysindexes.id, ioampg)) * (@@maxpagesize/1024)
from sysindexes, sysobjects
where sysindexes.id = sysobjects.id
and sysindexes.id > 100
and (indid > 1)
group by sysobjects.name
order by sysobjects.name
This will return the table name, number of pages and size in kilobytes.