Mining the Metadata
by Jeff Garbus
I’m going to talk
about 3 useful types of information you can get out of SQL
Server
metadata. First, there’s the type of
information which makes your life easier, and which
you can use to build dynamic scripts.
Second, I’ll discuss information which you can get
using system stored procedures, but which
my not come out the way you want. Third, I’ll
talk about fun things you can learn from
the system procedures yourself.
There comes a time in every DBA’s life
when life would be simpler if he/she could just
write a script to… what? For many of us,
something immediately comes to mind. At my
current client site, one of the things we
do on a frequent basis is to load a copy of data
from one server onto another. If you care
only about data, this is not a big deal. If you
care about users and permissions, all of a
sudden things get sticky.
Why?
In our case, the logins on each server do
not match (intentionally, and never will—we
have more logins in test than in
production, and therefore have alignment problems with
the SUIDs). What is the problem with that?
Each login has a distinct server user ID
(suid), from syslogins table in the master
database, which maps to the distinct database
user ID (uid) within the database (this
happens in the sysusers table, or the sysalternates
table if you’ve aliased somebody into the
database). If a user, jgarbus for instance, has a
suid of 27 on the source server, and a
suid of 45 on the target server, then when the new
database is loaded, the uid which had
pointed to the suid of 27 on the source server is
still
pointing to the suid of 27 on the target
server; which as we know, is not the correct login.
This means that whomever has the suid of
27 on the target server now has the same
access in that database as suid 27 had on
the other server.
Whether we like that or not, it was not
our intention. How can the problem be solved?
We can grab this information directly from
the system tables. Tools like DBArtisan may
make extracting code easy, but what does
it extract when the
wrong
information is in the
system tables? I haven’t checked, largely
because the tool makes the following so easy. I
write code like this:
________________________________________________________________
Select
‘exec sp_dropuser ‘ + name
from sysusers where uid between 3 and
16382
Select ‘exec sp_adduser ‘ + name + ‘, ‘ +
name + ‘, ‘ + user_name(gid)
from sysusers where uid between 3 and
16382
________________________________________________________________
Writing this type of code requires that
you understand the way the information resides in
the system tables, as well as the way
security works on your system. I know that name in
the sysusers table corresponds with the
correct login ID because of our site definitions. I
know that uid of 1 is the dbo, and uid of
2 is guest, so I start with 3. I know that group
names start with 16383, so I end just
prior to that.
Note: that if you have individual
permissions, as opposed to group permissions, you’ll
have to get those separately. Look in
sysprotects for the permissions, and spt_values to
decode grant / revoke (excellent exercise
for the student!)
Finally, I realize that if I don’t create
both the add- and drop scripts prior to executing
them, that I’ll lose vital information.
If you’re using a tool like DBArtisan, you
can click on the top of the column generated in
the results tab, and then copy & paste
from there into the query window (or a new one).
Lessons here: Know your own site, identify
requirements before executing anything, and
know what’s in the system tables.
Another good time to use this is when you
want to run something against all database
objects. Here’s one I use with some
frequency:
__________________________________________________________
Select “update statistics “
+ name from sysobjects where type = ‘U’
_________________________________________________________________
This gives you the ability to generate all
the update statistics statements for your entire
database.
For our next set, your predecessor has not
kept any records, and you need to find out
what the next available device number is.
Alternatively, you might want to know which
devices have space available. One optoin
is to run sp_helpdevice, and read through the
voluminous information that comes out.
Alternatively, you can figure out which devices
have space available by joining the
sysdevices table (master only) with the sysusages
table (also master only). The following
code will list the devices in order by device
number, along with the amount of space on
the device, and the amount already allocated.
____________________________________________________________________
Select "device
number"=low/power(16,6), --
High order byte of the low column
name,
"total size (meg)"=(high-low+1)/512,
-- Converting from pages to Meg
"space allocated (meg)"= sum(size)/512
-- Sum of allocated fragments
from sysdevices, sysusages
-- Run from master database
where cntrltype = 0
-- Use only database devices
and vstart between low and high
-- Join criteria
group by
low/power(16,6),name,(high-low+1)/512
order by 1
-- Sort by virtual device number
___________________________________________________________________
Finally, you can pull useful information
directly out of some of the procedures. Look in
the sp_spaceused procedure some time. You
can extract usage of some system functions
that have only recently been documented.
One of the things I do for capacity planning
purposes is to store information on how
many rows and pages a table is taking up over
time. This enables me to predict how big
the tables are getting, as well as to try to
estimate the level of fragmentation (as
the proportion of pages to rows changes). The
SQL looks like this:
_____________________________________________________________
select db_name(),
object_name(id),id,name,indid,getdate(),
data_pgs(id,doampg),data_pgs(id,ioampg),rowcnt(doampg),distribution
from sysindexes
______________________________________________________________________
The data_pgs() function tells you how many
data pages for an object, rowcnt() tells you
how many rows.. this is not as accurate as
“select count(*) from…” (which is exact) but it
is amazingly faster, and is very close,
closer depending on how recently you’ve run dbcc.
If you store this information in a table,
(I run a cron job which does this automatically
every Monday morning) you can begin to
understand growth over time.
Summary: It is worth your while to
investigate the contents of the system tables. In this
article, I’ve listed some of the useful
tables to look at. Don’t limit yourself to this,
though; any information you’ve stored in
the database with create statements, you can
extract with standard SQL.