On the subject of I/O
One the things that constantly surprises me when talking with clients about hardware for a new database server is that I/O is always at the bottom of the list. Typically the list will look something like this (listed in order of perceived importance)
- CPUs, have we enough. Fast as possible.
- Memory, as much as we can put in the box. Oracle don't charge us for that
- SAN, big as possible.
At this stage the purchase order is usually given the nod and the hardware supplier will ship yet another run of the mill box. Don't get me wrong. Many experienced DBAs have been through this process many times before and realise that not only is the list in the wrong order but its missing some critical components.
- HBAs, need to specify these in proportion to the CPUs and attached storage
- NICs, might need a lot of these i.e public, cluster interconnect, storage, management, backup. And typically in multiples for resillience or performance.
- Backup, are we using the existing backup infrastructure?
I don't blame anyone for this way of thinking, its the way its always been. When discussing a new server the first question that people tend to ask is "So whats this monster packing? 16 CPUs!!!" followed by lots of very macho grunts and hollering. The standard licensing model (not just Oracles) doesn't help. It starts with premise of a CPU describing the power of a server, and to a large degree it does but misses the point of what a database is all about and that's information. Typically that information is held in ones and zeros on a bunch of spinning scrap metal. The real power of a database comes from its ability to aggregate, analyze and process those ones and zeros, turn it into information and push results out to interested parties. Paraphrasing a little "Its all about I/O stupid".
With this in mind I'm constantly surprised by the imbalance of I/O put into servers both disk and network. Its not unusual to see a 4 cpu server running with the latest generation Intel and AMD CPUs but with a single HBA and dual ported NIC. Whilst memory is cheap many of these servers still run 32 bit kernels. This typically means only a small proportion of the database is cached in memory be it in the SGA or file cache (don't me started on file cache). I'd make a rash guess that whilst the size of the memory in a typical database server has increased the average size of the SGA hasn't increased in line with this trend. To make matters worse the typical size of a database has got significantly bigger. This has to lead us to the conclusion that less of the database is cached and as a result a bigger proportion of its is located on disk. As I said this is just a guess but its backed up with real customer engagements. What would be of interest is to have performed an analysis over the last 10 years to see if the wait event for scattered and sequential reads had decreased or increased as a proportion of the total wait event in production databases.
What I'm driving at is the need to move I/O way up the agenda when sizing a server for databases. The number of CPUs needs to be married to the number of I/O channels available. It makes no sense to buy database licenses for a machine that will simply sit and wait on I/O, Its simply wasting money. Equally it makes no sense to stuff a 4 cpu machine full of HBAs for a database application that will perform index lookups on a index that fits comfortably in the cache. Adding HBAs later to an existing server isn't necessarily a simple option either especially for a mission critical application or one that has hard coded paths to disk.
The next obvious question is "well thats well and good but how do I size the ratio of HBAs to CPUs." and in a typically vague fashion I reply "well that depends". The type of application and the type of processor should heavily influence the decision. Certainly the CPU has been winning the race in terms of performance over the last few years and it needs a lot more I/O to keep it busy. But the equation also needs to be balanced with the amount of memory available on the box. A large SGA will certainly reduce the need to visit disk. The best advice I can give is to speak to your hardware supplier and find out what the current state of play is. Also check the latest TPC-C and TPC-H figures show. Whilst these are generally edging towards the extremes of performance it does show what a hardware supplier believed was needed to show their hardware in the best light.
SQL Developer goes production.
SQL Developer has gone production. Congratulations to the entire development team. I've been using it every day now and it feels solid and performs well and I continue to find out new tricks and features each day. If you've not given it ago, try it...
SQLDeveloper
If you haven't tried this tool out I strongly recomend you pop over
here. Its a massive step forward for database developers/DBA who really have felt a little neglected by Oracle over the last few years. I've had countless complaints about how SQLPlus/vi/notepad are still used by many as their development tools of choice and how it really isn't good enough. Well I've know about th tool for a while now and have had to keep quite but Im glad the cats out of the bag and its got such positive reviews... especially because of its price... free.
It features much of the functionality you'd expect in a top end development tool plus features that many of its competitors charge top dollar for. The best piece of news is that its an extensible framework and plugins have started to pop up all of the place... One of my particular favourites is at
fourthelephant. Perhaps a little over the top for a text man like myself but I appreciate the work that must have gone into it.
They've inspired me to think about putting one together myself... The API is pretty simple and so it shouldn't be too taxing.... Any ideas? drop me a line and I'll see what I can do.
rlwrap : Command line editing in sqlplus
I imagine alot of people in the Linux comunity already use this. However if you new to Linux and struggling with recalling text on the commandline from within sqlplus, rlwrap might be the tool for you... its just a wrapper you put around a command line program and provides command line editing.... I've been using it for the last few months and couldn't live without it
You can find the tool
here
Server side failover
I've had a number of emails recently with requests for help with the new server side failover functionaility in 10g release 2. This functionality is described in the Oracle10g release 2 documentation but I've been told its not really obvious.
Let start by explaining what it is. Server side failover allows the sysadmin/dba to configure the profile of connection availablity on the server using a service. Users are effectively unaware of what will happen in the advent of a node in the cluster failing. Previously in (9i and 10g) users needed an entry in their local tnsnames file to describe which nodes they could failover to and which nodes were used to load balance connections on. Unless you used a remote naming service to maintain the connection information every time you added or removed nodes from a cluster it meant an update to potentially hundreds or thousands of tnsnames files.
This was simplified with easy connect in 10g release 1 which allowed the creation and connection to a service specified on the server. For the first time users only needed to connect to a nominated "listener node" and know the name of the service, for example imagine we have a nominated server inside of our organisation called "oracleservice", this of course is the name used for our virtual ip that will float between our cluster of listeners. In 10g release 1 we could create a service called "orderentry" using either dbca or srvctl that would allow our users to connect to it using a connect string of the form
sqlplus soe/soe@//oracleservice/orderentry
This greatly simplifies Oracle network maintenance. In some cases it could mean the removal of tnsnames files from the client or application server. It has other advantages for the DBA as well. If some business event occurs that requires the provisioning of a new application or a new resource profile for a short period of time the DBA can provision it in seconds and trivially remove it when it is no longer required.
Sadly in Oracle10g release 1 this functionality didn't support Transparent Application Failover (TAF), this meant that DBAs still needed to maintain tnsnames files contain a description of what nodes a service could failover on. The good news is that in Oracle10g release 2 this all changed. DBA's could set up a service specifying TAF and the Oracle OCI layer would use this definition provided by the server to describe the load balancing and failover profile.
Implementing this functionality is pretty trivial but there is a step that might catch you out. So lets go through it step by step
To set up the service you can use either Oracle DBCA, the DBMS_SERVICE package, Enterprise Manager or srvctl. The choice is entirely dependent on what you have running. DBCA or Enterprise manager provide the simplest mechanism but you will still have to run the final step using the dbms_service package to tell the database about its failover profile.
I'll use the DBMS_SERVICE package and srvctl for the sake of brevity. In the following example I have a database called db10g2 with two instances db10g21 and db10g22. Im going to create a service called "orderentry" that will provide transparent application failover between the two instances.
The first step is to create the service using srvctl
srvctl add service -d db10g2 -s orderentry -r "db10g21,db10g22" -a "db10g21,db10g22" -P BASIC
and check on its status
$ > srvctl status service -d db10g2 -s "orderentry"
Service orderentry is not running.
So we'll have to start the service first
$ > srvctl start service -d db10g2 -s "orderentry"
if we now use sqlplus connecting as system/sys we can see the service.
SYSTEM@db10g21 > select SERVICE_ID, NAME, NETWORK_NAME, failover_method from dba_services;
id Name Network Name Failover
--- ------------------ ------------------------- ------------
1 SYS$BACKGROUND
2 SYS$USERS
3 orderentry orderentry
The thing to note is that the service hasn't got a failover profile associated with it. So we'll have to modify it using the DBMS_SERVICE package
SYS@db10g21 > get t1.sql
1 begin
2 DBMS_SERVICE.MODIFY_SERVICE(
3 service_name => 'orderentry',
4 failover_method => DBMS_SERVICE.FAILOVER_METHOD_BASIC,
5 failover_type => DBMS_SERVICE.FAILOVER_TYPE_SELECT,
6 failover_retries => 180,
7 failover_delay => 5);
8* end;
SYS@db10g21 >
if we now select the service information again
id Name Network Name Failover
--- ------------------ ------------------------- ------------
1 SYS$BACKGROUND
2 SYS$USERS
3 orderentry orderentry BASIC
We can now test the service using sqlplus.
sqlplus soe/soe@//node1/orderentry
SQL*Plus: Release 10.2.0.1.0 - Production on Wed Jan 18 14:11:47 2006
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, Real Application Clusters, OLAP and Data Mining options
SOE@//node1/orderentry >
So all we need to do now is to fire up swingbench and use the service we've created.
[oracle@node1 bin]$ ./charbench -cs //node1/orderentry -dt oci -uc 30 -a
Author : Dominic Giles
Version : 2.2
Results will be written to results.xml.
Users : 30 TPM : 272 Nested TPM : 0
If we log onto the database we can see that the connections have being balanced across the two nodes
SYS@db10g21 >;
1 select instance_name, count(1) usercount, nvl(username,'INTERNAL') user_name,
2 failover_type, failover_method
3 from gv$session s, gv$instance i
4 where s.inst_id = i.inst_id
5 group by instance_name, username, failover_type, failover_method
6* order by username, instance_name
SYS@db10g21 > /
Instance No. of Users Username Fail Over Type Fail Over Method
---------- ------------ ---------- ------------------ ------------------
db10g21 15 SOE SELECT BASIC
db10g22 15 SOE SELECT BASIC
db10g21 6 SYS NONE NONE
db10g22 6 SYS NONE NONE
db10g21 23 INTERNAL NONE NONE
db10g22 25 INTERNAL NONE NONE
so lets shut down of the instances
SQL*Plus: Release 10.2.0.1.0 - Production on Wed Jan 18 15:28:22 2006
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, Real Application Clusters, OLAP and Data Mining options
SYS@db10g22 > shutdown abort;
ORACLE instance shut down.
SYS@db10g22 >
And re-query the session profile
Instance No. of Users Username Fail Over Type Fail Over Method
---------- ------------ ---------- ------------------ ------------------
db10g21 30 SOE SELECT BASIC
6 SYS NONE NONE
25 INTERNAL NONE NONE
There's a lot more thats possible using the service approach to database connection but I'll discuss that in another blog.