Changing java libraries and increasing the heap size used by swingbench
15/01/12 11:57 Filed in: Swingbench
I’ve
been asked a couple of times recently about how to
change the infrastructure libraries i.e. the Oracle
jdbc driver. I try to always ship the latest
jdbc/ons/ucp libraries but its possible that Oracle may
realease a patch set whilst I’m working on
something else. Its also possible people may want to
investigate a bug or performance problem by trying out
a selection of drivers. To change the jar files all
that is needed is to move the old version out of the
lib directory and copy the new one in, for example
Don’t try and keep multiple versions of the same driver in the lib directory. It can lead to unexpected results. When swingbench starts it looks in the lib directory for all the libraries (contained in the jar files) to use.
If you need to increase the amount of memory used by swingbench you need to edit a file called launcher.xml in the launcher directory. You might need to do this if you are planning to run with many thousands of threads. You need to changed the default value (1024m) to a larger one. For example
to something like
$>
cd swingbench
$> mv
lib/ojdbc6.jar ~/backup/ojdbc6.jar
$> cp ~/newjdbc/ojcbc6.jar
lib Don’t try and keep multiple versions of the same driver in the lib directory. It can lead to unexpected results. When swingbench starts it looks in the lib directory for all the libraries (contained in the jar files) to use.
If you need to increase the amount of memory used by swingbench you need to edit a file called launcher.xml in the launcher directory. You might need to do this if you are planning to run with many thousands of threads. You need to changed the default value (1024m) to a larger one. For example
<jvmargset id="base.jvm.args">
<jvmarg
line="-Xmx1024m"/>
</jvmargset>
to something like
<jvmargset id="base.jvm.args">
<jvmarg
line="-Xmx4096m"/>
</jvmargset>
|
New build of swingbench 2.4
08/12/11 19:56 Filed in: Swingbench
I’ve
just uploaded a new build of swingbench with the
following fixes
You can find it here. Let me know if you find any problems.
- Wizard now writes output to debug.log correctly with the -debugFile option
- Fixed bug where partitioning was always installed regardless. This meant the wizards couldn't install against a standard edition database
- Fixed a bug where swingbench wouldn't start unless all users were able to log on. Swingbench will now start when it has logged on as many users as it can.
- Fixed a bug where users were incorrectly counted as being logged on
- Fixed a bug where logging wasn't correctly written to a debug file
- Changed -debugFile to -debugf to provide consistency with other tools
- Changed to way users logged on is reported in verbose mode of charbench
- Updated the secure shell libraries to support Solaris 11
You can find it here. Let me know if you find any problems.
2.4 makes it to stable status
08/11/11 16:02 Filed in: Swingbench
At
last I feel that I’ve fixed enough bugs and had
enough feed back to change 2.4 to stable status. This
means that this should be the default client most
groups use. Based on the feedback I get I’ll
remove 2.3 from the downloads page.
There has been a few changes in this release. These include
The new Overview chart is show below

As per usual you can download it here
There has been a few changes in this release. These include
- Improvements to the Overviewchart to provide dynamic YXAxis and floating legends
- The overview chart now uses a 5 second rolling window for response time metrics as opposed to a meaningless average
- “Minibench”
has been made a little bigger

- Numerous fixes to the wizards
- Numerous fixes to Universal Connection Pooling
- New command line options
The new Overview chart is show below

As per usual you can download it here
New Build of swingbench 2.4
06/05/11 18:43 Filed in: Swingbench
It’s
been 2 or 3 months since the last drop of swingbench
and so it will come as no surprise that I’ve
released another build. This release features
- Fix to enable the jdbc version of the SOE benchmark to work without errors. I’ll be updating it over the coming week to try and make it as close as possible to the PL/SQL version
- A
new parameter (and UI change) to support the
disconnect/connection of sessions after a fixed
number of transactions

- Various fixes
New build of swingbench 2.4
15/02/11 10:22 Filed in: Swingbench
I’ve
just uploaded a new build of swingbench 2.4 it has a
number of bug fixes which sort out some of the
following issues
You can download it here
Let me know if you have any problems via the comment page
- Unexpected termination due to driver issues
- Unclear timeline text on the overview chart
- Numerous UI and java issues
You can download it here
Let me know if you have any problems via the comment page
Notes on pre-parsing data for Oracle data loads
14/02/11 18:53 Filed in: Oracle
Sometimes
data simply isn't in a form that is easy to load into
an Oracle database i.e. column form. It would be great
if everybody exchanged data in a simple CSV form with a
single file to table mapping. Sadly that isn't the case
and sometimes you have to do a little work to get it
into a form thats useable. A recent benchmark
highlighted this issue very well. The customer provided
the data in compressed CSV form (so far so good) but
the data was held in key value pairs (not so good).
They also provided us with a mapping file that
describes how it all fits
together.
Now typically the approach many people would take would be to develop some form of program that parses all of the data and writes it to staging area and then loads all of it in one go to the target database. I make no criticism of this approach since it works well and as long as its not time critical. It's by far the simplest method. However Im a big fan of taking advantage of whats already available and one of the most underused and powerful features of the Oracle database is the preparser. It enables you to pipeline various operations so they all run as quickly as possible. So going back to my benchmark we used this approach to load data into out target database. It consisted of 4 steps
The java program simple reads from stdin and writes to stdout. To handle key value pairs just required the program to read the mapping file in and split and parse the values from stdin. The data was then written to stdout in a well know order.
Java extract from my program...
All that was needed for sqlloader to process the files was a control file that understood the order of the columns and any additional formatting.
One of the additional benefits is that we can load the data via "direct path" and implement other features such as multi table insert. The Java preparser enables you to add all of the additional formatting to make this a trivial process.
The following diagram illustrates the process.

This equates into a Unix/Linux statement such as
NOTE : one thing you may have noticed is that Im using dd to do 1MB I/Os. This just an efficiency operation and works well on structures such as DBFS, you could skip this part of the operation if needed.
Which brings us onto external tables and the preparser
External Tables and pre-parsers
As I mentioned earlier I like to take advantage of functionality that's already available and one of those features in the Oracle database is external tables. I don't intend to go into much detail as to why you should use external tables other than they do much of the heavy lifting for you and they provide a seamless interface between the filesystem and the database. They effectively make files look like tables.
This means it's trivial to implement parallelism for our pre-parser. We don't need to worry about how to handle the files and how to schedule everything, external tables take care of all of that for you. In our benchmark we used them in the following way

Our previous pipeline remains the same except that we don't need sqlloader its all managed by the table definition itself. So we end up with something similar to the following for the table definition (I've abbreviated it quite substantially and highlighted the important bits)
One of the things to note is that I've included the pipelined preprocessor inside of a shell script which looks like this
The important part of this script is the parameter ($1) that is passed to the shell script. This is the file name that the external table wants to process.
A simple select statement from my "staging_ext_mydata_jan01" unzips and parses the data converting it to usable columns. Whats more if I issue the statement in parallel Oracle takes care of creating the processes for me and making sure everything is scheduled in an orderly fashion.
To finish the load we simply used a multi table insert to put the data into the correct tables in an efficient fashion. Using this approach we were able to read zipped files, parse them and insert them into our three target tables at over 1.5 million source records/sec.
Now typically the approach many people would take would be to develop some form of program that parses all of the data and writes it to staging area and then loads all of it in one go to the target database. I make no criticism of this approach since it works well and as long as its not time critical. It's by far the simplest method. However Im a big fan of taking advantage of whats already available and one of the most underused and powerful features of the Oracle database is the preparser. It enables you to pipeline various operations so they all run as quickly as possible. So going back to my benchmark we used this approach to load data into out target database. It consisted of 4 steps
- Read the data of the filesystem as efficiently as possible and write it to stdout
- Read from stdin and Unzip the the data writing it to stdout
- Read from stdin into a java program to do the key value mapping and error detection/correction writing the output to stdout
- Read from stdin into sqlloader
The java program simple reads from stdin and writes to stdout. To handle key value pairs just required the program to read the mapping file in and split and parse the values from stdin. The data was then written to stdout in a well know order.
Java extract from my program...
BufferedReader br
= new BufferedReader(new InputStreamReader(System.in),
OneMB);
String line =
null;
HashMap<String, String> keyValuePairs = null;
MyTokenizer mt =
null;
while ((line =
br.readLine()) !=
null) {
keyValuePairs =
new
HashMap<String, String>(200);
mt = new MyTokenizer(line,
delimitor);
for (String token : mt)
{
int loc = token.indexOf("=");
if (loc !=
-1)
{
String i =
token.substring(0,
loc);
String s =
token.substring(loc +
1, token.length());
keyValuePairs.put(i,
s);
}
StringBuffer outRec = new StringBuffer(1000);
outRec.append(checkForNull(keyValuePairs.get("uniqueID"), "")).append(seperator);
// mapping logic similar
to above repeats
System.out.println(outRec.toString());
}
All that was needed for sqlloader to process the files was a control file that understood the order of the columns and any additional formatting.
One of the additional benefits is that we can load the data via "direct path" and implement other features such as multi table insert. The Java preparser enables you to add all of the additional formatting to make this a trivial process.
The following diagram illustrates the process.

This equates into a Unix/Linux statement such as
/bin/dd if=myverybigfile.txt bs=1024k
status=noxfer 2>/dev/null | /bin/gunzip -c | java -classpath
/home/oracle/loader.jar
com.dom.KeyValueParserStdIn | sqlloader bench/bench control=kv.ctl data=\"-\" direct=TRUE; NOTE : one thing you may have noticed is that Im using dd to do 1MB I/Os. This just an efficiency operation and works well on structures such as DBFS, you could skip this part of the operation if needed.
Which brings us onto external tables and the preparser
External Tables and pre-parsers
As I mentioned earlier I like to take advantage of functionality that's already available and one of those features in the Oracle database is external tables. I don't intend to go into much detail as to why you should use external tables other than they do much of the heavy lifting for you and they provide a seamless interface between the filesystem and the database. They effectively make files look like tables.
This means it's trivial to implement parallelism for our pre-parser. We don't need to worry about how to handle the files and how to schedule everything, external tables take care of all of that for you. In our benchmark we used them in the following way

Our previous pipeline remains the same except that we don't need sqlloader its all managed by the table definition itself. So we end up with something similar to the following for the table definition (I've abbreviated it quite substantially and highlighted the important bits)
create table
staging_ext_mydata_jan01
( uniqueid
NUMBER,
..
-- Lots of columns
..
)
ORGANIZATION EXTERNAL
(
TYPE oracle_loader
DEFAULT DIRECTORY load_dir_jan01
ACCESS PARAMETERS
(
RECORDS DELIMITED BY
NEWLINE
PREPROCESSOR
exec_dir:'external_tab.sh'
BADFILE
log_dir: 'external.bad'
LOGFILE
log_dir: 'external.log'
fields terminated by '|'
OPTIONALLY ENCLOSED BY '"'
AND '"'
( uniqueid
char(100),
..
-- Lots of defintions
..
)
)
LOCATION('data1.txt',data2.txt'...'data100.txt'))
REJECT LIMIT UNLIMITED;
One of the things to note is that I've included the pipelined preprocessor inside of a shell script which looks like this
/bin/dd if=$1
bs=1024k status=noxfer 2>/dev/null | /bin/gunzip -c | java -classpath
/home/oracle/loader.jar
com.dom.KeyValueParserStdIn
The important part of this script is the parameter ($1) that is passed to the shell script. This is the file name that the external table wants to process.
A simple select statement from my "staging_ext_mydata_jan01" unzips and parses the data converting it to usable columns. Whats more if I issue the statement in parallel Oracle takes care of creating the processes for me and making sure everything is scheduled in an orderly fashion.
To finish the load we simply used a multi table insert to put the data into the correct tables in an efficient fashion. Using this approach we were able to read zipped files, parse them and insert them into our three target tables at over 1.5 million source records/sec.
New Build of Datagenerator
31/12/10 11:34 Filed in: Datagenerator
Im releasing a new build of Datagenerator simply
because there hasn’t been one for a while. Thats
not to say it hasn’t undergone significant
changes. Most of them are as a result of enhancements
to support schema creation for swingbench. In
particular is the introduction of Pre and Post
generation scripts. These allow me to run a complete
schema creation from within datagenerator. These
scripts appear as top level items from within the tree
(see below).
In the side panel you can now include scripts and parameters for the scripts.
I’ve also included the script files used for generating the SH and SOE schemas used by swingbench. This should make it easier to understand what is going on and potentially create your own versions of the schemas.
In this release I’ve also improved the threading model and included one or two other performance enhancements....
In the next release I’m going to try and add support for for well know data items such as zip/post codes, NI numbers, Social Security etc.... as well as allowing users to plug their own data generators in.
You can download it from the usual place and as before leave comments below or via the comments page.
In the side panel you can now include scripts and parameters for the scripts.
I’ve also included the script files used for generating the SH and SOE schemas used by swingbench. This should make it easier to understand what is going on and potentially create your own versions of the schemas.
In this release I’ve also improved the threading model and included one or two other performance enhancements....
In the next release I’m going to try and add support for for well know data items such as zip/post codes, NI numbers, Social Security etc.... as well as allowing users to plug their own data generators in.
You can download it from the usual place and as before leave comments below or via the comments page.
Happy Christmas All...
24/12/10 18:54
Database Time Monitor
08/11/10 20:22
Fixes to CPUMonitor
06/11/10 14:24 Filed in: cpumonitor
I’ve just posed a minor update to CPUMonitor to
try and fix a few bugs and improve the debugging. You
can download it from the usual place here