SBUtil in Action (And Some Fixes)

As a means of demonstrating what you can achieve with SBUtil I’ve created a little test to show it’s possible to trivially change the size of the data set and rerun the same workload to show how TPS can vary if the SGA/Buffer Cache is kept consistent.

First the script

#!/bin/bash

sqlplus sys/welcome1@//ed2xcomp01/DOMS as sysdba << EOF
create bigfile tablespace SOE datafile size 1000G;
exit;
EOF

time ./oewizard -scale 1 -dbap welcome1 -u soe -p soe -cl -cs //ed2xcomp01/DOMS -ts SOE -create

time ./charbench -u soe -p soe -cs //ed2xcomp01/DOMS -uc 128 -min -0 -max 0 -stats full -rt 0:10 -bs 0:01 -a -r resscale01.xml

time ./sbutil -u soe -p soe -cs //ed2xcomp01/DOMS -soe -parallel 64 -dup 2  

time ./charbench -u soe -p soe -cs //ed2xcomp01/DOMS -uc 128 -min -0 -max 0 -stats full -rt 0:10 -bs 0:01 -a -r resscale02.xml

time ./sbutil -u soe -p soe -cs //ed2xcomp01/DOMS -soe -parallel 64 -dup 2

time ./charbench -u soe -p soe -cs //ed2xcomp01/DOMS -uc 128 -min -0 -max 0 -stats full -rt 0:10 -bs 0:01 -a -r resscale04.xml

time ./sbutil -u soe -p soe -cs //ed2xcomp01/DOMS -soe -parallel 64 -dup 2

time ./charbench -u soe -p soe -cs //ed2xcomp01/DOMS -uc 128 -min -0 -max 0 -stats full -rt 0:10 -bs 0:01 -a -r resscale08.xml

time ./sbutil -u soe -p soe -cs //ed2xcomp01/DOMS -soe -parallel 64 -dup 2

time ./charbench -u soe -p soe -cs //ed2xcomp01/DOMS -uc 128 -min -0 -max 0 -stats full -rt 0:10 -bs 0:01 -a -r resscale16.xml

time ./sbutil -u soe -p soe -cs //ed2xcomp01/DOMS -soe -parallel 64 -dup 2

time ./charbench -u soe -p soe -cs //ed2xcomp01/DOMS -uc 128 -min -0 -max 0 -stats full -rt 0:10 -bs 0:01 -a -r resscale32.xml

time ./sbutil -u soe -p soe -cs //ed2xcomp01/DOMS -soe -parallel 64 -dup 2

time ./charbench -u soe -p soe -cs //ed2xcomp01/DOMS -uc 128 -min -0 -max 0 -stats full -rt 0:10 -bs 0:01 -a -r resscale64.xml

time ./sbutil -u soe -p soe -cs //ed2xcomp01/DOMS -soe -parallel 64 -dup 2

time ./charbench -u soe -p soe -cs //ed2xcomp01/DOMS -uc 128 -min -0 -max 0 -stats full -rt 0:10 -bs 0:01 -a -r resscale128.xml

time ./sbutil -u soe -p soe -cs //ed2xcomp01/DOMS -soe -parallel 64 -dup 2

time ./charbench -u soe -p soe -cs //ed2xcomp01/DOMS -uc 128 -min -0 -max 0 -stats full -rt 0:10 -bs 0:01 -a -r resscale256.xml

I’m hoping it should be fairly obvious what the script does. I create a tablespace big enough to accommodate the final data set (In this instance 1TB). This way I’m not constantly auto extending the tablespace. Then I create a seed schema at 1GB in size. Then I run a workload against the schema, this will remain constant through all of the tests. Each of the load tests runs for 10min and collects full stats. After the test has completed I use “sbutil” to replicate the schema and simply rerun the workload writing the results to a new file. By the end of the tests and duplication the end schema has roughly ½ a Terabytes worth of data and about the same in indexes. The benefit of this approach is that it’s much quicker to expand the data than using the wizard to do it.

The directory will now contain all of the results files which can be processed trivially using the the “parse_results.py” script in utils directory. In the example below I simply parse all of the files and write the result to a file called res1.csv. Because the python script is so easy to modify you could pull out any of the stats from the file. I’m just using the ones that it collects by default

$ python ../utils/parse_results.py -r resscale01.xml    resscale02.xml    resscale04.xml    resscale08.xml    resscale16.xml    resscale32.xml    resscale64.xml    resscale9128.xml  resscale9256.xml -o res1.csv


On completing this step I can use my favourite spread sheet tool to load res1.csv and take a look at the data. In this instance I’m using “Numbers” on OS X but Excel would work equally as well.

spreadsheet

The resulting chart shows pretty much what we’d expect .That doubling the size of the datasets but keeping the workload and SGA constant results in a reduction of the TPS. Which tails off quickly as the data set quickly outgrows the buffer cache and we are forced to perform physical IOs, but declines less slowly after this impact is felt. It’s something we would expect but it’s nice to able to demonstrate it.

TPS Chart

So the next obvious question is how easy is it to rerun the test but to enable a feature like partitioning. Really easy. Simply ask oewizard to create the schema using hash partitioning i.e. change

time ./oewizard -scale 1 -dbap welcome1 -u soe -p soe -cl -cs //ed2xcomp01/DOMS -ts SOE -create

to

time ./oewizard -scale 1 -hashpart -dbap welcome1 -u soe -p soe -cl -cs //ed2xcomp01/DOMS -ts SOE -create

and rerun the tests to produce the following results


TPS Chart2

Or even enabling compression and partitioning by simply replacing the first sbutil operation with

time ./sbutil -u soe -p soe -cs //ed2xcomp01/DOMS -soe -parallel 64 -dup 2 -ac


TPS Chart3


During the testing of this I fixed up a few bugs which I wasn’t aware of so it’s worth downloading a new build which you can find here. I also released I needed a few more options on the utility to round it out. These were

  • “seqs” : this recreates the sequences used by the schemas
  • “code” : rebuilds any stored procedures

I’ll most likely add a few more over the coming weeks.

Just a word of caution… I’ve noticed that my stats collection is broken. I don’t cater for all of the possible version/configurations in my script and so it can break on large stats collections and go serial. I’ll try and fix this next week.



Comments

Swingbench Util... Going Big...

I’ve added a utility to swingbench that perhaps I should’ve done a long time ago. The idea behind it is that it enables you to validate, fix or duplicate the data created by the wizards. I’m often contacted by people asking me how to fix an install of “Order Entry” or the “Sales History” benchmark after they’ve been running for many hours only to find they run out of temp space for the index creation. I’m also asked how they can make a really big “Sales History” schema when they only have relatively limited compute power and to create a multi terabyte data set might take days. Well the good news is that now, most of this should be relatively easy. The solution is a little program called “subtil” and you can find it in your bin or win bin directory.

Currently sbutil is command line only and requires a number of parameters to get it to do anything useful. The main parameters are
  • “-dup” indicates the number of times you want data to be duplicated within the schema. Valid values are 1 to n. Data is copied and new primary keys/foreign keys generated where necessary. It’s recommended that you first increase/extend the tablespace before beginning the duplication. The duplication process will also rebuild the indexes and update the metadata table unless you specifically ask it not to with the “-nic” option. This is useful if you know you’ll be reduplicated the data again at a later stage.
  • “-val” validates the tables and indexes in the specified schema. It will list any missing indexes or invalid code.
  • “-stats” will create/recreate statistics for the indicated schema
  • “-delstats” will delete all of the statistics for the indicated schema
  • “-tables” will list all of the tables and row counts (based on database statistics) for the indicated schema
  • “-di” will drop all of the indexes for the indicated schema
  • “-ci” will recreate all of the indexes for the indicated schema
Alongside of these verb based parameters are a number of associated attributes
  • “-u” : required. the username of the schema
  • “-p” : required. the password of the schema
  • “-cs” : required. the connect string of the schema. Some examples might be “localhost:1521:db12c”, “//oracleserver/soe” “//linuxserver:1526/orcl” etc.
  • “-parallel” the level of parallelism to use to perform operations. Valid values are 1 to n.
  • -sort” sort the seed data before duplicating it.
  • “-nic” don’t create indexes or constraints at the end of a duplication
  • “-ac” convert the “main” tables to advanced compression
  • “-hcc” convert the main tables to Hybrid Columnar Compression
  • “-soe” : required. the target schema will be “Order Entry”
  • “-sh” : required. the target schema will be “Sales History”
So lets take a look at a few examples.

sbutil -u soe -p soe -cs //oracleserver/soe -dup 2 -parallel 32 -sort -soe

will duplicate the data in the soe schema but will first sort the seed data. You should see output similar to this

Getting table Info
Got table information. Completed in : 0:00:26.927
Dropping Indexes
Dropped Indexes. Completed in : 0:00:05.198
Creating copies of tables
Created copies of tables. Completed in : 0:00:00.452
Begining data duplication
Completed Iteration 2. Completed in : 0:00:32.138                                                  
Creating  Constraints
Created  Constraints. Completed in : 0:04:39.056
Creating  Indexes
Created  Indexes. Completed in : 0:02:52.198
Updating Metadata and Recompiling Code
Updated Metadata. Completed in : 0:00:02.032
Determining New Row Counts
Got New Row Counts. Completed in : 0:00:05.606
Completed Data Duplication in 0 hour(s) 9 minute(s) 44 second(s) 964 millisecond(s)
----------------------------------------------------------------------------------------------------------
|Table Name          |  Original Row Count|       Original Size|       New Row Count|            New Size|
----------------------------------------------------------------------------------------------------------
|ORDER_ITEMS         |         172,605,912|             11.7 GB|         345,211,824|             23.2 GB|
|CUSTOMERS           |          40,149,958|              5.5 GB|          80,299,916|             10.9 GB|
|CARD_DETAILS        |          60,149,958|              3.4 GB|         120,299,916|              6.8 GB|
|ORDERS              |          57,587,049|              6.7 GB|         115,174,098|             13.3 GB|
|ADDRESSES           |          60,174,782|              5.7 GB|         120,349,564|             11.4 GB|
----------------------------------------------------------------------------------------------------------


The following example validates a schema to ensure that the tables and indexes inside a schema are all present and valid

./sbutil -u soe -p soe -cs //ed2xcomp01/DOMS -soe -val


The output of the command will look similar to to this

The Order Entry Schema appears to be valid.
--------------------------------------------------
|Object Type    |     Valid|   Invalid|   Missing|
--------------------------------------------------
|Table          |        10|         0|         0|
|Index          |        26|         0|         0|
|Sequence       |         5|         0|         0|
|View           |         2|         0|         0|
|Code           |         1|         0|         0|
--------------------------------------------------


The next command lists the tables in a schema

./sbutil -u soe -p soe -cs //ed2xcomp01/DOMS -soe -tables

Order Entry Schemas Tables
----------------------------------------------------------------------------------------------------------------------
|Table Name                  |                Rows|              Blocks|           Size|   Compressed?|  Partitioned?|
----------------------------------------------------------------------------------------------------------------------
|ORDER_ITEMS                 |          17,157,056|             152,488|         11.6GB|              |           Yes|
|ORDERS                      |           5,719,160|              87,691|          6.7GB|              |           Yes|
|ADDRESSES                   |           6,000,000|              75,229|          5.7GB|              |           Yes|
|CUSTOMERS                   |           4,000,000|              72,637|          5.5GB|              |           Yes|
|CARD_DETAILS                |           6,000,000|              44,960|          3.4GB|              |           Yes|
|LOGON                       |                   0|                   0|        101.0MB|              |           Yes|
|INVENTORIES                 |                   0|                   0|         87.0MB|      Disabled|            No|
|PRODUCT_DESCRIPTIONS        |                   0|                   0|         1024KB|      Disabled|            No|
|WAREHOUSES                  |                   0|                   0|         1024KB|      Disabled|            No|
|PRODUCT_INFORMATION         |                   0|                   0|         1024KB|      Disabled|            No|
|ORDERENTRY_METADATA         |                   0|                   0|         1024KB|      Disabled|            No|
----------------------------------------------------------------------------------------------------------------------

To drop the indexes in a schema use the following command

./sbutil -u sh -p sh -cs //oracle12c2/soe -sh -di
Dropping Indexes
Dropped Indexes. Completed in : 0:00:00.925

To recreate the indexes in a schema use the following command

./sbutil -u sh -p sh -cs //oracle12c2/soe -sh -ci
Creating Partitioned Indexes and Constraints
Created  Indexes and Constraints. Completed in : 0:00:03.395

You can download the new version of the software here.
Comments

Java Version Performance

Sometimes it’s easy to loose track of the various version numbers for software as they continue their march ever onwards. However as I continue my plans to migrate onto Java8 and all of the coding goodness that lies within I thought it was a sensible to check what difference it would make to swingbench in terms of performance.

Now before we go any further it’s worth pointing out this was a trivial test and my results might not be representative of what you or anyone else might find.

My environment was

iMac (Retina 5K, 27-inch, Late 2014), 4 GHz Intel Core i7, 32 GB 1600 MHz DDR3 with a 500GB SSD

Oracle Database 12c (12.1.0.2) with the January Patch Bundle running in a VM with 8GB of memory.

The test is pretty simple but might have an impact on your choice of JVM when generating a lot of data (TB+) with swingbench. I simply created a 1GB SOE with the oewizard. This is a pretty CPU intensive operation for the entire stack : swingbench, the jdbc drivers and the database. The part of the operation that should be most effected by the performance of the JVM is the “Data Generation Step”.

So enough talk what impact did the JVM version have?

t1

Now the numbers might not be earth shattering but it’s nice to know a simple upgrade of the JVM can result in nearly a 25% improvement in performance of a CPU/database intensive batch job. I expect these numbers to go up as I optimise some of the logic to take advantage of Java8 specific functionality.



Comments

And now a fix to the fixes...

Just realised I uploaded the last build with Oracle Database 12c Release 1 (12.1.0.1) jdbc drivers. I’ve fixed that (now 12.1.0.2 drivers) and so feel free to download again.

Thanks

Dom.
Comments

Fixes to swingbench

Happy new year….

I’ve updated swingbench with some fixes. Most of these are to do with the new results to pdf functionality. But at this point I’m giving fair warning that the following releases will be available on Java 8 only.

I’m also removing the “3d” charts from swingbench and replacing them with a richer selection of charts build using JavaFX.

Also I’ve added some simple code to the jar file to tell you which version it is… You can run this with the command

$ java -jar swingbench.jar
Swingbench Version 2.5.955

As always you can download it here. Please leave comments below if you run into any problems.
Comments