Setting up Swingbench for Oracle Autonomous Data Warehousing (ADW)

The following blog details how to install swingbench against ADW. It’s very similar in structure to my previous blog for ATP with the major difference being the benchmark we are using. The example below will use the TPC-DS Like Benchmark but the approach and parameters apply to the SH benchmark as well. To install swingbench on ADW you’ll need to run through the following steps (Step 7 is optional).

Step 1/ Make Sure you have a SSH Public key


You are likely to already have a ssh key but it is possible that you want to create another purely for this exercise. You’ll need this key to create your application server. You can find details on how to do this here

https://git-scm.com/book/en/v2/Git-on-the-Server-Generating-Your-SSH-Public-Key

It’s the .pub file or more precisely its contents that you’ll need. The public key file is typically created in the hidden .ssh directory in your home directory. The public key will look something like this (modified)

ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDfO/80wleUCYxY7Ws8c67PmqL2qRUfpdPGOduHmy9xT9HkCzjoZHHIk1Zx1VpFtQQM+RwJzArZQHXrMnvefleH20AvtbT9bo2cIIZ8446DX0hHPGaGYaJNn6mCeLi/wXW0+mJmKc2xIdasnH8Q686zmv72IZ9UzD12o+nns2FgCwfleQfyVIacjfi+dy4DB8znpb4KU5rKJi5Zl004pd1uSrRtlDKR9OGILvakyf87CnAP/T8ITSMy0HWpqc8dPHJq74S5jeQn/TxrZ6TGVA+xGLzLHN4fLCOGY20gH7w3rqNTqFuUIWuIf4OFdyZoFBQyh1GWMOaKjplUonBmeZlV

You’ll need this in step 3.

Step 2/ Create the ADW Instance


You’ll have to have gone through the process of acquiring an Oracle Cloud account but that’s beyond the scope of this walkthrough. Once you have the account and have logged into Oracle Cloud Infrastructure, click on the menu button in the top left of the screen and select “Autonomous Data Warehouse”. Then simply follow these steps.

Napkin 28-08-18, 1.38.46 pm

Step 3/ Create a compute resource for the application server


Whilst the ADW instance is creating we can create our application to run swingbench. For any reasonable load to be run against the application server you’ll need a minimum of two cores for larger workloads you may need a bigger application or potentially a small cluster of them.

In this walkthrough we’ll create a small 2 core Linux Server VM.

Iaas Creation 11-08-18, 9.48.29 am

This should only take a couple of minutes. On completion we’ll need to use the public IP address of the application server we created in the previous step.

Step 4/ Log onto application server and setup the environment


In this step we’ll use ssh to log onto the application server and setup the environment to run swingbench. ssh is available on MacOS and Linux. On platforms like Windows you can use Putty. You’ll need the IP address of the application server you created in the previous step.

First bring up a terminal on Linux/Mac. On Putty launch a new ssh session. The username will be “opc”

ssh opc@< IP Address of Appserver >

You should see something similar to

ssh opc@129.146.65.101
ECDSA key fingerprint is SHA256:kNbpKWL3M1wB6PUFy2GOl+JmaTIxLQiggMzn6vl2qK1tM.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '129.146.65.101' (ECDSA) to the list of known hosts.
Enter passphrase for key '/Users/dgiles/.ssh/id_rsa':


By default java isn’t installed on this VM so we’ll need to install it via yum. We’ll need to update yum first

$> sudo yum makecache fast

Then we can install java and its dependencies

$> sudo yum install java-1.8.0-openjdk-headless.x86_64

We should now make sure that swingbench works correctly

$> java -version
openjdk version "1.8.0_181"
OpenJDK Runtime Environment (build 1.8.0_181-b13)
OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)

We can now pull the swingbench code from the website

$> curl http://www.dominicgiles.com/swingbench/swingbenchlatest.zip -o swingbench.zip

and unzip it

$> unzip swingbench.zip

Step 5/ Download the credentials file


The next step is to get the credentials file for ATP. You can do this by following these steps.

Napkin 2 28-08-18, 2.25.22 pm

You’ll need to upload this to our application server with a command similar to

scp wallet_DOMADW.zip opc@< IP address of App Server >:

This will place our credentials file in the home directory of the application server. You don’t need to unzip it to use it.

Step 6/ Install a workload schema into the ADW instance

We can now install a schema to run our queries against. We do this by first changing to the swingbench bin directory

cd swingbench/bin

And then running the following command replacing your passwords with those that you specified during the creation of the ADW instance.

A quick explanation of the parameters we are using
  • -cf tells tpcdswizard the location of the credentials file
  • -cs is the connecting for the service of the ADW instance. It is based on the name of the instance and is of the form followed by one of the following _low, _medium or _high
  • -ts is the name of the table space to install swingbench into. It is currently always “DATA”
  • -its is the name of the index table space to install swingbench into. It is currently always “DATA”
  • -dba is the admin user, currently this is always admin
  • -dbap is the password you specified at the creation of the ADW instance
  • -u is the name you want to give to the user you are installing swingbench into (I used tpcds)
  • -p is the password for the user. It needs to follow the password complexity rules of ATP
  • -async_off you need to disable the wizards default behaviour of using async commits. This is currently prohibited on ADW
  • -scale indicates the size of the schema you want to create where a scale of 10 will generate 10GB of data. The indexes will take an additional amount of space roughly half the size of the data.
  • -create tells swingbench to create the schema (-drop will delete the schema)
  • -cl tells swingbech to run in character mode
  • -v tells swingbench to output whats going on (verbose mode)

You should see the following output. A scale of 10 should take just over 25 mins to create depending on how many CPUs you allocated to your compute and ADW instances.

SwingBench Wizard
Author : Dominic Giles
Version : 2.6.0.1086

Running in Lights Out Mode using config file : ../wizardconfigs/tpcdswizard.xml
Connecting to : jdbcGaspracle:thin:@domadw_high
Connected
Starting run
Starting script ../sql/tpcds_droptables.sql
Script completed in 0 hour(s) 0 minute(s) 6 second(s) 594 millisecond(s)
Starting script ../sql/tpcds_createtables.sql
Script completed in 0 hour(s) 0 minute(s) 0 second(s) 740 millisecond(s)
Inserting data into table STORE_SALES_21603029
Inserting data into table STORE_SALES_2
Inserting data into table STORE_SALES_14402020
Inserting data into table STORE_SALES_7201011
Inserting data into table CATALOG_SALES_10811609
Inserting data into table CATALOG_SALES_2
Inserting data into table CATALOG_SALES_3603871
Inserting data into table CATALOG_SALES_7207740
Inserting data into table WEB_SALES_5395379
Inserting data into table WEB_SALES_2
Inserting data into table STORE_RETURNS_179999
Inserting data into table WEB_SALES_3596920
Inserting data into table WEB_SALES_1798461
Inserting data into table INVENTORY_2452180
Inserting data into table INVENTORY_2451725
. . .
. . .
Connection cache closed
Starting script ../sql/tpcds_analyzeschema.sql
Script completed in 0 hour(s) 1 minute(s) 25 second(s) 439 millisecond(s)
Starting script ../sql/tpcds_constraints.sql
Script completed in 0 hour(s) 11 minute(s) 25 second(s) 427 millisecond(s)
Starting script ../sql/tpcds_transactions_pkg.sql
Script completed in 0 hour(s) 0 minute(s) 0 second(s) 675 millisecond(s)

============================================
| Datagenerator Run Stats |
============================================
Connection Time 0:00:00.003
Data Generation Time 0:12:58.906
DDL Creation Time 0:12:58.965
Total Run Time 0:25:57.876
Rows Inserted per sec 135,690
Data Generated (MB) per sec 10.6
Actual Rows Generated 105,530,092
Commits Completed 5,366
Batch Updates Completed 528,519


It may report that the schema was created unsuccessfully because we asked it to not create any indexes on the schema. Just ignore this and carry on.

To be sure it installed correctly you can validate the schema using the following command

./sbutil -tpcds -cf ~/wallet_DOMADW.zip -cs domadw_high -u tpcds -p < Your user password > -val

You should see something similar to

Screenshot of RapidWeaver (06-09-2018, 20-13-48)

It should report that you are missing a single index INVENTORY_NUK. This is expected a a fix in the future will take into consideration that we asked for indexes (except those used for constraints) no to be created.

You can check on the number of rows that have been created with the following command

./sbutil -tpcds -cf ~/wallet_DOMADW.zip -cs domadw_high -u tpcds -p -tables

You should see something similar to this

Screenshot of RapidWeaver (06-09-2018, 20-15-27)
The size may be smaller or larger than this depending on a number of factors but in this case HCC has attempted to compress the data even though it was loaded via DML operations. A higher compression ratio can be achieved by reorganising the data.


Step 7/ Run a workload


In the next step we’ll be using swingbench’s load generator to create a load against ADW. Since we are running on a command line interface with no graphical UI we’ll be using the “charbench” command line utility. But before we begin we’ll disable a couple of the TPC-DS queries because they take a long time to complete. To do this just run the following command.


Now we can run a query workload against the TPCDS schema with a command similar to this


I won’t explain the parameters that I detailed earlier when running the wizard but for the new ones do the following
  • -v indicates what info should be shown in the terminal when running the command. In this instance I’ve asked that the users logged on, The Time remaining in the run, Tx/Min, Tx/Sec and average response time of the queries (in milliseconds)
  • -intermin -intermax indicates the time to sleep between each transaction.
  • -rt indicates how long to run the benchmark before automatically stopping it

You should see output similar to the following

Screenshot of RapidWeaver (06-09-2018, 20-19-41)

NOTE : The value used in the example above for the run time might not be enough to complete all of the queries (approximately 100) in which case either increase the run time or up the number of CPUs being used.

One thing to try whilst running the load against the server is to try and scale the number of available CPUs to the ADW instance up and down. This should see a increase in the number of queries being processed.

Screenshot of Google Chrome (30-08-2018, 13-22-38)

Somethings to note. At the end of each run you’ll end up with a results file in xml format in the directory you ran charbench from. i.e.

$ ls
bmcompare clusteroverview debug.log oewizard results00003.xml results00006.xml results00009.xml sbutil swingbench
ccwizard coordinator jsonwizard results00001.xml results00004.xml results00007.xml results2pdf shwizard tpcdswizard
charbench data minibench results00002.xml results00005.xml results00008.xml results.xml sqlbuilder


These xml files contain the detailed results of each run i.e. average transactions per second, completed transactions, percentile response times etc. Whilst these are difficult to read you can install swingbench on a windows or mac and use a utility called results2pdf to convert them into a more human parseable form. You can find some details on how to do that here.

http://www.dominicgiles.com/blog/files/86668db677bc5c3fc1f0a0231d595ebc-139.html

Using the methods above you should be able to create scripts that test the performance of the ADW server. i.e. running loads with different CPU counts, users, think times etc.

But beware that before comparing the results with on premise servers there are a lot of features enabled on the ATP server like db_block_checking and db_check_sum that may not be enabled on another Oracle instance.

Step 7/ Optional functionality


To make the demo more interactive you could show the charts on the service console. The only issue is that the refresh rate is a little slow. You can improve on this by using some utilities I provide. The first of these is Database Time Monitor (http://www.dominicgiles.com/dbtimeviewer.html).

To install it you first need to download it (
http://www.dominicgiles.com/downloads.html) to your PC or mac and make sure that you’ve installed a Java 8 JRE. Once you’ve done that you simply need to unzip it and change into the bin directory. From there all you need to do is to run a command similar to

$> ./dbtimemonitor -u admin -p < your admin password > -cs domadw_low -cf /Users/dgiles/Downloads/wallet_DOMADW.zip

or on Windows

$> dbtimemonitor.bat -u admin -p < your admin password > -cs domadw_low -cf /Users/dgiles/Downloads/wallet_DOMADW.zip

where the -cf parameter references the credential file you downloaded and the -cs parameter references the service for ATP. You should see a screen similar to this.

Screenshot of ChildMain (10-08-2018, 19-07-22)

You can use this to monitor in real time the activity of the ATP instance.

The final tool, cpumonitor (http://www.dominicgiles.com/cpumonitor.html), allows you to monitor the activity of the application server. This can be downloaded from here (http://www.dominicgiles.com/downloads.html) and again should be installed on your PC or Mac. This is done by simply unzipping the download. Then change into the bin directory on mac or linux or the winbin directory on a windows machine.

You’ll need to edit the XML file to reflect the location of the application server.



Save the file and then launch it with the command

$> ./cpumonitor
Screenshot of Terminal (11-08-2018, 19-04-00)

Conclusion

This walk through should show you how to install swingbench and it’s DSS and data warehousing benchmarks against ADW. I’ll be providing
blog comments powered by Disqus