A new build of Datagenerator

I hadn't realised it's been such a long time since I updated Datagenerator… I'm going to put that right today by releasing a new build that will eventually turn into version 0.5 and be at the heart of the upcoming Swingbench 2.6.

I've had to update Datagenerator extensively to support some of the new benchmarks and changes that the next version of swingbench requires. Some of the changes I've made are.

Support for a new generator to create arbitrary precision floating point values. It also includes a preview of the data to be generated. This will become the standard moving forwards.

Datagenerator 0.4.0.1013

Number generators can use the row count of another table to act as the end value. This makes it simple and quick to create foreign keys.

AppMain

Reverse Engineering now supports Index and sequences. The reverse engineering process also uses the statistics from tables to seed the generators.

untitled

Tables can be scaled at different ratios from others…

Datagenerator 0.4.0.1013

Also upcoming in the 0.5 build…

  • Generators based on values from other generators i.e. use the "first_name" created by another generator to act as part of an email address.
  • UID Generator
  • Potentially generators for ZIPCodes, IP Ranges, Points within Polygons.

You can download it as always from here
Comments

ORDS From : Start to Finish

Introduction


This is likely to be a pretty long posting on ORDS to describe the workflow for creating a REST application from start to finish. With that said I'm going to have to make a few compromises to keep it tolerable to follow. I thought long and hard about what would be a meaningful example that would be easy to understand but show most of the aspects of building a REST based application using Oracle ORDS. With this in mind I settled on a simple micro service to enable users to "like" products/rows in a similar fashion to being able to like posts on Facebook/Twitter or other social media networks. Whilst there are plenty of examples of how to do this out there already I couldn't find many that explained the entire process from start to finish and using the Oracle Database as the target. I'll try and keep it up to date as we are expecting to see a few changes in SQLDeveloper and ORDS shortly. On completion we should have a web page that looks a little like the one below.

untitled

The code for this blog is maintained here
urlhttps://github.com/domgiles/ORDS_Example

To create a working environment simply go to the link above, click on the "Clone or download" button and then click on the "Download ZIP" link. When it's finished downloading uncompress the file into your preferred location.

Safari
This post should largely serve as a reference for those wishing to test out this functionality themselves. I'll also be releasing a version of this code where we store JSON in the database as opposed to relational table to offer a comparison with the approach.

ORDS Description


First of all what is ORDS and why do we care. Well this is Oracle's description

"Oracle REST Data Services (ORDS) makes it easy to develop modern REST interfaces for relational data in the Oracle Database and now, with ORDS 3.0, the Oracle Database 12c JSON Document Store and Oracle NoSQL Database. ORDS is available both as an Oracle Database Cloud Service and on premise.

REST has become the dominant interfaces for accessing services on the Internet, including those provided by major vendors such as Google, Facebook, Twitter, and Oracle, and within the enterprise by leading companies throughout the world. REST provides a powerful yet simple alternative to standards such as SOAP with connectivity to virtually every language environment, without having to install client drivers, because it is based on simple HTTP calls which virtually all language environments support.

For relational data, ORDS 3.0 now automatically generates REST endpoints for about a dozen major single table operations including create, query, update, and delete. Easy to write JSON filters allow query predicates and sorts to be specified in a query-by-example format. For more complex operations, REST calls are mapped to SQL and PL/SQL routines you can write which return data in JSON and other formats.

ORDS 3.0 enables a wide range of operations to be performed on JSON documents in the Oracle Database 12c Document Store using the Simple Oracle Document Access (SODA) API over REST. These operations include create, query, update, and delete with support for JSON filters.
"

With that said lets get started.

Prerequisites


If you plan to install this example you'll need to have the following components available. I won't be describing their installation. I'll assume you'll have done this ahead of time
  • Oracle Database (SE or EE) 11.1 or later. Oracle Database 12c is the preferred version. It can be configured either as a PDB or not. You can download it from here
  • Oracle SQL Developer 4.1.3. It needs to be at least this version. You can down load it here
  • Oracle ORDS 3.0.6. You can download it here. (NOTE : I will walkthrough this installation later in the document)
  • A Java Virtual Machine on the server/laptop/VM you are running the example on. I'd recommend Java 8 available here
  • Alternatively you could download a Virtual Box VM such as the "Hands-on Database Application Development" VM from here.Which will provide everything you need in a ready configured package.

I'm also going to assume that you'll be running on a unix type OS (Linux, Mac OS, Solaris, AIX). Although this will also work fine on windows (just need to convert some of the commands).

My setup


For my setup I have the following configuration.
Architecture@2x
In reality everything is running on my workstation but the the diagram above provides a working model

Install of ORDS

Assuming you aren't going down the pre installed Virtual Box VM mentioned in the previous section. The install of ORDS is very simple. After you've downloaded it to your VM/Server/Laptop all you need to do is unzip it and change into the directory . i.e.

In the directory you unzipped the download run ORDS. The first time you run this it will lead you through the install asking you for the connection to the database. I'd recommended skipping configuring APEX at this stage (by selecting 2 when asked) and running ORDS using http. Whilst these options might not be ideal for production environments they will enable you to get started.

After this completes hit control-c to exit the program. At this stage you should probably add a user with SQLDeveloper development support. I've named mine "appdev" but any name would be fine. We'll use this user later in the example. You can do this using the following command and supplying passwords.

And then restart ORDS. This time it won't prompt you for any passwords but simply run in the foreground


Thats all thats needed to install ORDS.

Install Database Tables


The next step is to create a user and tablespace to hold the tables. I've provided a simple script "create_user.sql" that will create a database user "SOE" and tablespace "SOE" to run it simply connect to the database with a user with DBA privilege. (NOTE : This could also be done inside of SQLDeveloper)
The script creates a user "soe" and and tablespace "soe" and then grants the user access to the tablespace. It finishes off by granting the user enough privileges to create tables, views and procedures
At this stage it's worth starting SQLDeveloper where the bulk of the work will be done. You'll need to create a connection to the Database .You'll do this from the connections tab as show below.

Oracle SQL Developer

Then fill in the details and hit connect

New - Select Database Connection

The next step is to create the example tables and test data. I've provided a simple script "create_tables.sql to do this for you. It will create one to hold the "product_information" that would be typically part of and existing application and other to hold the likes. This table is called "user_likes".


The "USER_LIKES" table is an Index Organised Table. Whilst the trivial amount of data we holding in this example won't benefit from his approach it would likely have an impact on production scale workloads. The script will insert a few sample rows as well.

To run this script copy and paste the SQL into the SQLDeveloper
Oracle SQL Developer - oracle12c2 - soe


Install REST Service Application Code



At this point we'll connect to ORDS from SQLDeveloper. You'll be using the username you used during the installation of ORDS. In my case this was "appdev". From the menu you'll need to view "REST Data Services" and select the "Development" option as shown below

Oracle SQL Developer

This will launch the REST development UI (by default on the left hand side of the screen) as shown below. Add a new connection and fill in the details as shown i.e. Server Path = "/ords" and Schema Workspace = "/soe/".

Affinity Designer

The next step is to run a script "utilities_module.sql" which will enable the schema to support ORDS and create the ORDS modules used by the application. This is purely for convenience. I'll describe the process of creating one of the modules in a blog shortly after this one. To install the REST modules simply copy and paste the code below into the SQLDeveloper session you've already created.

As shown below
Oracle SQL Developer - oracle12c2 - soe

After this script completes we can then download the modules into SQLDeveloper from ORDS's repository. The reason we need to do this is that we've simply imported the data into the ORDS repository and not made it visible to SQLDeveloper at this point. To achieve this all we need to do is

slice1@2x

This will display the modules we created by running the script. The two modules are
  • Products : This will simply retrieve the rows from the PRODUCTS table. It's not really necessary to explicitly declare this module as ORDS base functionality already has a REST API that could be used instead.
  • Utilities : This module contains three templates or services we can call to like a row in a table
    • get likes : Gets the total number of likes of a row by all users and the user specified in the parameter
    • like : Enables you to like a row if you haven't already
    • unlike : reverses a like operation by the user specifed if he's liked it before
You can drill into the details of each service by clicking on it and then clicking on the HTTP operation it uses in SQLDeveloper. The details of the service comprises of three sections (here we'll take a look at the "like" service).

The first tab list the SQL or PL/SQL that will implement the business logic. In this particular example we are using PL/SQL to first check if a user has liked a row by checking first the USER_LIKES table. If they haven't already liked the row we insert a new one and commit the transaction.

Oracle SQL Developer

The next tab lists the parameters that are passed to our logic or SQL statement. In our particular example we are passing three parameters that tell the procedure the table that is being liked, the unique identifier for the row (in our simple case it's assumed to be a string but could as equally be a rowid or number) and the user identifier (typically application specific but we'll assume it's something like an email).

Oracle SQL Developer

The final tab summarises some of the settings for the service and provides you with a url to call to test the service.

Oracle SQL Developer

You can test most "GET" REST calls pretty trivially within a browser put it takes just a little bit more for POST, PUT and DELETE. To test the POST call shown in the screen shot above we are going to use a command line utility called curl (installed by default on linux and MacOS). We'll also use a file that contains the json we want to send to the REST service. This file is called "test.json" and has the following contents


To call the REST service all we need to do is issue the following command


Obviously substitute the name of your server in the code. You should get a HTTP return code of 200. Indicating that the code successfully ran.

We can also test the GET REST call that returns the "like" we just made using the following command


You should get a small JSON snippet with the number of likes by everybody and the user "dominic.giles"

Calling the REST Services from a web page


Now that we've got the services up and running we can use a simple web page that displays all of the rows from the PRODUCT_INFORMATION table with a "thumb" to enable us to like the row. Like all of the code used in this blog you can find them all on my github repository.Let me start by saying that the web page we are going to create is a very simple example and would normally require a lot more validation and error handling before it would ever be considered for a production environment.

The web page we'll put together uses JQuery and a javascript library called "DataTables" (link here). However it's important to point out that implementing this functionality doesn't require these tools. It could have been as easily achieved using Oracle Application Express or a framework like Jet. But to simplify things this approach allows us to minimise the steps required and reduce the amount of moving parts.

The final app should look like this

slicescreen

When the form is first loaded the javascript calls the "GetLikes" REST service (once for each row displayed) which asynchronously returns the total count of "likes" a product has received and sets the colour of the them dependent on whether the current user has liked it. It will stay grey if the current user hasn't liked it and turn blue if they have. On clicking on a thumb the javascript checks whether the user has previously like it. If they haven't it calls the "Like" REST Call and call the "Unlike" REST call if they have.

The code is made up broken up into two files. One containing the HTML and the other the containing the javascript.



And the javascript



This really isn't the place to go into all of the details of the code but there are one or two things that are worth pointing out. The first is at the start of liketable.js file. You'll need to change the following entries to reflect your own environment. It's likely you'll only need to change the "hostport" variable to start with. The username is hardcoded in this example but in the real world it would be derived from the system context.

var $userName = 'dominic.giles';
var $tableName = 'product_information';
var $hostport = 'http://oracle12c2:8080';
var $uniqueIDColumn = "product_id";

The other piece of code worth looking at is the calls to the REST service. The code below uses the javascript ajax functionality to make an asynchronous call to ORDS which with then return the JSON response.

function getRowLikes(un, tn, rID) {
    var likeResult = $.ajax({
        url: $hostport + "/ords/soe/utilities/getlikes?table_name=" + tn + "&column_value=" + rID + "&user_id=" + un,
        type: "GET"
    });
    return likeResult;
}


Likewise the following shows the AJAX call to the like function. This call unlike the the previous one is performed synchronously but follows the same principals.

function likeRow(un, tn, rID) {
    $.ajax({
        url: $hostport + "/ords/soe/utilities/like",
        type: "POST",
        data: {"table_name": tn, "column_value": rID, "user_id": un},
        async: false
    });
}


Once you've made the changes to the hostname variable you should be able to open a browser and open the products.html file within it.

Over the coming days I'll knock up a little screen cast to accompany this blog.
Comments

Testing a new build of swingbench

I haven't released any updates to swingbench in a while and the reason was that I encountered a difficult to debug "concurrent modification" issue. Normally these are pretty straight forward to find if you are regularly modifying the code. However that's not how I work. I only modify the swingbench code base when I find the time which is generally when something else is cancelled at short notice… Or there's a massive issue with a release.

The problem I had this time around was I made a "quick fix" a couple of years ago to a section of the code that creates the various snapshots of data during a timed run. I always meant to come back and fix this "quick fix" when I found the time… Anyway, fast forward to October last year and I added some code that created "percentiles" for reporting for simple and full stats collection. This lead to some behaviour on the stats collection that created a near impossible to debug "concurrent modification exception". The only solution was to painstakingly go through all of the threaded code to find the issue. It turns out that the "quick fix" I made created a sublist of a section of the data structure as a "view" instead of a new copy. An amateurish mistake to make but I guess not uncommon if you maintain the code in the way I do….

So what I'm looking for is some volunteers to check that the code works as expected.

You can find the new build of the code here. You'll need Java 8 on the machine you're installing it on.

The steps needed to test the problem code are…

  • Unzip the swingbench zip file (swingbenchtest.zip)
  • Change into the bin directory (on Linux/Unix) or winbin directory (on windows)
  • Create a new swingbench schema using oewizard or shwizard. This will walk you through the install against your database (11.2 or 12.1)
  • Start up swingbench or charbench…
  • Set and test the connection to the database that you ran oewizard against
  • Set the Benchmark run time to a value i.e. 10mins,
  • Set the start record statistics to a value i.e. 2mins
  • Set the stop record statistics to a value i.e. 8mins

TimedStatsCollection

  • Start the benchmark run
  • When it finished check that stats/xml has been created in the output tab.

Alternatively do this in charbench with a command like

./charbench -cs //oracleserver/orcl -u soe -p soe -uc 100 -rt 0:10 -bs 0:02 -be 0:08


This should generate a results file in the directory you started it from.

Appreciate the time and effort for helping me out…

Alongside the fix to this difficult build I've also included the additional functionality in this build

  • Percentiles are now created for both "simple" and "full" stats collection
  • You can now specify "rt", "bs" and "be" in fractions of a second i.e. 0:00.30 (30 seconds). Previously is had a granularity of minutes
  • An initial JSON in the database benchmark framework has been created.
Comments

Update to cpumonitor... first of many

I've updated cpumonitor for the first time in a while. This brings the first of many release as I gradually drag the swingbench code base up to Java 8 conformance. This build now requires Java 8. The few changes I've made are
  • The record button now outputs a CSV as opposed a tab delimited file.
  • Removed unnecessary output in the "micro" version
  • Updated ssh libraries to latest version
You can find it here. Let me know if you have any issues.
Comments

Oracle, Pythons and Pandas... Oh My.

Over the last year I've been using Python more frequently and enjoying the experience. Some things are just simpler to do in Python than my go to language, Java. One area that makes things simpler is the extensive range of libraries and the simplicity of installing and maintaining them… This blog isn't intended to be an introduction to python and it's many libraries just a quick intro to those I seem to use more and more.

There are few libraries that are perfect for working with a database.

  • cx_oracle : This driver enables you to connect to the oracle database via Oracle's OCI Client. You'll need to install Oracle's instant or full client.
  • MatPlotLib : A powerful charting library enabling you to visualise your data in an unimaginalable number of ways
  • Pandas : An easy to use library for data analysis
  • Numpy : A powerful scientific computing package

Also one aspect that I'm not going to cover but that really takes Python from an expressive programming language to a collaborative to a tool for sophisticated collaborative data exploration and development is IPython Notebook. IT provides a means of writing up and executing live code that can be modified by other collaborators. I'll cover this in a future blog.

Setting up Python

Before we go any further we need to make sure we have python installed and the correct libraries. I'm using Python 2.7 but the code we are using should work fine with python 3.0. Most operating systems will have have python installed by default. If yours doesn't you can get it here. Next we need to insure that we have all of the correct libraries. To do this I use the python library manager "pip". Again I won't go into it's installation if you don't have it installed but you can find details here. After you've installed pip all you need to do to ensure you've got the correct libraries is to issue a command like


pip install matplotlib numpy cx_oracle pandas --upgrade

This will sort out all the dependencies for you. I've also used the "—upgrade" option to ensure we refresh any out of date libraries. If you get permission errors it's likey to be because you don't have the privilege to install the libraries into the shared system lib location. Either rerun the command as root or with sudo i.e.


sudo pip install matplotlib numpy cx_oracle pandas --upgrade

Or create a virtual environment

You can check to see that your libraries are installed by using the "pip list" command.

Code

For the impatient amongst you lets start with code and explain some of the details later

__author__ = 'dgiles'

import cx_Oracle
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import style

def read_from_db (username, password, connectString, mode=None, save=False):

    if mode is None:
        connection = cx_Oracle.connect(username, password, connectString)
    else:
        connection = cx_Oracle.connect(username, password, connectString, mode)
    with connection:
        try:
            df = pd.read_sql_query("SELECT\
                                      wc.wait_class                    AS waitclass,\
                                      TRUNC(begin_time, 'MI')          AS sample_time,\
                                      round((wh.time_waited) / wh.intsize_csec, 3) AS DB_time\
                                    FROM V$SYSTEM_WAIT_CLASS wc,\
                                      v$waitclassmetric_history wh\
                                    WHERE wc.wait_class != 'Idle'\
                                          AND wc.wait_class_id = wh.wait_class_id\
                                    UNION\
                                    SELECT\
                                      'CPU'                   AS waitclass,\
                                      TRUNC(begin_time, 'MI') AS sample_time,\
                                      round(VALUE/100, 3)         AS DB_time\
                                    FROM v$sysmetric_history\
                                    WHERE GROUP_ID = 2\
                                          AND metric_name = 'CPU Usage Per Sec'\
                                    ORDER by sample_time, waitclass",
                                   connection)
            if save:
                df.to_csv('results.csv')
            return df
        except cx_Oracle.DatabaseError as dberror:
            print dberror

def read_from_file(filename):
    return pd.read_csv(filename, parse_dates=['SAMPLE_TIME'])


if __name__ == '__main__':
    style.use('ggplot')
    df = read_from_db(username='sys', password='welcome1', connectString='oracle12c2/soe', mode=cx_Oracle.SYSDBA, save=True)
    # df = read_from_file('results.csv')

    print df.head()

    pdf = df.pivot(index='SAMPLE_TIME', columns='WAITCLASS', values='DB_TIME')

    print pdf.head()

    pdf.plot(kind='area', stacked=True, title='DB Time over the last hour', color=['red', 'green', 'orange', 'darkred', 'brown', 'brown', 'pink', 'lightgreen', 'cyan', 'blue'])
    plt.show()

On running the code with a command like. Note : I saved my code to a file dbtime60min.py

python dbtime60min.py


You should see a chart like the following.

Pasted Graphic 1

So in a relatively short amount of code we can quickly produce a useful chart. And the beauty of the solution is that we could replace the relatively complex piece of SQL I used with some thing more trivial and Pandas does much of the heavy lifting to convert it into a useful chart that we can share with others.

An explanation of the code

There are a view areas that are worth understanding in the code and that highlight the easy of working with a a framework like Pandas.

I created a function "def read_from_db" that connects to the database (via the cx_oracle library) and then asks the Pandas framework to read the data back from the database.
df = pd.read_sql_query("SELECT\
                      wc.wait_class                    AS waitclass,\
                      TRUNC(begin_time, 'MI')          AS sample_time,\
                      round((wh.time_waited) / wh.intsize_csec, 3) AS DB_time\
                    FROM V$SYSTEM_WAIT_CLASS wc,\
                      v$waitclassmetric_history wh\
                    WHERE wc.wait_class != 'Idle'\
                          AND wc.wait_class_id = wh.wait_class_id\
                    UNION\
                    SELECT\
                      'CPU'                   AS waitclass,\
                      TRUNC(begin_time, 'MI') AS sample_time,\
                      round(VALUE/100, 3)         AS DB_time\
                    FROM v$sysmetric_history\
                    WHERE GROUP_ID = 2\
                          AND metric_name = 'CPU Usage Per Sec'\
                    ORDER by sample_time, waitclass",
                   connection)

The Pandas reads this data into a multi dimensional structure, much like the table we read this information from. And just like a database table Pandas enables us to sort and filter the data. Using a command like

print.info()

Shows us the details of the information we've just read in. i.e.


Int64Index: 549 entries, 0 to 548
Data columns (total 3 columns):
WAITCLASS      549 non-null object
SAMPLE_TIME    549 non-null datetime64[ns]
DB_TIME        549 non-null float64
dtypes: datetime64[ns](1), float64(1), object(1)
memory usage: 17.2+ KB

We can also see a sample of the data with "tail()" or "head()" i.e.

print.head()

results in

       WAITCLASS         SAMPLE_TIME  DB_TIME
0            CPU 2015-08-17 11:47:00        0
1         Commit 2015-08-17 11:47:00        0
2    Concurrency 2015-08-17 11:47:00        0
3  Configuration 2015-08-17 11:47:00        0
4        Network 2015-08-17 11:47:00        0

We can also select individual columns from this data set with a command like


wcdf = df['WAITCLASS'] print wcdf.head()


Which will give us


0 CPU 1 Commit 2 Concurrency 3 Configuration 4 Network


Instead of explicitly iterating through the data to find information we can also filter out just the relevant information i.e.

fdf = df[df['WAITCLASS'] == 'CPU']

Which will enable us to just select data from the dataset where the WAITCLASS column only contains 'CPU. Very similar to a SQL where clause i.e. "where WAITCLASS = 'CPU'. When we print the head of the fdf Data frame we get


print fdf.head()

  WAITCLASS         SAMPLE_TIME  DB_TIME
0        CPU 2015-08-17 12:38:00    0.001
9        CPU 2015-08-17 12:39:00    0.001
18       CPU 2015-08-17 12:40:00    0.000
27       CPU 2015-08-17 12:41:00    0.003
36       CPU 2015-08-17 12:42:00    0.000

Another capability of Pandas (and there are many and we've only touched on a few) is the ability to pivot the data. Now in this example you could make the case it would have been simpler to pivot the data in the database. But I'm doing it here to illustrate the point. All we need to do is to specify which columns will be the index (y axis) and which column(s) will be the column headers (x axis) and which column will be the value.


pdf = df.pivot(index='SAMPLE_TIME', columns='WAITCLASS', values='DB_TIME')

will turn this time series data


WAITCLASS SAMPLE_TIME DB_TIME 0 CPU 2015-08-17 12:47:00 0 1 Commit 2015-08-17 12:47:00 0 2 Concurrency 2015-08-17 12:47:00 0 3 Configuration 2015-08-17 12:47:00 0 4 Network 2015-08-17 12:47:00 0


into this pivoted data


WAITCLASS CPU Commit Concurrency Configuration Network \ SAMPLE_TIME 2015-08-17 12:47:00 0.000 0 0 0 0 2015-08-17 12:48:00 0.002 0 0 0 0 2015-08-17 12:49:00 0.001 0 0 0 0 2015-08-17 12:50:00 0.000 0 0 0 0 2015-08-17 12:51:00 0.000 0 0 0 0 WAITCLASS Other Scheduler System I/O User I/O SAMPLE_TIME 2015-08-17 12:47:00 0.000 0 0 0 2015-08-17 12:48:00 0.001 0 0 0 2015-08-17 12:49:00 0.002 0 0 0 2015-08-17 12:50:00 0.000 0 0 0 2015-08-17 12:51:00 0.000 0 0 0

The last step is to chart the data and this is achieved in a single command in Pandas.


pdf.plot(kind='area', stacked=True, title='DB Time over the last hour', color=['red', 'green', 'orange', 'darkred', 'brown', 'brown', 'pink', 'lightgreen', 'cyan', 'blue']) plt.show()

There's a couple of things to note. I've overwritten the default colour map to use colours that would be familiar to Oracle DBAs via the Enterprise Managers DB Time charts. And I also didn't use a colour map to ensure that CPU is always rendered in "green".

So just a quick example of the power of Pandas when used in conjunction with Oracle. I'll try and post a few more Python articles over the coming weeks.
Comments