DB2 Nerd: August 2008

Friday, August 22, 2008

IBM Exam 704 - Advanced DB2 DBA Certification

Taking advantage of IBM certification promotion offered to their business partners in Malaysia, I enrolled into the test 000-704 advanced DB2 DBA about less than one month ago. The test only costs me USD30. Primary study material is the "Advanced DBA Certification Guide and Reference for DB2(R) Universal Database v8 for Linux, UNIX, and Windows", ISBN #0130463884. Overall the study experience was a pleasant one because you get to dive into the rich technical details of IBM DB2 database engine and start to be amazed by this world class RDBMS.

Anyway I think that it is too naive to declare someone as certified Advanced DB2 DBA just for the sake of passing 56% out of 57 questions. Yes, the questions are breadth and tricky but it would be better if the number of questions in the test increased to let say 300, :), testing every nuances of DB2.

For me, even passing the exam with flying colors as usual, I think I'm only conquered maybe 35% to 50% of the total universe of DB2 version 8 LUW, not to mention my lack of exposures in DB2 for i5/zOS and version 9 universes.

I will work harder. :-)

DB2 Stored Procedure Maintenance

If you ever thought that you can just create a bunch of DB2 store procedures and left it in the server until it being replaced by newer version without having to spend some efforts to maintain it, you're wrong, dude/dudette.

Dynamic SQL statement is as it's name implied: Dynamic. Dynamic in the sense that the SQL Compiler process the statement when someone execute it and optimize it using the current statistics.

What about Static SQL statement? Haha, does this start to make any sense?

If it haven't ring any bells in your mind, faster go and grab a copy of DB2 book and start your revision.

Static SQL statement's access plan is generated and stored in the database at the moment that you perform the binding, i.e. compile the statement. This means that the access plan is based on the statistics at that moment.

In a large organization with few ten to hundreds line of business servers, usually the DBA don't really bother (or they can't really know) whether they must perform the maintenance on the application database objects.

Remember the DB2 Automated Maintenance Tool in DB2 Control Center? They only help you to backup, reorg and runstats your databases. Frankly speaking, there are more maintenance needs than you possibly can imagine.

Just like human body which you gotta do exercises, body building and drink super boosted tonic to maintain that drop dead gorgeous figure to attract the opposite sex, any enterprise databases desire the same treatments.

I'm going too far, but you get the idea, :p

So, the meat for today's lesson is: YOU HAVE TO REBIND YOUR STORED PROCEDURE, especially those lengthly multipages SQL codes and which involves plenty of data manipulations. One good example is a taxing month-end report generation SP.

To do this, you must either recreate the stored procedure which is I think a stupid way to perform in the long run, or you can use the System Procedure: SYSPROC.REBIND_ROUTINE_PACKAGE like the example below:

CALL SYSPROC.REBIND_ROUTINE_PACKAGE('P','MYSCHEMA.MYBATCH1','ANY');

Of course, there are few calling variants by passing in different set of parameters, but the whole point is you need to rebind the SP.

And..... very important, in case you are new in this field. Make sure you reorganize your tables (i.e. REORG command) and collect the latest statistics (i.e. RUNSTATS command) before you do the rebind, else it won't help much.

DB2, Simple right?

Connecting DB2 LUW to DB2 OS/390, Simple right?

Assuming that every necessary configurations are done at the zO/S, i.e. communication connectivity, authorizations, etc. It is blindly simple to set up DB2 LUW to talk to DB2 OS/390.

You just need to do the following in DB2 Command Window (Let say in Windows).

db2 catalog tcpip node NODE1 REMOTE SERVER OSTYPE OS390

db2 catalog dcs database AS

db2 catalog database AUTHENTICATION DCS

Ok, and if previously the utility packages are not bind yet, you might hit errors when try to run even simple commands in DB2 OS/390. To fix this, You will need to change directory to your \bnd now, and enter the following commands after you are connected to the host:

db2 bind @db2ubind.lst blocking all grant public
db2 bind @db2cli.lst blocking all grant public
db2 bind @ddcsmvs.lst blocking all grant public

Again, I will ask "Simple right?" ;-)

Using Microsoft Active Directory to Store DB2 Catalog Information

Another way to catalog many remote databases automatically is through using LDAP as the storage repository to distribute the catalog information.

In DB2 V8, there are 3 LDAP directory servers supported: IBM SecureWay Directory, Netscape LDAP Server and Microsoft Active Directory. Of these, I will briefly talking about some crucial steps to perform to eliminate obstacles as you trying to use Microsoft Active Directory to publish, store and distribute the node and database catalog information.

Assuming your domain is ibm.com and the machine host name is nerdyeddy. IP Address is 10.10.10.1 and LDAP port is listening at 389.

Firstly, fire up a DB2 Command Line Windows (db2cmd).

Execute the following commands:

db2set DB2_ENABLE_LDAP=yes
db2set DB2LDAPHOST=10.10.10.1:389
db2set DB2LDAP_BASEDN=CN=nerdyeddy,DC=ibm,DC=com
db2stop force
db2start

The above commands are to configure the LDAP connectivity to AD.

Then to register your DB2 Server into AD, do this:

db2 register ldap as MYDB2 protocol tcpip

or if you need to register a remote DB2 server, do this:

db2 register db2 server in ldap as
protocol tcpip
hostname <168.168.168.2>
svcename <50000>
remote
instance

Note: You shall need to replace the command parameters, these enclosed with square brackets accordingly.

If you was unable to run any or all of the above commands due to maybe SQL3276N eror, SQL3278N error or other funny problems, most probably you haven't run the secret setup yet, :p.

Assumining your DB2 is installed at C:\IBM\SQLLIB

Do this in the command prompt:

regsvr32 C:\IBM\SQLLIB\bin\db2ads.dll
db2schex

Basically, the db2ads.dll and db2schex are to properly configure the schema, metadata and other settings in AD to accomodate for DB2 related object classes and properties.

Ok. If your DB2 server is successfully registered in AD, then for now onwards the new databases that you created will be auto published the catalog information in AD for distribution.

If you need to catalogue previously created databases,
use command

Other related commands include:
REGISTER LDAP AS
REGISTER DB2 SERVER IN LDAP AS
CATALOG LDAP NODE
CATALOG LDAP DATABASE
UNCATALOG LDAP DATABASE
UNCATALOG LDAP NODE
DEREGISTER DB2 SERVER IN LDAP NODE
REFRESH LDAP NODE DIRECTORY
REFRESH LDAP DB DIRECTORY
ATTACH TO

Good luck.

Validate Your DB2 Backup Image

Everyone knows that a disaster recovery plan would be practically useless if you never really try the thing out... at least on regular basis. Constant environment changes and infrastructure evolutions can render your DR strategy flatten at the precise moment when you really need it to restore your business operation.

Same matters applied to your DB2 backup images (versioned backup) that you takes maybe on monthly basis depending on the criticality of the data and business continuity. So, what you plan to do? Restoring the image one by one during your precious coffee break to one of the available development machine? You must be joking me!!! Amateurs who know little about DB2 might do that.

Start pranking around and get serious. We need to use some tool to do the plumbing for us. Fortunately enough, DB2 do provides us with that tool.

DB2CKBKP

This tool is located at \bin\db2ckbkp.exe (Windows)

Have fun.

Handle Slowly Changing Dimension in DB2

Excerpt from WWW about Slowly Changing Dimension:

The "Slowly Changing Dimension" problem is a common one particular to data warehousing. In a nutshell, this applies to cases where the attribute for a record varies over time. We give an example below:

Christina is a customer with ABC Inc. She first lived in Chicago, Illinois. So, the original entry in the customer lookup table has the following record:

Customer Key Name State
1001 Christina Illinois

At a later date, she moved to Los Angeles, California on January, 2003. How should ABC Inc. now modify its customer table to reflect this change? This is the "Slowly Changing Dimension" problem.

There are in general three ways to solve this type of problem, and they are categorized as follows:

Type 1: The new record replaces the original record. No trace of the old record exists.

Type 2: A new record is added into the customer dimension table. Therefore, the customer is treated essentially as two people.

Type 3: The original record is modified to reflect the change.

Well, you might be naive (talented?) enough to write your own framework to manage the slowly changing dimensions in your environment, however it is good to know that there is a handy tool at your disposal to minimize the plumbing works for those data management scenarios.

As of IBM DB2 version 8.1 FixPack 2 or later, MERGE statement is introduced.

The example below illustrates how Merge can take data from a staging table and use it to update or create dimension members accordingly.

MERGE INTO
DWH.DIM_COMPANY AS TARGET
USING
(SELECT ID, NAME, PHONE FROM DWHSTAG.COMPANY) AS SOURCE
ON SOURCE.ID = TARGET.ID
WHEN MATCHED THEN
UPDATE SET (NAME,PHONE)=(SOURCE.NAME,SOURCE.PHONE)
WHEN NOT MATCHED THEN
INSERT (ID, NAME, PHONE) VALUES (SOURCE.ID, SOURCE.NAME, SOURCE.PHONE)
ELSE IGNORE;

Do you think life is better now?

Complete Syntax on DB2 Merge Statement

DB2 Changing Statement Terminator Symbol

When you pass in -t command option to DB2 CLP, the delimiter ; (Semi colon) is turn on for enabling you to enter multiple statement lines before submitting it. This is typically acceptable behavour when you need to submit a single statement that encompassing multiple lines in the console.

What if you need to submit a large and complex stored procedure creation script to CLP? You can save the codes in a physical file then use db2 -t -f myProc.sql.

Then you hit errors that doesn't make sense at all.

Most probably the query parser is confused with the statement terminator and your substatement terminator. Substatements, such as those that you embedded in your stored procedure are forced to use ; as their termination character. In this case, when the parser encounter the first embedded semicolon, it would think that the statement is ready for parsing. Ta da. it strew up.

So, a better command would be "db2 -td# -f myProc.sql" assuming you are using # symbol as your statement terminator.

Yet, you might face another issue of encountering multiple different statement terminator in the same CLP session.

So you decide to

db2 -td#
select * from syscat.tables#
quit#

db2 -td$
select * from syscat.columns$
quit$

This example is trivial, but you get my point.

So, is there a better solution? O yeah, you can use one of the DB2 Control Option in the form of:

--#SET TERMINATOR

For example:

db2 -t
SELECT * FROM SYSCAT.TABLES FETCH FIRST 1 ROW ONLY;
--#SET TERMINATOR #
SELECT * FROM SYSCAT.COLUMNS FETCH FIRST 1 ROW ONLY#
--#SET TERMINATOR $
VALUES (1)$

Similar approach can be adopted in Java DB2 programming by submitting it as part of the query you sent to DB2.

IBM DB2 HADR Example

Here is an HADR configuration example that I did when preparing for my DBA certification. It has being awhile when I performed these steps, so bear with me if there are any mistakes or errors in the description.

DB2(R) high availability disaster recovery (HADR) is a data replication feature that provides a high availability solution for both partial and complete site failures. HADR protects against data loss by replicating data changes from a source database, called the primary, to a target database, called the standby.

This practice assumes two DB2 instances, named DB2INST1 and DB2INST2 resides in the same physical machine.

For configuring HADR, you will need few pieces of information.

Update DB2 Registry Profile
DB2 Communication Protocol (DB2COMM)

Update Each Instance DBM Configuration
DB2 TCPIP Listening Port (SVCENAME)

Update Primary DB Archical Logging Configuration
Log Pages During Index Rebuild (LOGINDEXBUILD)
Index Creation Time (INDEXREC)
Retain Log (LOGRETAIN)

Update Each DB HADR Configuration
Local IP Address (HADR_LOCAL_HOST)
Local Service Name/Port (HADR_LOCAL_SVC)
Remote IP Address (HADR_REMOTE_HOST)
Remote Service Name/Port (HADR_REMOTE_SVC)
Remote DB2 Instance Name (HADR_REMOTE_INST)
Synchronization Mode (HADR_SYNCMODE)

For the intent of all purposes, I installed a fresh copy of DB2 ESE version 9 and chooose NOT to create the default DB2 instance. Windows environment is assumed.

Step 1: Create 2 DB2 Instances

Open up an instance of DB2 Command Window by typing "db2cmd" in Run dialog.

Create Instance 1 by typing "db2icrt DB2INST1"

Create Instance 1 by typing "db2icrt DB2INST2"

Step 2: Configure Instance 1

Type the following commands:

SET DB2INSTANCE=DB2INST1
db2set DB2COMM=tcpip
db2 update dbm cfg using SVCENAME 41001
db2start

Note: The TCPIP service port for this instance is 41001

Step 3: Configure Instance 2

Type the following commands:

SET DB2INSTANCE=DB2INST2
db2set DB2COMM=tcpip
db2 update dbm cfg using SVCENAME 42001
db2start

Note: The TCPIP service port for this instance is 42001

Step 4: Create the Primary DB in Instance 1

Type the following commands:

SET DB2INSTANCE=DB2INST1
db2 create db BANKFRS
db2 update db cfg for BANKFRS using LOGINDEXBUILD on
db2 update db cfg for BANKFRS using INDEXREC RESTART
db2 update db cfg for BANKFRS using LOGRETAIN on

Note: This step will turn the DB into archival logging mode

Step 5: Create a backup of Primary DB

Type the following commands:

db2 BACKUP DB BANKFRS TO C:\TEMP

Note: This step will backup the BANKFRS db into C:\TEMP

Step 6: Restore the backup BANKFRS as Standby database

Type the following commands:

SET DB2INSTANCE=DB2INST2
db2 RESTORE DB BANKFRS FROM C:\TEMP

Note: This step will restore the database as BANKFRS in the second DB2 instance and place it into roll forward pending state.

Note: If you roll forward the BANKFRS now, you will not be able to initialize it as the standby database.

Step 7: Configure the HADR settings for the Primary DB in DB2INST1

Type the following commands:

SET DB2INSTANCE=DB2INST1
db2 update alternate server for database BANKFRS using hostname localhost port 42001
db2 update db cfg for BANKFRS using hadr_local_host localhost
db2 update db cfg for BANKFRS using hadr_local_svc 41005
db2 update db cfg for BANKFRS using hadr_remote_host localhost
db2 update db cfg for BANKFRS using hadr_remote_svc 42005
db2 update db cfg for BANKFRS using hadr_remote_inst DB2INST2
db2 update db cfg for BANKFRS using hadr_syncmode SYNC
db2 update db cfg for BANKFRS using hadr_timeout 120

Note: Primary BANKFRS HADR service port will be 41005.
Note: Standby BANKFRS HADR service port will be 42005.
Warning: You cannot use the Instance TCPIP port (SVCENAME) or next port number (SVCENAME+1) as the HADR service port. This is by design.

Step 7: Configure the HADR settings for the Primary DB in DB2INST1

Type the following commands:

SET DB2INSTANCE=DB2INST1
db2 update alternate server for database BANKFRS using hostname localhost port 41001
db2 update db cfg for BANKFRS using hadr_local_host localhost
db2 update db cfg for BANKFRS using hadr_local_svc 42005
db2 update db cfg for BANKFRS using hadr_remote_host localhost
db2 update db cfg for BANKFRS using hadr_remote_svc 41005
db2 update db cfg for BANKFRS using hadr_remote_inst DB2INST1
db2 update db cfg for BANKFRS using hadr_syncmode SYNC
db2 update db cfg for vusing hadr_timeout 120

Note: Primary BANKFRS HADR service port will be 41005.
Note: Standby BANKFRS HADR service port will be 42005.
Warning: You cannot use the Instance TCPIP port (SVCENAME) or next port number (SVCENAME+1) as the HADR service port. This is by design.

Now, you are ready to test the configuration.

Step 8: Start HADR on Standby DB IN DB2INST2

SET DB2INSTANCE=DB2INST2
db2 START HADR ON DB BANKFRS AS STANDBY

Step 9: Start HADR on Primary DB IN DB2INST1

SET DB2INSTANCE=DB2INST1
db2 START HADR ON DB BANKFRS AS PRIMARY

Step 10: Take over the Primary Role to Standby Instance

SET DB2INSTANCE=DB2INST2
db2 TAKEOVER HADR ON DB BANKFRS

Step 11: Take back the Primary Role from Standby Instance

SET DB2INSTANCE=DB2INST1
db2 TAKEOVER HADR ON DB BANKFRS

Step 12: Stop the HADR service on Primary DB

SET DB2INSTANCE=DB2INST1
db2 STOP HADR ON DB BANKFRS

Step 13: Stop the HADR service on Standby DB

SET DB2INSTANCE=DB2INST2
db2 deactivate db BANKFRS
db2 STOP HADR ON DN BANKFRS

General Notes:

You can invoke the HADR wizard from the DB2 Control Center by right clicking on the specific database and select "High Availability and Disaster Recovery" -> Setup

You can monitor the HADR status on the database by using the database snapshot monitor. "db2 get snapshot for database on BANKFRS"

Passed IBM DB2 Certification Exam 701

Officially I didn't hold a full time job as DBA in any of my previous employments, however this doesn't stop me from getting one or two DBA certifications, because my hunger for knowledge and piss off with some moments where intricacy of DB technical details hinder my architecting and design process.

Besides all the formal knowledge and information I acquainted, one of the most important experience is to further understand the scope, responsibility and difficulties that real DBA faced in their daily operational routines and when application development needs conflicted with the interest of maintaining corporate data policy and security auditing.

In the enterprise application paradigm, the only way a solution architect can architect a better solution is to completely understood and blend every piece of the jigsaw puzzle together into a coherent whole, in the right way.

That's my motivation. Now let me share my preparation process.

1.) Read up the IBM DB2 701 Preparation Materials from IBM

Nothing much to say here.

2.) Read up DB2® Universal Database™ v8 for Linux®, UNIX®, and Windows® Database Administration Certification Guide, by George Baklarz, Bill Wong

You can opt to skip the last part regarding Application Development, even though no harm for you to take a look into it. That part is more relevant when you were taking exam 703.

3.) Do at least one project dealing with DB2 DBMS

4.) Do your own exploration and experiments with some of the concepts in the materials like HADR, Roll Forward and etc.

5.) Get enough coffee and sleeps. Trade off between these 2 important constraints :-)

Lastly, I'd like you to know that DB2 V9 series of examination is already available and you should jump on the bandwagon of V9 if your company or projects are using it or potentially will. I sticked to V8 because that's the one bundled together with IBM DBW DWE V9.1 and I dealt with that version of DB2 the most.

P/S: I got a 100% score in this exam. Cheers. I finally revenged on my 98% score in exam 700. :EVIL:

DB2 Infinite Active Log Space

In DB2, when a database is created, 3 log files are allocated. They are known as primary log files. In Linux/Unix environment, they will be 1000 * 4K pages for each log file by default, whereas in Windows it is 250 * 4k pages.

By default, up to two secondary log files will be created if needed, and their size will equal that of each primary log file used. However, the total number of secondary log files allowed is also configurable (via the logsecond database configuration parameter).

You might think you can avoid running out of log space by configuring a database to use a large number of secondary log files. However, the maximum number of secondary log files allowed is 254. If the size of your log files is relatively small, you can still run out of log space quickly when transaction workloads become heavy. You should avoid allocating a large number of secondary log files, if possible, because performance is affected each time a log file has to be allocated. Ideally, you should allocate enough primary log files to handle most situations, then use just enough secondary log files to handle peak times in transaction workloads. If you're concerned about running out of log space and want to avoid allocating a large number of secondary log files, you can configure a database to use what is known as infinite logging. To enable infinite logging, simply set the database configuration parameters userexit and logsecond to YES and -1, respectively.

UPDATE: IBM DB2 Data Warehouse Edition Password Maze

A new finding on the Alphablox wsadmin.password parameter in /server/AlphabloxAnalytics/server.properties is that after you replaced the wsadmin.password.protected with wsadmin.password to change the password, you will need to make some (dummy) changes to the Administration page in Alphablox Console for the plain text password to be automatically encrypted by Alphablox. Restart of applications or server is not necessary.

DB2 Audit Facility for Dummy

IT security auditors came and approached me, asking about DB2 user query activities, security events and blah blah. Let me see, o yeah, we got application specific audit trails. HR system got their own, ERP too and not forgetting that small little ETL program that IS people wrote. Of course, having domain specific audit facility is not a sin and they are one of the standard practice. However, DB2 provides an audit facility which is generic enough to capture almost any events happened in your databases.

Check out this DB2 program, located at <DB2_INSTALL_PATH>\bin

db2audit

Before you start to use db2audit, you should configure the AUDIT_BUF_SZ parameter in DBM configuration file. By the way, db2audit is controlled at Instance level. Setting AUDIT_BUF_SZ to non-Zero value indicates a multiple of 4KB. This is the buffer size before the audit records written to disk.

Then, you can check the current settings of db2audit, by using this command

db2audit describe

You will see something like below:

DB2 AUDIT SETTINGS:

Audit active: "FALSE "
Log errors: "TRUE "
Log success: "TRUE "
Log audit events: "TRUE "
Log checking events: "TRUE "
Log object maintenance events: "TRUE "
Log security maintenance events: "TRUE "
Log system administrator events: "TRUE "
Log validate events: "TRUE "
Log context events: "TRUE "
Return SQLCA on audit error: "TRUE "

AUD0000I Operation succeeded.

For example, if I just want to log audit events for authentication and authorization, both when success and failed, and rollback application if the system unable to generate the audit logs, perhaps due to system failure. Then I use this command:

db2audit configure scope checking,validate status both errortype audit

I will get the results below, when I submit "db2audit describe" again

DB2 AUDIT SETTINGS:

Audit active: "FALSE "
Log errors: "TRUE "
Log success: "TRUE "
Log audit events: "FALSE "
Log checking events: "TRUE "
Log object maintenance events: "FALSE "
Log security maintenance events: "FALSE "
Log system administrator events: "FALSE "
Log validate events: "TRUE "
Log context events: "FALSE "
Return SQLCA on audit error: "TRUE "

AUD0000I Operation succeeded.

You can enumerate the list of possible values for SCOPE, STATUS and ERRORTYPE by just typing "db2audit" and submit.

After configuring the audit facility, to start the facility

db2audit start

To stop the facility

db2audit stop

And, whenever you set AUDIT_BUF_SZ to non-ZERO, you should also use

db2audit flush

for writing the buffer to disk.

2 output file formats are supported by the native extraction: Flat and Delimited.

Use "db2audit extract" command to extract the logs. For example, if I need the audit records for authentication and authorization events, in comma delimiter format for database PROD, then I use the command below:

db2audit extract delasc DELIMITER , category checking, validation database

This will generates a list of files with .DEL extention in <INSTANCE_DIRECTORY>\security folder. You can load the files into database by using LOAD or IMPORT utility.

Audit file will grows over time and you need to perform house keeping on it occasionally. To remove all audit records, use

db2audit prune all

Or more likely you will want to remove records prior to certain date.

db2audit prune date YYYYMMDDHH

E.g. db2audit prune date 2007050100 will delete all records where date prior to 1-May-2007.

Lastly, additional information you might need to aware of

Only SYSADM group members can perform auditing actions

Audit Configuration File located at <INSTANCE_DIR>\security\db2audit.cfg, and it is in binary format

Audit Log File located at <INSTANCE_DIR>\security\db2audit.log, and it is in binary format

By setting DB2INSTANCE environment variable, you can configure audit for different DB2 instances

DB2 Restore DB Command

Personally, I prefer to use Restore DB DB2 Command, instead of that Restore Database Wizard in Control Center. Don't know why, maybe I easily get confused by buttons and drop downs. ;-)

Here I will run down a simulation of backing up and restoring the database under different name in separate DB2 instance resides within same physical machine. Similar scenario might be to replicate databases for development, testing and production.

First, Fire up the Command Window (db2cmd)

Create the Development instance:

db2icrt DEV

Start the DEV instance

SET DB2INSTANCE=DEV
db2start

Create the sample database

CREATE DB DEVDB

Check out the list of tablespaces in the database, this is important when later you need to redirect the creation of tablespace containers.

db2 connect to DEVDB
db2 list tablespaces

Backup the sample DEVDB database. This will create a folder DEVDB.0 under C:

db2 BACKUP DB DEVDB TO C:

Create the Production instance:

db2icrt PROD

Start the PROD instance

SET DB2INSTANCE=PROD
db2start

Restore the backup DEVDB under C: to PRODDB in PROD instance.

db2 restore db DEVDB FROM C: INTO PRODDB redirect
db2 set tablespace containers for 0 using (path 'C:\container\tspace00c1')
db2 set tablespace containers for 1 using (path 'C:\container\tspace01c1')
db2 set tablespace containers for 2 using (path 'C:\container\tspace02c1')
db2 restore db DEVDB continue

Try to connect to the new PRODDB restored from DEVDB

db2 connect to PRODDB

The trick here is to specify a separate set of container paths for Restore DB command. This will depends on number of tablespaces used by your DB. Default is only 3 tablespaces: SYS, TEMP and USER.

To clean up the simulation, do the following:

set DB2INSTANCE=DEV
db2stop force
set DB2INSTANCE=PROD
db2stop force

db2idrop DEV
db2idrop PROD

Manually remove C:\DEV, C:\PROD, C:\DEVDB.0 and C:\container

IBM Data Warehouse Edition DWH Password Maze

Stringent user account security policy in the network domain can cause damaging maintenance headache in deployed IBM DWH multiservers environment. User account information are all around the places, in your DB2 services, WAS server, Alphablox and so on. The day when the user passwords expired or required account disabled, it will be the day DWE solutions face total outrage. Well, may be I'm just exaggerating.

Where do you update the user credentials in DWE environment when such a need arise?

Briefly speaking, at least the following locations:

1. DB2 Windows services, assuming Windows environment

Log On As for each DB services need to be updated.

2. Websphere Global Security Setting, assuming using LocalOS repository

This can be tricky. The easiest is to update the password before you shut down the WAS server. If the server already shut down and you didn't manage to update the password, then you wouldn't be able to start the server again because of authentication error. If this is the case, you got to manually disable the WAS global security by changing the "enabled" attribute of security:Security xml element to false in security.xml file located in /config/cells/Cell. Then start the server, update the password in LocalOS setting and turn on Global Security again by checking on the option in WAS Admin Console.

3. Data Sources defined in DWE Admin Console

Data Sources used by DWH application processes, which are not attached to WAS data source, must be updated.

Before you can perform this, you need to update the J2C user password in WAS for Admin Console to be able to connect to its repository. (Item 6)

4. Data Sources defined in Alphablox Admin Console

Usually this will be data sources for Alphablox cubes to retrieve IBM Cube Views meta data.

5. WAS account used by Alphablox for management

Alphablox uses a WAS user credential for connecting to WAS and managing Alphablx applications in WAS. This piece of information is located in Alphablox repository, /servers/AlphabloxAnalytics/server.properties.

Replace the line ws.admin.password.protected with ws.admin.password=<your_password_in_plaintext>

The issue here is that the new password will be in clear text. I read across some materials that say the password is encoded again the next time the server restarted. However, I don't see that happens in my environment.

6. WAS J2C Authentication entries

7. WAS JNDI Data Sources, assuming not using J2C authentication

8. Optionally, WAS Windows Services

Hope this is helpful to you.

Alphablox Cube Security

I really headache when customer want to impose security constraints on Alphablox solution. Here is a simple security problem that I need to solve.

A local bank in Malaysia has 14 main branches throughout the country where each state got one main branch. There is at least 2 groups of users with different level of data access. One group is country wide users who can view the data for all the states. Another group being state manager which can only sees their own cake.

My first attempt is defining a user property called BelongingState and assign it state code that user belongs to, or "ALL" if they are country wide users. Then I would use Java codes to dynamically construct the MDX for the DataBlox. Thought this will solve my problem, but some user actions would causes rewrite of the MDX automatically (Seems like the default behavior). For example, Show Siblings action will display all members on the same level. This totally defeats my aim of controlling view by Malaysia state. Another one is the Drill Up action. I also found Member Filter dialog and Drill Down (There are 5 types of Drill Down available) action can causes exploitation on data.

Due to lack of time to spend on digging deep into Alphablox object model, I revert to a quick resolution: use removeAction attribute to disable Drill Up, Member Filter and Show Siblings for state level users. At the same time, I wrote a filter on Drilldown event to check the drill down option and prevent the drill down to inappropriate data area. This cut off a lot of interactivity from user, but for the sake of security, some trade offs gotta be made.

Giving some faith to developers who wrote Alphablox framework, I believe there should be some methods to disable MDX rewrite due to user activity, thus preventing the underlying data set from changing. One thing to note if that if you are using MSAS or Essbase as cube source, then you can rely on the native security control features of these cube engines for security. Particularly you can use MemberSecurity tag for this purpose. For Cube View based cubes, may be it could be possible to control the view at database level.

Back to square one, my point of bringing this topic up is that it is important to plan for security requirements to match the out of the box security features provided by Alphablox. For my simple security scenario, I personally think that it is more usable to create 14 identical cubes, each for different state. By controlling which cube that users use in their report, you can be sure that the data view doesn't violate the security expectation, yet retaining powerful interactivity actions such as Member Filter and Drill Up. Things can get pretty ugly and complicated when the security requirements are not just on one dimension. In this case, creating separate cube for different data view might not be practical.

Lastly, localization is another aspect of global business intelligence application that can be as tricky as security factor to implement in Alphablox solution. I really need to appraise Microsoft Analysis Services (MSAS) for incorporating security and localization so well in their cubes. Maybe this is what differentiates market leader and players.

Alphablox Features Wish List

Writing to Santa-BM,

Undeniably, Alphablox architecture and component models present a very flexible and powerful of incorporating analytic capability into whatever web-based applications that your enterprise currently maintained.

With greater power comes greater responsiblities are very true indeed. To be up to 80% productive with Alphablox solution, the architect and developer need to face a very steep learning curve especially if their background on design pattern, J2EE web applications, and some-what JSF-type of knowledge are not strong.

To put it in simple form, I see Alphablox as conglomeration of part-of-JSF (There are other JSF competitors, but I wouldn't mention them here since they are not the part of standard J2EE stack) and AJAX suite of technology.

The thing that I like JSF is not just about their wonderful architecture and well defined component framework. The most important is the promised upcoming tool support that allows you (developers or designer) to visual create, debug, test and deploy JSF component. Without this support, it's like .NET Web developer using code editor to create web pages. WYSIWYG? Forget it.

The problem with Alphablox (As of 8.4.0.1) is the missing of these tool supports.
I don't get report designer in DWE design studio. I don't get testing environment for my custom made UI blox. I can't even test my report without actually put them into the server and press F5 key. Of course, some of you might argue and say to me "Please spend some $$ to buy WSAD 6, cheap skate!". Ok, I accepted that suggestion. But the point is DWE as an integrated and promised to be end-to-end BI platform should at least provide a minimal report development support. Look at Microsoft SSRS, althought their report templates are quite limited, at least I can drag a text box component on the report or define the styles of report contents visually, ;-)

Here some constructive suggestions to IBM Alphablox team

* Provide new project types in design studio that do
1. Alphablox Relational Report Creation (With tons of wizard, of course)
2. Alphablox Analytics Report Creation (With tons wizard, of course)
3. Alphablox Cube (With the real time capabilities to see cube data, etc)
- Trust me, Query Builder is really not enough
4. Alphablox Custom Component/UI Component Creation

I wish that next time when I want to customize the behavior of the MemberFilterBlox to inform users that their selection is not valid due to constraints such as security, I only need to :

1. Fire up my design studio
2. Open the Alphablox project that contains the web page that have the specific PresentBlox/MemberFilterBlox,
3. Double click on the blox, brings me a selection list of events to listen

OR

3. Right click on the blox, go to "Customize Event Handler", then "XXX event"

4. Select the event, and then it generates the handler method skeleton for me to put in my code.

All linking between the event handler and the framework event pipelines should be handled internally by the boilerplate, I don't want to be bothered about that.

5. Lastly, from Tool Palette, drag a Message Box dialog out to the editor

6. Skeleton message box server side codes automatically generated for me, I just need to put in the title and message, etc.

7. Press a button/key to preview with live data.

So, think about my imaginative way and compare it to the current way of doing the same thing, which one is more developer friendly and productive you say?

Even better, convert all the bloxs to JSF compliant component or provide a JSF wrapper for them. This will allows any JSF supported IDE to develop Alphablox applications.

Enable Commenting (Comments)

An excerpt from Alphablox 8.4 documentation:

CommentsBlox allows you to provide cell commenting (also known as cell annotations) functionality to your application. In addition, you can use CommentsBlox for general commenting that are not tied to any other Blox. For example, you can allow users to add comments to a site, an application, a report, or a Web page.

Comments are stored in a JDBC accessible relational database. Supported databases include IBM(R) DB2(R) UDB, Sybase, Microsoft(R) SQL Server, and Oracle. This data source needs to be defined to DB2 Alphablox. DB2 Alphablox provides a Comments Management page under the Server link in the DB2 Alphablox Administration tab that lets you specify the relational data source to use for storing comments. From that page, you can create "collections" (data tables) to store comments. For cell-level comments, you will need to specify the multidimensional data source used in your GridBlox, the cube to use (for Microsoft Analysis Services), and the dimensions to include. For general comments, you only need to specify the name.

Here is what you need to do to get a feel of the built-in CommentsBlox in Alphablox.

First, go to Alphablox Administration Site, by default should be accessible from http://yourserver:9080/AlphabloxAdmin/

From the Administration main tab, you should be in General paage by default. There is a Runtime Management section, amongst others. Click on the Comments link under this section. A pop-up window should appear that refer to http://yourserver:9080/AlphabloxAdmin/comments/commentsAdmin.jsp.

Here you gotta define the comments collection (sort of like a repository to store a particular set of comments) and maybe customize the set of fields used for commenting.

Choose a data source that you already defined elsewhere, enter the security credential and click Connect button. All existing collections should appear in the list provided.

Click Create button, since you want to create a new collection. You need to fill in the details for the form provided, particularly you need to specify the dimension to enable the comments and the fields required for the comment entry.

Click Save once you done.

Ok, you have done with the necessary configuration.

In your Blox programming, you can do the following:

1. Enabling Commenting for GridBlox
...
<blox:data><blox:comments collectionName="YourCommentCollectionName" dataSourceName="TheDataSourceUsedForCollection" /></blox:data>
...

...
<blox:grid commentsEnabled="true" />
...

Then when you right click on your grid cell, an option called "Comments" will appear.

2. Use com.alphablox.blox.CommentsBlox and related com.alphablox.blox.comments.*

This allows you to implement Commenting feature for your relational reports or any other general usage. Requires programming though.

IBM DB2 Exam 000-700

I did it in the reverse way. I took the exam 000-705, yet I still haven't take 000-700. It's time now to bridge the gap and to obtain the official title of BI Designer.

Following are the steps I took for the preparation:

1. Read DB2 Family Fundamental Guide from IBM

6 chapters guide. Clear and concise to help you pass above 75%.

2. Read eLearning Crammer Course Material from colleague

Consists some sample questions and added notes for task 1. If you read this, 80% and above shouldn't difficult.

6 chapters guide. Clear and concise to help you pass above 75%.

3. Read a 400 pages book

DB2® Universal Database™ V8.1 Certification Exam 700 Study Guide
By Roger E. Sanders

Must read if you plan to get 85% and above.

4. Use DB2 whenever possible

SQL section constitutes 31% of exam. Practice makes perfect. I wouldn't need to do this step since I'm already quite good in it. :p

Must do if 90% and above is your target

Note: No. No advanced SQL covered in the exam. No recursive SQL you bet!

My result sheet

Some of the questions are pretty tricky. I shall not put the real questions here, but just to give you some clues, here is a similar one:

(Not REAL questions)

1.) If you gotta build DB2 application on XXX OS and it need to access YYYY OS DB2 Server when deployed, which of the following DB2 product is the minimal you need?

a. DB2 UDB Personal Edition
b. DB2 UDB Personal Edition with DB2 Connect Enterprise Edition
c. DB2 Personal Developer Edition
d. DB2 Universal Developer Edition

Answer is C. Because you need to build DB2 application, you need the SDK toolkit and at least a DB2 UDB PE for testing, therefore C is the minimal. D can be the answer is the question is not on MINIMAL because DB2 UDE includes every DB2 PDE components. A and B is definitely not the answer since it doesn't include SDK in the package.

and another typical kind of question.

2.) What should you catalog for a client to access a DB2 host database?

Answer: You need to catalog the remote server in the NODE directory, the remote database in the System DB directory AND remote database in the DCS directory since you need to use DB2 Connect to access host database. You need to understand the parameter use in these CATALOG statements.

Good luck for you folks that taking this exam.

IBM DB2 Exam 000-705

To earn the title as IBM Certified Business Intelligence Designer, I got to get myself pass in two exams. 000-700 and 000-705. I already passed the 000-705 BI exam during IBM Malaysia sponsored workshop (It's free for IBM business partner).

Here is the summary of the 000-705 exam preparation and experience.

1. 3 days of IBM DWE V9 bootcamp

Hmmm, not that relevant to the 705 exam, more like IBM try to equip its' business partners presales skills on the new DB2 DWE. It covers topics like the new Design Studio, Intelligent Miner, Cube making and etc.

2. 1 day of 705 exam workshop

It is like a last minute top-up of the knowledge pertaining to the 705 exam objectives. The instructor covers the test objectives one by one, and goes through some sample questions.

3. Self study

Got myself the copy of study guide from IBM web site for BI Designer exam version 7, most of the concepts and theories are still applicable.

Found a few excerpts from the book
"Business Intelligence for the Enterprise",
ISBN: 0-13-141303-1

This book is only 240 pages and it tells you general "stories" of Business Intelligence. Personally, I found it not that useful for the exam, but still something interesting to read about.

Scored a 87.5% or answered 49 questions correct out of total 56 questions.
Hit major bumper in the section "Business Intelligence Data Sourcing/Movement", which mostly the questions on replication and types of data warehouse processing (full, incremental, snapshot, etc).

705 is not a technical paper. If you have decent experience in data warehousing and business intelligence, then questions in the terminology, requirements and architecture are piece of cake.

Again, it is easy to pass, but hard to score and excel.

DB2 How to Empty a Table

In data warehouse environment, usually there is a need to clear out the contents of staging tables to prepare for a fresh set of extracted source data. You might also want to housekeep some historical aggregation according to some predefined schedules, which involves copying data from a table to another and subsequently removed all data from the copied table. These are just some of the examples that boiled down to the need of efficiently "truncate" a table.

Microsoft SQL Server and Oracle DBA are definitely enjoying the luxury of built-in table truncation functionality, through command like "TRUNCATE TABLE YourTableName".
Ok fine, but does IBM DB2 UDB, the so-called most scalable and performing RDBMS provide such option?

Before that, let me evaluate some of the options of removing rows from a table.

Option A:

DELETE FROM YourTable

Well, you can delete all rows using this statement. However when involving lot of records, transaction logging causes significant performanc degradation. Still, this option is acceptable if your application requires recovery of deleted rows.

Option B:

(Assume this is within the same transaction)
ALTER TABLE YourTable ACTIVATE NOT LOGGED INITIALLY;
DELETE FROM YourTable;

You managed to escape the bad luck of doing a lot of transaction logging. But wait a minute, constraint checking are still in force (Check yourself by doing explaining a DELETE FROM statement).

Option C:

(Assume this is within the same transaction)
SET INTEGRITY FOR YourTable OFF;
ALTER TABLE YourTable ACTIVATE NOT LOGGED INITIALLY;
DELETE FROM YourTable;

Again, you managed to skip the logging and check constraint and referential constraint checking, datalink integrity checking, and generation of values for generated columns. Primary/Unique Key constraints still enforced.

Option D:

LOAD FROM /dev/null of del REPLACE INTO YourTable

This is by far the most common workaround that I have seen for table truncation. It basically uses the LOAD utility on /dev/null for simulating the loading (replace) of no-data into the designated table. The same concept works for Windows environment.

You can also use similar IMPORT FROM /dev/null of DEL REPLACE INTO YourTable. There are some differences between IMPORT/LOAD.

Option E:

With a little bit of guts, you can drop and recreate the tables. This can be tedious if you got to recreate every constraints/views/etc that dependent on the "new" table.

Option F:

ALTER TABLE YourTable ACTIVATE NOT LOGGED INITIALLY WITH EMPTY TABLE

This is by far my favorite way of doing "TRUNCATE Table" in DB2.

Which ring suits you? You decide.

Should Divide Alphablox Relational Reporting By Zero !

If you are doing relational reporting using IBM DB2 Alphablox, you might face this situation.

In Alphablox, you can define calculated column using bloxreport:calculate custom action. However, the functionality supported by the expression is pretty limited, even to say "Quite Useless". According to the offcial little documentation, only arithmetic +, -, *, /, sum(), (), runningTotal, rank, runningCount, and percentOfTotal are supported. Missing from it is the ability to conditionally process the value in the calculated column.

Now, I created a calculated column that contains the expression "NEWCOL = COL1 / COL2". A typical problematic scenario is when the value of COL2 is zero, and hence the expression is evaluated to "Divided by Zero". Alphablox Relational display a demonic "?" for such a result. No customer want to see their reports flooded with alot of "?".

After hours of finding and hacking around the classes and constants provided by Alphablox Relational. A feasible workaround is to replace the "Divided by Zero" value using a combination of Alphablox Relational tags and Javascript.

Here is one example of such solution:

<script>
function fix(s) {
if (s=="?") document.write ("0");
else { document.write (s) }
}
</script>

<bloxreport:report id="sample1" errors="true" bloxname="sample5">
<bloxreport:canneddata></bloxreport:canneddata>

<bloxreport:calculate expression="col1 = 0.0"></bloxreport:calculate>
<bloxreport:calculate expression="col2 = 0.0"></bloxreport:calculate>
<bloxreport:calculate expression="col3=col1/col2"></bloxreport:calculate>

<bloxreport:text>
<bloxreport:data columnname="col3" text="<script>fix('<value />');</script>" /> </bloxreport:text>

</bloxreport:report>

The idea is to invoke custom logic to manipulate the cell value in the client browser, instead of the server. This is solely because I can't find any documented API that allows me to do so in the server side.

Though it works, more testings are required for the above solution, especially when exporting the relational report to pdf and excel. These report delivery environments might not understood Javascript, hence wouldn't display any value for the formatted cells.

Let cross our finger and hope for similar enhancement to the Alphablox Relational report formatting capability in its' future releases. I have to mention here that even JasperReport is superior than Alphablox Relational in this area.

Be Smart. Get A Row Number !

Imagine there is a table with no primary key or unique constraint and volumes of duplicated rows keep coming in. The only piece of information that differentiates these rows is a timstamp column that stores the date and time where record is inserted.

ID (VARCHAR)	NAME (VARCHAR)	AMOUNT (INT)	LAST_UPDATED (TIMESTAMP)
1	Eddy	80000.00	2006-03-14-00.10.31.999999
1	Eddy	90000.00	2006-03-14-00.09.31.999999
1	Eddy	90000.00	2006-03-14-00.11.31.999999
2	Lee Sin Ti	100.00	2006-03-14-00.10.31.999999
2	Lee Sin Ti	200.00	2006-03-14-00.09.31.999999

Table1

Based on this, your DB2 query is to remove all the outdated records while maintaining only the latest entries.

An idiotic first attempt:

DELETE FROM TABLE1
WHERE
ID || NAME || CAST (AMOUNT AS VARCHAR(32)) || CAST(LAST_UPDATED AS VARCHAR(64))
NOT IN
(
SELECT (ID || NAME || CAST(AMOUNT AS VARCHAR(32)) || CAST(MAX(LAST_UPDATED) AS VARCHAR(64)) ) AS KEY FROM TABLE1 GROUP BY ID, NAME
);

This attempt is definitely a NO-NO. Not only the string concatenations takes a huge amount of processing cycles, it is also UGLY in my point of view. Using a generated random data of 100k records, it takes an unacceptable amount of time to complete the delete query.

Then, my second attempt got to deal with DB2 support of row_number() function, which I greatly appreciated from IBM.

Second attempt:

DELETE FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY ID, NAME ORDER BY LAST_UPDATED DESC)
FROM TABLE1
) AS X (ROWNUM) WHERE ROWNUM > 1;

WOW, an optimized yet elegant query to achieve my goal. With the same random data set, it only took less than 1 minute to complete the query.

Lack of IBM Alphablox Community

As of time of this writing, I did perform some searches in Google about IBM Alphablox. Almost 70-80% of the returned search results are marketing articles lead me to a conclusion that if you are not adventurous enough or don't want to spend your precious department funds calling up Alphablox customer service, you might want to think twice before deploying it as your corporate analytics solution at this moment. The size of Alphablox community still below the level needed for a mature enterprise level technology. This might causes frustration if Alphablox solution developer expects development problems to be solved by searching for solutions in the WWW. Hey fellow developer, time to use your brain. :p

DB2 Nerd

Friday, August 22, 2008

IBM Exam 704 - Advanced DB2 DBA Certification

DB2 Stored Procedure Maintenance

Connecting DB2 LUW to DB2 OS/390, Simple right?

Using Microsoft Active Directory to Store DB2 Catalog Information

Validate Your DB2 Backup Image

Handle Slowly Changing Dimension in DB2

DB2 Changing Statement Terminator Symbol

IBM DB2 HADR Example

Passed IBM DB2 Certification Exam 701

DB2 Infinite Active Log Space

UPDATE: IBM DB2 Data Warehouse Edition Password Maze

DB2 Audit Facility for Dummy

DB2 Restore DB Command

IBM Data Warehouse Edition DWH Password Maze

Alphablox Cube Security

Alphablox Features Wish List

Enable Commenting (Comments)

IBM DB2 Exam 000-700

IBM DB2 Exam 000-705

DB2 How to Empty a Table

Should Divide Alphablox Relational Reporting By Zero !

Be Smart. Get A Row Number !

Lack of IBM Alphablox Community

Interesting Sites

Blog Archive

Friday, August 22, 2008

Subscribe To This Site

Interesting Sites

Blog Archive