Showing posts with label Data warehousing. Show all posts
Showing posts with label Data warehousing. Show all posts

Wednesday, December 16, 2009

Think Out of the Big Box

A bank customer of us was requesting helps to solve a sudden problem dealing with their IBM DWE server. Apparently from certain date onwards, all the processes executed within the DWE console were failed as shown in the following figure:



Interesting because there are few applications running in DWE and some of them already run for more then 2 years without any problems and now all of them failed at once?

Looking at the logs, obviously db2cmd.exe is throwing out a return code of 5 with exception code SQL1042C which means:



And oh shit, what the hell is the meaning of return code 5?

Tried to google, lookup IBM forums, contacted IBM Support and no one can tell me the definition of code 5?

The real fun is almost all other DB2/DWE components are running fine. db2cc, db2hc, db2start, db2stop, db2licm and many others I tried works. It seems like db2cmd.exe being cursed.

I did many other novel investigations. To name a few:

1. Rule out Windows environment variables
- Open up an instance of command prompt, clear off every environment variable.
- Run db2cmd.exe, still failed with the same error message.

2. Rule out Windows registry
- Open up regedit, throw away \HKEY_LOCAL_MACHINE\SOFTWARE\IBM and one under Current User(Of course, you backup it first)
- Run db2cmd.exe, still failed with the same error message.

3. Rule out corrupted files #1
- Copy db2cmd.exe and all the necessary dll files from my machine into the server.
- Open up command prompt, and point every thing to my new db2cmd.exe folder
- Run db2cmd.exe, still failed with the same error message.

4. Rule out corrupted files #2
- Reinstall DB2 Admin Client (Same FP)
- Run db2cmd.exe, still failed with the same error message.

5. Rule out corrupted files #3
- Reinstall DB2 everything (Latest FP)
- Run db2cmd.exe, still failed with the same error message.

6. Trace through AD GPO/Windows stuffs
- Use gpmc.msc, gpresult, gpedit.msc and many windows tools
- Trace into event viewers
- Look for suspicious items. Nothing found

7. Diagnosing DB2 #1
- Set DB2TEMPDIR to non-existing folder/folder where current user don't have security right to access/write
- db2cmd.exe normally will throw DB2CMD.EXE -> 0, Access denied message.
- Nope, still DB2CMD.EXE -> 5

8. Diagnosing DB2 #2
- Put DIAGLEVEL to 4, restart db2, run db2cmd.exe
- Nothing useful in db2diag.log

9. Diagnosing DB2 #3
- Use db2trc, turn it on, run db2cmd.exe, turn it off.
- Nothing useful.

10. Diagnosing DB2 #4
- Execute db2setcp and db2clpsetcp independently with success.

11. Diagnosing DB2 #5
- Fooling around using db2set, removing settings, changing value to bogus one and adding things.
- Still Error code 5 with db2cmd.exe

12. Finding what the heck is error code 5
- Open db2cmd.exe with notepad, look for any potential hints. Found a list of API used. Cross check API name with return code 5 using Google.
- Seems like return code 5 is usually associated with Access Denied error.

13. Finale
Google again "db2cmd.exe -> 5" +SQL1042C. Found one related post in a china forum. Apparently the fella who faces the problem is a student who was trying out DB2 for some homeworks or so. No solution from the post either.

Very interesting... What is the connection or similarity between a student machine in china and a production level server machine in Malaysia? Indeed, I found the answer to this question and yes, it solved the problem!

Any idea? :D



Top Blogs

Friday, August 22, 2008

Handle Slowly Changing Dimension in DB2

Excerpt from WWW about Slowly Changing Dimension:



The "Slowly Changing Dimension" problem is a common one particular to data warehousing. In a nutshell, this applies to cases where the attribute for a record varies over time. We give an example below:

Christina is a customer with ABC Inc. She first lived in Chicago, Illinois. So, the original entry in the customer lookup table has the following record:

Customer Key Name State
1001 Christina Illinois


At a later date, she moved to Los Angeles, California on January, 2003. How should ABC Inc. now modify its customer table to reflect this change? This is the "Slowly Changing Dimension" problem.

There are in general three ways to solve this type of problem, and they are categorized as follows:

Type 1: The new record replaces the original record. No trace of the old record exists.

Type 2: A new record is added into the customer dimension table. Therefore, the customer is treated essentially as two people.

Type 3: The original record is modified to reflect the change.




Well, you might be naive (talented?) enough to write your own framework to manage the slowly changing dimensions in your environment, however it is good to know that there is a handy tool at your disposal to minimize the plumbing works for those data management scenarios.

As of IBM DB2 version 8.1 FixPack 2 or later, MERGE statement is introduced.

The example below illustrates how Merge can take data from a staging table and use it to update or create dimension members accordingly.


MERGE INTO
DWH.DIM_COMPANY AS TARGET
USING
(SELECT ID, NAME, PHONE FROM DWHSTAG.COMPANY) AS SOURCE
ON SOURCE.ID = TARGET.ID
WHEN MATCHED THEN
UPDATE SET (NAME,PHONE)=(SOURCE.NAME,SOURCE.PHONE)
WHEN NOT MATCHED THEN
INSERT (ID, NAME, PHONE) VALUES (SOURCE.ID, SOURCE.NAME, SOURCE.PHONE)
ELSE IGNORE;


Do you think life is better now?

Complete Syntax on DB2 Merge Statement

Passed IBM DB2 Certification Exam 701

Officially I didn't hold a full time job as DBA in any of my previous employments, however this doesn't stop me from getting one or two DBA certifications, because my hunger for knowledge and piss off with some moments where intricacy of DB technical details hinder my architecting and design process.

Besides all the formal knowledge and information I acquainted, one of the most important experience is to further understand the scope, responsibility and difficulties that real DBA faced in their daily operational routines and when application development needs conflicted with the interest of maintaining corporate data policy and security auditing.

In the enterprise application paradigm, the only way a solution architect can architect a better solution is to completely understood and blend every piece of the jigsaw puzzle together into a coherent whole, in the right way.

That's my motivation. Now let me share my preparation process.

1.) Read up the IBM DB2 701 Preparation Materials from IBM

Nothing much to say here.

2.) Read up DB2® Universal Database™ v8 for Linux®, UNIX®, and Windows® Database Administration Certification Guide, by George Baklarz, Bill Wong

You can opt to skip the last part regarding Application Development, even though no harm for you to take a look into it. That part is more relevant when you were taking exam 703.

3.) Do at least one project dealing with DB2 DBMS

4.) Do your own exploration and experiments with some of the concepts in the materials like HADR, Roll Forward and etc.

5.) Get enough coffee and sleeps. Trade off between these 2 important constraints :-)


Lastly, I'd like you to know that DB2 V9 series of examination is already available and you should jump on the bandwagon of V9 if your company or projects are using it or potentially will. I sticked to V8 because that's the one bundled together with IBM DBW DWE V9.1 and I dealt with that version of DB2 the most.

P/S: I got a 100% score in this exam. Cheers. I finally revenged on my 98% score in exam 700. :EVIL:

UPDATE: IBM DB2 Data Warehouse Edition Password Maze

A new finding on the Alphablox wsadmin.password parameter in /server/AlphabloxAnalytics/server.properties is that after you replaced the wsadmin.password.protected with wsadmin.password to change the password, you will need to make some (dummy) changes to the Administration page in Alphablox Console for the plain text password to be automatically encrypted by Alphablox. Restart of applications or server is not necessary.

IBM Data Warehouse Edition DWH Password Maze

Stringent user account security policy in the network domain can cause damaging maintenance headache in deployed IBM DWH multiservers environment. User account information are all around the places, in your DB2 services, WAS server, Alphablox and so on. The day when the user passwords expired or required account disabled, it will be the day DWE solutions face total outrage. Well, may be I'm just exaggerating.

Where do you update the user credentials in DWE environment when such a need arise?

Briefly speaking, at least the following locations:

1. DB2 Windows services, assuming Windows environment

Log On As for each DB services need to be updated.


2. Websphere Global Security Setting, assuming using LocalOS repository

This can be tricky. The easiest is to update the password before you shut down the WAS server. If the server already shut down and you didn't manage to update the password, then you wouldn't be able to start the server again because of authentication error. If this is the case, you got to manually disable the WAS global security by changing the "enabled" attribute of security:Security xml element to false in security.xml file located in /config/cells/Cell. Then start the server, update the password in LocalOS setting and turn on Global Security again by checking on the option in WAS Admin Console.

3. Data Sources defined in DWE Admin Console

Data Sources used by DWH application processes, which are not attached to WAS data source, must be updated.

Before you can perform this, you need to update the J2C user password in WAS for Admin Console to be able to connect to its repository. (Item 6)

4. Data Sources defined in Alphablox Admin Console

Usually this will be data sources for Alphablox cubes to retrieve IBM Cube Views meta data.

5. WAS account used by Alphablox for management

Alphablox uses a WAS user credential for connecting to WAS and managing Alphablx applications in WAS. This piece of information is located in Alphablox repository, /servers/AlphabloxAnalytics/server.properties.

Replace the line ws.admin.password.protected with ws.admin.password=<your_password_in_plaintext>

The issue here is that the new password will be in clear text. I read across some materials that say the password is encoded again the next time the server restarted. However, I don't see that happens in my environment.

6. WAS J2C Authentication entries

7. WAS JNDI Data Sources, assuming not using J2C authentication

8. Optionally, WAS Windows Services


Hope this is helpful to you.

Alphablox Cube Security

I really headache when customer want to impose security constraints on Alphablox solution. Here is a simple security problem that I need to solve.

A local bank in Malaysia has 14 main branches throughout the country where each state got one main branch. There is at least 2 groups of users with different level of data access. One group is country wide users who can view the data for all the states. Another group being state manager which can only sees their own cake.

My first attempt is defining a user property called BelongingState and assign it state code that user belongs to, or "ALL" if they are country wide users. Then I would use Java codes to dynamically construct the MDX for the DataBlox. Thought this will solve my problem, but some user actions would causes rewrite of the MDX automatically (Seems like the default behavior). For example, Show Siblings action will display all members on the same level. This totally defeats my aim of controlling view by Malaysia state. Another one is the Drill Up action. I also found Member Filter dialog and Drill Down (There are 5 types of Drill Down available) action can causes exploitation on data.

Due to lack of time to spend on digging deep into Alphablox object model, I revert to a quick resolution: use removeAction attribute to disable Drill Up, Member Filter and Show Siblings for state level users. At the same time, I wrote a filter on Drilldown event to check the drill down option and prevent the drill down to inappropriate data area. This cut off a lot of interactivity from user, but for the sake of security, some trade offs gotta be made.

Giving some faith to developers who wrote Alphablox framework, I believe there should be some methods to disable MDX rewrite due to user activity, thus preventing the underlying data set from changing. One thing to note if that if you are using MSAS or Essbase as cube source, then you can rely on the native security control features of these cube engines for security. Particularly you can use MemberSecurity tag for this purpose. For Cube View based cubes, may be it could be possible to control the view at database level.

Back to square one, my point of bringing this topic up is that it is important to plan for security requirements to match the out of the box security features provided by Alphablox. For my simple security scenario, I personally think that it is more usable to create 14 identical cubes, each for different state. By controlling which cube that users use in their report, you can be sure that the data view doesn't violate the security expectation, yet retaining powerful interactivity actions such as Member Filter and Drill Up. Things can get pretty ugly and complicated when the security requirements are not just on one dimension. In this case, creating separate cube for different data view might not be practical.

Lastly, localization is another aspect of global business intelligence application that can be as tricky as security factor to implement in Alphablox solution. I really need to appraise Microsoft Analysis Services (MSAS) for incorporating security and localization so well in their cubes. Maybe this is what differentiates market leader and players.

Alphablox Features Wish List

Writing to Santa-BM,

Undeniably, Alphablox architecture and component models present a very flexible and powerful of incorporating analytic capability into whatever web-based applications that your enterprise currently maintained.

With greater power comes greater responsiblities are very true indeed. To be up to 80% productive with Alphablox solution, the architect and developer need to face a very steep learning curve especially if their background on design pattern, J2EE web applications, and some-what JSF-type of knowledge are not strong.

To put it in simple form, I see Alphablox as conglomeration of part-of-JSF (There are other JSF competitors, but I wouldn't mention them here since they are not the part of standard J2EE stack) and AJAX suite of technology.

The thing that I like JSF is not just about their wonderful architecture and well defined component framework. The most important is the promised upcoming tool support that allows you (developers or designer) to visual create, debug, test and deploy JSF component. Without this support, it's like .NET Web developer using code editor to create web pages. WYSIWYG? Forget it.

The problem with Alphablox (As of 8.4.0.1) is the missing of these tool supports.
I don't get report designer in DWE design studio. I don't get testing environment for my custom made UI blox. I can't even test my report without actually put them into the server and press F5 key. Of course, some of you might argue and say to me "Please spend some $$ to buy WSAD 6, cheap skate!". Ok, I accepted that suggestion. But the point is DWE as an integrated and promised to be end-to-end BI platform should at least provide a minimal report development support. Look at Microsoft SSRS, althought their report templates are quite limited, at least I can drag a text box component on the report or define the styles of report contents visually, ;-)

Here some constructive suggestions to IBM Alphablox team

* Provide new project types in design studio that do
1. Alphablox Relational Report Creation (With tons of wizard, of course)
2. Alphablox Analytics Report Creation (With tons wizard, of course)
3. Alphablox Cube (With the real time capabilities to see cube data, etc)
- Trust me, Query Builder is really not enough
4. Alphablox Custom Component/UI Component Creation

I wish that next time when I want to customize the behavior of the MemberFilterBlox to inform users that their selection is not valid due to constraints such as security, I only need to :

1. Fire up my design studio
2. Open the Alphablox project that contains the web page that have the specific PresentBlox/MemberFilterBlox,
3. Double click on the blox, brings me a selection list of events to listen

OR

3. Right click on the blox, go to "Customize Event Handler", then "XXX event"

4. Select the event, and then it generates the handler method skeleton for me to put in my code.

All linking between the event handler and the framework event pipelines should be handled internally by the boilerplate, I don't want to be bothered about that.

5. Lastly, from Tool Palette, drag a Message Box dialog out to the editor

6. Skeleton message box server side codes automatically generated for me, I just need to put in the title and message, etc.

7. Press a button/key to preview with live data.


So, think about my imaginative way and compare it to the current way of doing the same thing, which one is more developer friendly and productive you say?

Even better, convert all the bloxs to JSF compliant component or provide a JSF wrapper for them. This will allows any JSF supported IDE to develop Alphablox applications.

Enable Commenting (Comments)

An excerpt from Alphablox 8.4 documentation:


CommentsBlox allows you to provide cell commenting (also known as cell annotations) functionality to your application. In addition, you can use CommentsBlox for general commenting that are not tied to any other Blox. For example, you can allow users to add comments to a site, an application, a report, or a Web page.

Comments are stored in a JDBC accessible relational database. Supported databases include IBM(R) DB2(R) UDB, Sybase, Microsoft(R) SQL Server, and Oracle. This data source needs to be defined to DB2 Alphablox. DB2 Alphablox provides a Comments Management page under the Server link in the DB2 Alphablox Administration tab that lets you specify the relational data source to use for storing comments. From that page, you can create "collections" (data tables) to store comments. For cell-level comments, you will need to specify the multidimensional data source used in your GridBlox, the cube to use (for Microsoft Analysis Services), and the dimensions to include. For general comments, you only need to specify the name.


Here is what you need to do to get a feel of the built-in CommentsBlox in Alphablox.

First, go to Alphablox Administration Site, by default should be accessible from http://yourserver:9080/AlphabloxAdmin/

From the Administration main tab, you should be in General paage by default. There is a Runtime Management section, amongst others. Click on the Comments link under this section. A pop-up window should appear that refer to http://yourserver:9080/AlphabloxAdmin/comments/commentsAdmin.jsp.

Here you gotta define the comments collection (sort of like a repository to store a particular set of comments) and maybe customize the set of fields used for commenting.

Choose a data source that you already defined elsewhere, enter the security credential and click Connect button. All existing collections should appear in the list provided.


Click Create button, since you want to create a new collection. You need to fill in the details for the form provided, particularly you need to specify the dimension to enable the comments and the fields required for the comment entry.



Click Save once you done.

Ok, you have done with the necessary configuration.

In your Blox programming, you can do the following:

1. Enabling Commenting for GridBlox
...
<blox:data><blox:comments collectionName="YourCommentCollectionName" dataSourceName="TheDataSourceUsedForCollection" /></blox:data>
...

...
<blox:grid commentsEnabled="true" />
...

Then when you right click on your grid cell, an option called "Comments" will appear.


2. Use com.alphablox.blox.CommentsBlox and related com.alphablox.blox.comments.*

This allows you to implement Commenting feature for your relational reports or any other general usage. Requires programming though.

DB2 How to Empty a Table

In data warehouse environment, usually there is a need to clear out the contents of staging tables to prepare for a fresh set of extracted source data. You might also want to housekeep some historical aggregation according to some predefined schedules, which involves copying data from a table to another and subsequently removed all data from the copied table. These are just some of the examples that boiled down to the need of efficiently "truncate" a table.

Microsoft SQL Server and Oracle DBA are definitely enjoying the luxury of built-in table truncation functionality, through command like "TRUNCATE TABLE YourTableName".
Ok fine, but does IBM DB2 UDB, the so-called most scalable and performing RDBMS provide such option?

Before that, let me evaluate some of the options of removing rows from a table.

Option A:

DELETE FROM YourTable

Well, you can delete all rows using this statement. However when involving lot of records, transaction logging causes significant performanc degradation. Still, this option is acceptable if your application requires recovery of deleted rows.


Option B:


(Assume this is within the same transaction)
ALTER TABLE YourTable ACTIVATE NOT LOGGED INITIALLY;
DELETE FROM YourTable;


You managed to escape the bad luck of doing a lot of transaction logging. But wait a minute, constraint checking are still in force (Check yourself by doing explaining a DELETE FROM statement).


Option C:


(Assume this is within the same transaction)
SET INTEGRITY FOR YourTable OFF;
ALTER TABLE YourTable ACTIVATE NOT LOGGED INITIALLY;
DELETE FROM YourTable;


Again, you managed to skip the logging and check constraint and referential constraint checking, datalink integrity checking, and generation of values for generated columns. Primary/Unique Key constraints still enforced.

Option D:


LOAD FROM /dev/null of del REPLACE INTO YourTable


This is by far the most common workaround that I have seen for table truncation. It basically uses the LOAD utility on /dev/null for simulating the loading (replace) of no-data into the designated table. The same concept works for Windows environment.

You can also use similar IMPORT FROM /dev/null of DEL REPLACE INTO YourTable. There are some differences between IMPORT/LOAD.

Option E:

With a little bit of guts, you can drop and recreate the tables. This can be tedious if you got to recreate every constraints/views/etc that dependent on the "new" table.

Option F:


ALTER TABLE YourTable ACTIVATE NOT LOGGED INITIALLY WITH EMPTY TABLE


This is by far my favorite way of doing "TRUNCATE Table" in DB2.

Which ring suits you? You decide.

Should Divide Alphablox Relational Reporting By Zero !

If you are doing relational reporting using IBM DB2 Alphablox, you might face this situation.

In Alphablox, you can define calculated column using bloxreport:calculate custom action. However, the functionality supported by the expression is pretty limited, even to say "Quite Useless". According to the offcial little documentation, only arithmetic +, -, *, /, sum(), (), runningTotal, rank, runningCount, and percentOfTotal are supported. Missing from it is the ability to conditionally process the value in the calculated column.

Now, I created a calculated column that contains the expression "NEWCOL = COL1 / COL2". A typical problematic scenario is when the value of COL2 is zero, and hence the expression is evaluated to "Divided by Zero". Alphablox Relational display a demonic "?" for such a result. No customer want to see their reports flooded with alot of "?".

After hours of finding and hacking around the classes and constants provided by Alphablox Relational. A feasible workaround is to replace the "Divided by Zero" value using a combination of Alphablox Relational tags and Javascript.

Here is one example of such solution:



<script>
function fix(s) {
if (s=="?") document.write ("0");
else { document.write (s) }
}
</script>

<bloxreport:report id="sample1" errors="true" bloxname="sample5">
<bloxreport:canneddata></bloxreport:canneddata>

<bloxreport:calculate expression="col1 = 0.0"></bloxreport:calculate>
<bloxreport:calculate expression="col2 = 0.0"></bloxreport:calculate>
<bloxreport:calculate expression="col3=col1/col2"></bloxreport:calculate>

<bloxreport:text>
<bloxreport:data columnname="col3" text="<script>fix('<value />');</script>" /> </bloxreport:text>

</bloxreport:report>


The idea is to invoke custom logic to manipulate the cell value in the client browser, instead of the server. This is solely because I can't find any documented API that allows me to do so in the server side.


Though it works, more testings are required for the above solution, especially when exporting the relational report to pdf and excel. These report delivery environments might not understood Javascript, hence wouldn't display any value for the formatted cells.

Let cross our finger and hope for similar enhancement to the Alphablox Relational report formatting capability in its' future releases. I have to mention here that even JasperReport is superior than Alphablox Relational in this area.

Lack of IBM Alphablox Community

As of time of this writing, I did perform some searches in Google about IBM Alphablox. Almost 70-80% of the returned search results are marketing articles lead me to a conclusion that if you are not adventurous enough or don't want to spend your precious department funds calling up Alphablox customer service, you might want to think twice before deploying it as your corporate analytics solution at this moment. The size of Alphablox community still below the level needed for a mature enterprise level technology. This might causes frustration if Alphablox solution developer expects development problems to be solved by searching for solutions in the WWW. Hey fellow developer, time to use your brain. :p