Monday, June 29, 2009

Yesterday while perusing the Concepts Guide, I stumbled across the ENABLE NOVALIDATE keywords for the definition of a Foreign Key constraint. I've always known it was there, just never used it, or thought to use it.

I've decided to post a few "classics" from long before anyone paid attention...as opposed to now, when 30 people pay attention. I originally posted this on 12/02/2007 at 10:04 PM which you can find here.

Instrumentation has something that I have come to rely on fairly heavily. I believe I first read about it on asktom, but the one that really spurred me on was this post on instrumentation on his personal blog.

Initially, I couldn't really wrap my head around instrumentation. I don't know why it was so difficult; I had a similar problem with sessions when I first started my career. I look back now and it just seems so obvious.

Now that I am doing datawarehouse work, nothing is fast. Fast to me is now one hour to load 30 or 40 million records. No more split second queries for me.

We currently use no tools. It's straight PL/SQL. Instrumentation of the code is ideal. Actually, it's more instrumentation to aid monitoring. The tool most easily used is provided by Oracle in the DBMS_APPLICATION_INFO package.

There are three subprograms that I use most, SET_MODULE, SET_ACTION and most importantly SET_SESSION_LONGOPS. I hadn't started using it until this year, I mainly stuck to the first two. SET_SESSION_LONGOPS is now part of my procedure/function template I've created in JDeveloper.

What it allows you to do is set a row in the v$session_longops view (I know it's not actually putting the row in the view...it's the underlying table, but I digress). You can then monitor how your job is doing.

I'm going to make a small effort to post some of the key concepts here over the next couple of weeks. If you've read through the Concepts Guide before, this can serve as a brief refresher. If not, good, you're exposed to something new.

This section describes the rules that can be applied to table columns to enforce different types of data integrity.

Null rule: A null rule is a rule defined on a single column that allows or disallows inserts or updates of rows containing a null (the absence of a value) in that column.

Unique column values: A unique value rule defined on a column (or set of columns) allows the insert or update of a row only if it contains a unique value in that column (or set of columns).

Primary key values: A primary key value rule defined on a key (a column or set of columns) specifies that each row in the table can be uniquely identified by the values in the key.

Referential integrity rules: A referential integrity rule is a rule defined on a key (a column or set of columns) in one table that guarantees that the values in that key match the values in a key in a related table (the referenced value).

Referential integrity also includes the rules that dictate what types of data manipulation are allowed on referenced values and how these actions affect dependent values. The rules associated with referential integrity are:

* Restrict: Disallows the update or deletion of referenced data. * Set to null: When referenced data is updated or deleted, all associated dependent data is set to NULL. * Set to default: When referenced data is updated or deleted, all associated dependent data is set to a default value. * Cascade: When referenced data is updated, all associated dependent data is correspondingly updated. When a referenced row is deleted, all associated dependent rows are deleted. * No action: Disallows the update or deletion of referenced data. This differs from RESTRICT in that it is checked at the end of the statement, or at the end of the transaction if the constraint is deferred. (Oracle Database uses No Action as its default action.)

Complex integrity checking: A user-defined rule for a column (or set of columns) that allows or disallows inserts, updates, or deletes of a row based on the value it contains for the column (or set of columns).

ENABLE NOVALIDATE means that the constraint is checked, but it does not have to be true for all rows. This allows existing rows to violate the constraint, while ensuring that all new or modified rows are valid.

This is a great tool for legacy systems. You have data in the column(s) that you can't really do anything with, but you want to insure that all future data that goes in that particular column(s) matches the parent key.

Of course the ideal is to somehow clean the data up, but you don't always have that option. This is a good first step towards to overall cleanup of your legacy system.

Friday, June 26, 2009

Years and years of reading AskTom, if there's one thing that I really come away with, it's read. RTFM. Specifically though, he's telling everyone to read the Concepts Guide. Paraphrasing:

If you can get a good look at the Concepts Guide by sticking your head...

That's not the one.

If you read the Concepts Guide you'll have 10% of the knowledge that 90% don't have.

Nope.

If 90% of the people would read 10% of the Concepts Guide, they'd be able to program in C...?

Seriously, I need help.

If you do know the saying I'm talking about, please share. I can't seem to find it anywhere.

Ultimately it doesn't matter what was said. I'm reading the Concepts Guide again, browsing through it really, trying to find something new. Then I noticed this:

I know it's a "feature" and all, but does it have to be so prominently displayed? Couldn't it be buried in the Appendix of the PL/SQL Guide or something? This is especially important since Mr. Kyte is telling all the new people to read the Concepts Guide.

Perhaps the new people figure it out, the ones that have read the Concepts Guide anyway. I've run across too many "experienced" database developers that can't create apps without them. My guess is that they have barely hit any of the documentation, let alone the Concepts Guide.

So, there is a difference between defining a NOT NULL constraint using either the NOT NULL or CHECK syntax. In USER_CONSTRAINTS, there are 4 distinct constraint types:1. P = PRIMARY2. U = UNIQUE3. R = REFERENCE or FOREIGN KEY4. C = CHECK

What's the lesson here? Well, if doing analysis, you can't just depend on using the DESCRIBE command from SQL*PLus to see what is required and what is not. Of course you can't depend on that for everything. Interesting "problem" none-the-less.

No they aren't, but they've opened up the summer program to all children.

Interesting. All I could think about was I might get to drive Kate to school. WIN! My friend later brought over some paperwork and I passed that on to Kris.

The next day, Kris called and was thoroughly impressed. They knew what apraxia was, unlike her local public school.

Kate has been in the ELP program for about a year and a half. ELP stands for Early Learning Program. It's designed for kids who, for one reason or another, have fallen behind. I think it falls between regular classes and Special Education (think Down's Syndrome).

Anyway, they recently provided us with her year-end report. It's not required though because the end of the year is really December. On the report, they crossed out (literally) the December date and hand-wrote the current date (June something). On top of that injustice (pun intended), they stated that Kate had made no progress since the last review.

Seriously, why would you do that? Why would you even entertain the idea of making @ktjustice mad? Mad is an understatement actually.

Needless to say, we signed Kate up for Blossom.

Kate signs, a little bit. Please, More, All Done. I don't think we pursued that hard because of her hands. She's been in school for 2 weeks now and seems to enjoy it. I think she signed Daddy this evening (poor Mommy).

Today they took a field trip to the Clearwater Aquarium and lo and behold, a news crew was there!

(I was having problems getting this to display, so just click-through if it doesn't show up).

Kate shows up at 1:02...with her back to everone else...standing, while everyone else is sitting.

I ended my investigation there as it seems to be a waste of time. I just found it interesting that there was a difference in how you defined a NOT NULL constraint and whether or not it would show up in the DESCRIBE command. Anyone out there notice this before?

Tuesday, June 23, 2009

After many months and many attempts, I have managed to install a database on a non-windows platform. I think I deserve a little credit. ;)

In January, I began installing Ubuntu Intrepid Ibex on all of my home computers. What this meant for me is that I was often without one because I screwed something up. I even got so far as to install Oracle XE on Ubuntu which took 2 blogposts.

Just before COLLABORATE 09, I installed Jaunty Jackalope on my laptop. Then I got the crazy idea to install Oracle on Ubuntu. I ended up having to wipe the hard drive and start fresh. I have no doubt that it (Oracle) can be installed on Ubuntu, it just takes someone much more experienced than I.

Armed with a fresh install, I was going the virtual route. Inspired, I believe, by this comment (Aman Sharma).1. Install Sun's Oracle's VirtualBox2. Download the latest version of Oracle Enterprise Linux (v5)

I think that's the point I realized my first installation of Jaunty was hosed, and by hosed I mean unrecoverable by me. I couldn't see the 2nd partition, the one with all the room.

3. Create OEL virtual machine4. Follow this guide to get it ready to install the Oracle database (seriously? Why does OEL not come ready for the database server?)

In that previous post, I was taking a URL string and splitting it based on the delimiter, which is typically the ampersand &. In ApEx it is the colon :. I'd take a wild guess and say there are others, but I'm not going to look them up. An example string looks like this:

param1=value1&param2=value2

Etc. Etc. Etc.

The output of that split looks like this:

param1=value1param2=value2

Now you need that string parsed. Instead of just a string being returned however, you'd like to know the name of the parameter as well, thus key/value. Desired format looks like this:

Now I just need to wrap this up into package form and I'm all done. That will be another post with the unit tests provided.

UpdateSo after rereading this post, I realized that I just confused it with the previous post on splitting URL strings. Probably the best solution would be to combine these 2 functions, or at least pipe out the key/value pairs from the main function (previous post). Did I catch a niner in there?

Tuesday, June 16, 2009

I spent about 2 hours with Mr. Peralta that day and had quite a bit of fun. I learned a couple of new tricks (JRockit for one) and got a slightly better understanding of Coherence.

I emphasized "slightly" because I still don't quite know how to use it. Caching data, I get that part. I guess I do understand it from a high-level perspective...it's the where part that I haven't quite grasped yet.

One thing I asked Mr. Peralta to do for me was to right it up, blog it. Show me (us) how and when to use it. Through twitter, he's shared with me some links. Specifically this one on how to define a Data Grid.

An obvious (or maybe not so obvious depending on who you ask) first step in scaling a database application is to cache as much as you can. This is fairly easy to do if you have a single app server hitting a database. It becomes more interesting however as you add more app servers to the mix. For instance:

* Is it OK if the caches on your app servers are out of sync? * What happens if one of the app servers wants to update an item in the cache? * How do you minimize the number of database hits to refresh the cache? * What if you don’t have enough memory on the app server to cache everything?

It's a very interesting technology. I would highly encourage you to click through and give it a go. Leave comments here or abroad.

I then cut and paste both "halves" of the INSERT statement into excel so I can get a row-by-row compare.

Everything looks good, go!

VALUES *ERROR at line 19:ORA-00947: not enough values

WTF?

I still haven't figured it out, but that excel picture gave me an idea...

I tend to put one column on each line, for large tables, this takes up quite a bit of vertical space. I've seen others put columns (and values) on the same line. Just looks ugly to me. Here's what this table looks like:

OK, now I remember what I always get confused. There seems to be this endless stream of SUBSTR and INSTR involved. It's difficult to follow...for me anyway. So here's what I've come up with for now. I am aware of regular expressions and the DBMS_UTILITY.COMMA_TO_TABLE procedures, but for some reason, I like to reinvent the wheel. No, not really, I just find it challenging. More challenging would be for me to start using regular expressions...Anyway, here goes my attempt at splitting a URL string:

Thursday, June 11, 2009

What feels like a thousand years ago I used SQLUnit to do create and perform unit tests on database stored procedures. Lots of them. To the point where that's almost all I was doing. I both hated it and loved it. Hated it because it was boring, tedious, mind-numbing work. Loved it because it gave me confidence in the code I was writing and, once all the tedium was done, could be done quickly. A small change could be immediately tested to see it's effect.

Something else happened too, I began to write code thinking about the tests. Every single line of code was a potential test. If I do this, how will I test it? If I do that, how will I test it? Where do the tests end? You could theoretically test forever...and then I would inevitably fall down that rabbit hole.

Like I was saying, it changed the way I wrote code. Test Driven Development without writing the tests first (which I can hardly stand). That was a good thing because I began to anticipate certain errors. On an INSERT statement with a sequence generated Primary Key, should I trap the exception in code and give it my own error number? Or just let the ORA-00001 propagate up? I just let the Oracle error come up, no need to create my own for that. There's something seriously wrong if you get an error in that case and the whole world should know about it.

Speaking of exceptions, use them, throw them, don't trap them (mostly) and do something else. Log it and then RAISE it. I think it's good for every single developer to know when something is messed up. I've spent far too much time trying to debug something because all the errors are being trapped and processing just continues. I've been lucky on some occasions to have error logging, at least...

SQLUnit is a regression and unit testing harness for testing database stored procedures. An SQLUnit test suite would be written as an XML file. The SQLUnit harness, which is written in Java, uses the JUnit unit testing framework to convert the XML test specifications to JDBC calls and compare the results generated from the calls with the specified results.

Unfortunately it has not been worked on in almost 3 years, but I'd say it's in pretty good shape. There's support for Procedures, Functions, Ref Cursors and User Defined Types (though I haven't tried this out yet). The current version is 5.0. The last version I worked on consistently was 1.3 I believe. A colleague (who now works for Oracle) from my early days of SQLUnit testing even contributed some Oracle specific items.

I used it to build the database for CABEZE, my first, unsuccessful, try at my own business. It was nice because I was building it from scratch, so I could build all the test data along with it (no testing with production data...though there wasn't really much of that anyway), set up the database (create test data), run the tests and then tear it down back to it's original state (empty). Unfortunately the systems I've worked on since weren't empty and testing with production, or semi-production (cleansed) data was the only viable alternative.

Which brings me to now. I'm trying to reacquaint myself with the tool and writing test cases for various procedures. I ran into an issue create test cases because all of our credit card numbers were scrambled. Everything failed with an Invalid Card Number. Barnacles.

Why not create a routine that would generate "real" credit card numbers, more specifically, numbers of the appropriate length that had a check digit? So credit cards use the Luhn formula which is supposed to prevent certain transposition errors.

The Luhn algorithm will detect any single-digit error, as well as almost all transpositions of adjacent digits. It will not, however, detect transposition of the two-digit sequence 09 to 90 (or vice versa). It will detect 7 of the 10 possible twin errors (it will not detect 22 ↔ 55, 33 ↔ 66 or 44 ↔ 77).

With CABEZE, I had written my own PL/SQL card number generator (and validator), but I didn't blog at the time and seem to have lost that code. So I tried writing it again.

Since I'm using SQLUnit, what a great way to demonstrate the power of the tool. It's a somewhat complex (to me anyway) formula, so writing up tests for it should help me to figure it out. Here's the code to create a check digit based on the Luhn formula.

I could then identify the problem, fix it, and run the tests within seconds. How cool is that?

That particular run helped me realize that if I passed in a string of numbers that was even, it would give me the wrong result. If you look at Visa for instance, it's typically 16 digits long with the final digit being the check digit. You start at the rightmost digit before the check digit and go backwards. That was the impetus behind putting the digits into the PL/SQL TABLE OF INTEGERS up above.

I'll post more on creating validly formatted credit card numbers later, I just wanted to use it as an example for SQLUnit.

Try it. Use it. Let me know how it goes. Maybe we can get some Java guy to get it started up again.

Monday, June 8, 2009

When I ask this question I ask it in relation to INSERTs. There is a difference between creating a record and updating a record. I think it comes down to a design decision.

For instance, I once used a table to track sessions (web sessions, using Application Context in the database to "reconnect"), it was defined something like the following:

An Acceptable Time to use UPDATEA session defaulted was created with a 30 minute window. Each page the user would hit would update the END_TIME to SYSDATE + 30 Minutes. If they had no activity for 30 minutes, the END_TIME would be less then the current time (SYSDATE) and they would be logged out. This table did exactly what I needed, UPDATEs worked perfectly here.

What if I wanted to track page hits though? Could I do that with the current table? I could possibly add PAGE_ID or something I suppose, but then I would have to write this "complicated" query to find the first START_TIME and then compare that with the last END_TIME. Kind of changes the meaning of the table right?

An Unacceptable Time to use UPDATEYou have a table of scheduled transactions or some sort of recurring billing.

You use this table by having a job that looks at the DATE_TO_POST, takes those and attempts to post them. If there is a problem, you update the DATE_TO_POST column to the next date/time (based on rules somewhere) and you increment the TIMES_POSTED counter. That doesn't sound so bad right? If you don't have any processing or billing failures, it's not. But if you do, you lose some valuable data, in my opinion.

First, get rid of the TIMES_POSTED column. You don't need it. Create a record for every single transaction you have posted. This obviously changes the meaning of the table and will force you to change your code.

Here's why it's good though (to remove the UPDATE): You now have a single record for every attempt. Your reporting off of this becomes much easier and is performed with SQL. With the UPDATE, you have to maintain some PL/SQL code ( TIMES_POSTED + 1). With the INSERT, you don't perform a calculation at all.

A slightly different example and one that may be more familiar to everyone:

This kind of structure is more suited for a data warehouse. The columns have been pivoted to make reporting easier. In an ideal environment, this wouldn't exist on the OLTP server. However we don't all work in ideal environments and this type of structure is often employed. I pretty much hate it (in OLTP, not DW). The volume and complexity of code needed to maintain this type of table is large. There are all sorts of computations that must be performed and then someone, like me, has to come in and maintain it. I always vote for scrapping it in favor of the following type structure(s):

Now you have INSERT statements as opposed to one or more UPDATE statements. Reporting tools can then handle the pivoting or whatever else you want to do to the data.

UPDATEs in an OLTP are not a flaw, but when about to write one, think of the implications. Is that what you really want to do? Wouldn't you rather just do an INSERT and be done with it?

Wednesday, June 3, 2009

Someone called me a New Media Douchebag today...jokingly of course. (I hope...)

I posted the video to Twitter and Kris Rice didn't think I complained enough to meet the 3rd requirement. I suggested that I'd be the Susy Sunshine of the New Media Douchebags...or something like that.

I've definitely embraced the idea of the New Media or Social Media or whatever you want to call it. It seems kind of natural to me. I've always been the social butterfly at work, the "nice" IT guy and I love to talk. My old manager struggled with whether this was good or bad. He saw the good of building solid relationships with business types but also the amount of time. I'm sure he wasn't the first manager to be concerned about my talking.

I've even discussed doing some sort of presentation on the value of New/Social media.

Am I a New Media Douchebag? Should I relinquish ORACLENERD to someone more deserving? Tell me what you think. Some guy at work said I come off as "Extreme" (I think he meant as a future Exadata team member) in my database ramblings...so I can take it.

I've been tasking with making some fairly drastic changes. When I say drastic I mean that I'm changing the name of a table and updating all of the dependencies. The drastic part comes in because there are about 11 Trillion dependencies.

So not to affect the other developers while I hammer this out, I created my own instance and one-by-one brought over the necessary objects (and their dependent objects). I know there are other ways to do this, develop in your named schema in the development database just changing or repointing the objects to your own.

Since I am new however, I thought it would be very beneficial to do it on my own. There are a couple of advantages:1. I won't interfere with any developer's progress2. It's a great opportunity to learn how everything fits together3. I have ApEx installed (development doesn't...yet).4. Look what I can do

Anyway, one procedure had an INSERT into our reporting environment (another database) so it wouldn't compile. It used the INSERT INTO [table]@database_link syntax. How can I replicate that? I know I've seen or read the solution somewhere, but I decided to do without Google this time to see how long it would take me.

I'll create TEST_USER, where the procedure lives and REPORT_DATA where the table lives. Following is performed with DBA privs:

Monday, June 1, 2009

Most of you know I'm a Linux n00b. I have no apologies for that...at least I'm trying right?

I unsucessfully attempted to install Oracle on Jaunty Jackalope (9.04) so I decided to use VirtualBox and Oracle Enterprise Linux. I'd successfully created a Windows Vista Ultimate virtual machine, off to try my second.

I chose fixed size storage:

Then select the folder where I wanted it to reside:

Location and size:

And the summary:

And received this:

At first, I began to search for problems with VirtualBox. I tried created a smaller disk, 2.0 GB, and it worked. So I knew I had a space issue (well, I think I knew).

So, fdisk to view the available partitions (as root):

root@cdj-laptop:/# fdisk -lroot@cdj-laptop:/#

So I did it as myself:

chet@cdj-laptop:/# fdisk -lchet@cdj-laptop:/#

Using sudo:

chet@cdj-laptop:/# sudo fdisk -lroot@cdj-laptop:/#

So I installed GParted, and tried using that Partition Editor to see what was available:

WTF?

So I've tried search on "ubuntu fdisk nothing" and get nothing. So I'm asking you fine folks...either help me out or point me to the manual (RTFM!). Links are helpful. Answers are the most helpful. Something. I don't want to reinstall again but will if I have to.

Probably helpful to everyone is the contents of my /etc/fstab file as well: