Wednesday, August 22, 2012

For example, try searching for ROW_NUMBER or ASSERTION on the Technical Documents page: Google won't return nearly as many pages as exist... it won't even return some pages with those words in the title.

Sooooo... after waiting and waiting and waiting for the Google CSE to catch up, a different approach is being tried:

A standalone page has been created, separate from this blog, holding just the document links and nothing else,

a new Google Custom Search Engine has been created using one single URL (the standalone page) rather than 1,127 separate document URLs,

Tales From The Doc Face
Control Panel
Sites
Included sites
Add sites
Include sites individually
URL: http://www.risingroad.com/Tales_From_The_Doc_Face.html
What to include:
Dynamically extract links from this page and add them to my search engine
Include all pages this page links to

That last point is important: if "Include all pages this page links to" implies "and only those pages" then maybe the new approach will work. The last thing anyone wants is for Google to include all the other stuff those pages link to; if you want to see all of sybase.com, or the whole internet, you can use Google Classic.

Monday, August 20, 2012

Only EBFs for the latest fully-supported versions of SQL Anywhere are shown here: 11.0.1 and 12.0.1.

Just because an older version or different platform isn't "fully supported" any more doesn't mean you can't download files (or ask questions, or get help), it just means there won't be any more new EBFs released.

Friday, August 17, 2012

"Why is X so Y?"

If you fill in a value for X, Google will solve for Y:

The official name for this feature is Google AutoComplete but with a little effort you can turn it in to "The Google AutoStereotyper": just type a partial question of the form "why is x so" and Google does the rest.

Don't let the price tag fool you, this is The Real Thing, so please, take the survey...

Event Details

Date:
November 14th-15th, 2012

Location:Waterloo, Ontario

Why attend the SQL Anywhere Technical Summit?

1. Education (Learn from the experts and developers themselves2. Networking (Meet the people behind the product as well as other experienced users)3. The training is FREE!

*The training is free however attendees are responsible for flight, hotel and other travel expenses.

Save the Date

SQL Anywhere Technical Summit

Hello SQL Anywhere Users,

The SQL Anywhere team is excited to announce that SAP will be hosting a ‘SQL Anywhere Technical Training’ event in Waterloo, ONT on November 14th and 15th.

This technical training event is designed for developers who are experienced with SQL Anywhere. Ideally located in Waterloo, the main development and support location for SQL Anywhere, this will be a unique opportunity for you to meet many of the developers and support contacts that have contributed to the success of SQL Anywhere. The training event will be free* of charge, however space is limited, so discuss this opportunity with your management team now.

Mark the date on your calendar and look for our next email (early September) which will provide registration and detailed event information. In the meantime, to help ensure we have a successful event please provide your input on session topics, by completing the following survey.

Wednesday, August 15, 2012

Question: How do I figure out which engine is which, in Windows Task Manager?

When I have a [cough] problem and I need to kill one of several database engines running on the same computer, it's a crap shoot trying to tell them apart... whichever one I pick always turns out to be the wrong one.

Answer: It's pretty hard to do in Windows Ancient, where "pretty hard" is code for "I don't know".

But in Windows Recent, you can tell Task Manager to show the command line that launched the engine, and in a lot of cases that's enough to tell one process from another. Just use

Monday, August 13, 2012

Only EBFs for the latest fully-supported versions of SQL Anywhere are shown here: 11.0.1 and 12.0.1.

Just because an older version or different platform isn't "fully supported" any more doesn't mean you can't download files (or ask questions, or get help), it just means there won't be any more new EBFs released.

The blob_key column is an artificial primary key 1, 2, 3, ... for the new table; this is where the "pointer to the new table" comes from.

The hash_value column is filled by applying SQL Anywhere's HASH() function to the long string in t.blob to create a much shorter but (almost?) perfect alternate version of t.blob. In other words, if two HASH() values are the same, then the two original strings are (probably) the same, and vice versa: if two HASH() values are different then the original strings are (probably) different.

The collision_counter column counts the number of times 0, 1, 2, ... that the same HASH() value was returned for different t.blob values. This value has remained zero during all the testing done for this article, but... who knows?

The reference_count column counts the number of times that separate copies of the same t.blob value were replaced by one row in tblob. This column is nice to have for testing, and it may even be needed in your application, but it does increase the overhead.

The blob column is where the single copy goes, and it's NOT NULL because by the time anything is stored in tblob it's known to be longer than the threshold (e.g., 100 bytes).

The UNIQUE constraint is a performance optimization to help with the process that inserts new rows in tblob.

Here's what the original table and query look like now, with the t.blob value moved to tblob.blob for some rows and not others:

The LEFT OUTER JOIN matches each row in t with the corresponding row in tblob, if one exists. If no such row in tblob exists, that means the original blob value was too short to be moved, or it was NULL in the first place.

The COALESCE function call returns t.blob if it isn't NULL,

otherwise it returns tblob.blob if there was a matching row in tblob, or

if neither is true, it returns NULL.

In other words, only one of t.blob_key and t.blob can be non-NULL. Both columns can be NULL, however, which is what happens when the original blob value was NULL.

Here's a sample of code that shows how to INSERT rows into t and tblob, using as test data the blob column from another table called x:

The IF LENGTH > 100 on line 14 checks to see if the benefits of moving t.blob into tblob is worth the overhead. If not, the INSERT down on line 61 leaves the blob in t and sets the pointer to NULL.

The HASH() function call on line 18 uses the SHA256 algorithm, the best that SQL Anywhere has to offer. HASH() returns VARCHAR ( 64 ) values like b9afc180bd8fa250a006229ba9c8b8eddba0b96dfcb8320740af84a4c485f5ea... in other words "trust me, it's unique".

The FOR statement starting on line 21 looks through all the rows in tblob with matching HASH() values. In the real world, there will (probably) be zero or one rows processed by this FOR loop, and if one row is returned the IF @blob = @existing_blob on line 31 will be TRUE.

The IF THEN on lines 31 through 37 handles the case where not only does the HASH() value match but so does the blob value, and it saves the tblob.blob_key value in @blob_key for later use, before using the LEAVE statement on line 37 to skip out of the inner FOR loop and resume processing at line 43.

The ELSE starting on line 38 handles a collision, where the HASH() value matches but not the blob value, by bumping up the value of @max_existing_collision_counter (which was initialized to -1 back on line 19).

The IF THEN on lines 43 through 55 handles the case where the HASH() value didn't match, so a new row in tblob must be inserted.

The INSERT statement on line 57 fills in the pointer (t.blob_key) while setting t.blob to NULL. This is different from the INSERT on line 61 which sets the pointer to NULL and puts the original blob value into t.blob.

100 bytes might not be a good threshold. There's a lot of overhead here, and it might not be worth it for strings that short.

The whole idea might not be worth it... it depends on your data, and you'll never know until you try it, but if you do go to the trouble of implementing it, don't fall in love with it until you've done some objective performance tests.

Unless you really need tblob.reference_count, the UPDATE on line 34 is pure waste.

The tblob.collision_counter column might also be a waste since it is (almost) always zero. An alternative approach might be to just skip moving the blob into tblob if a collision occurs; i.e., just leave it in t.blob, and don't bother implementing tblob.collision_counter at all.

Yes, "Wow!"

1,127 is a big number, and you can Google just those articles without seeing any extra clutter...

"Patents" was the only category where every entry was included. Quite a few entries in the other categories (blogs, Techwave and "other") were excluded for "not being technical enough"... for example, not one single fRiDaY File made the cut.

If you don't see your name in this list, maybe you are part of the Borg you were published as "anon."...

The number you pick doesn't have to be unique, that's up to you... you can use 99002 all over the place if you don't mind the aggro of not knowing where it came from.

Here's a sample that includes code for capturing and recording exceptions; in this case, the 99xxx numbers are only unique within the stored procedure because other logic takes care of identifying the procedure:

Wednesday, August 1, 2012

Yesterday was Glenn Paulley's last day at the company formerly known as Watcom, then Powersoft, then Sybase, then iAnywhere Solutions, then Sybase again, and now SAP.

For those of you who don't know Glenn, up until yesterday he was boss of the query engine which is that part of SQL Anywhere responsible for giving you an answer when you say SELECT.

Today, he's heading to... no, not Google, but to join the faculty of Conestoga College as Professor, Information Technology, to teach database systems, systems analysis and IS courses.

No, it's not like that at all!

In fact, this is more likely...

Yes, Glenn will be missed!

Here's the reason, in Glenn's own words:

I have been thinking of such a move for a little while. In a nutshell, I am trying to duplicate the joy I get from coaching, and earn a living at it. I have coached curling for ten years now and it is, by far, the most personally rewarding thing I have ever done. My hope is that I can convey the same passion and enthusiasm for Computer Science to Conestoga students as I can for my curling athletes, and bring to the classroom useful knowledge from my 25 years of experience that can help the students at Conestoga achieve their goals.

Those kids at Conestoga have no idea how lucky they are; their great good luck is inversely proportional to our bad luck, and a Glenn-free world is going to take some getting used to.

Maybe not ENTIRELY Glenn-free...

From time to time Glenn may grow tired of the grind, of the never ... ending ... pounding ... of ... thick ... skulls, and regale us with fresh posts on his new blog at The Ubiquitous Database.

His old blog will probably disappear, but the content will surely be preserved in perpetuity, as will all his articles here.