FOR f_fetch AS c_fetch NO SCROLL CURSOR FORSELECT folder.file_name AS @file_name, folder.size AS @file_size, folder.modified_date_time AS @file_date_time, folder.contents AS @file_contents FROM folder WHERE folder.file_name LIKE '%.JPG' ORDER BY folder.file_nameFOR READ ONLYDO

The CREATE EXTERNLOGIN statement is required for CLASS 'DIRECTORY' remote servers... I'm not sure why, but it's the law.

The CREATE EXISTING TABLE statement uses 'folder;;;.' as the "AT" location. The first parameter refers back to the CREATE SERVER name "folder". The second and third parameters are empty; no database or owner names are required. The fourth parameter, the "object name" parameter, is a single dot meaning "this folder", referring back to the earlier ROOT parameter.

The CREATE EXISTING TABLE statement gives you a table that looks like this, as shown by Foxhound:

The time spent to execute this code depends on a lot of factors, not just the state of the database cache. That's my way of saying "I can't get consistent timing figures no matter what I do"... which is another way of saying "I don't know what's going on."

So here goes: 36 JPEG files took anywhere from 1.5 to 3.5 seconds to load into a SQL Anywhere 11 table on the same reasonably fast laptop used yesterday.

Here's the result in dbisql:

SELECT jpeg.file_name, jpeg.file_size AS "size", jpeg.file_date_time AS "date_time", jpeg.file_contents, jpeg.image_height AS "height", jpeg.image_width AS "width" FROM jpeg ORDER BY jpeg.file_name;

And here's a new feature I didn't know about: SQL Anywhere 11's version of dbisql is smart enough to know that jpeg.file_contents is an "(IMAGE)", and when you click on the funky little ellipsis button "..." you get these marvellous displays:

Tuesday, December 23, 2008

I want to fill in the HEIGHT and WIDTH attributes of HTML IMG tags generated by SQL Anywhere web services from images stored in the database.

HTML pages display faster if the browser knows ahead of time how big each image is, and the HEIGHT and WIDTH attributes are one way of doing that:

<IMG SRC="images/blue enamel cup.JPG" WIDTH="400" HEIGHT="435">

CSS STYLE height and width attributes, the preferred technique for this blog, are another way of doing the same thing:

<img style=" ... width: 283px; height: 308px;" src="..." ... />

Answer: The dimensions of a JPEG image are embedded in the image itself.

This article presents a pure SQL solution for finding those dimensions, after first converting the binary image to a character string as described in yesterday's post Hexadecimal Strings.

Reference: There are many explanations of the JPEG file format on the web, mostly shallow, useless, incorrect or hugely obscure. In situations like that I prefer to read actual code, so I went with the logic in in rdjpgcom.c, Copyright (C) 1994-1997 by Thomas G. Lane, from the Independent JPEG Group's software.

This ancient C program delves into the basic structure of JPEG files, just far enough down into the details to find what I was looking for, without going all the way down into the massive complexity of image compression.

Plus, it's got comments!

You can find the rdjpgcom.c file at various places on the web including here.

1: The JPEG Table

For the purposes of this article, the first two columns of the following table have already been filled in, and it's the job of the code in the next few sections to fill in the image_height and image_width columns:

2: CREATE PROCEDURE get_jpeg_dimensions

The get_jpeg_dimensions procedure receives a JPEG image as the first parameter, and returns the image height and width as parameters 2 and 3.

Here are some notes on the format of a JPEG file:

A JPEG file consists of a series of blocks, sometimes called headers.

Each block begins with a marker.

Each marker consists of one hexadecimal FF byte, zero or more FF bytes called padding, and one non-FF byte. The scan_jpeg_marker procedure at the end of this article is dedicated to skipping over the FF padding and returning a 4-character 'FFxx' marker string.

The first block consists only of its marker, FFD8.

For second and later blocks, the marker is followed by a two-byte length field.

The value of the length field includes the length of the length field itself plus the length of the varying-length, varying-format data which follows it.

The only data of interest is the image width and height fields. This article does not go into any detail on the other very complex data stored inside JPEG images.

The image width and height fields are contained in the first block that has one of these markers: 'FFC0', 'FFC1', 'FFC2', 'FFC3', 'FFC5', 'FFC6', 'FFC7', 'FFC9', 'FFCA', 'FFCB', 'FFCD', 'FFCE' or 'FFCF' (yes, there missing values in that sequence :)

The only way to find the image width and height fields is to scan all the markers and length fields up to the 'FFCx' block containing those image width and height fields. For example, a simple search for an FFCx marker won't work because JPEG images sometimes contain embedded thumbnails, and a simple search may return the width and height from an inner thumbnail FFCx block instead of the real values.

Scanning should not proceed past the first 'FFD9' or 'FFDA' marker... Here Be Dragons!

The UNLOAD at line 15 converts an n-byte LONG BINARY string to a 2n-byte LONG VARCHAR string as described in Hexadecimal Strings.

The SET at line 20 initializes @current_pos to point to the first significant character in the hexadecimal string. This variable is incremented throughout the rest of the code, as scanning proceeds.

The scan_jpeg_marker procedure is called from several locations, starting with line 24. This procedure starts with the leading FF of a JPEG marker and scans forward to find the subsequent non-FF substring; the code appears in the next section.

The loop at line 44 continues until a RETURN is executed.

The IF at line 46 tests for the sought-after "Start of Frame" block, containing the image height and width values; the code starting at line 61 returns those values.

The ELSEIF at line 69 tests for a premature end-of-JPEG situation; scanning can't proceed, and the image height and width are returned as zero.

The ELSE at line 75 skips over the current block and scans the subsequent marker.

Note the expression "@block_length * 2" at line 83: forgetting to multiply by 2 is a characteristic error when dealing with hexadecimal strings. You often have to multiply by 2 because a hexadecimal string contains 2 characters for each single byte in the original LONG BINARY string.

3: CREATE PROCEDURE scan_jpeg_marker

The scan_jpeg_marker procedure looks forward for one single FF, followed by one single non-FF, with zero or more FFs in between.

I'm not going to say anything nice about regular expressions (they are abominable!) but the REGEXP_SUBSTR function certainly makes life easy once you figure out how to code the regular expression itself. Here's why:

REGEXP_SUBSTR returns the actual substring that matches the regular expression, instead of a string position,

you can pass it a starting point, and that starting point is numbered from 1, not zero, and

you can pass it an "occurrence count"; e.g., you can tell it to match the third occurrence of the regular expression.

UPDATE jpeg SET jpeg.image_height = @image_height, jpeg.image_width = @image_width WHERE CURRENT OF c_fetch;

END FOR;

COMMIT;

END;

It's not the fastest code in the world, BUT... it's not the slowest: With a cold cache (call sa_flush_cache) it took about 1.5 seconds to process 36 images on a reasonably-fast laptop. Depending on your application, you might do the calculation on the fly as needed, or do it once ahead of time as shown here.

Here's the result in dbisql:

SELECT jpeg.file_name, jpeg.image_width, jpeg.image_height FROM jpeg ORDER BY jpeg.file_name;

Monday, December 22, 2008

A hexadecimal string is an ordinary character string that obeys the following rules:

each single character in the string is one of the following: '0' through '9', or 'A' through 'F', and

the string contains pairs of characters; i.e., the string length is an even number.

In other words, a hexadecimal string is made up of pairs of characters '00' through 'FF'.

Hexadecimal strings are handy in SQL Anywhere because there are a lot of functions for handling strings and not much in the way of features for representing complex data structures; e.g., the contents of a JPEG file.

...and any arbitrary string of data, including that JPEG file, can be converted to a hexadecimal string by converting each successive byte value 0 through 255 to the corresponding hexadecimal string '00' through 'FF'.

Once upon a time, converting a LONG BINARY string to a hexadecimal LONG VARCHAR meant coding a funky (and slow) loop involving RIGHT, INTTOHEX, ASCII and SUBSTR function calls. Now, SQL Anywhere 11 lets you perform the conversion in one single, very fast, UNLOAD statement using the INTO VARIABLE and HEXADECIMAL ON clauses:

Tuesday, December 16, 2008

program trace data written to the SQL Anywhere console log text file during development, and

errors and exceptions written to a permanent table by both development and delivery builds,

but it only presented code for the first kind.

Here's an example of an exception displayed by the Foxhound "Display Diagnostics" button:

Here is the full text of the message, wrapped to fit on screen. To the end user, the interesting part to the end user is the ERRORMSG "...cannot access the file". However, for the purposes of this article the interesting part is the substring "610a5c":

From Foxhound's point of view, this exception is a minor glitch. Foxhound executed the dbping.exe utility to gather some information about the connection to the target database, and it could not display that information; i.e., this section of Foxhound's "Facts & Figures" display does not appear:

From the client's point of view, it may be more interesting; in some cases the culprit is a firewall getting in the way between Foxhound and the target database.

But... enough about the actual exception; we're here to see how it is trapped and how it is handled.

The following excerpt comes from the stored procedure that detected the exception; only the code specific to exception handling is shown, NOT the actual code that raised the exception:

Here's how it works: Lines 18, 22 and 26 (and many others in the actual code) record the "current location" in the code, during execution, so that information may be saved when an exception occurs. The hand-coded @diagnostic_location values are alphanumeric, starting with the module (file) number "610", followed by increments "a1", "a2", ... "a9", "b1", etc. When new program logic is inserted, say between 610a5 and 610a6, more characters are added: 610a5b, 610a5c and so on.

There's nothing special about that encoding technique, except for the fact it's easy to hand-code new values that are globally unique without having to keep track of a "last used value" or anything like it.

A global search for "610a5c" in all of the Foxhound source code will find exactly one location, and will narrow the search quite a bit.

It would be nice if SQL exception diagnostics showed the line numbers inside procedures, triggers and so on, but sometimes life is harsh.

And yes, there all these SET @diagnostic_location statements do have an effect on performance. I've never been able to measure the effect, but I have also avoided placing the SETs inside tight loops.

In this example, the exception was raised somewhere between lines 22 and 26. We know this because the @diagnostic_location value '610a5c' is passed to rroad_exception in line 33, and it ends up being displayed in the message text shown earlier "...610a5c...".

The exception is trapped by the one single EXCEPTION block in the entire module, starting at line 30. Other, longer stored procedures have many nested BEGIN ... EXCEPTION ... END blocks, but this procedure only has one.

The hard-coded literal '(610eh1)' serves to uniquely identify the EXCEPTION handler itself. If this module had more than one, the others would be numbered '(610eh2)' and so on. The most exception handlers in any single Foxhound procedure is 14.

The WHEN OTHERS THEN construction is shorthand for "trap everything". It is immediately followed by the SELECT at line 34 that saves the current values of SQLCODE, SQLSTATE and ERRORMSG(); it is important to do this right away inside the EXCEPTION block before any other code has a chance to change the values.

The three lines starting at line 37 are this module's way of graciously handling all exceptions: all the OUT parameters are set to empty strings to tell the caller "nothing to display".

The CALL to rroad_exception at line 41 passes all the diagnostic information gathered so far to a procedure that will store it in this table:

The SET at line 15 adds some more data (timestamp, Foxhound build number, etc) to the incoming @diagnostic_text, and then right-truncates the result to the length that can be passed as a parameter to an event.

The TRIGGER EVENT at line 30 uses the special syntax necessary for passing parameters to events.

The SET at line 51 uses the special EVENT_PARAMETER function to receive the parameter that was passed to the event.

The INSERT and COMMIT at lines 55 and 58 do the real work... on the separate connection started by the event so the data gets saved even if the caller does a ROLLBACK.

There are many EXCEPTION handlers in Foxhound. Some of them call rroad_exception as shown here, some of them check the SQLCODE first to see if it's on a "do not call" list; i.e., the exception is both expected and uninteresting. Some of the handlers return immediately to the caller, others carry on executing the program logic following the BEGIN block. Still others don't call rroad_exception at all, they just silently handle the exception.

Friday, December 12, 2008

(This being a blog about databases, I have to make one thing clear: diagnostic logging has nothing to do with transaction logging.)

Foxhound creates two kinds of diagnostic logs: large amounts program trace data are written to the SQL Anywhere console log text file during development, and a small number of errors and exceptions are written to a permanent table by both development and delivery builds (those errors and exceptions are also written to the console log during development).

Jeff says "I am not anti-logging. I am anti-abusive-logging." That said, Jeff has five main points to make, five problems he sees with the way programmers often do diagnostic logging... let's see how those points apply to Foxhound (Jeff's points are in "Bold italics", my comments follow):

"Logging means more code."

All the tracing logic in Foxhound is hand crafted, none of it is automatically generated. That reduces the amount of superfluous logging, output that will never be useful... most instances of tracing logic have been added to Foxhound to help debug some specific problem, or added to sections of code that have had a history of problems.

The Monitor section of Foxhound has been particularly hard hit, due to it's multi-threaded and timing dependent nature, so that's where the most interesting tracing logic exists.

Once added, tracing logic tends to remain in the code until it proves to be annoying... more on this later.

"Logging isn't free."

All of the tracing logic uses SQL Anywhere's MESSAGE ... FOR DEBUG feature, and it's enabled only for development builds, not delivery. Over time, however, some large sections of tracing logic have been removed from Foxhound to avoid performance penalties during development... sometimes deleted, sometimes commented-out in case it becomes useful again during periods of heavy maintenance.

That only applies to the development build, however. The execution profiler has never shown any bottlenecks in the delivery build that were caused by disabled MESSAGE ... FOR DEBUG statements.

Foxhound doesn't (intentionally) fail, there's no concept of "fatal" errors. The design calls for all exceptions to be handled one way or another: Some are "normal" such as the failure to connect to a target database (which is shown to the user), or the failure to complete some SQL operation because Foxhound is shutting down (which is not shown to the user).

Other exceptions are "abnormal" (unexpected and/or unheard of) and are worth recording for possible future investigation; those are recorded in a permanent table. They are the reason why the Foxhound Options page has buttons labelled Display Diagnostics, Export Diagnostics and Delete Diagnostics.

The vast majority of diagnostic logging data, however, is tracing data written to the console log, and only during development. NONE of that tracing data is written by the delivery build, and there is no Foxhound user interface to display it... during testing, Wordpad is the tracing data UI of choice.

"The more you log, the less you can find."

That's absolutely true; that's the most frequent reason various pieces of tracing logic are removed from Foxhound: to keep the size of the console log file humanly manageable. Sometimes, the pieces are commented-out ("Who knows when this #$%&!%# procedure will give trouble again!"), sometimes they are permanently expunged ("No, this inner-inner-inner loop works now, I'm never gonna need that ten-thousand-line trace again!")

Sure that's true, if the huge log file is the only place serious errors are recorded. That's not the case with Foxhound, however... that's what the permanent table is for.

I've got some things to say myself, about logging...

If logging helps, do it. If it hurts, stop.

Diagnostic logging is neither good, nor bad, per se. It's a tool which can be used for good or evil, it's your choice.

Logging contributes nothing to the end product; no customer is going to be impressed by the elegance and thoroughness of your logs.

And last but not least: Debuggers are great, especially the one built in to SQL Anywhere, but they are not a silver bullet.

All debuggers come with a considerable amount of human procedural overhead. I'm not anti-debugger... I once bought a hardware circuit board to debug an IBM PC timer interrupt handler: $800 in 1988 dollars, a week to figure out how to make it work, one single "Aha!" moment, one single line of code to fix, and then it was never used again.

Debuggers are the MRI machines of the programming world, invaluable for some problems but unnecessary for most. In most cases nothing beats a simple MESSAGE statement. In many cases even that's not required, not if you know where the error occurred and you can see the SQLCODE and ERRORMSG().

Show Me The Code!

Here's how it diagnostic tracing works, in Foxhound, starting with the Windows batch file that runs the development build. The heart of this batch file is the dbisql command which launches the debug build SQL file called 004_rroad_build_debug.sql:

Note: The ^ character is the line continuation character for Windows command files.

The PARAMETERS command is not part of SQL, it's a command specific to dbisql that gives names to different [values] passed from the command line. The series of READ statements inside 004_rroad_build_debug.sql tell dbisql to process each named *.sql file in turn, passing on whatever parameter [values] they require.

Inside the [square braces], the names enclosed inside {curly braces} are references back to the PARAMETERS statement.

For example, MONITOR_DEBUG_MESSAGES parameter is second on the dbisql command line, and it is passed on as the first parameter to the file 203_rroad_monitor_sample_loop.sql. For the debug build, the value is ON. For the delivery build batch file, and the corresponding delivery SQL file, the value is OFF.

Since every EVENT and web SERVICE runs on its own connection, every event and service block is responsible for its own connection-level SET TEMPORARY OPTION commands... hence the PARAMETERS command in this SQL file. The handling of {curly brace values} is a pre-compile string substitution process performed by dbisql before the code is passed to SQL Anywhere for processing; here's the SET command after the substitution:

SET TEMPORARY OPTION DEBUG_MESSAGES = 'ON';

For many applications, setting DEBUG_MESSAGES is not that complex: The simplest approach is to set it ON and OFF one time, at the database level, and be done with it:

SET OPTION PUBLIC.DEBUG_MESSAGES = 'ON'; -- or 'OFF'

In a client server environment, your application could pass the following command to the engine after starting a new connection:

SET TEMPORARY OPTION DEBUG_MESSAGES = 'ON';

That might not work too well in a connection pooling environment; for that, it's probably best to turn it ON and OFF at the database level as described above.

Foxhound, however, is built entirely around EVENT and web SERVICE blocks; there are no client server database connections, no connection pools, no applications in the traditional sense, just the web browser.

The MESSAGE statement shown above contains these basic components:

A call to STRING to build the message text.

The 'DIAG ' substring acts as a eyecatcher so you can tell diagnostic text apart from checkpoint messages and other data in the log.

The CURRENT TIMESTAMP is often useful when looking for needles in the haystack.

The CONNECTION_PROPERTY ( 'Number' ) value is helpful with applications that have many simultaneous connections; e.g., busy applications with many users, and Foxhound with its myriad events and web services.

The rest of the data passed to STRING is the actual diagnostic data, unique to each MESSAGE statement.

The TO CONSOLE DEBUG ONLY clauses control where and when the data is written; for more information see here.

Thursday, December 11, 2008

Wednesday, December 10, 2008

In Monday's Windows-Style ORDER BY I answered the question "How do I use ORDER BY to sort alphanumeric strings the same way Windows Explorer sorts file names?" as follows:

"By using CREATE FUNCTION to return an alternative string value that will have the desired result when used in ORDER BY, and then (optionally) calling that function in a new COMPUTE column."

I really should have left out "(optionally)"... the COMPUTE clause isn't just some cute technique to push program code down into the database, it's a valuable performance improvement tool. And without it, in this case, queries run about as fast as continental drift.

Here is an example of a table without the COMPUTE column, followed by a query that refers to the user-defined function in the ORDER BY (see Windows-Style ORDER BY for the reformat code):

On the same 1000 rows, same empty cache, same computer, the query now took less than 1 second. Now the plan looked better, an index scan:

But wait, there's more! If you code the ORDER BY to use the function call instead of the computed column, SQL Anywhere STILL finds the index and uses it, and it STILL takes less than one second:

SELECT TOP 100 START AT 200 * FROM fast ORDER BY reformat ( fast.original_name );

I kid you not; here's the plan... fast.alternate_name does not appear anywhere in the SELECT but the index xalternate_name is still being used:

There is a down side to this COMPUTE clause: the user defined function must be called for each row that is inserted or updated. In this test it took almost 7 seconds to run a LOAD TABLE with COMPUTES ON... but that's a one-time cost, whereas queries tend to be repeated over and over again:

Tuesday, December 9, 2008

Does the creation of the web-based Google Group SQL Anywhere Web Development mark the beginning of the end for NNTP-based newsgroup forums like sybase.public.sqlanywhere.general on forums.sybase.com?

This ain't your father's Google Groups... this is not just an HTTP front-end on top of NNTP data from sybase.public.sqlanywhere.web, it's a whole new implementation: different input, different data store, different display.

Web interfaces generally suck when it comes to viewing threaded conversations because they are slow, awkward to navigate, and (did I mention it?) slow. One of the absolute worst is Sybase's own "Community Forums" web interface to the NNTP newsgroups; check it out at sybase.public.sqlanywhere.web.

Even the much-hyped stackoverflow.com website sucks. Like the new Google Groups, it's a whole new implementation... but go ahead, try to find what you're looking for.

With Google Groups, searching has never been a problem. The new interface at SQL Anywhere Web Development lets you easily post messages via email, and you can reduce the suckage with these tips:

open each new thread in a new tab, and

set Options - View as tree.

If NNTP is going to die, if we have to use web browsers to carry on a conversation, I'd rather it be Google Groups than any of the alternatives.

The alternate_name column is defined as an automatic COMPUTE column based on this expression:

reformat ( original_name )

where reformat performs the following transformation:

The original string is parsed from left to right looking for numeric and non-numeric substrings. Each non-numeric substring is returned as-is. Each numeric substring is right justified with leading zero characters to some long length (say 40 digits, although 5 is the number used here).

The string_part table numbers and stores each numeric and non-numeric substring in a separate row so they can be reformatted and gathered back together again via the LIST function.

@justify_to controls the width of the right-justified numeric strings.

@pos steps through each @char in the @original_name string.

@state implements a state-driven parsing loop, starting with @state = 'start' and then flipping back and forth between 'numeric' and 'non-numeric'.

@string_part is used to gather up all the @char values with the same value of @state.

The REPEAT function is used pad each numeric substring with an enormous number of leading '0' characters, and then the RIGHT function is used to strip off the excess '0' characters from the front of the string.

The LIST function is used to gather all the reformatted substrings back together into the single @alternate_name return value.

And last but not least, if the input is the empty string, so is the output.

Note that the reformat function translates substrings like 5.02 into 00005.00002, resulting in a different sort order than if the 5.02 was treated as a single decimal number and translated into 00005.02. If you want the latter functionality, the specs for the reformat function might have to be changed to something like this:

The original string is parsed from left to right looking for numeric and non-numeric substrings. Each non-numeric substring is returned as-is. Each numeric substring that does not follow a dot '.' is right justified with leading zero characters to some long length. Each numeric substring that does follow a dot '.' is returned as is.

My preference still stands today, even though Version 11 of SQL Anywhere introduced "Maintenance Plans" in Sybase Central to make it easier for you to create SQL Anywhere events that perform database backups and validations.

Here's why I prefer batch files over events, for database backups:

It's easier to write a batch file that records its progress in a text file.

It's easier to make an ad-hoc run of a batch file... just doubleclick on the file name.

It's just as easy to create a schedule via Windows Scheduled Tasks as it is via Sybase Central.

Batch files can be administered without database access or privileges.

More people are familiar with operating system commands than database operations.

Some time ago I wrote a template Windows batch file to give to clients who want to set up scheduled database backups... just one batch file, plus one document explaining in detail how to demonstrate the workings of the batch file.

The actual batch file is listed at the end of this blog posting; here are the contents of the document called "Backup a SQL Anywhere Database Keeping Three Generations of Backups":

1. Pick a folder to contain the command file and the backup subfolders.

Put the backup command file in that folder.

For example: C:\dbbackup\run_dbbackup_full.bat

2. Edit the backup command file.

Change the connection string so that dbbackup can connect to the consolidated database:

Note: The ^ character is the line continuation character for Windows command files.

Thursday, December 4, 2008

In order to preserve some semblance of sanity, the huge Foxhound "To Do" list has been broken into separate files by category: bug fixes, necessary enhancements to the Schema Display portion, enhancements to the Monitor, future changes and so on, plus a "Done" list where completed to-do items are moved when they're done.

The current refactoring walk-through has pointed out the need for a new category: the "Always Watch For" to-do items. These are tasks which must be repeated, sometimes frequently, and are never really "Done".

Here's the brand-new "Always Watch For" list... only three items, but the list is only a day old:

When upgrading to a new version of SQL Anywhere, check code for references to SQLANY11.

When upgrading to a new version of SQL Anywhere, check code in 610_rroad_dbping.sql for version-specific environmental changes (folder names, etc.)

The first one was really nasty... the Foxhound build process can create three different versions: the final delivery version, a version to be used for tuning, and a version with lots of extra diagnostics for use during development only.

Multiple versions of the same code exist in some places, and which version is actually compiled depends on the build version; for example, multiple versions of a CREATE TABLE may exist with more constraints in the development build than the final deployment build. When making maintenance changes it's important to check ALL the versions of the same code to see if they ALL need the same change; e.g., a new column is needed in all build versions.

You can see where this is heading: The diagnostic development build works OK, but not the final delivery version, because a change was made in one place when it needed to be made in two places. In some cases, the result is a "soft error" that is not fatal to Foxhound operation, but causes a runtime diagnostic to be stored.

And that's the reason for the to-do item "Always watch for new problems in Foxhound Options - Display Diagnostics"... over 500 of these messages appeared there before I noticed anything was amiss:

The other two "When upgrading..." items refer to code that is specific to the version of SQL Anywhere used to build Foxhound. Regardless of whether version-specific data is hard-coded or placed in a configuration file, it still needs to be revisited when the version changes... hence these reminders in the "Always Watch For" list.

All the items got added to the list the hard way: I made a mistake, and it's likely I'll make the same mistake again.

There will be lots more entries, I'm sure... the reason I'm talking about it is that you might benefit from creating your own "Always Watch For" list, on your own projects.

Foxhound is of interest to this blog, not (just) because it's a tool dedicated to SQL Anywhere, but because it's written in SQL... at least, 61,000 lines of it is. That's 98% of the total. The other 2% is written in C.

The time has come for me to fix some bugs in Foxhound and make some enhancements, and while I'm at it I'm going to do some refactoring: every time I have to open a module to make some changes, I'm also going to look around for other improvements to make.

And while I'm doing that, I'm also going to look for code samples and programming techniques that are interesting enough to publish here... hence this post, the first one in "Refactoring Foxhound: The Series".

Refactoring Foxhound: Calling C

The only reason any part of Foxhound is written in C is because it can't be done in SQL.

Here is the SQL code that shows how to define a SQL function called get_environment_variable that in turn calls a function of the same name inside demodll.dll... and inside that dll resides the C code (shown later) that actually calls GetEnvironmentVariable.

In this case the SQL function receives one input parameter, the name of the environment variable, and returns four output parameters: the value of the environment variable, a return code and two diagnostic values in case there was a problem.

For example, if you call get_environment_variable with name = 'SQLANY11' it will return value = 'C:\Program Files\SQL Anywhere 11' and return_code = a non-zero value.

If you call it with an unknown variable name, such as 'UNKNOWN', the return value will be empty, the return_code will be zero, and the diagnostic_string will contain 'GetEnvironmentVariable failed'.

Here's the actual demodll.cpp code used to compile demodll.dll via Visual C++ 2005 Express Edition, with line numbers and some explanations.

Note: This is not a tutorial in C++, nor is it a complete explanation of how to write external C functions for SQL Anywhere. For more information on the latter topic see the SQL Anywhere External Function API section in the Help, as well as the sample code in C:\Documents and Settings\All Users\Documents\SQL Anywhere 11\Samples\SQLAnywhere\ExternalProcedures.

Instead, the explanations below are my attempt to draw your attention to some tricky and troublesome bits...

Lines 73 and 74 show the standard entry point for your externally-callable C function. You pick the function name, but no matter what arguments you pass in the SQL CALL, the C routine only ever sees these two parameters: an_extfn_api *api and void *arg_handle.

Line 117 is an exhortation to the maintenance programmer: The five arguments passed in the SQL CALL are numbered 1, 2, 3, 4, 5 and those numbers are hard-coded in the code that follows. Getting one of those numbers wrong can wreak havoc...

The four sections of code above, starting at lines 119, 128, 133 and 138 respectively, show the special SQL-Anywhere-specific code that must be written to return data to the four output parameters in the SQL CALL.

The calls to set_current_timestamp use the DATEADD function to do the date arithmetic. Inside the function, the xp_cmdshell function is used to execute the ECHO, DATE and TIME commands (this has only been tested on Windows XP):

Speaking of date arithmetic, here's a snippet of code from last week's posting MobiLink Fall Back that's worth explaining:

-- If the current timestamp and the last download timestamp -- are both within the "fall back hour", set the last download -- timestamp back to 1:00 AM to catch all changes made during -- the two hour period between 1 and 2 AM.