Elegant PowerBuilder Code

The following tips are for PowerBuilder developers who want to be the best of the best :-). If you think that it will take only a few minutes to read all the stuff, then you are... wrong! Why? Because this page is only the tip of the iceberg - it describes only topics, specific to PowerBuilder, but there are also rules and recommendations addressed to developers using any programming language, so please read "the rest of the iceberg" too:

Local Variables Declaration

Declare local vars in the beginning (on the top) of the method, before the first executable line.

This will:

make it easier to detect all the variables, used in the method, and to follow them;

keep as little stuff as possible in executable code fragments, where programmers should concentrate on business logic.

In PowerBuilder (in contrast to Java), the local variable declaration is not an executable command: the memory is allocated on the stack exactly in the moment when the function is called - together with its arguments. So, it doesn't make sense to declare a variable inside an if construction in hope to improve performance - the declaration will occur in any case, even when the program flow will not go into the if.

Use CHOOSE CASE TRUE for a series of stand-alone validations

Use one compact choose case true block to "pack" a series of Boolean expressions and/or Boolean functions.

So, instead of

if not IsValid(ads) then
ls_err_msg = "Passed datastore is not valid"
end if
if ls_err_msg = "" and IsNull(al_row) then
ls_err_msg = "Passed row must not be null"
end if
if ls_err_msg = "" and al_row < 1 then
ls_err_msg = "Passed row must be greater than 0"
end if

Indentation in CHOOSE CASE

Avoid an unneeded indenting inside of choose case construction.

Speaking about choose case-es, let's mention the rule to avoid extra indenting. In the previous code example, I didn't put a tab before case not IsValid(ads) (and the rest of the cases), and that allowed me to use only one tab (instead of two) in the next line (ls_err_msg = ...): logically, this line has the same nesting level as if it would appear in an if construction, so why to use MORE tabs to express a SAME level? This style is looking pretty shocking in the beginning, but try it in your practice, and you will really enjoy more understandable scripts, especially when choose cases are nested (or mixed with ifs and loops).

Change objects' default names

Make sure all the objects have meaningful names instead of the default ones (like dw_1, cb_1).

PB helps us with that automatic naming, but sometimes developers forget to rename new objects to something more self-explanatory (like dw_order_header, cb_add_order_line).

If your specific window is inherited from an ancestor having its own objects under universal, but ugly names like dw_1, you physically cannot change these names in the descendant. But there is another simply solution to keep the scripts nice - for example, the next two steps let you get rid of the name dw_1 in your scripts:

In the descendant window, declare an instance variable of DataWindow type (or a type, used in your framework as the base DW class) and with a meaningful name (for example, idw_students).

In the Open event, make that variable pointing to the DW:

idw_students = dw_1

From this moment on, only idw_students must be used in the scripts - forget about dw_1!

Technical code vs. business code

Don't mix error processing (which is a purely technical code) with business logic.

"...and God divided the light from the darkness."

The First Book of Moses

The problem, described here, can appear if functions return succsess/failure code instead of throwing exceptions. So, if exceptions are utilized in your project then you can skip this paragraph.

Developers, reading the script, should concentrate
only on the business issues. They don't have to investigate what is an error
processing code (which is not interesting and should be automatically
skipped) and what is the "interesting"
"meat" of the program. In the following example, the method uf_get_orders_count()
returns 0 (no orders found) or a positive number of found orders, and
different business logic should be applied in these two cases; but
the method can also return -1 which means failure to obtain the data
(error) - the last is a very boring technical stuff which must be
taken away from the business issues. See the next patterns and don't
forget that the code fragments, appearing here as a couple of words
in square brackets, really may be huge fragments of code!

li_orders_count = this.uf_get_orders_count()
// The next 'if' block is a standard technical code, so script readers will skip it automatically:
if li_orders_count = -1 then // error :-O
[code fragment for banal error processing]
return -1
end if
// Here begins the "interesting" business logic:
if li_orders_count = 0 then // no orders found... :-(
[code fragment for "No Orders Found" business scenario]
else // order(s) found!!!
[code fragment for "Order(s) Found" business scenario]
end if

Return policy

Functions must return values to the calling script only when it makes sense.

A function is allowed to return a value using return statement only if at least one of the following conditions is true:

The main purpose of the function is to obtain the value, and there is only one value to return.

The main purpose of the function is to perform some action, but the returned value is important for the calling script (not for error processing - there are exceptions for that!), and there is only one value to return - for example, the main purpose of uf_retrieve is to retrieve data from the database, but, in addition, it returns the number of retrieved rows so the calling script is more efficient because it doesn't need to call RowCount().

A function must NOT return a value using return statement (i.e. "(none)" must be selected as the returned type in the function's signature) if at least one of the following conditions is true:

The function's purpose is to perform some action, and there is nothing useful to return to the calling script.

The function (of any purpose) must return more than one value - they all (!!!) must be returned using ref (out) arguments. It is considered a very bad programming style if both the mechanisms are used - return statement AND by-reference arguments!

You can ask: what is the problem to have meaningless, but harmless return 1 at the bottom of the script? Nothing catastrophic, but the return value is a part of the function's contract with the outer world. Each detail of that contract is important and must have sense! Looking at the function's interface, developers make conclusions about its functionality, so if a value is returned, that has a reason and the returned value should be somehow processed in the calling script... You know, it's like adding to a function of one extra argument of type int which is not used inside, and always passing 1 through it - that argument will be harmless and not catastrophic too, but needless and foolish in the same way as the discussed return 1.

Use REF keyword

When you pass actual arguments to a function by reference, always add ref.

In fact, this short keyword is playing the role of a comment:

dw_main.GetChild("status", ref ldwc_status)

It really helps to understand scripts, especially when calling functions have multiple arguments in both the logical directions - "in" and "out". It was a bad solution of PB creators to make ref an optional keyword - let's make it required of our own free will!

Constants used all over the application

Constants, used in the application globally, must be declared in not-autoinstantiated NVOs - one NVO for each constants group ("family"). Don't declare variables of the NVO's type; instead, use the automatically created global variable with the same name as the NVO.

For example, you can have an NVO named n_color for color constants, n_order_status for order statuses constants etc. It's a good idea to keep them all in one PBL so developers can easily check if the NVO for the group they need already exists (to avoid creation of a duplication). The constants are declared in the instance variables declaration section in a very straightforward way. Here is a possible set for n_order_status:

If you open any not-autoinstantiated class in the "View Source" mode then you can see that PB automatically creates a global variable of that object's type with the same name as the class (violating the naming convention, ha-ha!). For example, if you create n_order_status class (not-autoinstantiated, remember?) and look at its source, you see the following line:

global n_order_status n_order_status

The first appearance of n_order_status is the data type, and the second - the name of that global variable. So, you don't need do declare a variable each time you need to use a constant - you always have one ready for use! That's how you mention a constant in the code:

if ls_order_status = n_order_status.OPEN then...

The created NVO must contain constants for ALL the codes of the group, even if now you need only some of them - that will help developers to see the whole picture for the given family. If the codes are stored in the database then it's a good idea to add a comment with the SQL SELECT retrieving them.

For standalone constants (not belonging to any family) create one common NVO named, let's say, n_const.

Constants serving local needs

Build names of local and instance (used only in a particular object, NOT all over the application) constants from 2 parts:

Constants group, or "family" (like ORDER_STATUS or INV_STATUS);

Description (like OPEN or CLOSED).

Divide them by two underscores (in contrast to one underscore dividing words inside of each part) by this pattern: FAMILY_PART__DESCRIPTION_PART. That will help code readers to easily distinguish between the parts.

That approach allows many constants with a same description to co-exist in a same scope, for example:

Null constants

Have a ready set of nulls (of different data types) instead of using SetNull() each time you need a variable with null.

Sometimes we use variables, which are set to null - for example, to pass null to a function, or, oppositely, to return null from a function. Usually developers declare a variable and nullify it with SetNull() in the same script where they are used, but these two lines (really, hundreds lines all over the project!) can be avoided if you declare constants of different data types in a globally-accessed object. Technically, PowerBuilder doesn't allow to initialize constants with NULL, so we cannot use the method, described in "Constants used all over the application" section (i.e. to create an object n_null with a set of NULL constants). Instead, we will utilize any class, a global object of which is created in the application (an object, providing different technical services - like n_util - is an ideal candidate) and declare a set of public instance variables using privatewrite access modifier. Then we will nullify them using SetNull() in the Constructor event. Thus, we will have physical variables which logically acts as constants.

So, the first step is to declare the NULL vars in the instance variables declaration section of the selected class:

Change default access level of class members

Don't leave the class members public as they are created by default - change the access level to reflect their practical use.

Remember, that PowerBuilder's instance variables and functions are public by default. Methods access can be changed to private or protected using the dropdown in the declaration section, and instance vars must be "privatized" by adding the private or protected keyword before each var, or the same keyword with a semicolon before a group of vars. Pay attention - if no scope identifier is mentioned explicitly then the vars are public (in contrast to C#, where "no scope identifier" means "private")! IMHO, it was a bad solution of PB authors (everything should be private by default and be changed only if a need arise!), but that's what we have... So, when you are creating a new instance variable or a function, and your brain is busy thinking about the functionality being developed, don't forget to change the access level to what it should be!

Why is this advice important? We can see many PB applications with all functions public, and nothing bad is happening... I would say, it is not too important in the development stage, but when the maintenance time comes... Let's say, I have to add and even change functionality of an existing object. If the stuff I am touching is private then I know that my changes will not cause straightforward damage in other (sometimes unexpected) parts of the application - for example, I can change the number of parameters or even the name of a private function without having a chance to do any harm to other objects. But if the function is declared as public then I need to perform the Global Application Search (which is extremely slow in PowerBuilder, especially in serious apps) to make sure the function is not called outside the object (and if it is called then, may be, to change the technical solution).

Law of Demeter (or Principle of Least Knowledge)

A given object should assume as little as possible about the structure or properties of anything else (including its subcomponents) (Wikipedia).

In simpler words: if you want to get data from an object, referenced by a variable, then use a public function, declared in the referenced object itself, but not variables and functions of its nested object(s).

*** BAD code: ***

li_age = lw_AAA.tab_BBB.tabpage_CCC.dw_DDD.object.age[1]

*** GOOD code: ***

li_age = lw_AAA.uf_get_age()

In both the cases we have the same result - the variable li_age is populated with a same value. But there is a huge difference if we are talking about possible dangers! In the BAD code, the calling script inserts its "dirty hands" into internal mechanics of another object. In the GOOD code, only the referenced object itself is responsible for its internal affairs. If the age's storing method in the window has changed (for example, now it will be calculated based on the birth year using a computed field "c_age"), it's enough to change only uf_get_age() function's implementation (with no change to the interface), and all the calls to that function all over the application will keep working.

The described issue doesn't exist if data is properly encapsulated, but PowerBuilder has serious problems in this area: while instance variables can be declared as private or protected, on-window objects (including DataWindows whith their data) are always public. So it's enough to have a pointer to a window to access its DWs. It's possible technically, but let's avoid that!

Don't DESTROY objects

Don't destroy objects if that requires extra efforts. Don't waste time for memory management if a created object is passed to another function or object - trust on garbage collector.

In our days, there is no need to destroy previously created objects - the garbage collector will clean the memory for you. GC is a great invention! Now we are not afraid of visitation of God if we disobey the rule "destroy in the same script where created" - we simply don't think about destroys at all, concentrating on more important stuff. We can pass the created object to another function (even POSTed!) or object (for example, using OpenWithParm) and forget about memory leaks! But garbage collection also helps to make code more elegant and even more efficient. How? In a few ways:

If you destroy an object in your code - it means that the memory is being freed just when the program flow comes to the destroy command. Are you sure the time must be spent on this operation exactly in this moment? And if your whole script performs a time-consuming operation? And if you destroy a few objects? And if you check with IsValid before destroys? Oppositely to all that, GC will free the memory when the application is inactive, so the user will not experience any decline in performance.

Less lines of code (sometimes we had a number of destroy commands, ornamented with if IsValid...).

Before the garbage collection era, if we wanted to exit a function a few times (i.e. to write return in different parts of the script), we had to destroy our garbage objects before EACH return. If we wanted to write all the destroys and the return only once (in the end of the script), then we had to keep the track with a specially created local Boolean success/failure flag - that meant not only more lines of code but also one more tab of indentation.

No global variables and functions

Never create global variables and functions. A function, accessible from everywhere, must be created as a public functions of an NVO.

They are an atavism survived from the early days of programming. Modern technologies (like Java and .NET) don't have these obsolete features at all. PB has them only for backward compatibility, so don't create new ones (there is only one exception - global functions, used in DW expressions if other solutions are more problematic).

All developers, using the Object-Oriented approach, know about encapsulation, so usually there are no questions about global variables - they are an "obvious evil". But what is so bad with global functions? If you have a small, simple function, then making it a public method of an NVO (instead of a global function) seems to bring no advantage at first glance. BUT...

If you will want to extract a part of the function's code into a new function (according to the principle of creation a subprogram for each logical task or simply because the script has become too long) - it will be impossible without creating another global function. And if you will need 20 such functions in the future? So, you have two bad choices: to create an additional global function or not to create it (in the last case the script will be left as a long, unreadable and hardy managed buggers muddle). But if you have created an NVO as a container for your function (which is declared public), then you can add to that NVO any number of additional "service" methods (private if they are not intended to be called from outside).

If you need to create a number of different functions, related to a same task/flow, putting them in one NVO will not only decrease the quantity of objects in the PBL, but will also signal that they are somehow related to each other. It is definitely better than a PBL overloaded with crazy mix of tens or even hundreds of global functions, belonging to different logical units.

The programmed process may require to store data (for example, between calls to the function, or to cache data, retrieved once, for multi-times use by different consumers over the application). If a global function is used, your bad choices are global vars and using another NVO (in the last case you will have related stuff in different locations). But if you have created the function as an NVO method, then there is no problems - declare instance variables (as well as constants for safer and more elegant code).

Avoid SQL

Don't write SQL in PowerBuilder.

Why - read here about the concept of a SQL-free application. Now I only want to suggest replacement for different kinds of SQLs in PB:

Bad, archaic solution

Suggested replacement

Remarks

Cursor, embedded to PowerScript

DataStore

PB cursors are very inefficient, they are not allowed at all and must be replaced in old code.

As an exception from the rule, SQL-based DWs are allowed for basic database operations (when no business logic is involved) with static (reference) tables (for example, to manage entities' statuses etc.).

If you decide (or forced by company standards) to ignore this advice and embed SQL into your PowerScript code, then at least encapsulate each SQL statement in an apart PB function. The reason? Nothing new.

Don't duplicate conditions in Boolean expressions

Don't add to a Boolean expression conditions which don't change the final result, produced by the expression.

Foolish advice, you say? Why to add them? But this advice has been born during my code inspection practice. Look at the following nice examples:

if (not IsNull(ld_expire_date)) and ld_expire_date < ld_today then

is logically the same as

if ld_expire_date < ld_today then

because when ld_expire_date contains null then the second part of the expression will produce null, and nulls are interpreted by code branching operators as false.

if dw_xxx.RowCount() > 0 and dw_xxx.GetSelectedRow(0) > 0 then

must be re-written as

if dw_xxx.GetSelectedRow(0) > 0 then

because when dw_xxx contains no rows, dw_xxx.GetSelectedRow(0) will never return a positive number.

if (not IsNull(ids_xxx)) and IsValid(ids_xxx) then

is the same as

if IsValid(ids_xxx) then

because if IsValid returns true (i.e. memory is allocated for the pointed object) then, of course, the variable is pointing something (i.e. is populated), so testing it with IsNull will produce true.

if ll_row_count > 0 then
for ll_row = 1 to ll_row_count
[...]
next
end if

will work exactly in the same way as simply

for ll_row = 1 to ll_row_count
[...]
next

Forget the keyword DYNAMIC

Don't call functions and events dynamically. Instead, cast to the needed data type (copy the pointer to a variable of the class the called function/event is defined in) and do a static call.

Usually, that approach makes the code a little bit longer (sometimes requiring a whole choose case block instead of a single line of code), but the advantages, listed below, are more important:

Type-safety. A non-existing script will never be called after possible changes in the application in the future. Instead, a compilation error will force developers to find a solution.

Clarity to code readers. Developers immediately see the true picture. And, oppositely, dynamic calls hide the whole situation and require guessing or an investigation for understanding.

Remember that calling events with TriggerEvent and PostEvent is also dynamic, so use the modern dot-notation syntax instead (like dw_main.event ue_after_open()).

Pass inter-objects parameters by name

Passing parameters between objects, use one of methods where parameters are accessed by a mnemonic name, not by a meaningless index of an array.

...as it unfortunately happens when programmers decide to use an NVO or a structure with arrays having numeric (by index) access to the elements. Here is an example of that very BAD approach:

The index-by approach is extremely inconvenient and terribly bugs-prone!!! Imagine, that a window can be open from different places in the application, with a similar, but unique sets of parameters in each case. And even more: some of these sets (or all) can be changed in the future... Brrrrr!!!!.

There are a few methods to pass parameters where they are accessed by a mnemonic name.

No need to create tens or even hundreds of custom structures or NVOs for each individual case, so the PBLs will be less crowded.

Less maintenance efforts - you don't have to check out, change and check in that class when passed sets of data are changed - you only change the sending and the receiving scripts.

Disadvantage:

Less convenience working with nested data structures (you have to unpack passed structures and classes into variables before using their members).

Avoid row-by-row scanning

Don't process DataWindows and DataStores row by row if the task can be accomplished in another, more efficient way.

For example:

When you are looking for a value (or values combination), use Find function instead of row-by-row comparison. If you want a row (for example, the current row) not to be searched (it can happen when you want to check if there are more rows with the same value(s) as in the current row), call Find twice: from the first row to the before-skipped row, and then from the just-after-skipped to the last row.

If there can be more than one row to be found then use a loop, but iterate using Find(), without scanning ALL rows:

The function Describe() can be useful - for example, the following code obtains the maximum Effective Date of all rows:

ld_eff_date = Date(ids_pp.Describe("Evaluate('Max(eff_date)',1)"))

Use a computed field instead of making the calculation in a code and assigning the results to columns in each row.

If you need to assign a same value to a column in ALL the rows (like a coefficient another column should be multiplied or divided by), make that column a computed field with a very simple expression - "1", and change that expression programmatically. Suppose, a variable ll_coef_to_divide contains the result of a calculation in PB code. Here is the bad (not efficient) solution (assuming that the field is not a computed but exists in the DW's data source):

for ll_row = 1 to ll_row_count
dw_XXX.object.coef_to_divide[ll_row] = ll_coef_to_divide
next

To make the assignment at one stroke, the field should be a computed one. The value, returned by it, is assigned this way:

Don't implement functionality in built-in events

If the functionality is longer than a few lines then don't implement it in system (built-in) events (Clicked, ItemChanged etc). Instead, create well-named functions and call them from those events.

Advantages of this approach:

The functions' names will tell us what the functionality is.

If a built-in event should perform more than one logical task then implementing each task in its own function will conform to the best coding practices. It's ok to call two or three function from a system event, but if the script grows and begins to have logic then... it becomes "implementing functionality" by itself and, therefore, needs to be extracted into a function!

One day that functionality can need to be called from another place. Calling a well-named function is not only more elegant (you shouldn't call GUI events unless you're using call super from a descendant script) but also enables developers to add arguments - today the required functionality is identical, but sooner or later it can branch.

Someone will try to implement something in the ancestor ButtonClicked, thinking it's only going to fire when there's a button clicked. Then you'll end up with some spaghetti solution to keep track of whether this is a non-button-clicked ButtonClick... Ugly!

Different DataWindow objects (buttons, fields) require different processing in such events as Clicked and ItemChanged, so there is no reason to put their processing (not related to each other) in one script. In fact, these events are only traffic hubs for routing the program flow to different functions using choose case construction (except the cases when the processing code is simply a few lines).