... Those pesky null things ...

Null values are permissable when creating tables in certain data structures. I have never had occasion to use them since I personally feel that all entries should be coded with some value which is either:

a real observation,

one that was missed or forgotten,

there is truly no value because it couldn't be obtained

other

Null, None etc don't fit into that scheme, but it is possible to produce them, particularly if people import data from spreadsheets and allow blank entries in cells within columns. Nulls cause no end of problems with people trying to query tabular data or contenate data or perform statistical or other numeric operations on fields that contain these pesky little things. I should note, that setting a record to some arbitrary value is just as problematic as the null. For example, values of 0 or "" in a record for a shapefile should be treated as suspect if you didn't create the data yourself.

NOTE: This post will focus on field calculations using python and not on SQL queries..

List comprehensions, truth testing and string concatenation can be accomplished in one foul swoop...IF you are careful.

This table was created in a geodatabase which allows nulls to be created in a field. Fortunately, <null> stands out from the other entries serving as an indicator that they need to be dealt with. It is a fairly simple table, just containing a few numeric and text columns.

The concat field was created using the python parser and the following field calculator syntax.

I have whipped in a few extra unnecessary spaces in the first expression just to show the separation between the elements. The second one was just for fun and to show that there is no need for one of those murderous one-liners that are difficult to edit.

So what does it consist of?

a join function is used to perform the final concatenation

a list comprehension, LC, is used to determine which fields contain appropriate values which are then converted to a string

each element in a list of field names is cycled through ( for i in [...] section )

each element is check to see if it meets the truth test (that is ... if i ... returns True if the field entry is not null, False otherwise])

if the above conditions are met, the value is converted to a string representation for subsequent joining.

You can create your appropriate string without the join but you need a code block.

Simplifying the example

Lets simplify the above field calculator expression to make it easier to read by using variables as substitutes for the text, number and null elements.

List comprehension

>>> a =12345;

b = None;

c ="some text";

d ="";e ="more"

>>>" ".join([str(i)for i in[a,b,c,d,e]if i])

One complaint that is often voiced is that list comprehensions can be hard to read if they contain conditional operations. This issue can be circumvented by stacking the pieces during their construction. Python allows for this syntactical construction in other objects such as lists, tuples, arrays and text amongst many objects. To demonstrate, the above expression can be written as:

Stacked list comprehension

>>>" ".join([ str(i)# do this...for i in[a,b,c,d,e]# using these...if i ])# if this is True'12345 some text more'>>>

You may have noted that you can include comments on the same line as each constructor. This is useful since you can in essence construct a sentence describing what you are doing.... do this, using these, if this is True... A False condition can also be used but it is usually easier to rearrange you "sentence" to make it easier to say-do.

For those that prefer a more conventional approach you can make a function out of it.

Just for fun, let's assume that the values assigned to a-e in the example below, are field names.

Questions you could ask yourself:

What if you don't know which field or fields may contain a null value?

What if you want to flag the user that is something wrong instead?

You can generate the required number of curly bracket parameters, { }, needed in the mini-language formatting. Let's have a gander using variables in place of the field names in the table example above. I will just snug the variable definitions up to save space.

Function: no_nulls_mini

defno_nulls_mini(fld_list): ok_flds =[ str(i)for i in fld_list if i ]return("{} "*len(ok_flds)).format(*ok_flds)

>>> no_nulls_mini([a,b,c,d,e])'12345 some text more '

Ok, now for the breakdown:

I am too lazy to check which fields may contain null values, so I don't know how many { } to make...

we have a mix of numbers and strings, but we cleverly know that the mini-formatting language makes string representations of inputs by defaults so you don't need to do the string-thing ( aka str( ) )

we want a space between the elements since we are concatenating values together and it is easier to read with spaces

Now for code-speak:

"{} " - curly brackets followed by a space is the container to put out stuff plus the extra space

*len(ok_flds) - this will multiply the "{} " entry by the number of fields that contained values that met the truth test (ie no nulls)

*ok_flds - in the format section will dole out the required number of arguments from the ok_flds list (like *args, **kwargs use in def statements)

Strung together, it means "take all the good values from the different fields and concatenate them together with a space in between"

Head hurt??? Ok, to summarize, we can use simple list comprehensions, stacked list comprehensions and the mini-formatting options

a conventional function, requires the empty list construction first, then acceptable values are added to it...finally the values are concatenated together and returned.

And they all yield.. '12345 some text more'

Closing Tip

If you can say it, you can do it...

list comp = [ do thisif thiselse this using these]

list comp = [ do this # the Truth result

if this # the Truth condition

else this # the False condition

for these# using these

]

list comp = [ [do if False, do if True][condition slice] # pick one

for these # using these

]

A parting example...

# A stacked list comprehensionouter =[1,2]inner =[2,0,4]c =[[a, b, a*b, a*b/1.0]# multiply,avoid division by 0, for (outer/inner)if b # if != 0 (0 is a boolean False)else[a,b,a*b,"N/A"]# if equal to zero, do thisfor a in outer # for each value in the outer listfor b in inner # for each value in the inner list]for val in c:print("a({}), b({}), a*b({}) a/b({})".format(*val ))# val[0],val[1],val[2]))

Impact Metrics

Dan, thank you for all this. I have been wrestling with <null> values in Geodatabase tables for awhile and still have not totally cracked that nut.

I took your field calculator code and modified it to fit my fields.

I selected the Python parser and pasted that code string in the lower box so that ST_Code=

"".join([str(i) for i in [!ST_Orig!,!ST_Type!,!ORIG_Yr!,!HARV_Cd!,!HARV_Yr!] if i])

It works with the exception of one niggling thing: If ST_Orig is <null> or blank, the process completes but puts a leading blank space in front of the calculated string. In the old VB code, I used a trim() function to get rid of spaces in the concatenated string. Is there something similar in Python?

It isn't ST_Orig being NULL or having an empty string ("") that is causing the leading space. A NULL will return None, and the way that str.join() works it won't add a space using your existing code:

>>> l =[None,42,1940,0,1999]>>>"".join([str(i)for i in l if i])'4219401999'>>>

If ST_Orig has a "blank" string, i.e., one that has spaces or tabs so it isn't empty, then you might get leading spaces in front of the calculated string.

In the end, you will need to use strip as Blake points you too, but I think it is important for you to understand where and why the leading space is getting introduced rather than just stripping it at the end.

Just stumbled on this thread today as I've been instructed to (re) populate <null> values with blank ( fieldname = "" ). I'm going to agree with Dan's [partial] statement: I personally feel that all entries should be coded with some value but augment it with including the <null> value

When I see a blank field I never know if field_value = '' or ' ', or ' ', or ' '..... You get the picture. If I see <null> that gives me something I can work with. The query IS NULL is just more pleasant than Like ' *' or any other configuration one come up with to find empty fields.

I was just in a meeting where the major topic was data sharing and collaborative efforts. Very trendy topics these days and the suits were busy patting each other on their backs about it. I've been on the receiving end on more than one of these sorts of projects and can honestly say that data clean up is a major time consumer; <null> values make that part of the project way easier....