kdbfaqhttp://www.kdbfaq.com/kdb-faq/Answers. Fast.Thu, 29 Oct 2015 05:05:39 +0000This work is licensed under a Creative Commons License.en-USSquarespace V5 Site Server v5.13.478-267 (http://www.squarespace.com)kdbfaqAnswers. Fast.kdb book onlinekdb,kdbfaq,q,kdb/q,kdb+,kdb,book,onlinekdbfaqfeedback@kdbfaq.comHow does fby work?SQL HAVINGSyntaxamendfbyflilter-bykdbfaqWed, 21 Nov 2012 20:17:26 +0000http://www.kdbfaq.com/kdb-faq/how-does-fby-work.html817816:9603847:30749085According to code.kx.com, fby is short for filter-by, and it is commonly used as the q equivalent to SQL's HAVING clause (though, like where, fby is a q function, and its use is not limited to where clauses).

What fby does is aggregate values from one list based on groups defined in another, parallel, list. For example, suppose we have one list of cities and another list with a few temperature samples for each. We can use fby to calculate the minimum temperature sample for each city, and then replicate those values at each position for each corresponding city:

We can group the temperatures for each city together by indexing temp with the value of the grouped city dictionary:

q)grouped: value group city
q)temp[grouped]
32 31 12
75 70
69 68
q)

Note that the result of indexing temp with grouped is a nested list with the same shape as grouped. This is a general principle: the result of an indexing operation has the shape of the index.

Now we can apply an aggregation function to each of the temperature groups:

q)min each temp[grouped]
12 70 68
q)

We're almost there. The real trick of fby is placing each aggregation result into a new list so that each element has the correct value per the grouping list. We can use @ (functional amend) to get the job done (see also the functional apply/amend faq):

The real fby is just slightly more complicated to ensure that the first argument to @ has the correct type.

]]>http://www.kdbfaq.com/kdb-faq/rss-comments-entry-30749085.xmlWhat does it mean for a table to be a flipped dictionary?K object structureTablesdictionarydictionary to tabledictionary typeflipk.hkeykeyed tablekeyslist to tabletabletype numberkdbfaqMon, 29 Oct 2012 01:52:39 +0000http://www.kdbfaq.com/kdb-faq/what-does-it-mean-for-a-table-to-be-a-flipped-dictionary.html817816:9603847:12561136Short answer:

A table is a reference to a (column) dictionary.

The internal representation of a table is nearly identical to that of a dictionary. We can use the flip command to create a table from a dictionary:

In fact, when a dictionary is flip'ed, the underlying core data structure remains untouched. The table itself is a simple, small object that refers to the original dictionary. Using .Q.w, we can measure how much more memory a table takes than the corresponding dictionary:

Then all you need to do is to apply the process handle to a list whose first element is the function name (as a symbol) and whose remaining elements are the arguments.

For example, let's start one q process, which will be our server, listening on TCP port 5012. In our server, we'll define a function square (we'll make the background color different for the server to make it easier to distinguish from the client):

To call a function with more parameters, simply add them to the end of the list. We'll demonstrate by defining a 2-argument function on the server that calculates the length of a right triangle's hypotenuse:

q)hypotenuse: {sqrt sum square x,y}
q)hypotenuse[3;4]
5f
q)

q)phandle (`hypotenuse; 5; 12)
13f
q)

What if the function you're calling doesn't take any parameters? For example, we'll define a function in the server called serverTime that returns the local time according to the server:

q)serverTime: {[] .z.T}
q)serverTime[]
11:51:34.762
q)

You can't pass zero parameters over IPC:

q)phandle enlist `serverTime // a list with just
'length // the function name
q) // is not allowed

So far, all of our examples involved a client invoking a predefined function on the server. You can also pass a function defined on the client to be executed on the server. To see this, let's define a global variable on the server:

q)SERVERGLOBAL: 47
q)

Now, on the client, we'll define a function called getSERVERGLOBAL to retrieve the value of SERVERGLOBAL on the server. Instead of passing the name of the function (i.e., `getSERVERGLOBAL), we pass the function's value:

There is one more way to convey code to the server to run: you can pass the code in a string.

q)phandle "SERVERGLOBAL + 4"
51
q)

We prefer passing a function over a string, because - especially as the expression to be passed gets more complex - it's easier to read.

]]>http://www.kdbfaq.com/kdb-faq/rss-comments-entry-17318533.xmlHow do I use the functional forms of apply and amend?.@Functionsamendapplyfunctional form amendfunctional form applyprotected executionkdbfaqMon, 27 Feb 2012 03:27:37 +0000http://www.kdbfaq.com/kdb-faq/how-do-i-use-the-functional-forms-of-apply-and-amend.html817816:9603847:12386134Typical q code operates on all of the elements of a container at once:

q)container: 1 2 3
q)100 * container
100 200 300
q)

Sometimes we are interested in only a subset of elements from a container:

Sometimes, however, you need to update particular elements of a structure while leaving the remaining elements unchanged. That's what functional apply and amend are for; they transform specific elements of a container without touching the others. The variations are distinguished by 3 choices:

Which operator is used, @ or ., determines the interpretation of the indices used to select the elements to transform.

2. container or name

The first argument is either the value of a container or the name of a global variable referring to a container. In the former case, a new object is returned; in the latter, the global variable is modified and its name is returned.

3. monadic or dyadic function

If the transformation requires additional information beyond that contained in each element itself, that is accomplished by using a dyadic function and supplying the additional information in a fourth argument to the operator.

Lastly, note that there is another pair of overloads for @ and . - each with three arguments - called protected execution, which are invoked when the first argument to @ or . is a function or projection; protected execution is discussed in another faq.

]]>http://www.kdbfaq.com/kdb-faq/rss-comments-entry-12386134.xmlHow do I use the functional form of @ (at) apply?@Functionsamendapplyfunctional form applykdbfaqMon, 27 Feb 2012 03:26:40 +0000http://www.kdbfaq.com/kdb-faq/how-do-i-use-the-functional-form-of-at-apply.html817816:9603847:12582965Short answer:

(Note that there is another overload for @ with three arguments, called protected execution, which is invoked when the first argument to @ is a function or projection; protected execution is discussed in another faq.)

In the 3-argument case, q applies function to those elements of container specified by indices, leaving the rest of container alone. In other words, the behavior of 3-argument @ (apply) resembles that of the following function:

Although this model breaks down when the first argument is a global variable name, it is very helpful in understanding what is going on even in that case. Let's look at a couple of simple examples to clarify this:

Next, we'll consider the second case listed at the top of this faq. When function is dyadic and a fourth argument is supplied, @ (apply) behaves like the following function:

The basic idea is still the same, i.e., to transform selected elements while leaving the rest of the container intact. The difference is that, instead of function being modified by each, it is modified by '(each-both) (also see this faq on each-both and multivalent each), so that the selected elements of container are paired up with the corresponding elements from second_args:

Where our models do break down is when, as alluded to earlier, the first argument is the name of global variable referring to a container, rather than the value of a container. In that case, the mechanics of the operation are the same, but original container is modified, and the return value is the name:

q)list: 1 + til 8
q)@[`list; 0; neg]
`list
q)list
-1 2 3 4 5 6 7 8
q)

This behavior is handy for writing functions of your own that work in both scenarios - either creating a new value or modifying one in place - like the following:

We can also use @ to apply a function to selected entries in a dictionary:

Since we can index tables using column names, we can double all of the entries in col2 as follows:

q)@[t; `col2; 2*]
col1 col2
---------
foo 10
bar 20
baz 30
q)

]]>http://www.kdbfaq.com/kdb-faq/rss-comments-entry-12582965.xmlHow do I use the functional form of . (dot) apply?.Functionsapplydotfunctional formkdbfaqMon, 27 Feb 2012 03:20:00 +0000http://www.kdbfaq.com/kdb-faq/how-do-i-use-the-functional-form-of-dot-apply.html817816:9603847:12584888Short answer:

(Note that there is another overload for . with three arguments, called protected execution, which is invoked when the first argument to . is a function or projection; protected execution is discussed in another faq.)

In the 3-argument case, q applies function to those elements of container specified by indices, leaving the rest of container alone.

The behavior of . (apply) is very similar to that of @ (apply) (which is described in another faq). The only difference between @ (apply) and . (apply) is that the indices, in the case of . (apply), are applied at depth along the dimensions of the container. The following code behaves like . (apply) for a two-dimensional container:

To explore the different treatment of the indices argument between @ (apply) and . (apply), we'll consider the simplest, multi-dimensional container: a two-dimensional list, aka a matrix.

The . form of apply has another trick up its sleeve: the empty list index. When the second parameter to . (apply) is (), the entire container is passed to function in a single invocation, as the following example demonstrates:

When using . (apply) in this manner, we can return anything from function; the type and shape do not matter:

q).[matrix; (); {"hello"}]
"hello"
q)

Returning to the second case, when function is dyadic, . (apply) takes a fourth argument (named second_args in our example), and each indexed element of container is paired with the corresponding element of second_args. This means that second_args must conform to indices (or be an atom). The following function expresses this behavior:

]]>http://www.kdbfaq.com/kdb-faq/rss-comments-entry-12584888.xmlHow does each-both (aka multivalent each) work?Operatorsapply function for each table roweach bothmultivalent eachkdbfaqSun, 26 Feb 2012 21:34:00 +0000http://www.kdbfaq.com/kdb-faq/how-does-each-both-aka-multivalent-each-work.html817816:9603847:14777066Short answer: It takes a function of N operands and creates a new function that applies the original to each set of corresponding elements from N same-length vectors.

We can categorize the ways in which q handles vectors into the following three groups:

1) Treat the vector as a single entity.
2) Perform an operation on each element of a vector.
3) Perform an operation on each set of corresponding elements from multiple vectors of the same length.

We use ' (each-both) to ask q to apply a function in the third manner.

Let's review a few examples to distinguish the three forms of vector handling described above. Consider the , (join) operator, which takes two arguments and concatenates them. When used by itself, , (join) treats a vector as a single entity:

Notice that the lengths of the two vectors to be joined doesn't matter; in fact, they don't have to be vectors.

Next, as an example of applying the same operation to each element of a vector, we can use \: (each-left) to append the letter "e" to each element in a vector of strings:

q)("car"; "far"; "mar") ,\: "e"
"care"
"fare"
"mare"
q)

Lastly, we'll use ' (each-both) to join corresponding elements from two vectors of the same length (like zip from Python or Haskell):

q)1 2 3 ,' 10 20 30
1 10
2 20
3 30
q)

As an aside, the use of ' (each-both) is not needed for many built-in functions in q, because those functions (sometimes referred to as atomic functions in kx documentation) automatically assume that this is the desired behavior when presented with same-length vectors as arguments. The + operator is a good example:

q)1 2 3 + 10 20 30
11 22 33
q)

Even a user defined function, if that function's body is exclusively made up of applications of atomic functions, does not require the use of ' (each-both) to display this behavior:

The last example shows that such user defined atomic functions are truly atomic in the eyes of q.

The following code simulates the behavior of ' (each-both) - on functions of two arguments - by creating a new function (that acts on vectors) from the function passed as an argument:

However, ' (each-both) doesn't stop at two lists, which is why it is sometimes called multivalent each. Suppose we need to generate a bunch of html hyperlinks for a web page we are creating on-the-fly (perhaps in a custom http POST handler). Each link needs to have its own styling, so we want to add a distinct class attribute to each anchor tag.

On more wrinkle: sometimes ' (each) is used where \: (each-left) or /: (each-right) would be more explicit. For instance, our earlier example in which we appended the letter "e" to each element of a list of strings could have been written as follows:

When trying out ideas at the console as in the above example, the full details of the error are available on the screen. Getting those error details programmatically is a bit trickier, however. Consider the following function, in which we use protected execution to invoke an error handler when the 'os signal is raised:

Like other built-in functions that take two arguments, : (amend) can be called using either infix notation (as above) or function call notation:

q):[foo; 747]
q)foo
747
q)

Note that : (amend) displays its special semantics (which it shares with assignments in all strict languages) of not evaluating its first argument when that argument is simply a name, regardless of whether : (amend) is invoked infix or functionally.

]]>http://www.kdbfaq.com/kdb-faq/rss-comments-entry-12585799.xmlWhat is q's equivalent to sql's ORDER BY?ORDER BY ASCORDER BY DESCTablesascdescorder bysortxascxdesckdbfaqSun, 07 Aug 2011 18:29:03 +0000http://www.kdbfaq.com/kdb-faq/what-is-qs-equivalent-to-sqls-order-by.html817816:9603847:12425392xasc and xdesc. Consider the following table:

Getting the milliseconds from a time is slightly less obvious. Times (type -19) are represented internally by q as 32-bit integers; typically the value counts the number of milliseconds since midnight, but it can also represent a span of time. We can cast freely back and forth between the two types and the values are preserved:

Although extracting milliseconds from a time while keeping the time type (as in the second example above) is sometimes useful, we normally want to get back these components of a time as integers, so let's cast it:

q)`int $ now mod 1000
233
q)

By the way, there is a shortcut for getting hours, minutes, and seconds from global variables that hold times: dot notation.

]]>http://www.kdbfaq.com/kdb-faq/rss-comments-entry-12252279.xmlHow do I get the length of a list?Operatorslengthwhere clausewhere functionkdbfaqTue, 19 Jul 2011 02:02:41 +0000http://www.kdbfaq.com/kdb-faq/how-do-i-get-the-length-of-a-list.html817816:9603847:12157396Short answer:

count listorcount each column_nameto get the length of a list-valued field in a query.

The length of any list can be found using the count function; the only complication occurs with certain applications of count in queries. We'll illustrate with a table of simulated trade data:

Suppose, however, that we want to know how many trades occurred for a particular symbol:

q)exec count price from grouped where sym = `aif
1
q)

What went wrong? A where function yields a list of row indexes that meet the constraints, and then each projection (i.e., the column names between exec and from — in this case, price) yields a list of corresponding field values. Since the number of rows that met our sole constraint is 1, the result of the projection is an untyped list with a single element:

Projection results are the arguments to aggregation functions in queries. In other words, the untyped list in our example is the same as the one passed to count. Since the projection's single element contains the list of prices we want to count, the way out is to combine count with first:

q)exec count first price from grouped where sum = `aif
6
q)

Applying the same logic when counting the trades for every symbol, we need to use counteach:

Newcomers to q often expect the above query to return the rows from t whose z column has more than 5 characters (i.e., c 3 frobozz). Rather than counting the contents of each z field, however, count is actually counting the list t `z:

In many programming languages, including q, anytime you add integers (unless you somehow know for sure that the sum will fit in - roughly - 31 bits) you risk integer overflow. If you're lucky, the error will be obvious (e.g., you'll get a negative number when you expected a positive one). Casting the arguments to sum to long will give you (almost) 63 bits of breathing room.

]]>http://www.kdbfaq.com/kdb-faq/rss-comments-entry-11162651.xmlDoes the order of field constraints matter in the performance of a select query?Performanceevaluation orderhdboptimal where clause constructionquery optimizationwhere clausewhere functionkdbfaqSat, 04 Jun 2011 06:00:06 +0000http://www.kdbfaq.com/kdb-faq/does-the-order-of-field-constraints-matter-in-the-performanc.html817816:9603847:10645693Short answer: Yes, put the most restrictive constraint first.

Although expressions in q are usually evaluated from right to left, there is one exception: the constraints in a where clause are evaluated from left to right. Consider the following example:

The comma separating x < `c from y > 1 is not the join operator; instead, it delimits the constraints of the where clause. If you need to use the join operator in a where clause, use parentheses:

q)select from t where x in (`a, `b), y > 1
x y
---
b 2
q)

Each constraint is evaluated in the usual order, i.e., right-to-left:

q)select from t where x < first -1 # `p`c, y > 1
x y
---
b 2
q)

Because constraints are evaluated from left to right, putting the most restrictive constraint first will speed up queries on large tables. On a date partitioned database, for example, the placement of a constraint on the virtual date column will make a marked performance difference:

select from large_table where date = 2011.02.25, name = `JOE

is significantly faster than

select from large_table where name = `JOE, date = 2011.02.25

The latter query will attempt to load the entire contents of table for rows matching name `JOE prior to narrowing the result set down to a single date. kdb does not provide behind-the-scenes optimization in cases like these.

]]>http://www.kdbfaq.com/kdb-faq/rss-comments-entry-10645693.xmlHow do I set a breakpoint inside a q function?:::Debuggingbreakpointcommand line argumentsresume executionkdbfaqThu, 02 Jun 2011 05:47:01 +0000http://www.kdbfaq.com/kdb-faq/how-do-i-set-a-breakpoint-inside-a-q-function.html817816:9603847:10634490Consider the following function, testfunc:

testfunc: {[]x: `aaa;y: `bbb;z:: `ccc;-1 "finished!";}

Let's demonstrate the placement of a breakpoint prior to the assignment of global variable z. Although there is no explicit support for breakpoints in q, insertion of non-compliant code, such as breakhere; shown below, does the job (don't forget the trailing semicolon):

]]>http://www.kdbfaq.com/kdb-faq/rss-comments-entry-11136363.xmlI have a very large file to uncompress. zip and gzip do not work!7-zip7zaUtilitiescompresscompressionfile size limit exceededfile too largegunzipuncompresszipkdbfaqThu, 02 Jun 2011 01:46:54 +0000http://www.kdbfaq.com/kdb-faq/i-have-a-very-large-file-to-uncompress-zip-and-gzip-do-not-w.html817816:9603847:11014062Short answer: Use 7-zip. It contains large-file support.