Since Internal`Bag, Internal`StuffBag and Internal`BagPart can be compiled down, it is a precious source for various applications. There were already many questions why AppendTo is so slow, and which ways exist to make a dynamically grow-able array which is faster. Since inside Compile many tricks can simply not be used, which is for instance the case for Sow and Reap, this is a good alternative.

A fast, compiled version of AppendTo: For a comparison I will use AppendTo directly for an easy loop. Ignore the fact that this would not be necessary here, since we know the number of elements in the result list. In a real application, you maybe wouldn't know this.

Usage and features

The following information was collected from different sources. Here is an article from Daniel Lichtblau who was kind enough to give some insider information. A question on MathGroup led to a conversation with Oleksandr Rasputinov who knew about the third argument of Internal`BagPart. Various other posts on StackOverflow exist which I will not mention explicitly. I will restrict the following to the usage of Internal`Bag and Compiletogether. While we have 4 functions (Internal`Bag, Internal`StuffBag, Internal`BagPart, Internal`BagLength), only the first three can be compiled. Therefore, one has to explicitly count the elements which are inserted into the bag if needed (or use Length on All elements).

Internal`Bag[] creates an empty bag of type real. When an Integer is inserted it is converted to Real. True is converted to 1.0 and False to 0.0. Other types of bags are possible too. See below.

Internal`StuffBag[b, elm] adds an element elm to the bag b. It is possible to create a bag of bags inside compile. This way it is easy to create a tensor of arbitrary rank.

Internal`BagPart[b,i] gives the i-th part of the bag b. Internal`BagPart[b,All] returns a list of all. The Span operator ;; can be used too. Internal`BagPart can have a third argument which is the used Head for the returned expression.

Variables of Internal`Bag (or general inside Compile) require a hint to the compile for deducing the type. A bag of integers can be declared as list = Internal`Bag[Most[{0}]]

Examples

The important property of the following examples is that they are completely compiled. There is no call to the kernel, and using the Internal`Bag in such a way should most likely speed things up.

The famous sum of Gauss; adding the numbers from 1 to 100. Note that the numbers are not explicitly added. I use the third argument to replace the List head with Plus. The only possible heads inside Compile are Plus and Times and List.

Open Questions

Are there simpler/other ways to tell the compiler the type of a local variable? What bothers me here is that this is not really explained in the docs. It is only mentioned shortly how to define (not declare) a tensor. When a user wants to have an empty tensor, it is completely unintuitive that he has to use a trick like Most[{1}]. Declaring variables would be one of the first things I need, when I would be new to Compile. In this tutorial, I didn't find any hint to this.

Are there further features of Bag which may be important to know in combination with Compile?

The timing function of position above leaks memory. After the run {n, 100, 3000, 200} there is 20GB of memory occupied. I haven't investigated this issue really deeply, but when I don't return the list of positions, the memory seems OK. Actually, the memory for the returned positions should be collected after the Block finishes. My system here is Ubuntu 10.04 and Mathematica 8.0.4.

2 Answers
2

I am somewhat reluctant to offer this as an answer since it is inherently difficult to comprehensively address questions on undocumented functionality. Nonetheless, the following observations do constitute partial answers to points raised in the question and are likely to be of value to anyone trying to write practical compiled code using Bags. However, caution is always highly advisable when using undocumented functions in a new way, and this is no less true for Bags.

The type of Bags

As far as the Mathematica virtual machine is concerned, Bags are a numeric type, occupying a scalar Integer, Real, or Complex register, and can contain only scalars or other Bags. They can be created empty, using the trick described in the question, or pre-stuffed:

with a scalar, using Internal`Bag[val] (where val is a scalar of the desired type)

with several scalars, using Internal`Bag[tens, lvl], where tens is a full-rank tensor of the desired numeric type and lvl is a level specification analogous to the second argument of Flatten. For compiled code, lvl $\ge$ ArrayDepth[tens], as Bags cannot directly contain tensors.

Internal`StuffBag can only be used to insert values of the same type as the register the Bag occupies, a type castable to that type without loss of information (e.g. Integer to Real, or Real to Complex), or another Bag. Tensors can be inserted after being flattened appropriately using the third argument of StuffBag, which behaves in the same way as the second argument of Bag as described above. Attempts to stuff other items (e.g. un-flattened tensors or values of non-castable types) into a Bag will compile into MainEvaluate calls; however, sharing Bags between the Mathematica interpreter and virtual machine has not been fully implemented as of Mathematica 8, so these calls will not work as expected. As this is relatively easy to do by mistake and there will not necessarily be any indication that it has happened, it is important to check that the compiled bytecode is free of such calls.

Nested Bags

These are created simply by stuffing one Bag into another, and do not have any special type associated with them except the types of the registers containing the pieces. In particular, there is no "nested Bag type". Per the casting rules given above, it is theoretically possible to stuff IntegerBags into a RealBag and later extract them into Integer registers (for example). However, this technique is not to be recommended as the result depends on the virtual machine version; for instance, the following code is compiled into identical bytecode in versions 5.2, 7, and 8, but gives different results:

Stuffing Bags of mixed Real and Integer types into a RealBag produces even less useful results, since pointer casts are performed by Internal`BagPart without regard to the original type of each constituent Bag, resulting in corrupted numerical values. However, nesting bags works correctly in all versions provided that the inner and outer bags are of identical types. It is also possible to stuff a bag into itself to create a circular reference, although the practical value of this is probably quite limited.

Miscellaneous

Calling Internal`BagPart with a part specification other than All will crash Mathematica kernels prior to version 8.

Internal`Bag accepts a third argument, which should be a positive machine integer. The purpose of this argument is not clear, but in any case it cannot be used in compiled code.

So Internal'Bag is MMA's version of a smart array? It strikes me as odd that List needs to be copied just to append an element (also, how do you create in-line code with a back-tick?)
–
VF1Nov 10 '12 at 1:10

@VF1 I am not exactly sure what a smart array is, but Internal`Bag is a (singly) linked list. List is in general a hybrid between a linked list and an array, but in the VM it is implemented as just a standard array. In top-level Mathematica you can build linked lists using Lists, and this does not incur any copies, but by appending you run into issues due to the immutability of expressions, so that the copies are necessary.
–
Oleksandr R.Nov 10 '12 at 1:18

About your question regarding the definition of the type of local variables in Compile, Compile has an optional third argument that allows you do this in the same manner you specify arguments. It helps the compiler solve some type ambiguity issues sometimes as by default a local variable is considered a Real number.

This can be the case if a local variable is the result of another external function and the compiler cannot infer properly the type of the result of this external function. For example

+1 Although I new this, I never got this running with the declaration of local variables. I didn't know that you have to put y={} together with the type specification. How would I use this to define a Bag of integers, or a Bag of Bags of integers? I would have to know the true internal structure of the pattern or not?
–
halirutanJan 28 '12 at 14:37

The y={} is just for this example, you don't need to always initialize your local variable when you declare it if you specify the type. I would be surprised that you could have a type related to bags, the only official ones are Integer, Real, Complex, and True | False. In such Mathematica experiments I don't know anything better then try and fail ... but never fail to try !
–
faysouJan 28 '12 at 19:20

2

But the simple example Compile[{}, Module[{y}, y], {{y, _Real, 1}}] fails. Indeed, it seems this type specification at the end is completely ignored and your example only works because of the y={} and because you chose type Real. Try this Compile[{}, Module[{y = {}}, AppendTo[y, 1]], {{y, _Integer, 1}}] and you see, that the type is {Real} although I specified Integer and appended an integer.
–
halirutanJan 28 '12 at 21:02

Faysal, do you know any examples where adding a third argument to Compile has any effect? Would you consider updating your answer? I feel there should probably be a separate Q&A about this, but I was curious if you knew more.
–
Jacob AkkerboomFeb 28 '14 at 13:13

1

I've updated the answer, feel free to add any other example you may find relevant for the use of this third argument in Compile.
–
faysouMar 4 '14 at 10:05

Mathematica is a registered trademark of Wolfram Research, Inc. While the mark is used herein with the limited permission of Wolfram Research, Stack Exchange and this site disclaim all affiliation therewith.