Advogato blog for michihttp://www.advogato.org/person/michi/
Advogato blog for michien-usmod_virguleTue, 3 Mar 2015 20:22:36 GMTSun, 9 Jun 2013 21:17:40 GMTAll-Rules Mail Bundle gets a new homehttp://www.advogato.org/person/michi/diary.html?start=10
http://www.antforge.org/blog/2013/06/09/all-rules-mail-bundle-gets-new-home<p>When I first published the All-Rules Mail Bundle more than two years ago and also provided a precompiled binary, I didn’t spend much thought about where to host the binary. Just hosting it on GitHub together with the source seemed an obvious choice. But then GitHub said <a href="https://github.com/blog/1302-goodbye-uploads" >goodbye to uploads</a> and discontinued their feature to upload binary files.</p>
<p>At this point I have to say that I wholeheartedly agree with their decision. GitHub is a great place to host and share source code and I love what they are doing. But hosting (potentially big) binary files was never the idea behind GitHub, it’s just not what they do. Better stick to your trade, do one thing and do it well. Hence the search for a new home began. It’s important to remember that <a href="http://www.w3.org/Provider/Style/URI.html" >cool <span>URI</span>s don’t change</a>, so the new home for the All-Rules Mail Bundle binary better be permanent, which is why I decided to host the binary on my own server. Also the staggering number of 51 downloads over the past two years reassured me that my available bandwidth could handle the traffic.</p>
<h3>Where to get the bundle</h3>
<p>The source code repository will of course remain on GitHub and its location is unchanged. Only the location of the binary package has changed and moved off GitHub. The usual amount of <span>URL</span> craftsmanship should allow you to reach previous versions of the binary package.</p>
<ul><li>Source: <a href="https://github.com/mstarzinger/all-rules" >https://github.com/mstarzinger/all-rules</a></li>
<li>Binary: <a href="http://www.antforge.org/all-rules/download/AllRules-0.2.zip" >http://www.antforge.org/all-rules/download/AllRules-0.2.zip</a></li>
</ul><p>Note that I also took this opportunity to compile a new version 0.2 binary package. This version contains all the compatibility updates I made over the past two years and is compatible with several environments up to the following.</p>
<ul><li>Max OS X Mountain Lion 10.8.4</li>
<li>Mail Application 6.5</li>
<li>Message Framework 6.5</li>
</ul><p>As always, your feedback is very much appreciated and I am looking forward to the next fifty or so downloads.</p>Sun, 8 May 2011 21:12:44 GMTDaneel: Type inference for Dalvik bytecodehttp://www.advogato.org/person/michi/diary.html?start=9
http://www.antforge.org/blog/2011/05/08/daneel-type-inference-dalvik-bytecode <p>In the last blog post about Daneel I mentioned one particular caveat of Dalvik bytecode, namely the existence of untyped instructions, which has a huge impact on how we transform bytecode. I want to take a similar approach as last time and look at one specific example to illustrate those implications. So let us take a look at the following Java method.</p>
<div class="geshifilter"><pre class="text geshifilter-text" style="font-family:monospace;">public float untyped(float[] array, boolean flag) {
if (flag) {
float delta = 0.5f;
return array[7] + delta;
} else {
return 0.2f;
}
}</pre></div>
<p>The above is a straightforward snippet and most of you probably know how the generated Java bytecode will look like. So let&#8217;s jump right to the Dalvik bytecode and discuss that in detail.</p>
<div class="geshifilter"><pre class="text geshifilter-text" style="font-family:monospace;">UntypedSample.untyped:([FZ)F:
[regs=5, ins=3, outs=0]
0000: if-eqz v4, 0009
0002: const/high16 v0, #0x3f000000
0004: const/4 v1, #0x7
0005: aget v1, v3, v1
0007: add-float/2addr v0, v1
0008: return v0
0009: const v0, #0x3e4ccccd
000c: goto 0008</pre></div>
<p>Keep in mind that Daneel doesn&#8217;t like to remember things, so he wants to look through the code just once from top to bottom and emit Java bytecode while doing so. He gets really puzzled at certain points in the code.</p>
<ul>
<li>Label 2: What is the type of register <code>v0</code>?</li>
<li>Label 4: What is the type of register <code>v1</code>?</li>
<li>Label 9: Register <code>v0</code> again? What&#8217;s the type at this point?</li>
</ul>
<p>You, as a reader, do have the answer because you know and understand the semantic of the underlying Java code, but Daneel doesn&#8217;t, so he tries to infer the types. Let&#8217;s look through the code in the same way Daneel does.</p>
<p>At method entry he knows about the types of method parameters. Dalvik passes parameters in the last registers (in this case in <code>v3</code> and <code>v4</code>). Also we have a register (in this case <code>v2</code>) holding a <code>this</code> reference. So we start out with the following register types at method entry.</p>
<div class="geshifilter"><pre class="text geshifilter-text" style="font-family:monospace;">UntypedSample.untyped:([FZ)F:
[regs=5, ins=3, outs=0] uninit uninit object [float bool</pre></div>
<p>The array to the right represents the inferred register types at each point in the instruction stream as determined by the <em>abstract interpreter</em>. Note that we also have to keep track of the <em>dimension count</em> and the <em>element type</em> for array references. Now let&#8217;s look at the first block of instructions.</p>
<div class="geshifilter"><pre class="text geshifilter-text" style="font-family:monospace;"> 0002: const/high16 v0, #0x3f000000 u32 uninit object [float bool
0004: const/4 v1, #0x7 u32 u32 object [float bool
0005: aget v1, v3, v1 u32 float object [float bool
0007: add-float/2addr v0, v1 float float object [float bool</pre></div>
<p>Each line shows the register type after the instruction has been processed. At each line Daneel learns something new about the register types.</p>
<ul>
<li>Label 2: I don&#8217;t know the type of <code>v0</code>, only that it holds an <em>untyped 32-bit value</em>.</li>
<li>Label 4: Same applies for <code>v1</code> here, it&#8217;s an <em>untyped 32-bit value</em> as well.</li>
<li>Label 5: Now I know <code>v1</code> is used as an array index, it must have been an <em>integer value</em>. Also the array reference in register <code>v3</code> is accessed, so I know the result is a <em>float value</em>. The result is stored in <code>v1</code>, overwriting it&#8217;s previous content.</li>
<li>Label 7: Now I know <code>v0</code> is used in a floating-point addition, it must have been a <em>float value</em>.</li>
</ul>
<p>Keep in mind that at each line, Daneel emits appropriate Java bytecode. So whenever he learns the concrete type of a register, he might need to retroactively <em>patch</em> previously emitted instructions, because some of his assumptions about the type were broken.</p>
<p>Finally we look at the second block of instructions reached through the conditional branch as part of the <code>if</code>-statement.</p>
<div class="geshifilter"><pre class="text geshifilter-text" style="font-family:monospace;"> 0009: const v0, #0x3e4ccccd u32 uninit object [float bool
000c: goto 0008 float uninit object [float bool</pre></div>
<p>When reaching this block we basically have the same information as at method entry. Again Daneel learns in the process.</p>
<ul>
<li>Label 9: I don&#8217;t know the type of <code>v0</code>, only that it holds an <em>untyped 32-bit value</em>.</li>
<li>Label 12: Now I know that <code>v0</code> has to be a <em>float value</em> because the unconditional branch targets the <em>join-point</em> at label 8. And I already looked at that code and know that we expect a <em>float value</em> in that register at that point.</li>
</ul>
<p>This illustrates why our <em>abstract interpreter</em> also has to remember and merge register type information at each join-point. It&#8217;s important to keep in mind that Daneel follows the instruction stream from top to bottom, as opposed to the control-flow of the code.</p>
<p>Now imagine scrambling up the code so that instruction stream and control-flow are vastly different from each other, together with a few exception handlers and an optimal register re-usage as produced by some <span class="caps">SSA</span> representation. That&#8217;s where Daneel still keeps choking at the moment. But we can handle most of the code produced by the <code>dx</code> tool already and will hunt down all those nasty bugs triggered by obfuscated code as well.</p>
<p><strong>Disclaimer</strong>: The abstract interpreter and the method rewriter were mostly written by Rémi Forax, with this post I take no credit for it&#8217;s implementation whatsoever, I just want to explain how it works.</p>Wed, 27 Apr 2011 22:16:20 GMTDaneel: The difference between Java and Dalvikhttp://www.advogato.org/person/michi/diary.html?start=8
http://www.antforge.org/blog/2011/04/27/daneel-difference-between-java-and-dalvik <p>Those of you who kept following <a href="http://www.icedrobot.org/" >IcedRobot</a> might have seen that quite some work went into Daneel over the past months. He<sup class="footnote"><a href="#fn13366409954db889400f35d" >1</a></sup> is in charge of parsing Android applications containing code intended to run on a Dalvik VM and transforming this code into something which can run on any underlying Java VM. So he is <em>a VM compatible with Dalvik on top of a Java VM</em>, or at least that&#8217;s what he wants to become.</p>
<p>So Daneel is multilingual in a strange way, he can read and understand Dalvik bytecode, but he only speaks and writes Java bytecode. To understand how he can do that we have to look at the differences between those two dialects.</p>
<p><strong>Registers vs. Stack</strong>: We know Dalvik bytecode uses a register-machine, and Java bytecode uses a stack-machine. But each method frame on that stack-machine not only has an operand stack, it also has an array of local variables. Unfortunately this distinction is lost in our register-machine. To understand what this means, let us look at a full Java-Dalvik-Daneel round-trip for a simple method like the following.</p>
<div class="geshifilter"><pre class="text geshifilter-text" style="font-family:monospace;">public static int addConst(int val) {
return val + 123456;
}</pre></div>
<p>The first stop on our round-trip is the Java bytecode. So after we push this snippet through <code>javac</code> we get the following code which makes use of both, an operand stack and local variables.</p>
<div class="geshifilter"><pre class="text geshifilter-text" style="font-family:monospace;">public static int addConst(int);
[max_stack=2, max_locals=1, args_size=1]
0: iload_0
1: ldc #int 123456
3: iadd
4: ireturn</pre></div>
<p>The second stop takes us to the Dalvik bytecode. We push the above code through the <code>dx</code> tool and are left with the following code. Note that the distinction between the operand stack and local variables is lost completely, everything is stored in registers.</p>
<div class="geshifilter"><pre class="text geshifilter-text" style="font-family:monospace;">public static int addConst(int);
[regs=2, ins=1, outs=0]
0: const v0, #0x1E240
1: add-int/2addr v0, v1
2: return v0</pre></div>
<p>The third and last step is Daneel reading the Dalvik bytecode and trying to reproduce sane Java bytecode again. The following is what he spits out after chewing on the input for a bit.</p>
<div class="geshifilter"><pre class="text geshifilter-text" style="font-family:monospace;">public static int addConst(int);
[max_stack=2, max_locals=2, args_size=1]
0: ldc #int 123456
1: istore_1
2: iload_1
3: iload_0
4: iadd
5: istore_1
6: iload_1
7: ireturn</pre></div>
<p>The observant reader will notice the vast difference between what we had at the beginning of our round-trip and what we ended up with. Daneel maps each Dalvik register to a Java local variable. Fortunately any decent Java VM will optimize away the unnecessary load and store instructions and we can achieve acceptable performance with this naive approach already.</p>
<p><strong>Untyped Instructions</strong>: Another big difference might not be that obvious at first glance. Notice how the instruction at label <code>0</code> in the above Dalvik bytecode (the second stop on our round-trip) accesses register <code>v0</code> without specifying the exact type of that register? The only thing Daneel can determine at that point in the code is that it&#8217;s a 32-bit value we are dealing with, it could be an <code>int</code> or a <code>float</code> value. For zero-constants it could even be a <code>null</code> reference we are dealing with. The exact type of that register is not revealed before the instruction at label <code>1</code>, where <code>v0</code> is read again by a typed instruction. It&#8217;s at that point that we learn the exact type of that register.</p>
<p>So Daneel has to keep track of all register types while iterating through the instruction stream to determine the exact types and decide which Java bytecode instructions to emit. I intend to write a separate article about how this is done by Daneel in the following days, so stay tuned.</p>
<p><strong>Disclaimer</strong>: This is a technical description of just two major differences between Dalvik bytecode and Java bytecode. All political discussions about differences or similarities between Dalvik and Java in general are outside the scope of this article and I won&#8217;t comment on them.</p>
<p id="fn13366409954db889400f35d" class="footnote"><sup>1</sup> Yes, Daneel is male. His girlfriend is called Ika. Together they love to drink iced tea because they try to get off caffeine. They even have a butler working for them who is called Jenkins, a very lazy guy who regularly was seen to crash during work.</p>Tue, 11 Jan 2011 02:15:08 GMTAll-Rules Mail Bundle: The shortcut to your Mail.app ruleshttp://www.advogato.org/person/michi/diary.html?start=7
http://www.antforge.org/blog/2011/01/11/all-rules-mail-bundle-shortcut-your-mailapp-rules <p>Have you ever wanted to automate some message sorting tasks in Apple&#8217;s Mail application after you have read a message? I, for example, use one archive folder per account and move all messages into that folder after I&#8217;ve read them. The application&#8217;s rule system is perfectly suited for that task, unfortunately there is no way to activate certain rules by pressing a keyboard shortcut. That&#8217;s where this bundle comes into play.</p>
<p>The <em>All-Rules Mail Bundle</em> acts as a plugin for Apple&#8217;s Mail application and serves just one specific purpose. It provides an additional menu item located under &#8220;Message -&gt; Apply All Rules&#8221; which applies all active rules to the currently selected messages while ignoring any present &#8220;Stop evaluating rules&#8221; action.</p>
<h3>Where to get the bundle</h3>
<p>The source of the bundle is available at GitHub as a standard Xcode project. Feel free to adapt it to your needs if necessary. I will also provide a precompiled binary for those of you who just want to use it out of the box.</p>
<ul>
<li>Source: <a href="https://github.com/mstarzinger/all-rules" title="https://github.com/mstarzinger/all-rules" >https://github.com/mstarzinger/all-rules</a></li>
<li>Binary: <a href="https://github.com/downloads/mstarzinger/all-rules/AllRules-0.1.zip" title="https://github.com/downloads/mstarzinger/all-rules/AllRules-0.1.zip" >https://github.com/downloads/mstarzinger/all-rules/AllRules-0.1.zip</a></li>
</ul>
<p>Note that I&#8217;ve developed and tested the thing on my only Mac machine, which clearly is an inadequate test coverage. As always I would be happy about any response. So far the bundle is known to run in the following environment, which is the most recent one at the time of writing.</p>
<ul>
<li>Mac OS X Snow Leopard 10.6.6</li>
<li>Mail Application 4.4</li>
<li>Message Framework 4.4</li>
</ul>
<h3>How it is implemented</h3>
<p>First of all, let me emphasis that this is the first time I actually did some Objective-C coding. But I really liked the feel of it. I was really surprised about the power of the Objective-C runtime. You can do lots of nasty stuff at runtime like changing class hierarchies, adding methods to classes, changing method implementations and so on.</p>
<p>I used one technique known as <a href="http://www.cocoadev.com/index.pl?MethodSwizzling" >method swizzling</a> in the bundle, which lets you switch the existing implementation of a method with your own replacement at runtime. This enabled me to override the original <code>shouldStopEvaluatingRules</code> implementation of the <code>MessageRule</code> class inside the Message framework.</p>
<p>Unfortunately most of the <span class="caps">API</span>s of the Mail application and the Message framework are private, so I expect my bundle to break sometime in the future. But the <span class="caps">API</span> can be easily reverse engineered with the <a href="http://www.codethecode.com/projects/class-dump/" >class-dump utility</a> which generates header files out of Objective-C binaries.</p>
<p>To prevent bundles from silently breaking, each bundle includes a list of the exact versions of Message frameworks and Mail applications it is compatible with. I found an article that explains <a href="http://stib.posterous.com/how-to-fix-unsupported-plugins-after-upgradin" >how to fix unsupported plugins after upgrading Mail.app</a> without recompiling them. So if you have different versions running on your machine that are compatible as well, let me know about them.</p>
<p>And last but not least I want to mention one article which helped me a lot in figuring out all those tiny details and really did it&#8217;s job in <a href="http://eaganj.free.fr/weblog/?post/2009/07/14/Demystifying-Mail.app-Plugins-on-Leopard" >demystifying Mail.app plugins on Leopard</a> for me.</p>
<h3>Related bundles</h3>
<p>The same (and more) could be done with Indev&#8217;s Mail Act-On bundle, unfortunately that bundle is sold under a commercial license. With my bundle I cloned the essential feature which was indispensable for my personal use.</p>
Thu, 11 Nov 2010 23:17:17 GMTWAQL-PP 0.1 releasedhttp://www.advogato.org/person/michi/diary.html?start=6
http://www.antforge.org/blog/2010/11/11/waql-pp-01-released <p>With this post I am proud to announce the first release of <span class="caps">WAQL</span>-PP, a <acronym title="Web-service Aggregation Query Language"><span class="caps">WAQL</span></acronym> Preprocessor for Java I was working on for the last two weeks. In one of the <a href="http://www.antforge.org/blog/2010/10/25/waql-pp-preprocessor-data-aggregation-query-language" >former posts</a> I described the motivation behind this little project and how I planned to implement it. I&#8217;m rather satisfied with the result, so without further ado comes a copy of the release notes for this version. If you are interested just visit the <a href="http://www.antforge.org/waqlpp" >project page</a> to check it out.</p>
<div class="geshifilter"><pre class="geshifilter-text">WAQL-PP 0.1 released.
&nbsp;
This is the first release of the WAQL Preprocessor for Java. Here is a short
list of the most important features:
&nbsp;
* Resolves Data Dependencies between separate queries by converting
replacement objects into a textual representation.
* Handles nested Data Dependencies from innermost to outermost.
* Transforms Template List constructs into valid XQuery for-clauses and
handles correlations between different Template Lists.
* Parser tested against the XML Query Test Suite (XQTS).
&nbsp;
This release was developed against and tested with Java SE 1.6.0_22. It uses
Apache Ant as a build tool, JUnit 4.8.2 for testing purposes, JavaCC 5.0 as a
parser generator and has no additional runtime dependencies. It is currently
being used as a component in the WS-Aggregation framework.
&nbsp;
Information about the project and general documentation can be found on
http://www.antforge.org/waqlpp
&nbsp;
The WAQL-PP 0.1 release packages can be downloaded from
http://www.antforge.org/waqlpp/download/waqlpp-0.1/
&nbsp;
File : waqlpp-0.1-src.zip
md5sum : 57d06bfedaf1abd6eeed793838d96fc7
sha1sum: 1a5fd2196a0916fd74479c4e7aaa57811b673e3b
&nbsp;
File : waqlpp-0.1.jar
md5sum : bf97850f878014090eb9b9849e18ab37
sha1sum: 74c4e0e7e78bc16fea4bfd1b0954439e74636118
&nbsp;
Enjoy!
Michael Starzinger</pre></div>Tue, 26 Oct 2010 00:09:08 GMTWAQL-PP: Preprocessor for a Data Aggregation Query Languagehttp://www.advogato.org/person/michi/diary.html?start=5
http://www.antforge.org/blog/2010/10/25/waql-pp-preprocessor-data-aggregation-query-language <p>This week I started to design and implement a preprocessor for the <em>Web-service Aggregation Query Language</em> (<span class="caps">WAQL</span>) which is an extension of <a href="http://www.w3.org/TR/xquery/" >XQuery</a>. This language is used as part of the <a href="http://www.infosys.tuwien.ac.at/prototype/WS-Aggregation/" >WS-Aggregation framework</a> developed at the Distributed Systems Group of the Vienna University of Technology. With this text I want to explain the motivation behind <span class="caps">WAQL</span> and how the preprocessor will be designed. The motivation is nicely stated as part of my task description.</p>
<blockquote>
<p>The key idea of <span class="caps">WAQL</span> is that it provides a convenience syntax for XQuery, which otherwise tends to become complex and hardly comprehensible in bigger scenarios. <span class="caps">WAQL</span> queries are transformed into valid XQuery expressions, which are finally executed by a (third-party) XQuery engine.</p>
</blockquote>
<p>First of all we need to get a grasp of what the <span class="caps">WAQL</span> extensions to the XQuery language are. Since <span class="caps">WAQL</span> is still in its experimental stages, there is no exact specification of the language and it may change or grow over time. At the moment <span class="caps">WAQL</span> consists of two language constructs:</p>
<ul>
<li><strong>Template Lists</strong>: This extension tries to simplify the specification of generated inputs. It basically is syntactical sugar representing a XQuery for-loop construct and as such can be transformed easily.</li>
<li><strong>Data Dependencies</strong>: This second extension is the interesting one, it can express dependencies between several different queries. The framework has to identify these dependencies and execute the queries in a valid order, so that all dependencies can be resolved.</li>
</ul>
<p>The above two constructs should explain why the actual transformation has to be split into several phases which can be triggered by the framework at different points in time. The separate steps the preprocessor has to perform are as follows:</p>
<ol>
<li><strong>Parsing</strong>: The textual <span class="caps">WAQL</span> query is parsed and an intermediate representation is constructed. Since <span class="caps">WAQL</span> is an extension which enhances the set of expressions for the XQuery language, the actual parser has to understand the full XQuery grammar. This may sound like a lot of work, but the XQuery specification provides a detailed <a href="http://www.w3.org/TR/2007/REC-xquery-20070123/#id-grammar" >description of the grammar</a> in about 140 <acronym title="Extended Backus–Naur Form"><span class="caps">EBNF</span></acronym> rules. So defining a valid parser is a doable job.</li>
<li><strong>Resolving</strong> of data dependencies: At this point the preprocessor has generated a list of all unresolved data dependencies. However the preprocessor has no idea which other queries are linked to the one currently being processed. So the actual resolving has to be done by the framework, the preprocessor just adapts the intermediate representation to the data provided by the framework.</li>
<li><strong>Transformation</strong>: Once all dependencies have been resolved the intermediate representation can be transformed back into a textual XQuery (without any <span class="caps">WAQL</span> extensions), which can then be passed on to a third-party XQuery engine.</li>
</ol>
<p>Now that the basic operations are defined, we are able to give a rough description of the <span class="caps">WAQL</span> preprocessor and how it can be embedded into the existing framework. The two basic modules are a generated parser (obviously performing the parsing step) and a driving engine (performing the resolving and transformation steps). The parser will most certainly be generated using the <a href="https://javacc.dev.java.net/" >JavaCC</a> parser generator. The below graphic should explain the architecture.</p>
<p style="text-align:center;"><img src="/files/waqlpp-architecture.png" title="Architecture of the WAQL preprocessor" alt="Architecture of the WAQL preprocessor" width="400" height="124" /></p>
<p>Note that the above explanation is written from the compiler-constructor point of view, it just covers the preprocessor as part of the framework. All the other nasty details of WS-Aggregation are beyond the scope of this text. If you are interested you should read <a href="http://www.infosys.tuwien.ac.at/staff/hummer/docs/2011_sac_ws-aggregation.pdf" >the paper</a> or contact <a href="http://www.infosys.tuwien.ac.at/staff/hummer/" >Waldemar Hummer</a> who was kind enough to explain it to me. Also I will continue to write about the ongoing development of the preprocessor, so stay tuned.</p>
<p><strong>Update</strong>: This text was <a href="http://www.infosys.tuwien.ac.at/staff/treiber/blog/2010/10/25/waql-pp-preprocessor-for-a-data-aggregation-query-language/" >crossposted</a> to the <a href="http://www.infosys.tuwien.ac.at/staff/treiber/blog/" ><span class="caps">DSG</span> Praktika Blog</a> as well.</p>Fri, 30 Jul 2010 21:11:06 GMTPuzzling Java statement of the dayhttp://www.advogato.org/person/michi/diary.html?start=4
http://www.antforge.org/blog/2010/07/30/puzzling-java-statement <p>Some days ago I stumbled across a Java statement which I thought was trivial at first, only to discover that I had no idea. I have a reasonable understanding of what a <span class="caps">JVM</span> does and how Java bytecode is executed. But as this example shows once again, that doesn&#8217;t necessarily spread to the Java programming language. The snippet below should explain my point.</p>
<pre><code>int lorem = 1, ipsum = 2, dolor = 3;
if (lorem == (lorem = ipsum))
f();
if ((ipsum = dolor) == ipsum)
g();
</code></pre>
<p>Which of the above two methods <code>f()</code> and <code>g()</code> is actually invoked? Can you tell without compiling the code? Possible answers are:
<ul>
<li><strong>None</strong> of the two methods get invoked, both call-sites are dead code.</li>
<li><strong>Just <code>f()</code></strong> is invoked and the call-site of <code>g()</code> is dead code.</li>
<li><strong>Just <code>g()</code></strong> is invoked and the call-site of <code>f()</code> is dead code.</li>
<li><strong>Both</strong> methods are invoked, the conditions are pointless.</li>
</ul></p>
<p>Ironically, I finally understood what was going on after looking at the generated bytecode (good old <code>javap</code> is your friend). I am not posting the disassembled code because that would spoil the fun. But once you look at it, the answer appears to be quite obvious.</p>Fri, 14 May 2010 15:13:36 GMTGoodbye Cacao ...http://www.advogato.org/person/michi/diary.html?start=3
http://www.antforge.org/blog/2010/05/14/goodbye-cacao <p>As some of you might have heard (or deduced from my lack of activity), I have left <a href="http://www.cacaovm.org" >Cacao</a>. This is not a decision I took lightly, hence it took me some time to make it official by the means of this post.</p>
<p>I started my work on Cacao in 2005 with my first project being the <span class="caps">ARM</span> port of the code generator, which turned out to be my <a href="http://stud4.tuwien.ac.at/~e0306126/cacao-arm/cacao-arm.pdf" >bachelor thesis</a> and hooked me up with Cacao. I continued to actively contribute to the development of Cacao and tried to help push it towards being a real Java Virtual Machine. Since then a lot has changed in Cacao. I have learned a lot from all the contributors, <a href="http://blogs.sun.com/twisti/" >former maintainers</a> and other people I worked with and for that I am very grateful.</p>
<p>One of my most recent endeavors (also happening to be my diploma thesis) was to prepare Cacao to cope with exact garbage collection, a topic which was neglected for too long in Cacao. A project that big requires a lot of infrastructure. Once you want do the <em>cool stuff</em> you realize that more and more of those tiny bits and pieces are missing. Nevertheless you still want to do all the <em>really cool stuff</em>, to be able to compete with others out there. The essence of that is, that it&#8217;s just too much work for a single person to effectively push forward the development of a mature <acronym title="Java Virtual Machine"><span class="caps">JVM</span></acronym> like Cacao in time.</p>
<p>The future has a natural tendency to resist prediction, so I don&#8217;t even try to make any for Cacao. But what I can say is that there are two people taking over maintenance of the code, namely David Flamme and Stefan Ring, both being capable and motivated to do that.</p>
<p>However, I don&#8217;t want to leave Cacao without a vision. During my time at <a href="http://www.theobroma-systems.com/" >Theobroma Systems</a> I learned a lot about <a href="http://en.wikipedia.org/wiki/Microkernel" >microkernels</a> on embedded systems and picked up some of the enthusiasm about them from my former colleagues there. In my opinion, portability is what microkernel-based operating systems are really missing and a <acronym title="Java Virtual Machine"><span class="caps">JVM</span></acronym> might provide. To be sufficiently efficient, that VM needs to run directly on top of the hypervisor instead of being throttled by several compatibility layers. At first it would only be some kind of Micro Edition (if at all), but with some effort Cacao might be able to pull this off.</p>
<p>As for myself, I am looking forward to whatever the future may hold for me, and will keep you posted &#8230;</p>Tue, 6 Apr 2010 22:08:41 GMTWhy I don't like Java RMI and how I use it anywayshttp://www.advogato.org/person/michi/diary.html?start=2
http://www.antforge.org/blog/2010/04/06/why-i-dont-like-rmi <p>The Java <a href="http://java.sun.com/j2se/1.5.0/docs/guide/rmi/" >Remote Method Invocation</a> <span class="caps">API</span> is a great thing to have because it is available in almost every <acronym title="Java 2 Standard Edition">J2SE</acronym> runtime environment without adding further dependencies. However there are some implications when using <acronym title="Remote Method Invocation"><span class="caps">RMI</span></acronym> and I just cannot get my head around them:</p>
<ol>
<li>Interfaces used as <em>remote interfaces</em> need to extend <code>java.rmi.Remote</code>. Interfaces should be clean and not contain any <em>clutter</em> introduced by a certain technology. This is even more true with modern frameworks and things like dependency injection.</li>
<li>Remote methods need to declare <code>java.rmi.RemoteException</code> in their <code>throws</code> clause. This is basically a continuation of the first point. This point holds, even if you ignore the <a href="http://www.artima.com/intv/handcuffs.html" >rant about checked exceptions</a>, which I don&#8217;t want to comment on right now.</li>
<li>Remote objects need to be exported explicitly. Even though one explicitly declared which methods should be accessible from a remote site with the above two points, one still needs to explicitly export every single instance of an object implementing those methods.</li>
</ol>
<p>Don&#8217;t get me wrong, all those implications have their right to exist because the decisions leading up to them were made for a reason. But in some circumstances those reasons don&#8217;t apply. It is just not the <a href="http://en.wikipedia.org/wiki/One_Ring" >one-to-rule-them-all</a> solution for remote method invocation in Java.</p>
<p>There are ways around those problems. One could for instance duplicate the existing interfaces to fit the needs of <span class="caps">RMI</span>. But frankly speaking, I just don&#8217;t want to do that myself.</p>
<p><hr /></p>
<p>That being said, lets see if the task of separating the transportation layer based on <span class="caps">RMI</span> from your precious interfaces can be automated in some way, so it doesn&#8217;t have to be done by hand. The following are the key points of the approach:</p>
<ul>
<li>Interfaces can explicitly be marked as <em>remote interfaces</em> at runtime without the need for recompiling them. All methods exposed by such an interface can be invoked from a remote site. All parameters which are subclasses of such interfaces, will be passed by-reference and will not be serialized. This is just the same behavior as if the interface would extend <code>java.rmi.Remote</code> in the <span class="caps">RMI</span> world. The actual remote interfaces are generated on-demand at runtime.</li>
<li>Provide a <em>proxy factory</em> which supports the rapid development of a transportation layer based on <span class="caps">RMI</span> for given clean interfaces. The interface classes do not need to be <em>cluttered</em> with specifics of the transportation implementation.</li>
<li>A <em>proxy</em> in this context is a transparent object implementing both, the local and the generated remote interface. Both interfaces are usable:
<ul>
<li>Cast the <em>proxy</em> to <code>java.rmi.Remote</code> and use it with any naming or registry service available to the <span class="caps">RMI</span> world. Every proxy implicitly is a remote object without the need for explicitly exporting it.</li>
<li>Cast the <em>proxy</em> to your local interface and don&#8217;t bother whether it actually targets a local or a remote site.</li>
</ul></li>
<li>The decision how an invocation actually is dispatched can be solely based on whether the target object of a proxy is a remote or a local one. This decision is hidden inside the transportation layer.</li>
</ul>
<p>Available as a <a href="http://www.antforge.org/blog/2010/04/06/attachment/ProxyFactory.java" >download attached to this post</a> you&#8217;ll find a first reference implementation of such a <em>proxy factory</em> as described above. Note that it is just a sketch to illustrate my point and will probably contain major flaws. Also it brings a dependency on <a href="http://www.csg.is.titech.ac.jp/~chiba/javassist/" >Javassist</a>, which kind of contradicts the very first sentence of this post. However it is capable of distributing this tiny example across several sites without modifying the given interfaces, which also represents my only test-case:</p>
<pre><code>public interface Client {
public void callback(String message);
}
</code>
<code>public interface Server {
public void subscribe(Client subscriber);
public void notify(String message);
}
</code>
<code>public class ClientImpl implements Client {
public void callback(String message) {
// ... do some important job ...
}
}
</code>
<code>public class ServerImpl implements Server {
public void subscribe(Client subscriber) {
// ... remember &quot;subscriber&quot; in some fancy data structure ...
}
public void notify(String message) {
// ... invoke all &quot;subscribers&quot; like they were local ...
}
}
</code></pre>
<p>This is my attempt to show how I personally think that <span class="caps">RMI</span> should have been designed in the first place. Please feel free to comment, improve, ignore or flame.</p>Fri, 30 Oct 2009 23:10:15 GMTCacao supports JMX Remote Monitoring and Managementhttp://www.advogato.org/person/michi/diary.html?start=1
http://www.antforge.org/blog/2009/10/28/cacao-supports-jmx-remote-monitoring-and-management<p>Since a few days <a href="http://www.cacaovm.org" >Cacao</a> successfully starts OpenJDKs <cite>JMX Monitoring and Management Agent</cite> if requested to do so. This allows you to remotely connect to Cacao with any JMX-compliant monitoring tool. One of the main responsibilities of this agent is to act as a server for MBeans (managed beans). The JRE provides some basic MBeans which allow out-of-the-box monitoring and management of VM internals. But applications can easily extend the functionality by providing custom MBeans. If you want to learn more about this topic, you should visit OpenJDKs <a href="http://openjdk.java.net/groups/jmx/index.html" >JMX group</a>.</p>
<p>One such JMX-compliant monitoring tool is <a href="http://openjdk.java.net/tools/svc/jconsole/" >JConsole</a> which comes bundled with most J2SDK installations. Below you see a JConsole from Apples Java running on my MacOS X workstation connected to Cacao running on a remote Linux machine.</p>
<p><center><a href="http://www.advogato.org/files/images/cacao-jconsole.png" ><img src="/files/imagecache/preview_400/images/cacao-jconsole.png" alt="JConsole connected to Cacao" /></a></center></p>
<p>Note that there (still) are some restrictions on the current support:</p>
<ol>
<li>Some of the VM internal management functions are not yet fully implemented. Those functions are defined by HotSpots JMM interface (the thing called jmm.h). It will take some time and patience until all of them are implemented.</li>
<li>Only OpenJDK provides a reference implementation of the JMX agent, so at the moment there is no support for GNU Classpath.</li>
<li>The thing that baffled me most, was that the documentation stated that applications running on the same machine inside another HotSpot VM process can be monitored without starting the JMX agent. I found out that HotSpot creates a shared memory region to which you can attach another VM process. I don't like the idea of sharing memory across VM processes at all, so Cacao does not (and probably never will) support this feature. But I implemented the necessary stubs to avoid <code>UnsatisfiedLinkageError</code>s and make everything run smoothly. So don't wonder if you can't see a list of locally running Cacao processes in JConsole. If you are interested, all the functionality to access this shared memory is hidden in <code>sun.misc.Perf</code>.
</li></ol>
<p>And finally, how do you make Cacao start the JMX agent? Try the snippet below. If you want to know more about those magic properties, try one of the thousand other articles out there dealing with this topic.</p>
<pre>
$ java -Dcom.sun.management.jmxremote \
-Dcom.sun.management.jmxremote.port=9999 \
-Dcom.sun.management.jmxremote.authenticate=false \
-Dcom.sun.management.jmxremote.ssl=false
</pre><p>
Have fun monitoring and managing!</p>