Simpy put, machine-readable output of (non-)standard utilities. Just imagine every command line utility on your UNIX that displays some useful output that might be useful for some programmer out there in the nice form easily crunchable by the program. Possible examples are:

ifconfig

ctfdump

sysctl

find & ls

netstat, procstat

ping, traceroute

insert your tool here

If this sounds just one bit interesting and important, continue reading.

Problem

You see, I am not the only one to think of this, take this repository as an example. Don’t get me wrong, it is a nice piece of software. The problem with such approach is the scalability – if I make the JSON exporter for every one of the utilities mentioned above, I might as well quit my university and start working on that. Now let’s imagine there is another company that loves XML or maybe C data type notation to exchange data. For every format we add, the code of these utilities gets bulkier and the original purpose of the program will be hidden in hundreds lines of code emitting your favourite format. If we denote the number of formats F, we might say we have a 1:F approach.

The same goes with adding new utility to your arsenal of machine-readable-output-creating programs. You will need to teach it to speak every format you fancy. Writing tenth YAML converter must be fun, not. If we denote number of utilities U, we might say we have a U:1 approach.

So, what do we want exactly? U:F.

P.S.: Do not get me started on the situation, when your favourite XML library is no longer supported or changed the license to an incompatible one and you have to rewrite every one of your tools.

Solution

We want to write a YAML/JSON/XML/… converter only once. And, maybe even more importantly, we want to infect the source of the utility with only one data-ventiling code. Where does that leave us? We need an intermediate data format. Intermediate Data Format Library to be precise, let’s call it libidf.

More about the design and implementation of this library in next posts. Stay tuned!

]]>https://lovasko.wordpress.com/2014/05/29/libidf-i/feed/0lovaskoIdea: ctfcompresshttps://lovasko.wordpress.com/2014/05/29/idea-ctfcompress/
https://lovasko.wordpress.com/2014/05/29/idea-ctfcompress/#respondThu, 29 May 2014 17:04:02 +0000http://lovasko.wordpress.com/?p=54Continue reading →]]>Maybe there already exists an equivalent of tool, but I was not able to google it. I had no time to dig up the exact moment in the CTF toolchain process where the data gets compressed. But it might be a good idea to make this a standalone stage and thus creating a ctfcompress – a tool to (de)compress the CTF data. Usage would be very simple, just take a parameter, either -d or -c and a set of object files to perform the magic on.]]>https://lovasko.wordpress.com/2014/05/29/idea-ctfcompress/feed/0lovaskoUniquificationhttps://lovasko.wordpress.com/2014/05/29/uniquification/
https://lovasko.wordpress.com/2014/05/29/uniquification/#respondThu, 29 May 2014 11:48:44 +0000http://lovasko.wordpress.com/?p=47Continue reading →]]>Fancy word, don’t you think? Well, it is a fancy feature too!

Now let’s imagine this scenario: we have a kernel object full of ubiquitous types like uid_t, ushort, struct proc and so on. And with this kernel, we have a dozen of kernel modules built, mostly sharing numerous types with the kernel base. Needless to say, this can apply also to some userland application setups. Without any post-processing, both the modules and the kernel would have generated own CTF data. You can probably see now, that this causes unnecessary duplication. And that is exactly when uniquification comes into play.

It is done using the ctfmerge tool. Normally, we use it to take a set of objects and merge it into one with while making sure there is only one occurrence of each type. But this would not be enough, we still want to specify the parent file that we uniquify against. This saves a large portion of the space (currently, I have no estimate, there should be some post in the future with charts). Now, we need to realise this is not without problems – just imagine that we want to update the parent object, without even touching the children objects. Worry do not, there is one special switch in the ctfmerge that provides us the additive merge. One of the next blogposts will be exactly about this.

It seems that none of my kernel objects that were made during the CTF kernel build contain any references to the merging, simply the output of this command is either “cth_parname = (anon)” or a self-reference to the CTF header struct. Another evidence of this is simply looking at the kernel objects with the ctfdump – each one of them contains type definitions for common int or void.

After searching for usage of the ctfmerge in the system makefiles, I found that every time it is contained only as a body of an if which checks the MK_CTF variable for the “no” value. It seems that the file /usr/share/mk/bsd.own.mk contains the definition of this variable and sets it to be “no”. I will have to play some more with this.

]]>https://lovasko.wordpress.com/2014/05/29/uniquification/feed/0lovaskoCTF header I.https://lovasko.wordpress.com/2014/05/27/ctf-header-i/
https://lovasko.wordpress.com/2014/05/27/ctf-header-i/#respondTue, 27 May 2014 19:57:44 +0000http://lovasko.wordpress.com/?p=35Continue reading →]]>Now that we have the CTF data at our disposal, we need to parse it and get meaningful information out of it. As usual, every format contains a header that consists of data-wide preferences and stuff like version or a starting magic number.

Indeed, the CTF starts exactly with a 16-bit magic number 0xCFF1. I suspect this should be some kind of pun, trying to recreate the name of the format. 0xCC1F would serve as much better number, since “CTF” actually stands for Compact C Type Format.

Next byte is representing the version of the format. Currently, versions 1 and 2 exist. Every one of my kernel objects (FreeBSD 11) that I checked has the version 2, therefore for the libctf I would prefer to start with the version 2 and maybe discard the version 1 as it is old and unused in the project.

Fourth byte is treated as place for 8 flags, even though I was able to find only one – the compress flag. This means, that the actual CTF data are zipped to save some space. One of these bits could be used to signal the endianess of the data – this feature was requested more than once by the community as a lacking feature in the current library.

The thing I noticed looking at the current implementation is that it is using types like short, char and so on. Maybe I am overcautious or not knowledgable enough, but these variables do not have a fixed size. OK, maybe the char is obviously always a byte, but it is not guaranteed for the short to be 2 bytes wide. I understand the reasons behind this weird C types, but sometimes it is really needed to be able to rely on this. As a consequence to this problem, a POSIX of some kind was issued to create a files inttypes.h and stdint.h that would contain types like uint8_t or int32_t (in fact, many others). I believe it is only a header file that is generated during some compiler initialisation to contain the correct typedefs. The realisation of the types aside, I would like to use them in my libctf implementation to assure the proper sizes in all situations.

]]>https://lovasko.wordpress.com/2014/05/27/ctf-header-i/feed/0lovaskoNew relevant documentation and ideashttps://lovasko.wordpress.com/2014/05/26/new-relevant-documentation-and-ideas/
https://lovasko.wordpress.com/2014/05/26/new-relevant-documentation-and-ideas/#respondMon, 26 May 2014 23:03:16 +0000http://lovasko.wordpress.com/?p=33]]>Pedro Giffuni send me some great reading material regarding the CTF. The links can be seen on the official wiki of the project.

Many thanks!

]]>https://lovasko.wordpress.com/2014/05/26/new-relevant-documentation-and-ideas/feed/0lovaskoHunting for the .SUNW_ctfhttps://lovasko.wordpress.com/2014/05/26/hunting-for-the-sunw_ctf/
https://lovasko.wordpress.com/2014/05/26/hunting-for-the-sunw_ctf/#respondMon, 26 May 2014 00:38:10 +0000http://lovasko.wordpress.com/?p=24Continue reading →]]>In order to actually do something with the CTF data, we must first obtain it. The location – if everything was compiled and converted appropriately – is the ELF section named “.SUNW_ctf”. Hence I started my today’s research by listing all the sections one by one and for each comparing the name stored in the ELF string table to our searched section name. After a successful match, we proceed to get hold of the actual data. Luckily, the libelf API is well designed and all this was really straightforward. The important part of the code:

The whole code can be found here: elfctf.c (disclaimer: the code is nowhere near ideal state, there are no comments and no error checking, it is just a proof of concept).

After fixing the absence of the elements mentioned above, this code may be used in the rewrite of the CTF toolset, starting with the ctfdump. I would like to keep the libelf code out of the libctf for a simple-enough reason: keeping the library as light as possible when it comes to dependencies.

A small thought near the end: since there is no real connection anymore with the Sun Microsystems or Solaris, it might be suitable to rename this section to pure “.ctf”. There are not that many consumers of the CTF data that would need to change, most notably the D-Trace and the CTF toolset (ctfdump, ctfconvert and ctfmerge).

]]>https://lovasko.wordpress.com/2014/05/26/hunting-for-the-sunw_ctf/feed/0lovaskoMy FreeBSD setuphttps://lovasko.wordpress.com/2014/05/23/my-freebsd-setup/
https://lovasko.wordpress.com/2014/05/23/my-freebsd-setup/#respondFri, 23 May 2014 16:19:01 +0000http://lovasko.wordpress.com/?p=22Continue reading →]]>For some time now, I have been using OS X on my MacBook happily as my web-browsing, film-watching, BZFlag-playing operating system. But for my developer needs, I use FreeBSD with occasional experimenting (or just portability checking) on OpenBSD.

I would like to share my FreeBSD setup with you.

First of all, it runs in the VirtualBox. In the headless mode. What does that mean? Simply put, there is no window and therefore the VirtualBox eats less resources. I have set up a port-forwarding rule on localhost, where TCP connections from host OS port 3022 are forwarded to the guest OS (in this case the FreeBSD) to port 22. By default, the FreeBSD installation runs the sshd. After everything is loaded, I just run ssh -p 3022 root@localhost and get nice terminal inside my iTerm2!

]]>https://lovasko.wordpress.com/2014/05/23/my-freebsd-setup/feed/0lovaskodtrace.org mailing listhttps://lovasko.wordpress.com/2014/05/23/dtrace-org-mailing-list/
https://lovasko.wordpress.com/2014/05/23/dtrace-org-mailing-list/#respondFri, 23 May 2014 13:02:50 +0000http://lovasko.wordpress.com/?p=5Continue reading →]]>My initial goal is to document the CTF format and forge this knowledge into an useful and sober API. Therefore, I contacted people at dtrace.org, one of the very few groups of people that might be interested in my work. I briefly introduced myself and my project, stating my objectives and asked for commentary and opinions.

Only a few moments after my mail, I got a reply from Robert Mustacchi. To summarise his input:

to take a look at D-Trace print() action and mdb’s ::print dcmd

he suggests to take the top-down view of the problem: try to prototype the CTF tools like ctfmerge, ctfdump or ctfconvert and see what possible API they might need

he finds the reading/lookup API to be good enough in the illumos libctf

The first point seems like a reasonable thing to do (along with the LLDB default settings), but it has to wait until we have working libctf. I like the top-down approach idea and will suggest it to George (my mentor).

]]>https://lovasko.wordpress.com/2014/05/23/dtrace-org-mailing-list/feed/0lovaskoHello world!https://lovasko.wordpress.com/2014/05/23/hello-world/
https://lovasko.wordpress.com/2014/05/23/hello-world/#respondFri, 23 May 2014 12:34:28 +0000http://lovasko.wordpress.com/?p=3]]>Just created my blog account. And everything seems to be working just fine!]]>https://lovasko.wordpress.com/2014/05/23/hello-world/feed/0lovasko