http://www.yonch.com
Thu, 27 Dec 2018 02:44:27 +0000en-UShourly1https://wordpress.org/?v=4.9.3DNS Resolver Librarieshttp://www.yonch.com/uncategorized/dns-resolver-libraries
http://www.yonch.com/uncategorized/dns-resolver-libraries#respondSat, 21 Oct 2017 16:19:00 +0000http://www.yonch.com/?p=181We had an application that required parsing received DNS response packets for queries made by other applications. DNS packet structure is a 12-byte header, followed by question entries in the question format, followed by entries in answer, authoritative, and additional sections in the resource record (RR) format. The header specifies the number of entries in each of the four sections. Encoding and Decoding are made a bit more elaborate by introduction of pointers: to save space in packets, domain names compressed: encoded such that a domain name can reference as its suffix, some suffix of a prior domain name.

The project required a parser that was permissively licensed, high-performance, mature, easy-to-use, and secure. This sounds like a tall order, but there are some strong candidates! Wikipedia has a page the compares DNS server implementations, complemented by a Google search for DNS client libraries in C. We then filtered by permissive license.

MaraDNS. Has two server implementations: MaraDNS an authoritative, non-recursive daemon, whose goal is to serve DNS records to recursive resolvers, and Deadwood, a non-authoritative, recursive DNS daemon to help clients resolve domain names. MaraDNS developement started in 2001, Deadwood in 2007.The two components share no code, and the README claims the Deadwood code is cleaner. Deadwood looks like solid code with just a handful of sources. dwx_dissect_packet() in DwRecurse.c holds logic for handling incoming packets, although the code malloc()s memory for strings and structs.

NSD. Packet parsing code packet_read_rr() in packet.c seems to allocate memory using a custom memory allocator (“region_alloc()”). It also interact with a domain_table_type object (defined in namedb.h), i.e., code does not seem to cleanly separate parsing from processing. dns.h has macros to help parse packets.

Unbound. The client library, libunbound, implements an asynchronous reolver, and process_answer() in libunbound.c handles an incoming packet when it arrives. Handling is performed by process_answer() -> process_answer_detail() -> libworker.c: libworker_enter_result() -> parse_reply() -> util/data/msgparse.c: parse_packet(), which parses the header and calls parse_section() on sections, so parse_packet() seems to be the most relevant entry point to considering Unbound. At the end process_answer() calls the user-supplied callback function. The code uses its own buffer implementation, sldns_buffer.

YADIFA. Relatively recent implementation (started 2012) by the .eu TLD aiming to be “clean, small, light and RFC compliant”. YAFIDA seems to be developed internally by the TLD: the download page offers a tar.gz and de-emphasizes the github repo link, and the repo contains just a handful of commits, one for each release. YADIFA has four libraries: dnscore, dnsdb,dnszone, and dnslg. dnscore seems to include packet parsing infrastructure in message.h; the MESSAGE_ macros access aspects of the header, and message_process() unpacks into a struct message_data, however message_process() appears to mix packet access using the macros with server-specific logic. Specifically, it drops all answer packets (for the parsing use-case we would want to parse these too).

c-ares. Client library used by libcurl, wireshark, node.js and more. Developed on github. Macros for parsing headers, questions and RRs are in ares_dns.h. ares_expand_name() in ares.h and ares_expand_name.c decodes compressed DNS names into a buffer it allocates. ares_parse_a_reply() shows very nicely how to process A replies, with support for following CNAME records. It populates a list of all IP addresses supplied and the resolved hostname and its aliases via the CNAME chain. This seems easy to use; the only potential problem is multiple memory allocations.

dns.c. This is a full recursive, asynchronous DNS client in one c file, supposedly making it easy to embed. Has a github mirror. The h file seems to have everything needed to decode packets, but the code carries no documentation. It is admirably clear in naming and style, and low level functions stand out: dns_rr_parse(), dns_a_parse(), etc., but the sparse comments make it hard to find the right high-level functionality to parse packets. It might be that the resolver interface contains much of the incoming DNS packet processing logic (e.g., what to do when a packet contains multiple RRs).

udns. Has quite detailed documentation. Packet parsing is separate from the communication logic: the user supplies dns_parse_fn(), a callback to parse DNS packets; this callback is called when the reply arrives, after qualifying its DN, query class and qurry type as matching the original request. The library contains parsers for responses of standard query types, including A, AAAA, MX, and more. For A queries see dns_*_a4(), specifically dns_parse_a4. The “low-level interface” has parsing functions, including a domain decoder that doesn’t allocate memory, dns_getdn(). The low-level parsing code has everything required to parse a DNS packet quickly: after dns_initparse(), the user calls dns_nextrr() repeatedly to get the content of the RRs; if given the desired domain name (i.e., the DN in the query), the library performs automatic CNAME expansion.

s6-dns. Initial concern is of the library’s maturity. It came up on a Google search like SPCDNS and was not cross-mentioned in other libraries’ documentation (unlike c-ares, dns.c, udns). The documentation looks good, with separate libraries and s6dns_message_parse_answer_a() in the s6dns_message component, however the library has a dependency on a set of infrastructure libraries, skalibs, and uses skalibs’ dynamically allocated buffer stralloc.

SPCDNS. The project page is dead, there are few recent commits as of writing, and there are only two contributors, but the package focuses on parsing rather than performing the entire query, and does not allocate memory. The library’s dns.h has an impressive collection of structs for RR types. However a comment asking the user to choose which size to use when parsing packets, and to choose it large enough to avoid segfaults is worrying, and together with the fact that the sizes to be allocated are computed as multiple of a uintptr_t (i.e., 1 and 2) and the library wants a uintptr_t * as input..?

]]>http://www.yonch.com/uncategorized/dns-resolver-libraries/feed0Linux TCP congestion control internalshttp://www.yonch.com/tech/linux-tcp-congestion-control-internals
http://www.yonch.com/tech/linux-tcp-congestion-control-internals#commentsTue, 05 Jul 2016 22:07:01 +0000http://www.yonch.com/?p=148Linux has a pluggable TCP congestion control architecture: the IPv4 and IPv6 implementations both call a set of functions that implement congestion control. The congestion-control algorithm can be changed system-wide for new connections, or set for individual sockets using setsockopt (more info here). Here, we look at how the TCP implementation interacts with the congestion control algorithms and …

Research is based on kernel v4.4.3

The congestion control interface

Linux’s congestion control algorithm interface is defined in struct tcp_congestion_ops.

Below are some important state variables. tp-> designates struct tcp_sock (in include/linux/tcp.h). The variables from the RFC are named in lowercase:

tp->snd_nxt: the highest seq that was sent on the wire

tp->snd_wnd: the peer’s window

tp->snd_wl1: the seq number when peer last updated the window

tp->rcv_nxt: the next sequence number to be received. This is used in the ACK field of outgoing packets.

tp->rcv_wnd: the window size advertised to peer

Other important variables:

tp->write_seq: the highest seq written from the user process

tp->copied_seq: sequence number that will next be copied to the user.

tp->rcv_wup: the sequence number when window was advertised to peer (see more below)

tp->tlp_high_seq: zero when no TLP probe has been sent, is set to tp->snd_nxt when sending a TLP probe.

tp->prior_ssthresh: saves the previous ssthresh when going into window reduction, for cwnd undoing, if undoing is allowed.

tp->window_clamp: the maximum rcv window that will be advertised.

The flow-control window – what’s tp->rcv_wup (a.k.a RCV.WUP)?

On outgoing packets, a TCP sender includes an ACK, and th->window, the number of bytes after the ACK the sender is allowed to send. Special care is taken not to shrink the window when changing it, so if the other end has already sent some bytes they will fit in the new window.

The kernel keeps the information on the last window update in two variables. tp->rcv_wnd keeps the advertised window, and tp->rcv_wup keeps the last ACK that carried the update. This means that the other end may send until sequence number tp->rcv_wup + tp->rcv_wnd. So, in subsequent window advertisements, the code makes sure that the last permitted seq is not less than that sequence number.

tcp_receive_window() returns the number of bytes after tp->rcv_next that the previous window advertisement allowed

Incoming packet handling — tcp_rcv_established

Packets usually flow from the NIC to tcp_v{4,6}_do_rcv(), and when the TCP connection is in ESTABLISHED state, to tcp_rcv_established(). Here, the code follows a Van Jaconson-inspired scheme to quickly process “fast-path” packets, i.e., TCP packets that have no special circumstances: they have the next expected SEQ, carry an ACK in the correct range, and have the anticipated flags. The fast-path code optimizes two cases: ACK-only and data packets.

We now survey the functions that are called in each processing path.

ACK-only packet processing will call tcp_ack(), free the skb, then call tcp_data_snd_check(). For data packets, if the user process is waiting on the application, tcp_copy_to_iovec() tries to copy the data directly to user buffers, otherwise it calls tcp_queue_rcv() to enqueue to socket buffers. The data fast-path ends with the following processing:

If tcp_write_xmit() decided not to send, calls tcp_check_probe_timer()

Acknowledgement handling in tcp_ack()

tcp_ack() first performs a few checks of the ack against socket state. Is it acking previously-acked data (“old ack”)? Is it acking unsent data (“invalid_ack”)? If it is valid it proceeds re-arming the RTO if it is set for EARLY_RETRANS or LOSS_PROBE.

The flags at the start of tcp_ack() can contain:

FLAG_DATA: the packet contains data (not a pure ack) in the fast path

FLAG_SLOWPATH: header prediction didn’t work on the packet

FLAG_UPDATE_TS_RECENT: after verifying validity, ack_tcp() should call tcp_replace_ts_recent() to update the timestamp (tp->rx_opt.ts_recent and tp->rx_opt.ts_recent_stamp).

FLAG_DATA: added in tcp_ack() if in the slow path and packet contains data.

FLAG_WIN_UPDATE: if the right boundary of the peer’s advertised window might have moved (note there could be false positives in the slow path).

FLAG_ECE: ECN Echo bit was marked on the packet.

tcp_clean_rtx_queue() sets the following flags:

FLAG_RETRANS_DATA_ACKED: “This ACK acknowledged new data some of which was retransmitted”

FLAG_ORIG_SACK_ACKED: “Never retransmitted data are (s)acked”

FLAG_DATA_ACKED: previously unacknowledged data is acknowledged by the packet

FLAG_SYN_ACKED: the ACK acknowledges a SYN.

FLAG_SACK_RENEGING: the ACK acknowledges up to some packet, but the subsequent SEQ has been SACKed; this means the peer must have dropped a packet it has SACKed (otherwise the ACK would have included that packet too).

If header prediction succeeds, the packets enters the fast path, and the prediction guarantees the peer’s advertised window has not changed. So, in the fast path only tp->snd_wl1 and tp->snd_una need to be updated. In the slow path, tcp_ack_update_window() checks that the packet contains fresher information, and if so updates these variables and also tp->snd_wnd, and recomputes the fast-path prediction for the next packet. Slow-path packets might have SACKs, so these are processed next, then check for ECN Echo bit. tcp_clean_rtx_queue() frees packets from the write queue that are acked, so have arrived at the destination.

A dubious ack (see tcp_ack_is_dubious()) is an ack where packets:

carry the FLAG_CA_ALERT flag, or

do not carry any of the flags specified by FLAG_NOT_DUP, i.e., contain none of FLAG_DATA, FLAG_WIN_UPDATE, FLAG_DATA_ACKED, or FLAG_SYN_ACKED. or,

are in any connection not in TCP_CA_Open state

When acks are considered “dubious”, tcp_ack() calls tcp_fastretrans_alert() which we cover in a later section.

cong_avoid

Called from tcp_ack() if tcp_may_raise_cwnd() returns true. This requires:

The socket must not to be in CWR or Recovery states

Some progress has been made. Usually this means that some packets were acknowledged so FLAG_DATA_ACKED is set: tcp_clean_rtx_queue() counts how many packets were fully acknowledged by the ACK and removes them from the retransmit queue; if packets are removed, it sets FLAG_DATA_ACKED. When reordering in the network exceeds a threshold, a wider definition of progress is used (see FLAG_FORWARD_PROGRESS, basically this also counts new SACK’d packets).

tcp_fastretrans_alert()

On incoming packets with ECE marking, disable cwnd undoing.

If SACK reneging was detected (FLAG_SACK_RENEGING was set in tcp_clean_rtx_queue()), sets a short retransmission timeout to allow for ACKs to arrive for these packets, or otherwise the timeout will reset SACK state. If reneging is suspected, no further processing is done in tcp_fastretrans_alert().

If in CWR state and ACK is above tp->high_seq, will call tcp_end_cwnd_reduction() which resets tp->snd_cwnd to tp->snd_ssthresh, then move to TCP_CA_Open state.

If in Recovery state, and ACK is equal or above tp->high_seq, will call tcp_try_undo_recovery()

cwnd_event and TCP events

The TCP stack reports some events with tcp_ca_event(), which are then propagated to the congestion control’s cwnd_event:

In v4.4.3, calls to tcp_ca_event() appear in 8 source lines, one for each of the event types.

CA_EVENT_TX_START from tcp_event_data_sent()

CA_EVENT_CWND_RESTART from tcp_cwnd_restart()

CA_EVENT_COMPLETE_CWR from tcp_end_cwnd_reduction()

CA_EVENT_LOSS from tcp_enter_loss()

CA_EVENT_ECN_NO_CE, CA_EVENT_ECN_IS_CE from __tcp_ecn_check_ce()

CA_EVENT_DELAYED_ACK from tcp_send_delayed_ack()

CA_EVENT_NON_DELAYED_ACK from tcp_send_ack()

ssthresh

ssthresh is called from two functions: tcp_init_cwnd_reduction and tcp_enter_loss. The return value sets tp->snd_ssthresh, where tp points to struct tcp_sock.

in_ack_event

Called from tcp_in_ack_event. In the v4.4.3 kernel, only DCTCP and Westwood implement this callback. This is where DCTCP maintains counters for all acknowledged bytes and all ECN-marked bytes, and updates is alpha estimate each RTT (see dctcp_update_alpha()).

Whereas cong_avoid() is called only when tcp_may_raise_cwnd() is true, in_ack_event() is called for every ACK, and the call is before cong_avoid() in ACK processing so congestion control algorithms that implement both can expect a call to in_ack_event() before every cong_avoid().

pkts_acked

Called from tcp_ack -> tcp_clean_rtx_queue. This seems to be called for every ACK-only packet, and every valid slow-path packet with a valid ack value, even when no new packets have been acked. For fast-path packets with data, tcp_ack is only called if ACK_SEQ != SND_UNA, i.e., it is either an old ack or acks more bytes.

get_info

This is used when dumping diagnostics, i.e., dump() and dump_one() in tcp_diag_handler.

Resources

]]>http://www.yonch.com/tech/linux-tcp-congestion-control-internals/feed4Preparing a private git repository for public releasehttp://www.yonch.com/tech/tidy-git-repo-for-release
http://www.yonch.com/tech/tidy-git-repo-for-release#respondWed, 02 Mar 2016 20:38:28 +0000http://wordpress.yonch.com/?p=19A git project that is open-sourced could require some manipulation before it is ready for release such as changing the directory structure and cleaning out irrelevant files. When the repository contains a single sub-directory with content to release (and only such content), there are methods to split a git repository. This post covers techniques for finer-grained manipulation.

Removing history of deleted files

A git project frequently arrives to a point where it is beneficial to split it into two repositories. You may have developed a library and an application and want to release them separately, or perhaps some of the documentation grows to become a book, to be maintained by a different set of individuals. How does one create a repository with a subset of files from a big repository, optionally changing the directory structure, while keeping history for the split part and discarding history for files not carried over?

I’ve had good experience with extracting a patch from the original git and applying it to a new repository when separating libwireless from a bigger repository with a lot of experimental code. The rest of the post surveys the command-line parameters from the StackExchange post, and lists parameters for modifying the directory structure of files (for example, moving files from directoryA in the original repo to directoryB in the new repo).

Generating the patch

--pretty=email: format as an email message, to be read by “git am”. Using “git log” with the email format rather than “git format patch” allows easy creation of a single patch file with multiple commits.

--patch-with-stat: generate a patch and include statistics on changed files. Not clear why stats are needed for this purpose, but they don’t appear to hurt.

--reverse: latest commits first.

--full-index: show full object names, not only first few characters

--binary: binary diff, used to support binary files

To change the directory structure:

--relative=<path>: patch filenames should be relative to the given directory. Will also exclude changes outside the directory.

The combination –full-history and –simplify-merges would tell git to keep more history than the default, but the default setting seems to keep a good amount of history structure.

Applying the patch

--committer-date-is-author-date: usually, the time of the “git am” run is used as the commit date. This allows preservation of the original date.

--directory=<path>: prepends the given directory to the files in the patch, allows for putting the files under that subdirectory.

-p<n>: removes n leading slashes from the filenames. This can be used as an alternative to –relative when contructing the path. The files always have one leading slash, so to remove one level of directory nesting, use n=2.

]]>http://www.yonch.com/tech/tidy-git-repo-for-release/feed0Recovering Eclipse workspaces from CDT errorshttp://www.yonch.com/tech/recovering-eclipse-workspaces-from-cdt-errors
http://www.yonch.com/tech/recovering-eclipse-workspaces-from-cdt-errors#respondFri, 16 Oct 2015 14:43:00 +0000http://www.yonch.com/?p=144I use unison to synchronize work environments between different workstations (more convenient than rsync IMO), and include the eclipse workspace in the synchronized directory. The problem is sometimes, if the remote eclipse is running, sync will corrupt some workspace state that prevents Eclipse from running. This happens to me every 2-3 months.

Eclipse displays an error message to look at eclipse-workspace/.metadata/.log, which contains something like:

!ENTRY org.eclipse.osgi 4 0 2015-10-16 10:10:41.245
!MESSAGE Application error
!STACK 1
java.lang.NoClassDefFoundError: org/eclipse/core/resources/IContainer
at org.eclipse.ui.internal.ide.application.IDEApplication.start(IDEApplication.java:140)
at org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:196)
at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:134)
at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:104)
at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:380)
at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:235)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:669)
at org.eclipse.equinox.launcher.Main.basicRun(Main.java:608)
at org.eclipse.equinox.launcher.Main.run(Main.java:1515)
at org.eclipse.equinox.launcher.Main.main(Main.java:1488)
Caused by: java.lang.ClassNotFoundException: An error occurred while automatically activating bundle org.eclipse.core.resources (107).
at org.eclipse.osgi.internal.hooks.EclipseLazyStarter.postFindLocalClass(EclipseLazyStarter.java:116)
at org.eclipse.osgi.internal.loader.classpath.ClasspathManager.findLocalClass(ClasspathManager.java:531)
at org.eclipse.osgi.internal.loader.ModuleClassLoader.findLocalClass(ModuleClassLoader.java:324)
at org.eclipse.osgi.internal.loader.BundleLoader.findLocalClass(BundleLoader.java:327)
at org.eclipse.osgi.internal.loader.sources.SingleSourcePackage.loadClass(SingleSourcePackage.java:36)
at org.eclipse.osgi.internal.loader.BundleLoader.findClassInternal(BundleLoader.java:398)
at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:352)
at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:344)
at org.eclipse.osgi.internal.loader.ModuleClassLoader.loadClass(ModuleClassLoader.java:160)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 14 more
Caused by: org.osgi.framework.BundleException: Exception in org.eclipse.core.resources.ResourcesPlugin.start() of bundle org.eclipse.core.resources.
at org.eclipse.osgi.internal.framework.BundleContextImpl.startActivator(BundleContextImpl.java:792)
at org.eclipse.osgi.internal.framework.BundleContextImpl.start(BundleContextImpl.java:721)
at org.eclipse.osgi.internal.framework.EquinoxBundle.startWorker0(EquinoxBundle.java:941)
at org.eclipse.osgi.internal.framework.EquinoxBundle$EquinoxModule.startWorker(EquinoxBundle.java:318)
at org.eclipse.osgi.container.Module.doStart(Module.java:571)
at org.eclipse.osgi.container.Module.start(Module.java:439)
at org.eclipse.osgi.framework.util.SecureAction.start(SecureAction.java:454)
at org.eclipse.osgi.internal.hooks.EclipseLazyStarter.postFindLocalClass(EclipseLazyStarter.java:107)
... 23 more

A solution that has worked for me previously (found online) is removing *.snap in eclipse-workspace/.metadata/.plugins/org.eclipse.core.resources:

Some times I had to re-import the projects into the workspace, but otherwise worked fine.

If you’re here you’ve probably had similar problems.. Good luck and hope this helps!

]]>http://www.yonch.com/tech/recovering-eclipse-workspaces-from-cdt-errors/feed0Creating an LTTng tracepoint filehttp://www.yonch.com/tech/lttng-tracepoint-file
http://www.yonch.com/tech/lttng-tracepoint-file#respondMon, 16 Feb 2015 18:53:28 +0000http://www.yonch.com/?p=117LTTng is a framework to collect kernel tracepoint logs with low overhead (see 5.5 in Desnoyers’ thesis for eval). Instrumenting the kernel is done in two parts: adding a kernel tracepoint, and writing the LTTng adaptation layer. The process is documented in a section of the LTTng docs, however there are a few points that required fixing to make the adaptation layer work, detailed here.

The docs (at the time of writing of this post) show the following adaptation header:

Removing all semicolons after the LTTNG_TRACEPOINT_* macros fixes this.

Multiple adaptation headers in one lttng-probe-*.c file – doesn’t work

Not everyone might encounter this, but if you’re trying to #include multiple LTTng adaptation headers in one file, this does not seem to be supported (at least not in the git repo of mid-January 2015). Both cases failed: (1) multiple headers are included for the same TRACE_SYSTEM (2) multiple headers with different TRACE_SYSTEMs. The macros redefine some functions, causing a compile error. FYI!

Happy LTTng’ing!

]]>http://www.yonch.com/tech/lttng-tracepoint-file/feed0Linux kernel 3.x call-graphs with ncc-2.8http://www.yonch.com/tech/linux-kernel-3-x-call-graphs-with-ncc-2-8
http://www.yonch.com/tech/linux-kernel-3-x-call-graphs-with-ncc-2-8#commentsMon, 16 Feb 2015 18:04:48 +0000http://wordpress.yonch.com/?p=95ncc is a compiler, built specifically to extracts call-graph information from source code. One strong feature is analysis of function pointers. Call graphs can help explore and learn large code bases, which makes it especially useful for the Linux kernel. ncc comes with documentation on extracting call-graphs from 2.6 kernels, apparently tested circa 2008. We set out to test ncc on newer kernels (3.10.25)

Summary

ncc needs work to support the Linux kernel, while I solved some problems, more work needs to be done to complete the kernel compile.

__asm__ volatile and goto

ncc 1.8 does not support “__asm__ volatile” and __asm__ goto”, used in one of the first files compiled, kernel/bounds.c. To fix, add these to the function “__asm___statement” in ncc’s “parser.C”:

nccar argument parsing

ncc emulates ar to link together collections of collected metadata. In the process, ncc makes assumptions on archive names (ending with .a or .o), however the kernel makefiles use filenames like “.28948.tmp”. This causes a segfault in ncc’s ar code. To fix, see highlighted lines below in “preproc.C”:

Turning off compiler optimizations

For some big files, ncc sputters and segfaults; I saw this with “init/init_task.c”. The core dump doesn’t help much in finding the root cause of this segfault. When I compiled ncc with -O0 with icc, the segfault never came and ncc carried on. When I recompiled ncc with the default optimization, the kernel compilation continued.

Running ncc on the kernel

After copying a .config file to the desired directory, ran the kernel compilation with:

Making codeviz and gcc

The makefile calls the script “compilers/install_gcc-4.6.2.sh” to compile the patched gcc. There might be a few changes to that script in order to get everything working. Thanks to Stephan Friedl’s post, which gave the final fix for the gcc compilation.

Get a VM to compile gcc

This technique for setting up the system involves making many changes to /usr/lib and /usr/include, which could break the system in the future. I installed Ubuntu 14.04 desktop amd64 with 16GB of hard-disk on VirtualBox (also tested QEMU, but the VM was slow).

]]>http://www.yonch.com/tech/code-call-graphs-codeviz/feed7March 2013 – Boston’s St. Patrick’s Day Paradehttp://www.yonch.com/photography/march-2013-bostons-st-patricks-day-parade
http://www.yonch.com/photography/march-2013-bostons-st-patricks-day-parade#respondMon, 10 Nov 2014 04:07:06 +0000http://wordpress.yonch.com/?p=61
]]>http://www.yonch.com/photography/march-2013-bostons-st-patricks-day-parade/feed0Splitting a git repository into two reposhttp://www.yonch.com/tech/88-split-git-repository
http://www.yonch.com/tech/88-split-git-repository#respondTue, 03 Jun 2014 13:51:39 +0000http://wordpress.yonch.com/?p=18A git project frequently arrives to a point where it is beneficial to split it into two repositories. You may have developed a library and an application and want to release them separately, or perhaps some of the documentation grows to become a book, to be maintained by a different set of individuals. How does one create a repository with a subset of files from a big repository, optionally changing the directory structure, while keeping history for the split part and discarding history for files not carried over?

I’ve had good experience with extracting a patch from the original git and applying it to a new repository when separating libwireless from a bigger repository with a lot of experimental code. The rest of the post surveys the command-line parameters from the StackExchange post, and lists parameters for modifying the directory structure of files (for example, moving files from directoryA in the original repo to directoryB in the new repo).

Generating the patch

--pretty=email: format as an email message, to be read by “git am”. Using “git log” with the email format rather than “git format patch” allows easy creation of a single patch file with multiple commits.

--patch-with-stat: generate a patch and include statistics on changed files. Not clear why stats are needed for this purpose, but they don’t appear to hurt.

--reverse: latest commits first.

--full-index: show full object names, not only first few characters

--binary: binary diff, used to support binary files

To change the directory structure:

--relative=<path>: patch filenames should be relative to the given directory. Will also exclude changes outside the directory.

The combination –full-history and –simplify-merges would tell git to keep more history than the default, but the default setting seems to keep a good amount of history structure.

Applying the patch

--committer-date-is-author-date: usually, the time of the “git am” run is used as the commit date. This allows preservation of the original date.

--directory=<path>: prepends the given directory to the files in the patch, allows for putting the files under that subdirectory.

-p<n>: removes n leading slashes from the filenames. This can be used as an alternative to –relative when contructing the path. The files always have one leading slash, so to remove one level of directory nesting, use n=2.

]]>http://www.yonch.com/tech/88-split-git-repository/feed0Creating presentations programaticallyhttp://www.yonch.com/tech/86-script-presentation
http://www.yonch.com/tech/86-script-presentation#respondWed, 02 Oct 2013 13:59:07 +0000http://wordpress.yonch.com/?p=16One can convey quite complex ideas in presentations using the right diagrams, however composing diagrams using software like PowerPoint of OpenOffice Impress can be time consuming since elements are created and manipulated individually. This becomes even more problematic when showing an evolution of an idea through several slides: changing some aspect (e.g., color) in the first diagram requires manual modification to all subsequent diagrams.

In some cases, the diagram content can be represented using a few lines of code, for example when demonstrating a computer algorithm. Programatically authoring presentations could be a good solution in such cases.

HTML/JavaScript

Several JavaScript libraries allow graphics rendering. JavaScript presentations enjoy being platform independent, so running your presentation on a host computer could be easy. Raphael.js provides drawable objects with attributes that can be modified in the course of an animation. Paper.js manipulates objects through methods. Processing.js has a flatter model for drawing, where the canvas is painted on (similar to Cairo), so it is a lower-level engine. A comparison article highlights the differences between the three libraries. Some lists of JavaScript libraries are focused on data-visualisation, but some of these can be used to construct presentations.

D3.js is a framework for manipulating DOM objects based on data, and has good facilities to reason about animations (“transitions” in D3 language). This is a strong contestant for an animation engine.

Slide deck libraries display information in sequence, and some provide slide sorter menus and presenter mode. Some operate on the standard concept of slides, while others use scrolling or zoom-and-translate (like Prezi). Two promising libraries are deck.js and reveal.js. Both work with my USB clicker.

Python-based libraries

Pyglet provides low-level access to OpenGL, which enables 3D, while PyGame focuses on 2D animations and interaction. PyGTK supports Cairo python bindings to draw on its DrawingArea. These all seem like very low-level mechanisms.

SVG-based

InkScape is a program to create beautiful SVGs, and has a plugin model that allows extensions. SVG files can be animated using the SMIL standard; the InkScape wiki is a good introduction to SMIL animations. SMIL gives higher level control over animations, which could be well suited for presentations, reasoning about object paths and object attributes (for example, size), that change linearly or are bezier-interpolated over time.

Plugins like Sozi and JessyInk make presentations from InkScape, maybe this type of method can be used to program presentations.

Standalone programs

These are not what we were looking for, but might be adapted for use. Bruce and Pinpoint both render text files into presentations, and support the presented with processing key-presses and showing presenter mode.

Other options

Perfuse is a Java library to create visualizations. Latex-based presentations (e.g., Beamer) could be an option for some.