Posts tagged with 'libreoffice'

So, you still heard that unfounded myth that it is hard to get involved with and to start contributing to LibreOffice? Still? Even though that there are our Easy Hacks and the LibreOffice developer are a friendly bunch that will help you get started on mailing lists and on IRC? If those alone do not convince you, it might be because it is admittedly much easier to get started if you meet people face to face — like on one of our upcoming Events! Especially our Hackfests are a good way to get started. The next one will be at the University de Las Palmas de Gran Canaria were we had been guests last year already. We presented some introduction talks to the students of the university and then went on to hack on LibreOffice from fixing bugs to implementing new stuff. Here is how that looked like last year:

If you are a student at ULPGC or live in Las Palmas or on the Canary Islands, we invite you to join us to learn how to get started. For students, this is also a very good opportunity get involved and prepare for a Google Summer of Code on LibreOffice. Furthermore, if you are a even casual contributor to LibreOffice code already and want to help out sharing and deepen knowledge on how to work on LibreOffice code, you should get in contact with the Document Foundation — while the event is already very soon now, there still might be travel reimbursal available. You will find all the details on the wiki page for the Hackfest in Las Palmas de Gran Canaria 2015.

LibreOffice Evening Hacking in Las Palmas 2014

On the other hand, if two weeks is too short a notice for you, but the rest of this sounds really tempting, there is already the next Hackfest planned, which will take place in Cambridge in the United Kingdom in May. We will be there with a Hackfest for the first time and invite you to join us from anywhere in Europe if you either are a LibreOffice code contributor or if you are interested in learning more on how to become one. Again, there is a wiki page with the details on the LibreOffice Hackfest in Cambridge 2015, and travel reimbursals are available. Contact us!

How I imagine Cambridge in May — Photo by Andrew Dunn CC-BY-SA 2.0 via WikimediaRead more

Note the object here is small and trivial to copy as one would expect from objects passed around as values (as expensive to copy objects mostly can be passed around with a std::shared_ptr). So what did this measure? Here are the results:

Time for 1280000000 iterations on a Intel i5-4200U@1.6GHz (-march=core-avx2) compiled with gcc 4.8.3 without inline constructors:

implementation / CFLAGS

-Os

-O2

-O3

-O3 -march=…

A1

89.1 s

79.0 s

78.9 s

78.9 s

A2

89.1 s

78.1 s

78.0 s

80.5 s

A3

90.0 s

78.9 s

78.8 s

79.3 s

B1

103.6 s

97.8 s

79.0 s

78.0 s

B2

99.4 s

95.6 s

78.5 s

78.0 s

B3

107.4 s

90.9 s

79.7 s

79.9 s

C1

99.4 s

94.4 s

78.0 s

77.9 s

C3

98.9 s

100.7 s

78.1 s

81.7 s

And, for comparison, following are the results, if one allows the constructors to be inlined.
Time for 1280000000 iterations on a Intel i5-4200U@1.6GHz (-march=core-avx2) compiled with gcc 4.8.3 with inline constructors:

implementation / CFLAGS

-Os

-O2

-O3

-O3 -march=…

A1

85.6 s

74.7 s

74.6 s

74.6 s

A2

85.3 s

74.6 s

73.7 s

74.5 s

A3

91.6 s

73.8 s

74.4 s

74.5 s

B1

93.4 s

90.2 s

72.8 s

72.0 s

B2

93.7 s

88.3 s

72.0 s

73.7 s

B3

97.6 s

88.3 s

72.8 s

72.0 s

C1

93.4 s

88.3 s

72.0 s

73.7 s

C3

96.2 s

88.3 s

71.9 s

73.7 s

Some observations on these measurements:

-march=... is at best neutral: The measured times do not change much in general, they only even slightly improve performance in five out of 16 cases, and the two cases with the most significant change in performance (over 3%) are actually hurting the performance. So for the rest of this post, -march=... will be ignored. Sorry gentooers.
Read more

Note the object here is small and trivial to copy as one would expect from objects passed around as values (as expensive to copy objects mostly can be passed around with a std::shared_ptr). So what did this measure? Here are the results:

Time for 1280000000 iterations on a Intel i5-4200U@1.6GHz (-march=core-avx2) compiled with gcc 4.8.3 without inline constructors:

implementation / CFLAGS

-Os

-O2

-O3

-O3 -march=…

A1

89.1 s

79.0 s

78.9 s

78.9 s

A2

89.1 s

78.1 s

78.0 s

80.5 s

A3

90.0 s

78.9 s

78.8 s

79.3 s

B1

103.6 s

97.8 s

79.0 s

78.0 s

B2

99.4 s

95.6 s

78.5 s

78.0 s

B3

107.4 s

90.9 s

79.7 s

79.9 s

C1

99.4 s

94.4 s

78.0 s

77.9 s

C3

98.9 s

100.7 s

78.1 s

81.7 s

And, for comparison, following are the results, if one allows the constructors to be inlined.
Time for 1280000000 iterations on a Intel i5-4200U@1.6GHz (-march=core-avx2) compiled with gcc 4.8.3 with inline constructors:

implementation / CFLAGS

-Os

-O2

-O3

-O3 -march=…

A1

85.6 s

74.7 s

74.6 s

74.6 s

A2

85.3 s

74.6 s

73.7 s

74.5 s

A3

91.6 s

73.8 s

74.4 s

74.5 s

B1

93.4 s

90.2 s

72.8 s

72.0 s

B2

93.7 s

88.3 s

72.0 s

73.7 s

B3

97.6 s

88.3 s

72.8 s

72.0 s

C1

93.4 s

88.3 s

72.0 s

73.7 s

C3

96.2 s

88.3 s

71.9 s

73.7 s

Some observations on these measurements:

-march=... is at best neutral: The measured times do not change much in general, they only even slightly improve performance in five out of 16 cases, and the two cases with the most significant change in performance (over 3%) are actually hurting the performance. So for the rest of this post, -march=... will be ignored. Sorry gentooers. ;)

There is no silver bullet with regard to the different implementations: A1, A2 and A3 are the faster implementations when not inlining constructors and using -Os or -O2 (the quickest A* is ~10% faster than the quickest B*/C*). However when inlining constructors and using -O3, the same implementations are the slowest (by 2.4%).

Most common release builds are still done with -O2 these days. For those, using initializer lists (A1/A2/A3) seem too have a significant edge over the alternatives, whether constructors are inlined or not. This is in contrast to the conclusions made from “constructor counting”, which assumed these to be slow because of additional calls needed.

The numbers printed in bold are either the quickest implementation in a build scenario or one that is within 1.5% of the quickest implementation. A1 and A2 are sharing the title here by being in that group five times each.

With constructors inlined, everything in the loop except DoSomething() could be inline. It seems to me that the compiler could — at least in theory — figure out that it is asked the same thing in all cases. Namely, reserve space for three ints on the heap, fill them each with 4711 and make the ::std::vector<int> data structure on the stack reflect that, then hand that to the DoSomething() function that you know nothing about. If the compiler would figure that out, it would take the same time for all implementations. This doesnt happen either on -O2 (differ by ~18% from quickest to slowest) nor on -O3 (differ by ~3.6%).

One common mantra in applications development is “trust the compiler to optimize”. The above observations show a few cracks in the foundations of that, esp. if you take into account that this is all on the same version of the same compiler running on the same platform and hardware with the same STL implementation. For huge objects with expensive constructors, the constructor counting approach might still be valid. Then again, those are rarely statically initialized as a bigger bunch into a vector. For the more common scenario of smaller objects with cheap constructors, my tentative conclusion so far would be to go with A1/A2/A3 — not so much because they are quickest in the most common build scenarios on my platform, but rather because the readability of them is a value on its own while the performance picture is muddy at best.

And hey, if you want to run the tests above on other platforms or compilers, I would be interested in results!

Note: I did these runs for each scenario only once, thus no standard deviation is given. In general, they seemed to be rather stable, but this being wallclock measurements, one or the other might be outliers. caveat emptor.

So, I recently brought up the topic of writers notes in the LibreOffice ESC call. More specifically: the SwNodeIndex class, which is, if one broadly simplifies an iterator over the container holding all the paragraphs of a text document. Before my modifications, the SwNodes container class had all these SwNodeIndices in a homegrown intrustive double linked list, to be able to ensure these stay valid e.g. if a SwNode gets deleted/removed. Still — as usual with performance topics — wild guesses arent helpful, and measurements should trump over intuition. I used valgrind for that, and measured the number of instructions needed for loading the ODF spec. Since I did the same years and years ago on the old OpenOffice.org performance project, I just checked if we regressed against that. Its comforting that we did not at all — we were much faster, but that measurement has to be taken with a few pounds of salt, as a lot of other things differ between these two measurements (e.g. we now have a completely new build system, compiler versions etc.). But its good we are moving in the right direction.

With that comforting knowledge, I started to play around with the code. The first thing I did was to replace the handcrafted intrusive list with a std::list pointing to the SwNodeIndex instances as a member in the SwNodes class. This is expected to slow down things, as now two allocs are needed: one for the SwNodeIndex class and one for the node entry in the std::list. To be honest though, I didnt expect this to slow down the code handling the nodes by a factor of ~57 for the loading of the example document. This whole document loading time (not just the node handling) slows by a factor of ~2.4. So ok, this establishes for certain that this part of the code is highly performance sensitive.

The next thing I tried to get a feel for how the performance reacts was using a std::vector in the SwNodes class. When reserving some memory early, this should severely reduce the amount of allocs needed. And indeed this was quicker than the std::list even with a naive approach just doing a push_back() for insertion and a std::find()/std::erase() for removal. However, the node indices are often temporarily created and quickly destroyed again. Thus adding new indices at the end and searching from the start certainly is not ideal: Thus this is also slower than the intrusive list that was on master by a factor of ~25 for the code doing the node handling.

Searching for a SwNodeIndex from the end of the vector, where we likely just inserted it and then swapping it with the last entry makes the std::vector almost compatitive with the original implementation: but still 30% slower than the original implementation. (The total loading time would only have increased by 0.7% using the vector like this.)

For completeness, I also had a look at a std::unordered_map. It did a bit better than I expected, but still would have slowed down loading by 15% for the example experiment.

Having ruled out that standard containers would do much good here without lots of tweaking, I tried the sw::Ring<> class that I recently rewrote based on Boost.Intrusive as a inline header class. This was 11% quicker than the old implementation, resulting in 2.6% quicker loading for the whole document. Not exactly a heroic archivement, but also not too bad for just some 200 lines touched. So this is now on master.

Why do this linked list outperform the old linked list? Inlining. Especially, the non-inlined constructors and the destructor calling a trivial non-inlined member function. And on top of that, the contructors and the function called by the destructor called two non-inlined friendfunctions from a different compilation unit, making it extra hard for a compiler to optimize that. Now, link time optimization (LTO) could maybe do something about that someday. However, with LTO being in different states on different platforms and with developers possibly building without LTO for build time performance for some time, requiring the compiler/linker to be extra clever might be a mixed blessing: The developers might run into “the map is not the territory” problems.

my personal take-aways:

The SwNodeIndex has quite a relevant impact on performance. If you touch it, handle with care (and with valgrind).

Intrusive linked lists might be cumbersome, but for some scenarios, they are really fast.

Inlining can really help (doh).

LTO might help someday (or not).

friend declarations for non-inline functions across compilation units can be a code smell for possible performance optimization.

Please excuse the extensive writing for a meager 2.6% performance improvement — the intention is to avoid somebody (including me) to redo some or all of the work above just to come to the same conclusion.

So, as manyothers, I have been to the LibreOffice Hackfest in Toulouse which — unlike many of our other Hackfests — was part of a bigger event: Capitole du Libre. As we had our own area and were not 30+ hackers, this also had the advantage that we got quicker to work. And while I had still some boring administrative work to do, this is a Hackfest were I actually got to do some coding. I looked for some bookmark related bugs in Writer, but the first bugs I looked at were just too well suited to be Easy Hacks: fdo#51741 (“Deleting bookmark is not seen as modification of document”) and fdo#56116 (“Names of bookmarks should allow all characters which are valid in HTML anchor names (missing: ‘:’ and ‘.’)”). Both were made Easy Hacks and both are fixed on master now. I then fixed fdo#85542 (“DOCX import of overlapping bookmarks”), which proved slightly more work then expected and provided a unittest for it to never come back. I later learned that the second part was entirely nonoptional, as Markus promised he would not have let me leave Toulouse without writing a unittest for commited code. I have to admit that that is a supportable position.

Toulouse Hackfest Room

Scenes like the above were actually rather rare as we were mostly working over our notebooks. One thing I came up with at the Hackfest, but didnt finish there was some clang plugins for finding cascading conditional ops and and conditional ops that have assignments as a sideeffect in their midst. While I found nothing as mindboggling as the tweet that gave inspiration to these plugins in sw (Writer), I found someimpressiveexpressions that certainly wouldnt be a joy to step through in gdb (or even better: set a breakpoint in) when debugging and fixed those. We probably could make a few EasyHacks out of what these (or similar) plugins find outside of sw/ (I only looked there for now) — those are reasonably easy to refactor, but you dont want to do that in the middle of a debugging session. While at it, I also looked at clangs “value assigned, but never read” hints. Most were harmless, but also trivial to get rid of. On the other hand, some of those pointed to real logic errors that are otherwise hard to see. Like this one, which has been hiding — if git is to be believed — in plain sight ever since OpenOffice.org was originally open sourced in 2000. All in all, this experience is encouraging. Now that there are our coverity defect density is a just a rounding error above zero getting more fancy clang plugins might be promising.

Being involved in a project that is heavily driven by donations, I keep remembering myself of the importance of putting my money were my mouth is.

Some of these donations were triggered by recent events and initiatives in these projects. GNOME’s outreach for women program for example. Or OpenBSDs bold initiative in starting LibreSSL, which is doing what needed to be done and vitalizing an overlooked area of open source development. Watching them explain the status quo and how they are attacking it remembers me of LibreOffice — beyond the name. Plus, I dont want to be compared with a My little Pony character again.

goals of LibreSSL — they remind me of something

Others are already working examples of the long tail, crowd funding and the meshed society (Wikipedia) or tailblazing to be one (Krautreporter) beyond the world of software. The latter might also have been influenced by one of the last wishes of a man that unexpectedly died way to early. May he rest in peace.

And the sons of pullman porters and the sons of engineers Ride their father’s magic carpets made of steel Mothers with their babes asleep are rockin’ to the gentle beat And the rhythm of the rails is all they feel

So, LibreOffice does its releases on a train release schedule and since we recently modified the schedule a bit (by putting out the alpha1 release earlier), I took the opportunity to take a closer look and explain a bit on what we are doing. With this every 6 months of LibreOffice development currently roughly look like this:

week after x.y.0

development

release candidates

finalized releases

fresh

stable

fresh

stable

0

x.y.0

1

x.y.1~rc1

x.(y-1).4~rc1

2

3

x.y.1~rc2

x.(y-1).4~rc2

4

x.y.1

x.(y-1).4

5

6

x.y.2~rc1

7

8

x.y.2~rc2

x.(y-1).5~rc1

9

x.y.2

10

x.(y-1).5~rc2

11

x.y.3~rc1

x.(y-1).5

12

13

x.(y+1)~alpha1

x.y.3~rc2

14

x.y.3

15

16

17

18

x.(y+1)~beta1

19

20

x.(y+1)~beta2

x.(y-1).6~rc1

21

22

x.(y+1)~rc1

x.(y-1).6~rc2

23

x.(y-1).6

24

x.(y+1)~rc2

25

x.(y+1)~rc3

The last two columns are most visible to most visitors of the LibreOffice website. Those are the versions found on the LibreOffice Fresh and LibreOffice Stable download pages. We are in roughly at week 18 after 4.2.0 release now, and the versions available are 4.2.4 fresh and 4.1.6 stable. A careful reader will note that according to that schedule we should be at 4.2.3 and 4.1.5 — that is true, but the 4.2 series still had an extra 4.2.1 intermediate release to adjust the schedule of 4.2 in direction of the current plan. This is not expected for future releases (also note that there is always some flexibility in the plan to allow for holidays etc.)

If you count all the prereleases, release candidates and releases, you will find that we do 25 of those in 26 weeks. Beside the fact that this is a lot of work for release engineers, one might wonder if anyone can keep up with that, and if so — how? The answer to that depends on how you are using LibreOffice.

self deployment on LibreOffice fresh

If you are an user or a small business installing LibreOffice yourself, you will probably run LibreOffice fresh and the table above simplifies for you as follows:

week after x.y.0

development

release candidates

finalized releases

0

x.y.0

1

x.y.1~rc1

4

x.y.1

6

x.y.2~rc1

9

x.y.2

11

x.y.3~rc1

14

x.y.3

18

x.(y+1)~beta1

20

x.(y+1)~beta2

22

x.(y+1)~rc1

24

x.(y+1)~rc2

25

x.(y+1)~rc3

The last column shows the releases you are running. If you are a member of the LibreOffice community it would be very helpful if you also spend some time of this 6 months period for three actions:

running at least one of the release candidates in the table (available for download here) before the final is released.

running at least one beta releases in the table. Note that there will be a bug hunting session on the 4.3.0 beta release this week, that will help you get started.

running a nightly build once anywhere in the weeks 1-18. Note that if you are getting excited about seeing the latest and greatest builds while they are still steaming, there are tools that can help you with this on Linux and Windows.

If you do these each of these three things once in the timeframe of six months and report any issues you find, you are helping LibreOffice already a lot — and you are making sure that the finalized releases of the fresh series are not only containing all the latest features, but also free of severe regressions.

bigger deployments on LibreOffice stable

If you are not installing LibreOffice yourself, but instead have a major deployment administrated centrally, things are a bit different. You might be more conservative and interested in the releases from LibreOffice stable. And you probably have professional support from a certified developer or a company employing certified developers.

week after x.y.0

development

release candidates

finalized releases

1

x.(y-1).4~rc1

4

x.(y-1).4

8

x.(y-1).5~rc1

11

x.(y-1).5

13

x.(y+1)~alpha1

18

x.(y+1)~beta1

20

x.(y-1).6~rc1

23

x.(y-1).6

If you intend to deploy one series of LibreOffice (e.g. 4.3), there are two things that are highly recommended to be done:

make the alpha or beta releases available quickly to interested volunteers in your deployment early. They might find bugs or regressions that are specific to your use of the software.

make the release candidates of versions that you intent to deploy available early to your users.

Of these two actions, the first is by far the most important: It identifies issues early on in the life cycle and gives both your support provider and the LibreOffice developer community at large time to resolve the issue. In fact, I would argue that if you have a major deployment, the only excuse for not making available prereleases, is that you made available nightly builds.

Ubuntu

So, Ubuntu qualifies as a “bigger deployment” and I have to take care of LibreOffice on it. Also people want to be able to run the latest and greatest LibreOffice releases from the LibreOffice fresh series. Do I follow my recommendations here? Yes, mostly I do:

prereleases are made available as bibisect repositories rather quick (build on Ubuntu 12.04 LTS). In addition, fully packaged versions of LibreOffice are build in the prereleases PPA as early as starting with beta1.

So, you are invited to run or test builds from these PPAs — or download the bibisect repositories — to keep LibreOffice releases coming in the steady and stable fashion they do. Finally, there is a bug hunting session for LibreOffice this week and as said above, no matter if you are running a huge deployment or installing on your own, you are helping LibreOffice — and yourself, as a user of LibreOffice — a lot by testing the prereleases:

This needs some background first: LibreOffice 4.2 modified the UNO API to pop up a message box in a slight way against LibreOffice 4.1. This was properly announced in our LibreOffice 4.2 release notes many moons ago:

The following UNO interfaces and services were changed […] com.sun.star.awt.XMessageBox, com.sun.star.awt.XMessageBoxFactory

Luckily, LibreOffice extensions can specify a minimal version, so extensions using the new MessageBox-API can explicitly request a version of LibreOffice 4.2 or newer. This change in our sdk-examples shows how an extension can be updated to use the new API and explicitly require a version of LibreOffice 4.2 and higher. All this happened already with LibreOffice 4.2.0 being released and has nothing yet to do with the change in LibreOffice 4.2.4.

So what was changed in LibreOffice 4.2.4? Well, in addition to the LibreOffice version, old extensions sometimes just ask for an “OpenOffice.org version”. Most LibreOffice versions answered its version was “3.4”, so this old backwards compatible check was not very helpful anyway. So in LibreOffice 4.2.4 this value was changed to “4.1”, which might make some old extensions aware of the incompatible API change. That’s all.

Note that:

Most extensions using the MessageBox API have already been changed at 4.2.0 (or have been fixed by Linux distros)

So, the LibreOffice Las Palmas Hackfest 2014 is over and it was awesome. I have to thank Alberto Ruiz and University of Las Palmas de Gran Canaria for their excellent hosting and support. We had the opportunity to present the LibreOffice project to the students of the university, and we did so with a set of short talks to cover a lot of ground without too much boring details. Here is the hand of my slides:

You can find a video of all the talks in the session on youtube. My talk starts around minute 35 and is followed by Kendys nice intro on improving the LibreOffice UI. In addition to the video, I also made a few pictures on the event, you can find them in this album.

Hacking

The achievements section of this Hackfest is still being populated, but despite being a smaller Hackfest, there seems to have been quite some productive work done in total. It was also very encouraging to see curious students from the university drop by, we tried to give them a gentle introduction on ways to contribute and learn more.

So in a few minutes, I will be leaving for the meeting at Open Knowledge Lab in Hamburg for Code for Germany in Hamburg — but I dont want to show up empty-handed. Earlier I learned about BundesGit which is a project to put all federal german laws in a git repository in easily parsable markdown language. This project was featured prominently e.g. on Wired, Heise and got me wondering that having all those laws available at the tip of your hand would be quite useful for lawyers. So here I went and quickly wrote an extension to do just that. When you install the extension:

it downloads all the german federal laws from github and indexes them on the next restart of LibreOffice (completely in the background without annoying the user)

that takes about ~5 minutes (and it only checks for updates on the next start, so no redownload)

once indexed you can insert a part of a law easily in any text in Writer using the common abbreviations that lawyers use for these:

Type the abbreviation of the paragraph on an otherwise empty line, e.g. “gg 1″ for the first Artikel of the Grundgesetz

press Ctrl-Shift-G (G for Git, Gesetz or whatever you intend it to mean)

LibreOffice will replace the abbreviation with the part of that law

BundesGit for LibreOffice

Now this is still a proof-of-concept:

It requires a recent version (1.9 or higher) of git in the path. While that is for example true in the upcoming version of Ubuntu 14.04 LTS, other distributions might still have older versions of git, or — on Windows — none at all: Packing a git binary into the extension is left as an exercise for the reader.

I have not checked it to parse all the different laws and find all the paragraphs. It also ignores some non-text content in the repository for now. Patches welcome!

While it stays in the background most of the time intentionally to not get into the way of the user, it could use some error reporting or logging, so users are not left in the dark if it fails to work.

On the other hand, the extension is a good example what you can do with less than 300 lines of Python3 (including tests) in LibreOffice extensions. Thus the code was hopefully verbosely enough commented and was uploaded to sdk-examples repository, where it lives alongside this LibreOffice does print on Tuesdays extension that also serves as an example. Of course, if there other useful repositories of texts online, it can be quickly adapted to provide those too.

So, somewhere between the LibreOffice 4.2.0 and the 4.1.5 release, bugs.freedesktop.org broke through 25.000 reported bugs. A time to throw the hands up in despair? Not at all, as the following chart shows:

LibreOffice bug states on freedesktop

7% of reports are still unconfirmed or need more information

22% are confirmed and unresolved issues, that are not enhancements requests

6.5% are unresolved enhancement requests.

On the other hand:

33% of all reports have been fixed in some way

and 30% are invalid or duplicates.

Its interesting to see how now a quarter of the confirmed unresolved reports are asking for new features and enhancements. Its gets even more encouraging, if you take into account that the number of bugs reports is at a long term constant 20-25 reports per day, while over 40% of the bugs intentionally or collaterally fixed changed their state in the last 12 month. So we are picking up speed in triaging and fixing bugs, while the influx of new reports stays constant.

If you are interested, please help QA quite a bit in all this by writing good bug reports, identifying duplicates, confirming new reports, bibisecting regressions, run and test daily builds and prereleases or otherwise helping with the QA Easy Hacks!

and some errata for it: On slide 13 it says “the same file is also hardlinked from workdir/” — thats not true for quite a while already. LibreOffice keeps around exactly one copy of a library, unlike the confusing three copies that we had in LibreOffice 3.3. This should be a lot less confusing to the curious first time contributor.

If there is just one number to take away from all these slides, its that a noop rebuild for LibreOffice on a three year old developer notebook with the distro provided GNU make 3.81 takes just 17 seconds(*). And slide 7 shows still some possibilities to still speed things up beyond that — and while at current speeds it might not be worth it on Linux, it might be worthwhile for e.g. Windows, which is traditionally rather slow when it comes to file I/O.

On a related note, over time we improved the way new contributors can submit their changes on our instance of gerrit in many ways. Thanks a lot to David, Norbert and Robert for the work on this. One only has to look at one of daily digests generated from activity on gerrit and imagine we would still get one mail for each change, update and merge to the mailing list for manual patch tracking as we did in the early days. Thanks a lot also to Mathias Michel for his work on the script!

So if you haven’t done that yet, consider graping an EasyHack and get started!

So LibreOffice 4.2.0 release candidate 3 has been tagged yesterday evening. A good time to look back at the cycle and look at some numbers. The number of issues fixed in the 4.2 series are in line with our historic trends:

There is no page for the third release candidate yet, but I assume it to be no exception. Fixing issues is mainly done by development, although QA does the preparation for that by triaging a bug well. But QA also does quite a bit of work before a bug is triaged, and this is not directly locked to changes in code. So I had a look at the numbers simply in the timeframe between the tagging of 4.1.0 rc3 (2013-07-17) and 4.2.0 rc3 (yesterday). In this timeframe, QA did:

confirm 3114 bugs (change of ever_confirmed).

resolve 3393 bugs (change of resolution and not unresolved now, this includes the bugs fixed by development).

Naturally, these can not be simply be added up: for example, a bug can be confirmed and then be resolved by fixing it. If all of that happens in the timeframe (as it likely will for a relevant bug), it will appear in all the above counts. Meanwhile, in this timeframe 4092 bugs have been filed by endusers. Of those new bugs filed, 9.3% where enhancement requests. Since not all resolved bugs need to be confirmed (e.g. invalid bugs), these numbers add up nicely.

Speaking of quality, another thing to look at is regressions. How many of those will be fixed in 4.2 as of now? Here is the rundown:

1 regression introduced in 3.4 or before

2 regressions introduced in 3.5 or before

3 regressions introduced in 3.6 or before

2 regressions introduced in 4.0 or before

8 regressions introduced in 4.1 or before

51 regressions introduced on master or found in betas and release candidates

As you can see, most of the regressions fixed with this have actually never been released. This should be encouraging news to those testing daily builds: If you do that, you will be rewarded with quick bug fixes. Still, only fixing 16 regressions that were visible in previous releases seems a rather low count for a release. Well, this is because this count does not count fixed regressions that are also backported to the updates on the 4.1 stable series. As regressions are usually worth that effort, this is usually done unless it is to risky a change for that. If you look for regressions that were fixed in 4.2 and also backported to 4.1, you as of now get a count of:

230 regressions fixed in 4.2 that were also backported to the 4.1 series

in addition. See this earlier post for more details on how the backporting works and some numbers on it.

Speaking of regressions, we have a pretty unique tool to corner them: bibisect. How well does this work? I keep tracking these in bugzilla for the last months. Currently 176 bugs have been bibisected, with the number of unresolved bibisected bugs staying constant in the 60-70 range. That is encouraging, as it means that for each regression bibisected, a developer fixes a bibisected regression. This happens currently at a rate of ~2 bugs per week, which is not too bad, as such regressions might be quite hard cornercases that without bibisect would be tricky to pin down. However, only ~14% of our unresolved regressions are bibisected as of now. Clearly, we can improve that ratio with more bibisecting and get more regressions fixed even quicker.

Ok, admittedly, this was a boring and dry post on bug numbers. What can I do to lighten you up? Here is catcontent, presented in LibreOffice Draw 4.2 running on Ubuntu trusty with the awesome new libreoffice-style-sifr icon theme:

So these days, most people prefer to use an IDE to navigate their source code. This has often been greeted with some defensive elitism of the “real programmers” kind since the early days of the open sourcing of StarOffice. One does not simply load a code base the size of LibreOffice in your wimpy IDE: while it is possible somehow in the end, its a lot more trouble than its worth to manually set up e.g. all the include path manually to get the fancy stuff like autocompletion. Add to that, that e.g. UNO headers are generated during the build and header were at distributed over multiple IDE unfriendly locations, with many headers even available as copies from multiple locations, before we fixed that.

All these things are fixed now. And while LibreOffice still is a huge beast with our new build system we can get a holistic view of what needs to get build where, how and when. This makes it easy, almost trivial to generate an IDE project file from the build system. And to prove this point, I did just that for the kdevelop IDE. This isnt limited in principle to this one IDE — in fact the kdevelop specific part of this is some 150 lines of Python. So no matter what IDE you use: Eclipse, Netbeans, Anjuta, Visual Studio, Code::Blocks or XCode — you should be able to adapt this. In fact, while writing this, I find there is already work going on for XCode. Feel invited to join the party and make LibreOffice trivially buildable in your favourite IDE!

Dont believe it? Here is a video featuring a stuttering german guy (me) on the audio track showing this:

If you want to show this around on social media, there is also a shorter version featuring the essentials (make sure to link to the HD versions).

A closing note: A long time, common IDEs embrace and extended into the buildsystems so once you used an IDE, you could only use this one IDE and no other. In retrospect, this is obviously doing it wrong. With the current approach, we can make LibreOffice easily buildable in any IDE on any platform. A very important fact for a product available on so many platforms.

I have to admit that I arrived at this event with some travel fatigue and some upcoming Ubuflu, so I was not too productive myself, but its good to see fixes like for example in the kde integration (Jan-Marek), in Calc (Eilidh), for enabling bitcoin donations (Florian), to mail merge (again Jan-Marek), to Math (Marcos), for the build system (Michael and David) happening (or at least be prepared at the event). A big “Thank You” to all the angels of the Chaos Computer Club Freiburg that organized the event — when I learned that I would need to travel to the US right before this, I had some doubts if it would result in “remote-organization-troubles” given this was a first time in Freiburg. This was completely unfounded, the support of our hosts was amazing and they seemed to have made a deal with Eris to take revenge for the original snub somewhere else on this weekend. ;)

So, given that I did not do much coding (just some preparation for the KDevelop integration for LibreOffice, more on that later), what can I offer you? Catcontent was not available (no cats at this Hackfest), so I give you the second best thing: the deputy chairman of the board of the Document Foundation patrolling the premises on a skateboard:

So, whats next? FOSDEM! We will of course be there again, and back-to-back with the event we will have a user experience Hackfest in Bruessels. So come and join us:

So, LibreOffice 4.2.0 alpha1 has been tagged upstream a week ago. It is an alpha release, essentially only a tagged snapshot of the LibreOffice master branch and as such might eat your kitten and kill unsuspecting relatives. On the other hand, if you absolutely are of the type that Metallica roars about in the above quote and therefore you are running the development release of Ubuntu (trusty tahr, which will become Ubuntu 14.04 LTS), you can add the LibreOffice prereleases PPA and try it out and report bugs. Of course, you should not use this in a production environment of any kind!

Im happy to see that this build available again a week earlier than last year, as early testing allows more bugs to be triaged and fixed in time. The more important difference though is that last year, the alpha version was build on the stable and released version of Ubuntu, while this year the version is already build against the early and moving development version of Ubuntu.

LibreOffice 4.2.0 alpha1 and a hint of the new startcenter on Ubuntu Trusty Tahr

Almost, as e.g. while I was able and eager to send the slides for the lightning talks I moderated, I somehow forgot to do so for my own slides for my talk on tb3. It will hopefully end up on the conference site at some point, but for now I uploaded at at speakerdeck (with the odp originals on the wiki and here):

I didnt bring my own camera and thus missed making pictures during e.g. the lively QA roundtable, but Rob made sure that we get at least some photo on the last day (when many were already on the way home):

So in the next days, I will be hopping over atlantic for a visit to the west coast, just to return to turn “eastbound and down, loaded up and trucking” to be at Freiburg for the Hackfest again. A big Thank You in advance to Tauon and Florian Effenberger, who took over a lot of my organizer duties on this one due to this tight scheduling. Oh, and of course, I hope to see many of you there!

So, Ubuntu 13.10 (Saucy Salamander) was released into the wild and comes with a fresh LibreOffice version: 4.1.2. Since the last major version of LibreOffice (4.0) was branched off, 11.034 commits by more than 200 different committers were done upstream up to the release that is now in Ubuntu 13.10. (*) The LibreOffice 4.1 features and fixes page gives an overview what is new with this release: rotating images, embedded fonts, improved interoperability — to name a few.

In the Ubuntu/Debian packaging repository, some 513 commits by 5 authors have been done between the version Ubuntu 13.04 was released with and the just released version. The majority of those commits have been done by Rene Engelhard of Debian. A big “Thank you” for all that work! Now leaving this release behind with a “Girl, Im leaving you tomorrow” on my mind, I am looking forward to what the name for Ubuntu t-series will be, as there does not seem to be an announcement yet (although there have been eager suggestions), start to brace myself for the early cycle madness again and prepare to make sure that Ubuntu t-series will get the best LibreOffice 4.2.

So much for looking backwards. A lot of people are shy and assume they could never be one of the contributors making a dent in LibreOffice, or even get started. Let me show you how wrong that assumption is:

resolved Easy Hacks over time

This little chart shows Easy Hacks resolved by newcomers to the project. Easy Hacks are tasks that get need to be done on LibreOffice and can be done without understanding all of the million lines of code and more than 20 years of history — quite a few do not even require C++ skills. They are specifically selected for that — and if you run into any trouble solving those, you can jump in at #libreoffice-dev to get help. So get yourself a LibreOffice build (here’s a video on how easy that is on Ubuntu — with dubstep soundtrack), find yourself an Easy Hack and get going!

(*) I didnt bother to check for the exact number, because checking for duplicates in email addresses is tiresome.

Note: An earlier version of this post talked about 22.000 commits — that was an error on my part fiddling with the scripting late at night.