Project description

Improvement of the visual equation editor for OpenOffice, by seperating the visual cursor from the text cursor, and handling keyboard input with the visual editor. Thus giving the user an equation editor similar to MathType, LyX, KFormula, Math``Cad etc...

Mentors: Eric Bachard and Fridrich Strba.

Project plan

A breif overview of the steps I intend to take, in order to complete this project, more task will likely occur as the project progresses.

Bonding period

Analyze nodes and how they are composed (write notes)

Determine new representation of cursor position

Figure out how to draw the cursor

Before mid-term

Make SmGraphicWindow grab focus and accept input

Disable synchronization with SmEditWindow

Start implementing methods or visitors for handling movement

After mid-term

Enable the cursor to move within the characters of a node

Handle tree modifying input, delete, insert character etc...

Enable selections, copy, cut, paste, delete etc.

Improve user experience... and finish details.

About me

I'm Jonas Finnemann Jensen, student at AAU in Denmark. I'm working on Go Open``Office for GSoC 2010. For more info ask me or see http://jopsen.dk/blog/.

Specification

A draft of a specification for specs.openoffice.org can be found here:

Visual Formula Editor patch for OpenOffice Math

This patch turns OpenOffice Math into a visual formula editor as known from MathType, LyX, MathCad, kFormula, Microsoft Office, etc. Users can freely use this visual formula editor in combination with the old command text interface for entering formulas. The goal is to make OpenOffice Math usable to non-power users who doesn't wish to learn the command language for OpenOffice Math. If you're by any chance in doubt about the awesomeness of this patch take a look at the demonstration video.

Demo videos

As I've implemented a lot of features explaining all of them in details would take a lot of time and be really boring to read... So I've done some screencasts of the patch in action demonstrating some of the most interesting features. For a list of all shortcuts, see the usage instructions section. And if you want to know even more, feel free to contact me or read my log below.

Status

The patch is currently fairly stable and testable, however, it is not recommended for production release in its current state. The patch will need rather extensive testing prior to a production release, and there's a few corner cases that haven't been worked out yet. For instance the visual editor will ignore and discard alignment and fonts in formulas edited. There's copy of my todo list below, note that not every item of this list needs to be adressed prior to a production release.

Todo List

This is a list of improvements that could be added to this patch, some of them are probably trivial and some of them should be implemented before this patched is merged into production code. Whilst others would just be nice improvements to do in some distant future.

Global clipboard integration

Support undo/redo with UndoManager integration

More documentation

Put #ifdef DEBUG around DumpAsDot related code.

Replace j_assert with DBG_ASSERT

Code style (missing spaces, linebreaks and a few renames)

parse.cxx merges multiple blanks into one SmBlankNode, the visual editor doesn't, and treats one SmBlankNode as one character.

Font of SmTextNode needs to be determined on SmTextNode::ChangeText and in SmCursor::InsertText, currently FNT_VARIABLE is always used.

SmAlignNode and SmFontNode are ignored (and deleted upon modification) by visual editor, figure out how these should work.

SmGraphicWindow::KeyInput relies on comparison of sal_Char, a more generic way must be available for CTRL+c, etc.

When modifying with visual editor the body of SmSubSupNode shouldn't be more than one node.

When OpenOffice Math runs in standalone mode it centers the current formula, this is not nice for visual editing.
(Don't worry, I'll likely continue working on this patch).

Usage Instructions

Start writer and insert a formula or open OpenOffice Math directly, either way, you just click where you would expect the formula to appear and start typing. Entering text will write text in the formula, if you type +, -, *, < or > the corresponding operator will be inserted. If you type / a fraction will be inserted, if you have a selection it will be used as numerator. If you type (, [ or { scalable round, square or curly brackets will be inserted, if you have a selection it will be place between the brackets. If you type ^ you'll create a superscript, if you have a current selection it will be used a superscript. If you type _ you will create a subscription the same way. Typing ! will insert a factorial. If you press the enter-key a new line will be inserted if you're in the toplevel line or in a stack/binom, notice elements placed after the caret will be moved to the new line. If you press the enter-key in a matrix, you'll insert a new row. If you hit shift-enter the current formula tree will be dumped to a dot graph written to /tmp/smath-dump.gv (this is a debug feature that should disabled later).

Notice you case use the "Formula Elements"-dialog and "Catalog"-dialog to insert stuff in the visual formula editor, without loosing focus and caret position. Also you as might have expected the caret is moved using arrow keys (up and down movement may be improved) or by clicking with the mouse. If you hold shift down while moving the caret, either using arrow keys or mouse, you will create a selection. If you hit backspace or delete you will delete the current selection, if there's no selection you'll delete the next or previous element, however, if it is a complex element you will only select it (thus you won't accidentally delete a fraction).

You can use CTRL+c, CTRL+x and CTRL+v to copy, cut and paste respectively, notice there currently no global clipboard integration.

Download

The patch is available against Go-OO (ooo320-m17) and OOo4Kids (r790).

Notes

I'll notes on anything useful I discover as a part of this project here, some of it may be useful to others hacking on ?StarMath later.

Implementation Strategy

As of writing (Thursday the 24th of June, 2010).

If the tasks below are completed, many of the fundamental editing features should be fairly trivial to support... For instance implementation of backspace could be done by creating a selection and deleting this selection.

Next iteration

Selection:

How to represent a selection

Create and change selections

Draw selections

Deleting a selection

Insertion:

Insertion of symbols:

From "catalog"

From "formula elements"

Inserting new letters in text

Insertion of basic math operators and symbols: +, -, etc...
This is not all the editing features I've planned, more will come in the following iteration.

Following iteration

May be subject to change.

Support for fancy shortcuts, e.g. ^ for superscript, _ for subscript, etc...

Better UI for selecting "formula elements"

Copy, cut and paste

Support for creating formula elements by entering their commands prefixed with "\", possibly with dropdown box for autocompletion...
More iterations to come as development progresses.

Notes on Nodes

Having made a patch to dump equation tree to graphs using graphviz, I figured I'd post a few examples here... I'll also use these equations as test vectors whenever something is needed. More test vectors, to cover all cases, may be needed later for now these will do.

I've included screenshots of the equations here. But only linked to the dumps, as these are rather large. The number on the edges in the graph denotes the subnode number, e.g. the number given to ?GetSubNode in the parent.

The patch for making these dumps can be downloaded here. You enter a formula, render it, in ?StarMath, and click on the rendered equation, and hit enter. The dump is saved to /tmp/smath-dump.gv enjoy...

Equation 5

Caret Movement

The problem with caret movement is to compute the next caret position in a given direction using the current caret position.

A caret position can be defined as a node and an index within this node:

struct SmCaretPos{SmNode* pSelectedNode; int index};

Note

this index does not denote subnode of the selected node. However, it can be used by nodes such as text node, in order to let the cursor exist within such node.

Using this definition of a caret position, we can discus methods for computation of caret movement.

Approach 1: Intelligent Tree Searching

The idea behind this approach is to do intelligent tree traversal in order to find the next caret position. To implement this approach we'll need to define a complex visitor for each direction of movement: left, right, up and down. The methods on this visitor will take a SmCaretPos as parameter, and will return a SmCaretPos.

Consider this pseudo implementation of a method in the visitor for computing a movement to the right:

SmCaretPos RightMovementVisitor::visit(SmXXXNode* curNode, SmCaretPos param){
//Where SmXXXNode is a derivation of SmNode
if(param.pSelectedNode == NULL){
//We will move within this node using index
//if not possible we will move out of this node by calling the parent:
return curNode->GetParent()->accept(this, SmCaretPos(curNode, -1));
}else if(param.pSelectedNode == curNode->GetParent()){
//We will move into this node from the given direction.
//i.e. return left-most child, or curNode doesn't have children return curNode and left-most index
//for the RightMovementVisitor
}else{
//We will move out of a child of curNode, and select the next child after param.pSelectedNode
//By next I mean the child to the right of param.pSelectedNode for the RightMovementVisitor.
//If no such child is available we will move of this node by calling the parent:
return curNode->GetParent()->accept(this, SmCaretPos(curNode, -1));
}
}

The above can hardly be called pseudo-code, however, the comments should give an idea to the in-variants. Basic idea is that if pSelectedNode is NULL, then we will choose the next index if possible, or ask the parent to move out of the current node. If pSelectedNode is the parent node, then it is because the parent is trying to move into the current node, thus return current node if it is visible otherwise ask a child node. If pSelectedNode is not NULL and not the parent node, then it is a child node of the current node; and we should try to move into the next child or ask the parent to handle it.

With a visitor like this, the next caret position to the right can be determined using:

Approach 2: Closest Caret Position Search

The idea in this approach is to suggest all possible caret positions, and select the closes caret position in the desired direction. This can be implemented using an abstract visitor, that suggests caret positions to template methods. Concrete implementation of this visitor then saves the suggested caret positions they are interested in.

With this approach, implementing a visitor for moving up, down, and left is almost trivial. Also these position search visitors, can also be used for handling mouse clicks. Using visitors such as above the next right position can be easily computed as:

Note that this approach searches the entire tree, however, the square roots in the distance computations can be omitted, because we only need to compare. It can also be optimized so that entire subtrees are not searched if their bounding box (the structure node that holds them) are placed so that they cannot offer a better caret position than what have already been suggested. Also equation trees are usually quite small, e.g. doesn't spand multiple pages. And even if so, axis aligned bounding boxes performs very well in practical implementations.

It obvious that this the condition:

(pos - from_pos).distance() < (best_pos - from_pos).distance()

Can be improved. For instance when moving right, distance different on the Y axis could be multiplied with two, in order to favor caret position on the same line as the current position. A more complex algorithm for doing this, might help solve ackward cases. And note that such an algorithm would not something that needs to be updated when new nodes are added.

Pros and cons

A list of pro and cons for the two approaches.

Pros for approach 1

It is efficient, e.g. not a wide-expensive-search

It's a predictable for the user. Consider the equation:

2
a + b code for equation: a^2 + b

If we decide that the next position from "a" when moving right is "+". Then it will not suddenly select "2" in an equation like:

2 + c
a + b code for equation: a^{2+c}+b

Where the exponent is closer to "a" than "b" is... So it can be argued that this offers more predictable behavior for the user. However, if the exponent is sufficiently large it can be argued that the user will expect the cursor to move into the exponent instead. I don't know if users imagine the tree behind the editor and expects movement to operate accordingly. Nevertheless, this illutrates an example of how approach 1 differs from approach 2. And there can propably be argued for both approaches, from a usability point of view.

Cons for approach 1

It's hard to maintain! It will require four complex visitors inorder to do movement, and each of these will need a special method for each node. Sometime the parsing context in which the node appears need to be taken into consideration too! Nevertheless, adding new nodes to the editor will become harder with this approach because they will need to be worked into four complex visitors.

If "b" is selected it is unlikely that "2" will be selected when moving up. Using approach 2, this would happen easily. A complex hack that combines stuff from approach 2, to solve this might be possible, but it would also be harder to maintain.

Bugs are likely to be serious, a bug in this approach could very easily result in infinite recursion, given a rare equation subtree.

Pros for approach 2

It's maintainable: Instead of four complex visitors, we will only need one: the ?MovementVisitor. And even this is in most cases rather simple. But adding a new Node will only require modifications to this visitor.

It's reusable for handling mouse clicks, i.e. finding the caret position closest to a click.

It selects the caret position that is visually closest, not logically.

Cons for approach 2

It's more expensive, however, requires less code, thus smaller binary, which might have just a big a positive performance implication as the more efficient search.

It's not logically predictable...

Summary

We need to figure out which approach is best in terms of maintainability and usability. It's obvious to me that approach 2, offers the easiest maintainability. And I believe that it is possible to define the condition, i.e. the distance computation algorithm so that result becomes reasonable. However, if we do want a strict behaviour in terms of how nodes are composed, approach 2 is the way to go. But again this is visual movement, thus strict behaviour according the composition of nodes is not necessarily always desired, so maybe the solution is to enhance approach 2, with some of the ideas from approach 1.

Current status, is that I have approach 2 working for all directions, however, it does have a few abnormalities, likely because I'm not using I-beam, but a selection around the entire selected node. I've also got a somewhat working implementation of approach 1, for left movement, but it still lacks many specific nodes and is not working very well at the moment. The patch can be found below... Anyway, a solution could be to go forward with a better implementation of approach 2, and if that doesn't work out nicely at the end, well then it can always be replaced later.

Patch for approach 2 and some of 1

The a patch with my initial visual cursor movement is available here. Click on an equation to give SmGraphicWindow focus, and use arrow keys to navigate using approach 2. A partial implementation of right movement using approach 1 is implemented and available using the "a" key. The patch can be found here:

http://jopsen.dk/downloads/GSoC2010/initial-visual-movement.diff
Note this implementation is not perfect finished or anything close. It doesn't not handle corner cases very well, and the patch includes a lot of dead/unused code, which somehow managed to get left behind. Also coding conventions are not followed strictly in these hacks nor is it documented completely...

External links

Log/progress report

On request of my mentor, I'll be posting progress reports here as often as possible.

Friday the 13th of August

Today I've ported my patch to OOo4Kids and written some instructions as to usage, and made a todo-list of stuff that would be nice to have... Some of the entries on the todo should be solve prior to production release, whilst other entries are dreams about features that could be really neat.

As this is the end of GSoC, this is probably also the last entry in this log/page. But I'd like to see this patch released, so I'm likely to keep working on it :) But I won't be writing logs here, maybe I'll do a blog post (jopsen.dk/blog) once in a while instead...

Anyway, thanks for having me as a GSoC student, it has been interesting to work on a project of this size, both in terms of people, lack of documentation :) and lines of code... I'd also like to give special thanks to my mentors Eric and Fridrich for code review and guidiance, random people who've helped me, and Google for the Google Summer of Code program.

Thursday the 12th of August

Today, I worked on the coding style of my patch, well, I got through one of the files and its header file... Having fixed the coding style of 3k lines, I think its time to go to bed... The main issue is lack of space inside parentheses, and something like line breaks before and after curly brackets... There's also a few place where I neglected to use proper prefix... I know I should have done this when I wrote the code, but sometimes it just goes a little too fast... Also I'm really having a hard time writing identifiers such as aAnchor, I think my English teacher would turn over in her grave, if she knew...

Anyway, I may have fixed some of it, but there's still a lot left for a rainy day...

Wednesday the 11th of August

Today, I updated the shortcut for "/", to put the current selection as numerator when inserting a fraction. Also implemented a shortcut for inserting factorial signs, the shortcut is "!". Then I implemented a method for inserting limits into instances of SmOperNode and I made a few exceptions to how commands from the "Formula elements"-dialogs are inserted in the visual editor, so now the context-dependent buttons such as insert newline and add limits to existing SUM or INT also works in the visual editor.

Along the way, I also fixed a few bugs and stuff... While I still have lots of things that would be nice to have done for the visual formula editor. It's probably about time, as my deadline is Saturday, that I start fixing my coding style and cleaning up the patch. So that's what I'll be doing tomorrow.

Tuesday the 10th of August

Today, I decided to spend some time implementing additional features. Specifically support for linebreaks using the enter-key. So if you place the caret in the middle of a toplevel line and hit enter, it will split the current line in two... I also extended the shortcut to work in sublines of STACK, BINOM and MATRIX commands, in these cases it will create a new row (and convert from BINOM to STACK, if needed). It's currently not possible to delete lines or rows, in the visual editor, but it's not really a critical feature.

Monday the 9th of August

Today, I downloaded !OOo4Kids and with a little help Spiso I managed to my svn checkout working. At first I tried to create a patch an apply it to !OOo4Kids. But with over 5k new lines and modifications to a lot files, I quickly realized that this would take forever... So instead I tried kdiff3 to merge my code with !OOo4Kids, this went pretty easy, especially after I discovered shortcuts, and found the menu item "Choose A for all whitespace conflicts". Believe it or not, a file like node.hxx have over 200 whitespace related conflicts.

Anyway, after merging my code using kdiff3, I tested and used svn diff --diff-cmd diff -x "-u" to make the patch... So here it is patch against !OOo4Kids r790, and a patch against go-oo (ooo320-m17).

Visual Formula Editor for Go-OO (ooo320-m17).
This is a testable version, it still not completely done... But it should be fairly stable, and possible to get a pretty good idea of how the visual formula editor works. I'll work some more on additional features/improvements of the editor and will port them again at the end of this week. Where I'll also be writing instructions, and probably another youtube screencast.

Sunday the 8th of August

Yesterday I noticed some unexpected behaviour in SmNode::CreateTextFromNode, I've used this method for synchronizing between visual formula editor and command text formula editor. As far as I can see SmNode::CreateTextFromNode is not implemented for all nodes, and not even implemented correctly everywhere. However, I haven't found any documentation for the method, so I can't tell if usage of MathType::LookupChar in SmMathSymbolNode::CreateTextFromNode is a bug, inconsistency or simply what it's supposed to do.

Anyway, I've always thought CreateTextFromNode used too many curly brackets, that weren't needed. So I decided implement SmNodeToTextVisitor for creating text from nodes. I finished this visitor today, and as it turns out I needed to use even more curly brackets than SmNode::CreateTextFromNode did... :) Nevertheless, this visitor can generate the command text for a formula, given its root node, and it should be able to handle all commands.

Today I also fixed a few small annoying issues... And implemented support for inserting brackets using "[", "{" and "(". That is if you select something in the visual editor and type "[", scalable square brackets will occur around the selection. This is really nice and easy to use...

I still have a lot of minor issues I'd like to address, stuff like sub-/superscription shortcut doesn't work for SUM and INT commands, fonts of SmtextNode's, shortcuts for line breaks, new rows/cols in matrices and so on. But I've promised my mentor a patch against ooo4Kids next Saturday. And there's probably a lot of issues with a patch of this size, so until this is done there's no more new features... I don't hope it takes all week, but just downloading, configuring and building ooo4Kids is probably going to take small day.

Friday the 6th of August

Yesterday I refactored a little more, mainly moving generic code into auxiliary methods. Today I finished my work on a shortcut for sub-/superscription in the visual editor. I still need to give it a little love, so that it'll work with SUM and INT commands, but for now it creates normals sub- and superscriptions nicely.

With this shortcut, typing "A", "" and "b", in the visual editor, will give you the formula "A rsub b". This is really nice, compared to using the "formula elements" dialog, but these kinds of shortcuts are rather complicated to make, because they are context dependent, and all sorts of weird corner cases quickly emerge. For instance having a selection while typing "" will bring the selection down into the subscript of the node infront of the selection. Also if there's no node to give the subscription a new place node will be created as body.

Anyway, I'm planing to work a bit more of refining this shortcut, and then start implementing support for "newline" commands using the enter-key. Also a very nice, but context dependent shortcut.

Wednesday the 4th of August

Today I started fixing corner cases... That is small annoying issues and inconsistencies. For instance it is no longer possible to delete the zigma sumbol of the sum command. I also took some time to refactor some of the code in SmCursor mainly to avoid code duplication and make it easier to implement neat shortcuts such as superscription when you type ^. This is one of the small neat, but somehow very complicated, features I'll be working on tomorrow.

Tuesday the 3rd of August

I've spendt some time getting the "symbols catalog" and "formula elements" dialogs to work correctly when working in visual editor. For the "symbols catalog" dialog I programmatically created an instance of SmSpecialNode and inserted it appropriately.

For the "formula elements" dialog I took another approach, instead of making a long switch with hundreds of lines to create nodes for stuff like "> subset >", I extended SmParser with a method for parsing an expression from a string. This approach was quite simple, it won't support commands such as "newline", "from {>}" or other context dependent commands.

So some the available elements in the "formula elements" dialog needs to be supported explicitly. And some of the available elements from "formula elements" dialog may be very difficult to support. However, in the long run it might be a good idea to implement a better "formula elements" dialog, perhaps as toolbar, but at the very least users should selected the number of limits ("from", "to" or both) when they are inserting a "sum" command, and not at a random time afterwards. That would be very difficult to support properly.

Sunday the 1st of August

The past three days I've been working out how SmDocShell, SmGraphicWindow and SmEditWindow works together, inorder to implement synchronization better. And today this paid off, that is I managed to make synchronization work better. And enabled size updates, so that the visual formula editor can also be used on embedded formulas, and not only in standalone mode.

I must admit that even though it works now and I have an idea about what I'm doing, it's very frustrating that sfx2, svx, etc. are more or less completely undocumented. If I try to read the source I might be lucky and get something that resembles an idea of what a method does, or I might end up 10 open tabs and no idea how I'm going to determine the side effects of setting some particular field. So I can't say I am totally certain what all the code snippets I've reused does... I fell like most of what I'm doing is best guesses, and this is not nice.

In particular the dynamic messages feels like magic, especially when it's handled by a custom class who uses it's base class from sfx2 to catch the dynamic message (sound confusing? - don't worry I lost myself too).

Also someday I should consider to figure out how to do undo/redo integration, EditEngine::SetText clears the undo history and can't be undone. Integration with global clipboard would also be nice, but for the moment, my current hack works.

Thursday the 29th of July

Today I finished support for copy, cut and paste in the visual formula editor. This is done by cloning a list of nodes, and inserting the cloned list... For creating clones I introduced a new visitor, which might also be reusable later... This still needs to be integrated with the global clipboard... And before I can do that I'll have to find the global clipboard.

Also so spend some time reading code trying to figure out how everything worked together, and I discovered UndoManager, so I've got an idea of how undo/redo functionality could be implemented and integrated.

Wednesday the 28th of July

Today, I wrote a small php script for converting my log entries to an RSS for inclusion on planet.go-oo.org, so here I am. But don't worry the script is bound to crash someday, so my noise will fade :)

I spend most of the day writing documentation for the visual equation editor. I did a lot LaTeX examples and graphs (using graphviz), everything is written in a header file and can be build with doxygen. Even though I'm not done I had to see how this would look when doxygen was done with it, so I tried the neat make docs, and as of writing I'm realizing that this is going to take 3-4 times as much time as it took to build OpenOffice.

Update: The page I wrote is now available here along with all of my documentation of OpenOffice Math.

Anyway, while my computer crunched away with doxygen, I started working on a cloning visitor for copy, cut and paste functionality... Oh, yes if anybody have experienced problems watching the demonstration screencasts I've posted on youtube (Selection, Editing) the videos now also available in various formats here.

Monday the 26th of July

I've just done another screencast, showing off editing capability: http://www.youtube.com/watch?v=tELPgJIC1sg If I'm moving slowly the screencast it's only to ensure that xvidcap gets enough frames, as redrawing is still alittle flickering... As mentioned yesterday the patch posted is a bit old, so I'm posting a new patch that up to date with the latest bug fixes. The patch is against ooo320-m17 with go-oo patches, and it's pretty large...

Sunday the 25th of July

Today I've fixed a lot of minor issues, stuff like working out a hacked synchronization between visual editor and traditional (text based) formula editor. I also finally managed to track down and fix the few pixels I'm always off when drawing caret. It turned out I had ignored an offset, and this offset was constant and very small, so that's why I didn't really notice it before. I've also enabled insertion of more nodes. And fixed a lot of small bugs, such as crash when starting on a clean formula.

To fix all these bugs, and track down issues when altering the formula tree, I've used the dump to dot method, for dumping the formula tree to graphviz. My current patch does this dump when you hit the enter-key, however, running graphviz and opening the image in an image viewer quickly became rather boring. So I wrote a script that monitors the file system and detects changes to /tmp/smath-dump.gv, once a change is detected it runs graphviz and opens the image for viewing, while closing any previous opened images.

Saturday the 24th of July

I worked a bit on insertion yesterday, however, it wasn't until today that it actually reached a state where it was remotely useful. Insertion sort of works, that is I can insert something, but it doesn't create the right tree... And some of the corner cases doesn't work very well. And there's a lot of corner cases with textnodes. The approach is basically to make a list of nodes that constitute the current line, insert the new node and parse the list into a subtree and put it back into the formula tree.

While, this approach works fairly well, it's rather nasty when the caret is inside a text node, then I'll have to split the text node in two. And if I'm inserting a text node, inside or next to existing text node these nodes should merge. It's also rather unpleasant to creates nodes because their constructor takes an SmToken. Before the constructor was only used in the parser, so this made sense, but now it's just really annoying and complicated. Especially with text nodes that might change parsing information when I change the text from a number to a char and number making it an identifier.

Anyway, status insertion is close, I guess another day or two... And also to conclude my complains about node constructors, maybe I should introduce new constructors or simplify existing ones :) .

Thursday the 22th of July

Today, I fixed deletion of selections. That means that it is now possible to move the caret around, select stuff and delete it. I continued with the approach I discussed yesterday, breaking down the subtree that constitutes a line and parsing it again. Tomorrow I'll start working on inserting new nodes in an existing tree, using the same approach. Most of the hard work, e.g. the parser, should be done.

I also found a little time to remove old code, refactor and just clean up in general. So I figured this would be a good time to dump a diff. I recall somebody asking for some code to look at... So here it is:

Wednesday the 21th of July

I almost managed to delete selections today, well, it does delete something, just not the right things :) But it doesn't crash, so that's pretty good. I've decided to take a pretty hardcore approach to modifying an existing formula tree. Instead of removing nodes and then try to repair the tree, I remove all the SmBinHorNodes and SmExpressionNode and similar nodes of the line that contains the selection, convert the line into a list, removed what I wanted to delete, then parse it adding the SmBinHorNodes and SmExpressionNode and put these back into the tree where it was taken from.

I've also updated my dot dumper for graphviz to draw dashed lines for nodes that are selected, this turned out to be a good tool for debugging selections. It's really late now and I promised to be on IRC tomorrow, so I'll go to bed now. But once I get bored I'll post some graphs showing how selections work.

And by the way, I've prepared a little video demonstrating caret movement and selection, I've posted it to youtube. So if you want a quick view of what I've got working good now, take a look at: http://www.youtube.com/watch?v=W8yXyDiIQPc

Note: I expect deletion of selections to be coming really soon :)

Tuesday the 20th of July

Today I started working on editing. I've decided to place all the movement and editing logic in a class called SmCursor. The cursor class is slightly inspired by QTextCursor from Qt. The idea is that the cursor manages caret position, movement and selection in the formula, and allows for programmatic modifications of the formula. That way this cursor class can be used by SmGraphicWindow to move the visual caret, while still be useful when we need to insert new formula elements using buttons in the UI.

Tomorrow I'll associate and integrate the SmCursor with the SmDocShell class, I think the SmDocShell is the owner of the entire formula, so this will be the right place to put it. Then I'll start working on how to delete a selection.

Monday the 19th of July

I didn't code much today, but having read code about almost all nodes in order to build a graph over the caret positions, I've learned quite a lot about the nodes. So I decided to spend some time writing doxygen documentation comments in nodes.hxx. I figured I better do it now, before I forgot what the nodes did. I haven't compiled a doxygen documentation of it, but someday when I'm bored I'll build the documentation and post it somewhere.

Sunday the 18th of July

Today I worked on up and down movement, which have improved significantly, there's still some minor issues... But at least the caret does no longer get stuck because up jumps between two positions. The caret position graph building visitor SmCaretPosMapBuildingVisitor is also finished now, there's still some minor corner cases that needs to be taken care of, but it builds a map over the caret positions for every node.

As mentioned yesterday, selection is really working out good and this is still the case today :) , so it not completely unlikely that I'll start to worry about editing tomorrow. Also I ported the DumpToDot method to my new branch, and made it work without information about the parent on every child node. This was necessary as I desperately need information about how nodes are composed and code does help, but the parser can be really complicated! Anyway, I'm adding doxygen comments to nodes.hxx with inline latex to show what the nodes look like (that's going to look nice when I start building documentation :) ).

Saturday the 17th of July

Today I fixed a lot of small annoying bugs, stuff I'd done wrong or not written correctly. With these fixes selection is finally working great, even when the selection crosses an operator in SmBinHorNode. I also cleaned up and removed the old caret movement code, this required that I ported the up and down movement to the new data structure. This worked fine, but up and down movement needs some tweaking, just minor stuff the approach taken should be final now:).

Before I start working on editing, I'd like to finished the caret position graph building visitor. And if I get a good idea, improve up and down movement. I expect this to get done tomorrow, then I'll start working towards something that can actually accept input and modify a formula. If I get bored with coding, tomorrow, I'll also document the caret movement strategy here...

Friday the 16th of July

Hard work and lots of thinking finally started to payoff. As a wrote a few days ago I decided to build a graph over all caret positions. Creating a visitor for building such graph turned out to be quite complex, but now it's finally happening. I haven't handled all nodes yet, and there's a few context dependent corner cases for some of the nodes (e.g. SmSubSupNode when used in a SmOperNode in an equation like "SUM FROM a TO b A_i").

The good news is that left and right movement is now consistent, this means that it doesn't depend on coordinates which may change depending on font size. Up and down movement will continue to use the old approach, but that is also the best solution for these kinds of movement. There's still a little work to do, before I can start working on editing. But once movement and selection is finally playing nice together, editing is simply a matter of adding and removing nodes to the tree.

Tuesday the 13th of July

As mentioned yesterday, I've become unhappy with the somewhat randomness of caret movement. So I've decided to address this, so that my invariant about caret positions is clear before I start working on editing... The solution I've come up with is to make a graph over the caret positions and use this to navigate left and right movement, whilst I'd still be able to search the graph for handling mouse clicks.

This compromise between the approaches discussed above, would limit the number of complex visitors to one, and still have consistent movement for left and right movement. However, today it turned out that making a visitor for building a graph over the caret positions is not exactly easy. I'll work more on this tomorrow, hoping for a breakthrough so I get on with editing, which should now, that selections are supported, be fairly straight forward (well, at least in my head :)).

Monday the 12th of July

Friday I started becoming increasingly unhappy with how caret movement and positing is currently done. Mainly some due to some serious pixel of issues. But also to the fact that movement, and additional caret positions becomes increasingly important when working with selections. So I've been working out a new caret movement approach, I'm currently in the process of testing it. If it works reasonably I'll write an outline in the above discussion on caret movement. Anyway, that's what I've been up to this weekend.

Thursday the 8th of July

Having spent a lot of time trying to figure out how to handle selections in a formula it finally came through today. I went forward with the idea I presented in the log yesterday (see below), and this turned out to be doable without any nasty hacks. This means that I've implemented the SetSelectionVisitor which sets a boolean property IsSelected on all nodes of the formula tree. As previously mentioned, a selection is two SmCaretPosstart and end. The visitor works by maintaining a state IsSelecting (initially false), which is set to true when it encounters start and false again when it encounters end. When visiting a node, the IsSelected property of that node is set to IsSelecting.

This might seem like an easy thing to do, and it indeed it is. However, there's some nasty corner cases to work around, for example consider the formula sqrt{A + B} + C, no imagine that [ denotes the start caret position of the selection, and ] denotes the end caret position of the selection. Then if start and end are placed as sqrt{A + [ B} + ] C the selected part of the formula should be sqrt{A + B} +. I.e if I put the caret under the square root sign, press shift and click infront of C (outside the square root) the entire square root should be selected.

Nevertheless, I managed to find a solution to the corner case describe above. I also managed to draw the selection in the DrawingVisitor I implemented yesterday. Currently selections doesn't work perfectly, I still need to place some caret positions, and selections can only be created using mouse click and shift (not movement using arrows on the keyboard). So there's still a lot work to be done before this works nicely, but the nasty corner case is now out of the way... And when you click the right places, and hold the shift-button, this is what it looks like:

[[!img http://jopsen.dk/downloads/GSoC2010/initial-selections.png]

Wednesday the 7th of July

Today I've moved all the drawing code into a visitor, this makes it a lot easier to locate the code and modify a section of it. Which will be needed when I'm going to draw selections. Anyway, the drawing visitor works, it still ugly because I didn't clean up the drawing code... It would be nice to do this... And move all the arrange methods etc. into visitors as well, however, for the moment being I don't need to modify these. So I'll take them when/if I get around to it.

With the regards to selections, I'm making a small progress, I'm still pretty much working in my head trying to figure out how selections can be handle easily. I've decided to try adding a SetSelectionVisitor that will set a boolean IsSelected property to true on nodes that are selected. Then this visitor should be able to mark a reasonable selection based on two caret positions. Using the IsSelected property I believe it should be possible to draw the selection, and hopefully also delete, copy and paste it.

I'm not certain that this approach will work, there's still some issues I need to work out. Nevertheless, I'm going to go forward with this approach and see where it ends... :) If it doesn't workout well, I'll write some notes on the problematics of selections, and work on something else until one of my infinitely wise mentors throws an idea at me.

Tuesday the 6th of July

Today I've played with OutputDevice and using GetTextWidth I've managed to allow for the caret to be placed inside an SmTextNode. The solution is pretty good and stable, but it might be a bit slow, I don't know how fast Push, Pop and GetTextWidth on OutputDevice is. But again performance is probably not an issue, if it is we can optimize by implementing CanExclude and/or caching expensive computations.

Monday the 5th of July

These past two days I've been working on refining the caret movement. Instead of using the distance between the centers of two nodes to determine, where to place the caret on movement, I'm now using the the distance between the caret lines. I've also added more caret positions and made the SmCaretPosSearchVisitor less general. As a result the caret movement is now more predictable and usable. There's still some parameters to tweak, and probably still corner cases that doesn't act as expected.

Next step is to start working on selections and/or editing, I'm considering to postpone selections as this might be very complicated. If I can figure out how to do it, I might give it as shot. But I'd rather spend some time implementing editing without selections, than waste as lot of time trying to implement selection unsuccessfully. IMO editing without selected would still be very useful (as opposed to fruitless efforts to implement selections).

Saturday the 3rd of July

I've tested Michal Spisiak's (Spiso) patch for the baseline alignment issue. It took a few adjustments to make it apply, but the patch seemed good. Michal still needs some UI, so that the user can choose this and it is not something that just magically happens. However, the baselines do align as they are supposed to, which is really nice, congratulations on work well done... :)

By the way, if you've noticed that there's no log entries for the past week, it because I've been off to summercamp (Danish). But I'm back and almost awake again :)

Sunday the 27th of June

This weekend I've been working on the |--> symbol (0x27fc). I defined the symbol in Math.xcu, so that it appears in the symbol catalog. Then I drew the symbol in inkscape and imported it to fontforge. Once I've decided if this was the right approach to take I'll write a neat little article about adding symbols to OpenOffice Math.

[[!img http://jopsen.dk/downloads/GSoC2010/mapsto.png]

Above you have see the result as rendered in OpenOffice Math. The hardest part was drawing the arrow, and making it look good, e.g. symmetric and stuff. Patch is included below, it's possible that it still needs a little cleaning to look good, for instance I'm not sure the changes in the .src are needed. The patch also makes union and intersection work as operators with boundaries as described in the 23th of June.

Wednesday the 23th of June

Today I worked on allowing the union and intersection operators to work with to and from boundaries as it is known from the sum operator. This turned out to be achievable with a few very small changes in parser.cxx. However, it would be nice to have a larger symbol to use for union and intersection, when used in this context. But no such symbol is in OpenSymbol, so we'd have to draw one. While this is working pretty good, there might be a few parsing abnormalities, because the command union is now used for union as binary operator and construction that can be seen below (which is what I just created):

[[!img http://jopsen.dk/downloads/GSoC2010/union.png]

So it's necessary to check that this change in the parser doesn't have consequences on backwards compatibility. If it does there might be a few parameters that we can twist in the parser. If that doesn't work, we can always choose another command for union with from and to boundaries. Anyway, I haven't done extensive testing yet, so I'm not going to worry about potential backwards compatibility issues before I discover them.

Tuesday the 22th of June

These past two days I've between rewriting and cleaning up the code I've written so far. Basically made a diff, reset my branch to master and started copying and writing in the changes that I wanted to have... Thus, I've removed all the testing code that I'm not using for caret movement anymore. I've also taken the time to cleanup in edit.cxx where the synchronization timer won't be needed anymore. I've also removed all the old cursor/caret code from SmGraphicWindow, previously most of it was just commented out enough to disable it... :)

I also took the liberty of reusing my abstract caret position search visitor as base class for a visitor that finds the caret position closest to a mouse click. So now caret movement using the mouse is also done using a visitor. Next up is refining caret movement, both how next caret position is calculated, and where carets can be placed (e.g. the rectangle of a fraction shouldn't be selectable). It is also necessary allow caret movement inside a text node, this might prove rather complicated, but I hope to find some text width computation methods to do the heavy lifting...

Sunday the 20th of June

I've started to cleanup my implementation of caret movement... I'll refine it later, but first of is getting the names within OO.o naming convention, adding more documentation comments and remove old code that I added but doesn't use anymore. There's also some cleaning up to do after removing synchronization in edit.cxx. Basically I'm writing it all from scratch again, so that code conventions and other minor issues are fixed as I write. All of this is not done yet, but it's something that I'll come around to, hopefully over the next few days...

Saturday the 19th of June

Today I've been working on added new symbols to OO.o Math... First I tried to add them by modifying parse.cxx, this turned out to be a fairly easy thing to do... But as I read into how most other symbols was implemented it turns out that added new standalone symbols is simply a matter of modifying officecfg/registry/data/org/openoffice/Office/Math.xcu (an xml configuration file). The symbol I wanted add wasn't in the OpenSymbol font (extras/source/truetype/symbol/opens___.ttf), I suppose I could just add them, however, that would certainly put my artistic skills to the test... Anyway, until we've decided specifically which symbols to implement, I'm putting this on hold... Note, that while working on this I've been taking some good notes, which I plan to write into an article once I'm done...

In the meantime I'll be working on cleaning up the code I've written so far, getting the caret movement refined and start working on editing.

Friday the 18th of June

I've been playing some more with VirtualDevice for drawing double buffered, however, this didn't solve the flickering issue. Quite to contrary, it made drawing slower, which in turn made the flickering more visible. Somehow VirtualDevice also managed to disable anti-alias, so the result wasn't pretty.

Anyway, I tried to do the drawing to the VirtualDevice in SmGraphicWindow::PrePaint instead of SmGraphicWindow::Paint, but this didn't solve the issue even though it should reduce the time from the Window::Erase() call in Window::ImplCallPaint() to the painting of the content is done. The patch where I tried this hack is available below:

Thread on dev@gsl...
I think that a solution to the flickering would be to draw the background in SmGraphicWindow::Paint instead of having Window::Erase() called in Window::ImplCallPaint(). I tried this by calling Window::SetBackground() in the constructor of SmGraphicWindow. This did indeed solve the flickering issue, however, I couldn't figure out how to draw the background manually in SmGraphicWindow::Paint, with more work this might be a solution...

By request on Eric I also started playing a little with implementing new symbols. I'll be implementing new symbols later, and will document the process so that others can implement new symbols if desired...

Thursday the 17th of June

I've been trying to use VirtualDevice for drawing double buffered... However, so far this is with limited success, I think I've got some logical and pixel sizes messed up. And maybe some antialiasing that doesn't work... Flickering is still an issue, but I'm not done getting the VirtualDevice working...

Monday the 14th of June

I've written a message to dev at gsl... for help on the flickering issue. And attached the patch for reproducing it, if that should be necessary... Somehow I get the feeling that vcl doesn't double buffer by default, something we're used to from modern GUI-toolkits today (gtk and qt off the top of my head)...

Friday the 11th of June

I've isolated and tested a patch that makes it easy to reproduce the flickering issue... It's now only two lines of new code, so that's pretty good. I'll be posting it to dev at gsl... Soon as possible... For anybody interested the patch can be found below, it courses SmGraphicWindow to grab focus on click and repaints/invalidates at every key stroke to SmGraphicWindow.

Monday the 7th of June

Today, I've been working on drawing a proper caret, that is a straight line. Not very hard to do... However, as it turns out drawing isn't very fast. Actually invalidating the entire ?SmGraphicWindow at each caret movement causes the widget to flicker a little at times. This is something we can live with as it only flickers when caret is moving, note this may be specific to my computer (though I doubt it).

It should be possible to avoid invalidating the entire widget when moving the caret, this might solve the flickering issue. However, we'll need to redraw the entire equation whenever it is edited, i.e. the SmNode tree is changed. Thus full redraw needs to be fast enough to avoid flickering. A quick guess is that it flickers because the underlying drawing stuff, first blits the entire widget blank/white, before blitting the content.

Anyway, I've made a patch where any keystroke to the plaintext editor for entering equations, causes ?SmGraphicWindow to parse and redraw the entire formula (The patch is available below). The result is that the widget flickers when entering the formula. I suppose that parsing and arranging the formula takes some time, but I still guess that it's the drawing that causes the flickering.

http://jopsen.dk/downloads/GSoC2010/updatehack-7-june.diff
This may be a deeply rooted issue in OpenOffice, it may be specific to my system (doubt that), it may also be specific to Linux (more likely), or it could be a general issue in OpenOffice. I suppose that drawing performance is a bit out of my scope and probably quite difficult, if not impossible, to improve; so I don't know if I should look more into this or just accept the current situation.

Sunday the 6th of June

This past week I've been preparing for an exam and celebrating my parents 25th wedding anniversary, anyway as of today I'm past that... I've got a new exam to prepare for, but I'm done with the wedding anniversary stuff... So today, I've been reading about specifications for ?OpenOffice and written a quick draft, link is available on this page.

I'm slightly torn as to how detailed it should be, more details is probably needed. But I'm fairly sure we won't get them all before the implementation is complete.

Monday the 31th of May

I've been writing notes on the two approaches to computing the next position for visual movement... I've also gotten the first approach stuff, that didn't work yesterday working, but the implementation is still so partial that it's almost not worth testing out. Anyway, I'll publish my patch, e.g. the combined patch of all my nasty hacks, and send a link to the notes to my mentor(s).

Sunday the 30th of May

I've been testing out the visual movement I implemented friday, and trying to improve it. Then I've also worked on the first approach, e.g. my initial idea, it's a bit more complicated, but yield more predictable results... Nevertheless, at the current stage my quick-n-dirty implementation didn't work. I'll try to bugfix it later, also I'll make some pretty pictures to explain the two approaches, with pros and cons.

Friday the 28th of May

I had a some trouble sleeping last night... It might be slightly related to the fact that I had an epiphany: Instead of making complex rules about how to find the next caret position, based on SmNode type and information about context from the parser, I define a set of CaretPosition within each node and let a visitor search for a position to the left of the current caret position.

I tried it out today and works pretty good, I didn't do a beam caret yet... I'd expect better results with that. I'll try it this weekend, if I can find time. It's a different approach, I'll do an outline with pro/cons (and explanatory pictures) later... And pass the question on to my mentor(s). Anyway, this approach could be easier to implement and maintain while offering better navigation, the ladder might be subjective.

Thursday the 27 of May

I got the visitor working today. And have been analyzing the parser... to get a more specific idea of how nodes are composed... I'm also starting to consider using the parsing information, specifically ?SmToken on ?SmNode, to assist the visitor. As one movement pattern based on the visitor alone won't cut it.

Wednesday the 26th of May

Today I've been writing a visitor... This was slightly complicated by the fact that a copy constructor of ?SmStructureNode creates an instance of ?SmNode, which I've tried to make abstract... Furthermore I felt unable to detect where this copy constructor was defined... There was nothing helpful in the definition of ?SmNode, and though ?SmRect has one, it's declared inline and doesn't have body in the definition. I tried debugging it, but the copy constructor of ?SmStructureNode wasn't invoked, it's probably only used in some special corner cases...

Anyway, I also wrote a small test visitor filled with assertions and trying to accept it generates a segmentation fault, that the debugger weren't too happy helping me with. Maybe I should try raw gdb, i.e. without frontend, and see if that helps. Alternately I'll probably just dump some strings to cerr and locate the crash. But I'll be fixing the visitor tomorrow... Where, by the way, I'll be working on GSoC all day, since I just handed my semester project in for printing today :)

Tuesday the 25th of May

Today I learned that printing stuff from functions helps detect when it executes the wrong branch, also ?KeyInput behaved inappropriately when GetCode() was called multiple times... That is handling key input with a switch worked fine, but doing to with if-statments didn't work... I'm still having problems writing integers to streams, e.g. cerr<<45 doesn't work... It works find for strings, but integers won't work and just blocks the stream. This is really annoying, because dumping information to stderr is easier and often faster than setting up breakpoints and reading variables in the debugger. Also information dumps can be persisted between sessions even though line numbers change.

Status, I've cursor movement... By using the keyboard I can select any node in the equation tree. The movement is still naive, e.g. it just moves between subnodes in the order they have. Next step is to implement visitors and make the movement more intelligent. I've also got to decide where the code for input handling should be placed, e.g. in ?SmGraphicWindow where ?KeyInput is or where the equation tree is managed. Perhaps it would be best to place tree related stuff in the ?SmDocShell, and handle input in ?SmGraphicWindow, I'll have to look into to this...

Monday the 24th of May

Had trouble with my build, so decided that it was easier to just start over... Delete my go-oo checkout, created a new from a stable branch and typed $ make ; totem ring.mp3 and went back to working on the semester project I'm handing in Thursday :) Then suddenly it rang, and now I have a working checkout again...

Sunday the 23th of May

The past week I've been working to get the cursor moving... I'm having minor problems with ?KeyInput and caret position calculation... And decided that I wanted to enable assertions to help find the bugs. So I spend this evening trying to enable assertions with:

This would build, after fixing a bad cast in testtools this module build, however, the connection module wouldn't build either, something about: _STLD::__stl_debug_engine<bool>() being undefined... Anyway, my guess is that it doesn't link against libstlport_debug, there's a problem with the library I have installed or something else is bad. Nevertheless, I decided that it was too far out of my scope and likely fixed if I updated, either go-oo or my distro (Ubuntu). So I configured without --enable-dbgutil and it doesn't work... So now I'm going to bed, leaving that headache for an other day.

Sunday the 9th of May

Since the meeting Friday (log here) I've built Star``Math with debug symbols... Also tried to activiate SM_RECT_DEBUG for visual debug information, not really useful for my project, and by the way SM_RECT_DEBUG needs to be defined in rect.hxx not node.cxx where it is mentioned. I have played a bit with the debugger, I used beaverdbg to get a nice GUI (sorry I'm too lazy to learn raw gdb)...

Furthermore I've tried to make SmGraphicWindow grab focus and handled ?KeyInput in view.cxx. Well, Now that I had this KeyInput I had to use it for something :) So I wrote a simple method on SmNode for dumping the entire equation tree to a dot-graph (for graphviz), and used KeyInput to do the dump. The graphs gives a pretty good overview of how the nodes are composed... I've published a few graphs under the notes section. I've also given all SmNode a parent node, which is NULL for the root, this made it easier to make the graph and is probably going to be necessary later anyway...

And by the way, writing an int to std::cout or an ordinary file stream doesn't work in ?StarMath. If you write my_fstream<<"A number: "<<42<<" will not be written"; neither 42 or anything written to the fstream afterwards will be written. I've heard that std::cerr should work, but that's not useful for dumping information to files. I did a nasty workaround using sprintf...