WebVTT 0.3 Release : Final

Today our 0.3 release is due for the WebVTT parser. I’ve completed the cue text parsing portion of the parser and it’s sitting on my GitHub repo. The main places you can look for the code I have added are:

Much of what I discussed in my last blog post has stayed the same with my final version of the 0.3 release. The major structure of the algorithm has stayed the same. However, I have made changes to some of the syntax in order to get rid of minor bugs. I won’t re-post all that slightly changed code as it would make this blog post to long. You can ether look at the GitHub links for that stuff or you can check out my earlier blog post.

I’ve gotten rid of the webvtt_parser pointer, the line number, the length of *cue text, and the length of node_ptr in the function signature that was there previously.

For the webvtt_parser pointer and line number I did this because there purpose was to be able to throw an error to the webvtt_parser pointer error call back and reference the line that it happened on, but currently the parser does not support this.

I got rid of the length of cue text because it should always be a null terminated pointer. So we can just tell that we are at the end of the line by checking for that. No need for the line length.

I got rid of the length of the node_ptr because the parser no longer returns an array of node_ptr it now returns a single node_ptr of type WEBVTT_HEAD_NODE, which contains an array of node_ptr underneath it.

I know we will be changing this in the future, but I got rid of it now to make it more clear.

The other major thing that caitp and I were talking about on IRC last night was the data structure of the nodes. Before caitp had it set up that an internal node and a leaf node would contain a node and so they would be subclasses of node. Then you could just return an array of nodes and cast it to a particular type of node based on it’s node kind.

The way I have it set up right now is similar but slightly different. In my version the node contains a pointer to a leaf node or internal node and based on its node kind you can cast it to either an internal node or a leaf node.

caitp made the case for converting it back to the old format as it might be more readable and possibly take up less space in memory. This is something that we should probably discuss in the future.

Some other things to note that we will need to take care of in the future:

I have not had the chance to test the parsing of escape characters, but the code for it is there.

It does not parse the new “lang” tag that was recently added to the W3C specification.

The memory operations in the node, token, and string list struct do not make use of the allocator functions that we have built into the framework.