Tags

I've finally decided to migrate my Expression Templates Library (ETL) project to
C++17. I've talking about doing that for a long time and I've released several
releases without doing the change, but the next version will be a C++17 library.
The reason why I didn't want to rush the change was that this means the library
needs a very recent compiler that may not be available to everybody. Indeed,
after this change, the ETL library now needs at least GCC 7.1 or Clang 4.0.

This will only cover the C++17 features I'm using in the updated ETL library,
I won't cover all of the new C++17 features.

if constexpr

The most exciting new thing in C++17 for me is the if constexpr
statement. This is a really really great thing. In essence, it's a normal
if statement, but with one very important difference. The statement that
is not taken (the else if the condition is true, or the if
constexpr if the condition is false) is discarded. And what is interesting
is what happens to discarded statements:

The body of a discarded statement does not participate in return type
deduction.

The discarded statement is not instantiated

The discarded statement can odr-use a variable that is not defined

Personally, I'm especially interested by points 1 and 2. Let's start with an
example where point 1 is useful. In ETL, I have a make_temporary function. This
function either forwards an ETL container or creates a temporary container from
an ETL expression. This is based on a compile-time traits. The return type of
the function is the not the same in both cases. What you did in those case
before C++17, is use SFINAE and make two functions:

One version of the function will forward and the other version will force
a temporary and the return type can be different since these are two different
functions. This is not bad, but still requires two functions where you only want
to write one. However, in C++17, we can do much better using if constexpr:

I think this version is really superior to the previous one. We only have one
function and the logic is much clearer!

Let's now see an advantage of the point 2. In ETL, there are two kinds of
matrices, matrices with compile-time dimensions (fast matrices) and matrices
with runtime dimensions (dynamic matrices). When they are used, for instance for
a matrix-multiplication, I use static assertions for fast matrices and runtime
assertions for dynamic matrices. Here is an example for the validation of the
matrix-matrix multiplication:

Again, we use SFINAE to distinguish the two different cases. In that case, we
cannot use a normal if since the value of the dimensions cannot be taken
at compile-time for dynamic matrices, more precisely, some templates cannot be
instantiated for dynamic matrices. As for the cpp_unused, we have to use for the
static version because we don't use them and for the dynamic version because
they won't be used if the assertions are not enabled. Let's use if constexpr to avoid having two functions:

Since the discarded won't be instantiated, we can now use a single function!
We also avoid some duplications of the first static assertion of the unused
statements. Pretty great, right ? But we can do better with C++17. Indeed, it
added a nice new attribute [[maybe_unused]]. Let's see what this gives
us:

No more need for cpp_unused trick :) This attribute tells the compiler
that a variable or parameter can be sometimes unused and therefore does not lead
to a warning for it. Only one thing that is not great with this attribute is
that it's too long, 16 characters. It almost double the width of my check
function signature. Imagine if you have more parameters, you'll soon have to use
several lines. I wish there was a way to set an attribute for all parameters
together or a shortcut. I'm considering whether to use a short macro to use in
place of it, but haven't yet decided.

Just a note, if you have else if statements, you need to set them as
constexpr as well! This was a bit weird for me, but you can figure it as
if the condition is constexpr, then the if (or else if)
is constexpr as well.

Overall, I'm really satisfied with the new if constexpr! This really makes the
code much nicer in many cases, especially if you abuse metaprogramming like
I do.

You may remember that I've coded a version of static if in the past with C++14 in the past. This was able to solve point 2, but not point 1 and was much uglier. Now we have a good solution to it. I've replaced two of these in the current code with the new if constexpr.

Before C++17, the only way to compute this result at compilation time was to use
template recursion, either with types or with constexpr functions. I think this
is pretty heavy only for doing a multiplication sum. Now, with fold expressions,
we can manipulate the parameter pack directly and rewrite our size function:

staticconstexprsize_tsize(){return(Dims*...);}

This is much better! This clearly states that each value of the parameter should
be multiplied together. For instance 1,2,3 will become (1*2)*3.

Another place where I was using this was to code a traits that tests if a set of
boolean are all true at compilation-time:

I think using fold expressions results in much clearer syntax and better code
and it's a pretty nice feature overall :)

As a note here, I'd like to mention, that you can also use this syntax to call
a function on each argument that you have, which makes for much nicer syntax as
well and I'll be using that in DLL once I migrate it to C++17.

Miscellaneous

There are also a few more C++17 features that I've used to improve ETL, but that
have a bit less impact.

A very nice feature of C++17 is the support for structured bindings. Often you
end up with a function that returns several parts of information in the form of
a pair or a tuple or even a fixed-size array. You can use an object for this,
but if you don't, you end up with code that is not terribly nice:

It's not terribly bad, but in these cases, you should be be hoping for something
better. With c++17, you can do better:

auto[index,result,alpha]=my_function();

Now you can directly use auto to deduce the types of the three variables at once
and you can get all the results in the variables at once as well :) I think this
is really nice and can really profit some projects. In ETL, I've almost no use
for this, but I'm going to be using that a bit more in DLL.

Something really nice to clean up the code in C++17 is the ability to declared
nested namespaces in one line. Before, you have a nested namespace
etl::impl::standard for instance, you would do:

namespaceetl{namespaceimpl{namespacestandard{// Someting inside etl::impl::standard}// end of namespace standard}// end of namespace impl}// end of namespace etl

Another very small change is the ability to use the typename keyword in place of
the class keyword when declaring template template parameters. Before, you had
to declare:

template<template<typename>classX>

now you can also use:

template<template<typename>typenameX>

It's just some syntactic sugar, but I think it's quite nice.

The last improvement that I want to talk about is one that probably very few
know about but it's pretty neat. Since C++11, you can use the alignas(X)
specifier for types and objects to specify on how many bytes you want to align
these. This is pretty nice if you want to align on the stack. However, this
won't always work for dynamic memory allocation. Imagine this struct:

structalignas(128)test_struct{chardata;};

If you declare an object of this type on the stack, you have the guarantee that
it will be aligned on 128 bytes. However, if you use new to allocate it
on the heap, you don't have such guarantee. Indeed, the problem is that 128 is
greater than the maximum default alignment. This is called an over-aligned type.
In such cases, the result will be aligned on the max alignment of your system.
Since C++17, new supports aligned dynamic memory allocation of
over-aligned types. Therefore, you can use a simple alignas to allocate
dynamic over-aligned types :) I need this in ETL for matrices that need to be
aligned for vectorized code. Before, I was using a larger array with some
padding in order to find an aligned element inside, but that is not very nice,
now the code is much better.

Compilation Time

I've done a few tests to see how much impact these news features have on
compilation time. Here, I'm doing benchmark on compiling the entire test suite
in different compilation mode, I enabled most compilation options (all GPU and
BLAS options in order to make sure almost all of the library is compiled).

Since I'm a bit short on time before going to vacation, I've only gathered the
results with g++. Here are the results with G++ 7.2.0

debug

release

release_debug

C++14

862s

1961s

1718s

C++17

892s

2018s

1745s

Difference

+3.4%

+2.9%

+1.5%

Overall, I'm a bit disappointed by these results, it's around 3% slower to
compile the C++17 version than the C++14 version. I was thinking that this would
a least be as fast to compile as before. It seems that currently with G++ 7.2,
if constexpr are slower to compile than the equivalent SFINAE functions.
I didn't do individual benchmarks of all the features I've migrated, therefore,
it may not be coming from if constexpr, but since it's the greatest
change by far, it's the more likely candidate. Once I'll have a little more
time, after my vacations, I'll try to see if that is also the case with clang.

Keep in mind that we are compiling the test suite here. The ETL test suite is
using the manual selection mode of the library in order to be able to test all
the possible implementations of each operation. This makes a considerable
difference in performance. I expect better compilation time when this is used in
automatic selection mode (the default mode). In the default mode, a lot more
code can be disabled with if constexpr. I will test this next with the
DLL library which I will also migrate to C++17.

Conclusion

This concludes this report on the migration of my ETL library from C++14 to
C++17. Overall, I'm really satisfied with the improvement of the code, it's much
better. I'm a bit disappointed by the slight increase (around 3%) in
compilation time, but it's not dramatic either. I'm still hoping that once it's
used in DLL, I will see a decrease in compilation, but we'll see that when I'll
be done with the migration of DLL to C++17 which may take some time since I'll
have two weeks vacation in China starting Friday.

The new version is available only through the master branch. It will be
released as the 1.3 version probably when I integrate some new features, but in
itself will not be released as new version. You can take a look in the
Github etl repository if you are interested.

I'm happy to announce the release of budgetwarrior 1.0. This is a major change
over the previous version.

Web Interface

Until now, budgetwarrior could only be used in command line. This is fine for
me, but not for every body. Since I wanted to share my budget with my
girlfriend, I needed something less nerdy ;)

Therefore, I added support for a web interface for budgetwarrior. Every feature
of the console application is now available in the web version. Moreover, since
the web version offers slightly better graphical capabilities, I added a few
more graphs and somewhat more information at some places. I'm not nearly an
expert in web interface, but I think I managed to get something not too bad
together. There are still some things to improve that I'll go through in the
future but so far the web interface is pretty satisfying and it is mobile friendly!

The web server is coded in C++ (who would have guessed...) and is embedded in
the application, you need to use the command server to use it:

budget server

and the server will be launched (by default at localhost:8080). You can
configure the port with server_port=X in the configuration file and the
listen address with server_listen=X. You can access your server at
http://localhost:8080.

Here is what this will display:

Note: All the data is randomized

The main page shows your assets, the current net worth, your monthly cash-flow
and the state of your objectives.

The menu will give you access to all the features of the application. You can
add expenses and earnings, see reports, manage your assets and your objectives
and so on. Basically, you can do everything you did in the application, but you
have access to more visualization tools than you would on the console. For
instance, you can access your fortune over time:

or see how your portfolio does in terms of currency:

Normally, unless I forgot something (in which case, I'll fix it), everything
should be doable from the web interface. This is simply easier people that are
not as nerdy as me for console ;)

The management is still the same, the server will write to the same file the
base application uses. Therefore, you cannot use the server and the command line
application on the same machine at the same time. Nevertheless, if the server is
not running, you can still use the command line application. This could be
useful if you want to use the web visualization while still using the command
line tool for managing the budget.

The default user and password is admin:1234, but you of course change it using
web_password and web_user in the configuration. You can also disable the
security if you are sure of yourself by setting server_secure=true in
the configuration. The server currently does not support

Currently, it does not protect against concurrent modifications of the same
data. It is very unlikely to happen with only a few people using the
applications, but I plan to improve that in the future.

Server mode

Although it's not possible to use both the server and the command line
application at the same time, it's possible to use the command line application
in server mode. In this case, instead of reading and writing the data from the
hard disk, the application will send requests to the server to read and write
the data it needs. With this, you can use both the server and the command line
application at the same time!

While running, the server exposes a simple API that can be used to get
all the information about the budget data and that can also be used to add new
expenses, earnings and so on directly to the server data. The API is also
protected by authentication.

Currently, the server does not support HTTPS. However, you can run it behind
a proxy such as nginx which is running in HTTPS. This is what I'm doing. The
server mode supports SSL from the client to the server, you just have to set
server_sll=true in the configuration.

This is the mode I'm currently using and will continue using. With this, I can
quickly do some modifications using the command line and if I want to see
advanced visualization, I just have to open my browser and everything is
updated. Moreover, in the future, other people involved with my budget will be
able to access the web interface. This also solves the synchronization problem
in a much better way than before.

Just as it was the case with the server, this is not made to be used in parallel
by different users. This should be perfectly fine for a small household.

Assets Tracking

Already a few months ago, I've added the feature to track assets <https://baptiste-wicht.com/posts/2017/10/budgetwarrior-track-assets-portfolio-savings-rates-auto-completion.html> `_ into budgetwarrior. You can define the list of the assets you possess. The tool will then help you track the value of your assets. You can set your desired distribution of bonds, cash and stocks and the tool will help you see if you need to rebalance your assets. This will let you compute your net worth, with :code:`budget asset value:

Moreover, you can also set a few of your assets as your portfolio assets. These
assets have a desired distribution and are handled different. These are the
assets you directly manage yourself, your investment portfolio. You can then
track their value and see if they need rebalancing. For instance, here is
a randomized rebalancing of your portfolio, with budget asset rebalance:

All these features are now also available on the web version as well.

Better console usability

A few months ago, I added some quality-of-life improvements to the console appplication. You can now cycle through the list of possible values for accounts for instance in the console! This is down with the UP and DOWN keys. Now, I also added auto-completion with TAB key. You can write Ins<TAB> and it will complete to Insurances if you have an Insurances account in your budget. This makes it much faster to enter new expenses or to update asset values.

Installation

If you are on Gentoo, you can install it using layman:

layman -a wichtounet
emerge -a budgetwarrior

If you are on Arch Linux, you can use this AUR repository
<https://github.com/StreakyCobra/aur> (wait a few day for the new version to be
updated)_

Conclusion

Overall, even though I'm not a fan of web development, it was quite fun to add
all these features to budgetwarrior and made it much better I think. This is
a very significant change to the project since it almost doubled in number of
source lines of code, but I think it's a change that was needed.

I think these changes really make budgetwarrior more useful to a wider group of
people and I'm pretty to have finally come around and implemented them. I still
have a few things I plan to improve in the near future. First, I want to make
the website a bit faster, there are many scripts and stylesheets that are being
loaded and make the site a bit bloated. I'll also enable gzip compression of the
website to speed up things. I will also ensure that the server can handle
requests concurrently without any problem of the data (should be simple since we
don't need high performance). I may also add a new module to budgetwarrior to
track your progress towards retirement if this is something you are interested
in, but I haven't decided in what form exactly. Finally, I will also try to
optimize the requests that are being done between the server and the client when
run in server mode. Indeed, it currently downloads almost all the data from the
server which is far from optimal.

If you are interested by the sources, you can download them on Github:
budgetwarrior.

If you have a suggestion or you found a bug, please post an issue on Github.

If you have any comment, don't hesitate to contact me, either by letting a
comment on this post or by email.

I should have done that earlier but it slipped my mind, so there it is!

My thesis (Deep Learning Feature Extraction for Image Processing) is now
available to download. Here is the abstract of the thesis:

In this thesis, we propose to use methodologies that automatically learn how to
extract relevant features from images. We are especially interested in
evaluating how these features compare against handcrafted features. More
precisely, we are interested in the unsupervised training that is used for the
Restricted Boltzmann Machine (RBM) and Convolutional RBM (CRBM) models. These
models relaunched the Deep Learning interest of the last decade. During the time
of this thesis, the auto-encoders approach, especially Convolutional
Auto-Encoders (CAE) have been used more and more. Therefore, one objective of
this thesis is also to compare the CRBM approach with the CAE approach.

The scope of this work is defined by several machine learning tasks. The first
one, handwritten digit recognition, is analysed to see how much the unsupervised
pretraining technique introduced with the Deep Belief Network (DBN) model
improves the training of neural networks. The second, detection and recognition
of Sudoku in images, is evaluating the efficiency of DBN and Convolutional DBN
(CDBN) models for classification of images of poor quality. Finally, features
are learned fully unsupervised from images for a keyword spotting task and are
compared against well-known handcrafted features. Moreover, the thesis was also
oriented around a software engineering axis. Indeed, a complete machine learning
framework was developed during this thesis to explore possible optimizations and
possible algorithms in order to train the tested models as fast as possible.

It has been a while since I've posted on this blog. I've had to serve three
weeks in the army and then I had two weeks vacation. I've been actively working
on budgetwarrior with a brand new web interface! More on that later ;)

Today, I'm happy to release the version 1.2.1 of my Expression Templates Library
(ETL) project. This is a minor version but with significantly better GPU support
and a few new features and bug fixes so I decided to release it now.

Faster GPU support

This will significantly reduce the number of CUDA kernel calls that are being
launched. For instance, each of the following expressions will be evaluated
using a single GPU kernel:

yy=1.1*x+yyy=x+1.1*yyy=1.1*y+1.2*yyy=1.1*x*yyy=x/(1.1*y)

This makes some operation significantly faster.

Moreover, I've reduced a lot the numbers of device synchronization in the
library. Especially, I've removed almost all synchronization from the
etl-gpu-blas sub library. This means that synchronization is mostly only done
when data needs to go back to the CPU. For machine learning, this means at the
end of the epoch to compute the final error. This makes a HUGE difference in
time, I didn't realize before that I was doing way too much synchronization.

With these two changes, I've been able to attain state of the art training performance on GPU with my Deep Learning Library (DLL) project!

Moreover, I've now added for random number generations on the GPU and for
shuffle operations as well.

New Features

I've also added a few new features recently. They were especially added to
support new features in DLL.

Matrices and vectors can now be normalized in order to have zero-mean and
unit-variance distribution. You can also merge matrices together. For now, there
is no GPU support, so this will use CPU anyway. I plan to fix that later.

In addition to bias_batch_mean that I added before, I also added bias_batch_var
now with the variance in place of the mean. This is mainly used for Batch
Normalization in machine learning, but it may have some other usages. The GPU
support has been added as well directly.

And the last feature is the support for embedding and the gradients of
embedding. Again this is totally related to machine learning, but can be very
useful as well. I haven't add the time to develop the GPU version so far, but
this will come as well.

Performance

Nothing fancy on the CPU performance side, I only added vectorization for
hyperbolic versions. This makes tanh much faster on CPU.

Bug Fixes

I fixed quite a few bugs in this version, which is one of the main reason
I released it:

1. When using large fast_matrix and aliasing was detected, there was a big chance of stack overflow occurring. This is now fixed by using a dynamic temporary.
1. Some assignables such sub_view did not perform any detection for aliasing. This is now fixed and aliasing is detected everywhere.
1. fast_dyn_matrix can now be correctly used with bool
1. The use of iterators was not always ensuring correct CPU/GPU consistency. This is now correctly handled.
1. The 4D convolution in GPU were not using the correct flipping
1. Fix small compilation bug with sub_matrix and GPU

What's next ?

I don't really know what will be in the next release. This should be the release
1.3. One possible idea would be to improve and review the support for sparse
matrix which is more than poor as of now. But I'm not really motivated to
work on that :P Moreover, I'm now actively working on the next release of
budgetwarrior which will probably still come this month.

I'm also still hesitating in switching to C++17 for the library to make it
faster to compile. And also to clean some parts of the code. I would be able to
remove quite some SFINAE with the new if constexpr, but I'm afraid this will
make the library to difficult to use since it would need at least GCC 7 or clang
3.9.

Download ETL

You can download ETL on Github. If you
only interested in the 1.2.1 version, you can look at the
Releases pages or clone the tag
1.2.1. There are several branches:

master Is the eternal development branch, may not always be stable

stable Is a branch always pointing to the last tag, no development here

For the future release, there always will tags pointing to the corresponding
commits. You can also have access to previous releases on Github or via the
release tags.

The documentation is still a bit sparse. There are a few examples and the Wiki,
but there still is work to be done. If you have questions on how to use or
configure the library, please don't hesitate.

Don't hesitate to comment this post if you have any comment on this library or
any question. You can also open an Issue on Github if you have a problem using
this library or propose a Pull Request if you have any contribution you'd like
to make to the library.

The GPU performance of my Expression Templates Library (ETL) is pretty good when
most of the time is spent inside expensive operations such as Matrix-Matrix
Multiplication or convolutions. However, when most of the time is spent in
linear kernels, performance is not great because this will invoke a lot of CUDA
kernels. Indeed, the way it is done is that each sub expressions compute its
result in a temporary GPU vector (or matrix) and these temporaries are passed
through the expressions. For instance, this expression:

yy=1.1*x+1.2*y

will be executed on the GPU as something like this:

t1=1.1*xt2=1.2*yyy=t1+t2

that will results in three GPU kernels being invoked. In the CPU case, the
complete expression will be executed as one CPU kernel, that is constructed with
Expression Templates. Unfortunately, a CUDA kernel cannot be constructed in the
same way since the CUDA compiler does not support general template
metaprogramming. That's why I've implemented by using small kernels for each
expression.

Fortunately, we can do better with a bit more meta-programming. Indeed, there
are some patterns that are repeated a lot and that easily be implemented in CUDA
kernels. I've started detecting a few of these patterns and for each of them
a single CUDA kernel is executed. For instance, each of the following
expressions can be executed with a single kernel:

yy=1.1*x+yyy=x+1.1*yyy=1.1*y+1.2*yyy=1.1*x*yyy=x/(1.1*y)

This results in significantly performance improvement for these expressions!

I have tested these new improvements in my Deep Learning Library (DLL) project
(not yet merged) and it resulted in 25% faster momentum computation and
17% faster Nesterov Adam (NADAM).

I'm going to continue to investigate which kernels need to be made faster for
DLL and try to improve the overall performance. Currently, the GPU performance
of DLL is very good for large convolutional networks, but could be improved for
small fully-connected networks. Indeed, in that case, quite some time is spent
outside the matrix-matrix multiplication and inside serial expressions for which
GPU could be improved. Once I'm done with my optimizations, I'll probably post
again on the blog with the latest results.

All these new optimizations are now in the master branch of the ETL
project if you want to check it out. You can access the project
on Github.

It's nothing fancy yet, but forward propagation of LSTM and basic
Backpropagation Through Time (BPTT) are now supported. It was not really
complicated to implemenet the forward pass but the backward pass is much
complicated for an LSTM than for a RNN. It took me quite a long time to figure
out all the gradients formulas and the documentation on that is quite scarce.

For now, still only existing classification loss is supported for RNN and LSTM.
As I said last time, I still plan to add support for sequence-to-sequence loss
in order to be able to train models able to generate characters. However, I don't
know when I'll be able to work on that. Now that I've got the code for LSTM,
I should be able to implement a GRU cell and NAS cell quite easily I believe.

For example, here is a simple LSTM used on MNIST for classification:

#include"dll/neural/dense_layer.hpp"#include"dll/neural/lstm_layer.hpp"#include"dll/neural/recurrent_last_layer.hpp"#include"dll/network.hpp"#include"dll/datasets.hpp"intmain(int/*argc*/,char*/*argv*/[]){// Load the datasetautodataset=dll::make_mnist_dataset_nc(dll::batch_size<100>{},dll::scale_pre<255>{});constexprsize_ttime_steps=28;constexprsize_tsequence_length=28;constexprsize_thidden_units=100;// Build the networkusingnetwork_t=dll::dyn_network_desc<dll::network_layers<dll::lstm_layer<time_steps,sequence_length,hidden_units,dll::last_only>,dll::recurrent_last_layer<time_steps,hidden_units>,dll::dense_layer<hidden_units,10,dll::softmax>>,dll::updater<dll::updater_type::ADAM>// Adam,dll::batch_size<100>// The mini-batch size>::network_t;autonet=std::make_unique<network_t>();// Display the network and datasetnet->display();dataset.display();// Train the network for performance sakenet->fine_tune(dataset.train(),50);// Test the network on test setnet->evaluate(dataset.test());return0;}

The network is quite similar to the one used previously with an RNN, just
replace rnn with lstm and that's it. It starts with LSTM layer, followed by
a layer extracting the last time step and finally a dense layer with a softmax
function. The network is trained with Adam for 50 epochs. You can change the
activation function , the initializer for the weights and the biases and number
of steps for BPTT truncation.

Currently I'm working on the GPU performance again. The performance of some is
still not as good as I want it to be, especially complex operation like used in
Adam and Nadam. Currently, there are many calls to GPU BLAS libraries and
I want to try to extract some more optimized patterns. Once it's done, I'll post
more on that later on the blog.

I've improved a lot the display of my Deep Learning Library (DLL). I know this
is generally not the most important point in a machine learning framework, but
the first impression being important. Therefore, I decided it was time to get
a nicer output in the console for training networks.

A network or a dataset can be displayed using the display() function.
I've added a display_pretty() function to them to display it more
nicely. I've also added the dll::dump_timers_nice() function to do the
same for dll::dump_timers().

I've also improved the display for the results of the batches during training.
Now, the display is updated every 100ms and it also displays the current
estimated time until the end of the epoch. With that, the user should have
a much better idea on what's going on during training, especially when training
networks when the epochs are taking a long time to complete.

Here is a full output of the training of fully-connected network on MNIST
(mnist_mlp.cpp <https://github.com/wichtounet/dll/blob/master/examples/src/mnist_mlp.cpp>):

During the first years of my thesis I worked on CTI research project with the
American company Verisign, which has also an office near my school. A CTI
research project is a project that is partially funded by the Commission on
Innovation and Technology (CTI) where a school and a company work together.
I was quite lucky to work on this project with the awesome people at Verisign
Fribourg. After the success of the project, Verisign filled several patents
regarding various points of the projects.

I'm quite happy now that these four patents are now approved and published. They
They have been approved by both the United States Patent and Trademark Office
(USPTO) and European Patent Office (EPO). The parents have been cl=¬ aimed by
Verisign, I'm only one of the inventor, I got no claim on the patent. But it's
still a great thing.

Here are the names of the four patents:

Systems and methods for automatic phonetization of domain names¬

Construction of phonetic representation of a string of characters¬

Method for writing a foreign language in a pseudo language phonetically resembling native language of the speaker¬

Construction of a phonetic representation of a generated string of characters¬

You can take a look at them on USPTO or EPO or on Google Patents, but the way
a patent is written make it relatively hard to follow, it's more on a lawyer
level or maybe I'm simply not used to patents anymore.

All these patents come from the research done during the CTI project with
Verisign. In this project, name suggestions were generated from the phonetic
sound of the name. The idea being to generate names that sounds the same as
another input (airmix could become rmix or rmics). We are using various
technologies to make this work: IG-Tree, Viterbi and HMM. And since we used
a model with an encoder and a decoder, we can also mix languages. For instance,
write something in French the way a English work would work (for instance school
could become scoule).

These patents concludes a very interesting and successful project. I'm now
working on yet another CTI research project with Verisign and it will surely be
as successful as the first one.

I'm happy to announce that I just merged support for Recurrent Neural Networks
(RNNs) into my Deep Learning Library (DLL) machine learning framework.

It's nothing fancy yet, but forward propagation of RNN and basic Backpropagation
Through Time (BPTT) are now supported. For now, only existing classification
loss is supported for RNN. I plan to add support for sequence-to-sequence loss
in order to be able to train models able to generate characters, but I don't
know when I'll be able to work on that. I also plan to add support for other
types of cells such as LSTM and GRU (maybe NAS) in the future.

For example, here is a simple RNN used on MNIST:

#include"dll/neural/dense_layer.hpp"#include"dll/neural/recurrent_layer.hpp"#include"dll/neural/recurrent_last_layer.hpp"#include"dll/network.hpp"#include"dll/datasets.hpp"intmain(int/*argc*/,char*/*argv*/[]){// Load the datasetautodataset=dll::make_mnist_dataset_nc(dll::batch_size<100>{},dll::scale_pre<255>{});constexprsize_ttime_steps=28;constexprsize_tsequence_length=28;constexprsize_thidden_units=100;// Build the networkusingnetwork_t=dll::dyn_network_desc<dll::network_layers<dll::recurrent_layer<time_steps,sequence_length,hidden_units,dll::last_only>,dll::recurrent_last_layer<time_steps,hidden_units>,dll::dense_layer<hidden_units,10,dll::softmax>>,dll::updater<dll::updater_type::ADAM>// Adam,dll::batch_size<100>// The mini-batch size>::network_t;autonet=std::make_unique<network_t>();// Display the network and datasetnet->display();// Train the network for performance sakenet->fine_tune(dataset.train(),50);// Test the network on test setnet->evaluate(dataset.test());return0;}

The network starts with recurrent layer, followed by a layer that extracts only
the last layer and finally a dense layer with a softmax function. The recurrent
layer has support to change the activation function, change the initializer for
the two weights matrices of the RNN and the number of steps for BPTT truncation.

I've just finished integrating new features into DLL, my deep learning library.
I've added support for an embeddings layer, a group layer and a merge layer.
This is not yet released, but available in the master branch.

Embeddings are used more and more these days to learn dense representation of
characters or word. An embedding layer in a neural network transform labels into
a vector. It's generally used as the first layer of the network. The embedding
are learned as part of the network.

The merge layer allows to create branches in the network. The input is passed to
each sub layer and then the output of each layer is concatenated to form the
output of the merged layers. This can be very useful to use different
convolutional filter sizes.

The group layer is a simple utility to group layers together. This is mostly to
use with merge layers to form several branches.

I've put together a new example to use these features on text classification.
The dataset is totally synthetic for now, but this can easily be reproduced with
a normal text classification dataset. This kind of model is called a Character
Convolutional Neural Network.

Here is the code for example:

constexprsize_tembedding=16;// The length of the embedding vectorconstexprsize_tlength=15;// The word (or sequence) lengthusingembedding_network_t=dll::dyn_network_desc<dll::network_layers<// The embedding layerdll::embedding_layer<26,length,embedding>// The convolutional layers,dll::merge_layer<0,dll::group_layer<dll::conv_layer<1,length,embedding,16,3,embedding>,dll::mp_2d_layer<16,length-3+1,1,length-3+1,1>>,dll::group_layer<dll::conv_layer<1,length,embedding,16,4,embedding>,dll::mp_2d_layer<16,length-4+1,1,length-4+1,1>>,dll::group_layer<dll::conv_layer<1,length,embedding,16,5,embedding>,dll::mp_2d_layer<16,length-5+1,1,length-5+1,1>>>// The final softmax layer,dll::dense_layer<48,10,dll::softmax>>,dll::updater<dll::updater_type::NADAM>// Nesterov Adam (NADAM),dll::batch_size<50>// The mini-batch size,dll::shuffle// Shuffle before each epoch>::network_t;autonet=std::make_unique<embedding_network_t>();// Display the network and datasetnet->display();// Train the network for performance sakenet->fine_tune(samples,labels,50);// Test the network on train setnet->evaluate(samples,labels);

The network starts with an embedding layer. The embedding is then passed to
three convolutional layers with different filter sizes, each followed by
a pooling layer. The outputs of the three layers are merged at the end of the
merge layer. Finally, a softmax layer is used for classification.

This kind of model can be very powerful and is used regularly. These new
features make for a much larger variety of models that can be build with the DLL
library.

The full code with the dataset generation can be found online:
char_cnn.cpp

The next feature I want to focus on is recurrent neural networks. I'll probably
try a single RNN layer first and then upgrade to multi-layers and LSTM and maybe
GRU.