Listen, Attend and Spell

During my Master Thesis project I re-implemented Listen, attend and Spell, an attention based speech recognition system. A key problem in speech recognition is that often it is unknown what is said when. In other words the speech signal and its transcription is unaligned. Attention based system such as the one I wrote solve this problem by computing attention weights for each input vector. A visualization of the system is given below:

The LAS architecture. BLSTM blocks are shown in red.LSTM blocks in blue and attention nets in green.

The overall system consists of the two blocks shown in the image above. A listener network computes a compressed fixed length encoding of an input signal. Which is then transcribed to an output sequence by the speller. To come up with a transcription the speller has to compute attention weights such as those shown below:

Plot of the alignment vectors computed by the network for all 45 labelsassigned to timit utterancefmld0_sx295(left), and alignments assigned by a humanlistener (right).

The plot above reveals that the attention weights found be the network are quite similar to those assigned by a human listener. Some artifacts remain, but its must be kept in mind that TIMIT is a quite small data set. Better results have been observed when larger data sets and more that just a single 8GB graphics card are used for training.

The source code is available on github.

For an in depth discussion please take a look at my thesis text.

The links below lead to more samples of my work on support vector machines and data-mining:
More on svms
More on data-mining

Medical Image Analysis

As coursework for my computer vision class in Leuven I looked into support vector machines on medical data. The task was to locate incisor teeth.

For every window the svm generates the probabilities for a miss and hit. In this example the frame with the largest hit probability was chosen, as all images where known to contain incisors somewhere.

The links below lead to more samples of my work on support vector machines and data-mining:
More on svms
More on data-mining

Linear Algebra

One of the most interesting things I have encountered in linear algebra are pseudospectra and their relation to toeplitz symbol functions, as well as their associated circulant matrix eigenvalues.
Below I have included plots which illustrate this beautiful relation (click to enlarge in new tab):

pesudoMerge

Shown on the left are the Symbol functions (yellow), Toeplitz eigenvalues (blue) and circulant matrix
eigenvalues (green). On the right epsilon-pseudospectra of the same matrices are shown.

Interested? More on:
Pseudospectra
Regularization

PDEs and iGem

One of the first equations I discretized and simulated was the wave equation using a finite difference scheme. I ended up with a simulation, which is quite pretty:

The biggest PDE project I have done so far was with the Leuven iGem-Team. We simulated the behavior of pattern forming bacteria using pure PDE and a PDE-Agent hybrid models. The pure PDE simulations have been generated by discretizing a modified Keller-Segel system of equations using a finite volume method in Matlab:


In order to take cell adhesion into account we created this hybrid model:

The international jury in Boston nominated us for the best model award, which means they saw us in the top 5 of overgrad teams in the modeling category.
Interested? Read more here:
Leuven iGem 2015 wiki

Ray Tracing

Ray tracing is a fascinating algorithm used for image synthesis. I have written a rudimentary ray tracer consisting of more then 5000 lines of code.
The video below shows it render a high resolution triangle mesh representation of the Stanford dragon:

 

 

 

 

Fast implementations depend on binary trees to reduce the number of required ray triangle intersection computations. These trees are quite beautiful to look at. In a nutshell such trees are generated by splitting a bounding box which contains the object in two recursively again and again. The video below visualizes such a tree:

 

 

 


Please note how the outermost box shows up for high sensitivity values and the splitted sub boxes become visible as the sensitivity is reduced. When it is reduced further, the edges of the dragon become visible nicely.

 

The same algorithm can also be used for mathematical plotting purposes. Below you can see renderings of a Julia fractal in two and three dimensions, generated using the same ray tracing code:

 

frac1
 

Interested? Read more:
pdf

The source code is freely available at:
Github