LJSpeech is a dataset of actual human voices. Lets start our own subjective quality evaluation by considering the example below. Modern artificial neural networks can generate credible sounding human speech. A MelGAN-reproduction of the first LJSpeech sentence is hard to identify as such. The same is true for the HIFI-GAN version below. The audio is …
Author Archives: moritz
New interview with the “digitale-perspektiven” podcast
I appeared as a guest on the “digitale-perspektiven” podcast. To learn more follow this link.
Jax on Juwels Booster
This post illustrates a possible way to set up multinode Jax computations on the Juwels Booster partition at the Jülich Supercomputing Centre. The following text adapts instructions from official documentation to run in Jülich. Let’s start with the Python code. The code snippet below determines how many GPUs we have and tells Jax to run …
Vscode-Python module debugging
As I spend too much time looking for this on the internet, I am posting an example launch.json for future reference: Replace the module path after “module” and the arguments in the “args” block with more suitable values.
Course release: “Introduction to deep learning with Jax”
Our course introduction to deep learning with Jax is now available online at https://github.com/Deep-Learning-with-Jax . The material currently consists of lecture videos, slides and exercises. Most exercises come with unit tests, allowing you to verify your solutions independently.
On the similarities of diffused- and gan-generated image detection
Guided diffusion has become the new go-to method for image generation. To avoid misuse of this inspiring new technology, we must ensure fake detection networks remain up to speed with recent developments. Using the approach described in “Diffusion models beat gans on image synthesis”. Wavelet packets decompose an input into blocks according to frequency. The …
Continue reading “On the similarities of diffused- and gan-generated image detection”
Wavelet-Packet Powered Deepfake Image Detection
Modern neural networks generate realistic artificial images and audio. This development will allow us to create movies, music and audio effects never seen before. Yet at the same time, the new technology may enable new digital ways to lie. In response, the need for a diverse and reliable toolbox arises to identify artificial images and …
Continue reading “Wavelet-Packet Powered Deepfake Image Detection”
Wavelet optimization for Network compression
Wavelets are uncommon in machine learning, systems with learnable wavelets, in particular, are rare. Promising applications of wavelets in neural networks exist. Adaptive wavelets for network compression are explored in the new paper ‘Neural network compression via learnable wavelet transforms‘. By defining new wavelet loss terms based on the product filter approach to wavelet design, …
Continue reading “Wavelet optimization for Network compression”
Jaxlets – Fast Wavelet Transformations in JAX
The fast wavelet transform is an important signal processing algorithm. Jet a differentiable implementation in JAX has been missing so far, I have therefore opened my implementation . It supports the one and two dimensional analysis and synthesis transforms. As well as an implementation of the forward wavelet packet transform. The plot below shows an …
Continue reading “Jaxlets – Fast Wavelet Transformations in JAX”
Video Prediction à la Fourier
Video frame prediction is a very challenging problem. Many recent neural network based solution-attempts trained using a mean squared error lead to blurry predictions. My most recent paper currently under review proposes to use Phase correlation and the Fourier-Shift theorem estimate changes and transform current images into predictions. A demo is shown below. The video …