Mon 11 July 2016

Recurrent Neural Networks in Tensorflow I
This is the first in a series of posts about recurrent neural networks in Tensorflow. In this post, we will build a vanilla recurrent neural network (RNN) from the ground up in Tensorflow, and then translate the model into Tensorflow's RNN API.

Mon 11 April 2016

First Convergence Bias
In this post, I offer the results of an experiment providing support for "first convergence bias", which includes the proposition that training a randomly initialized network via backpropagation may never converge to a global minimum, regardless of the intialization and number of trials.

Tue 05 April 2016

Inverting a Neural Net
In this experiment, I "invert" a simple two-layer MNIST model to visualize what the final hidden layer representations look like when projected back into the original sample space.

Wed 30 March 2016

Representational Power of Deeper Layers
The hidden layers in a neural network can be seen as different representations of the input. Do deeper layers learn "better" representations? In a network trained to solve a classification problem, this would mean that deeper layers provide better features than earlier layers. The natural hypothesis is that this is indeed the case. In this post, I test this hypothesis on an network with three hidden layers trained to classify the MNIST dataset. It is shown that deeper layers do in fact produce better representations of the input.

Tue 29 March 2016

Implementing Batch Normalization in Tensorflow
Batch normalization is deep learning technique introduced in 2015 that enables the use of higher learning rates, acts as a regularizer and can speed up training by 14 times. In this post, I show how to implement batch normalization in Tensorflow.

Sat 23 January 2016

Skill vs Strategy
In this post I consider the distinction between skill and strategy and what it means for machine learning. Backpropagation is limited in that it develops a skill at a specific strategy, but cannot, by itself, change strategies. I look at how strategy switches are achieved in real examples and ask what algorithm might allow machines to effectively switch strategies.