Thursday, October 15, 2015

Training neural nets using Amazon EC2

In my opinion, one of the most attractive aspects of artificial neural networks is that they provide incredible power and flexibility for very little implementation overhead. For most applications, very good software libraries exist that make implementing and applying these techniques simple.

Lately I've been playing around with Keras, an elegant high level Python library that provides a broad set of tools for building and using neural nets. Keras sits on top of Theano, a powerful math library for n-dimensional arrays with extensive optimization features and support for parallel CUDA architectures, pretty much engineered for deep learning applications. Taking advantage of the highly parallel potential of CUDA-enabled graphics hardware can speed up neural net training tremendously. One way to do this without having to drop a few grand on your very own GPU workstation is to use Amazon EC2, which provides access to modest-but-capable virtual GPU compute nodes at a much lower cost than rolling your own.

You'll need an Amazon web services account, so if you don't have one yet, go ahead and get that set up. Once you're signed in to your AWS account, head to the EC2 spot instances panel.

I'll point out that at the time I'm writing this, Amazon offers two GPU instance types: g2.2xlarge and g2.8xlarge. Chances are good you want the smaller and considerably cheaper g2.2xlarge—the bigger 8x instance has the same GPU and won't be any faster for CUDA-enabled tasks, although it does have more memory which might be helpful depending on the size of your inputs. You can also save a considerable amount of money by using spot instances (which you bid on) instead of dedicated ones (for which you pay a flat rate), although spot instances may be terminated without warning if you place too low a bid. In fact, the average cost per hour for the g2.2xlarge instances is typically very affordable (around $0.10 or less), but I have been setting my maximum bid around $3 to make sure my instances don't get interrupted during unpredictable price spikes. You can read more about how spot instances work and check the current prices here.

Setting up Keras on EC2

Running Keras on EC2 first requires setting up Theano and the NVIDIA CUDA drivers, which isn't difficult, but is a bit tedious. Markus Beissinger has some great instructions on how to do this, which I took the liberty of adapting into a really basic set of shell scripts that will automate the process of installing Theano/Keras/CUDA on a remote machine. Note that if you just want to start with Theano/CUDA and nothing else, and you don't need the latest versions, you can just initialize your EC2 instance with Markus's AMI ami-b141a2f5. If you want to have a bit more control, e.g. to specify newer versions and/or install additional components, my shell scripts are a good place to start. You can tweak them until you've got your installation process nailed down, and then create your own AMI so you can easily spawn new instances with your custom setup pre-loaded. For instance, I added a few custom steps to my own copies of the scripts, which set up additional security credentials, install a few more python libraries I use, and pull down a couple of my own personal git repos onto the EC2 instance.

To get started, you can use git to grab the setup scripts:

git clone

The scripts assume the remote instance is using Ubuntu Linux. So when you're requesting the initial AMI for your new EC2 instance, make sure you specify the base Ubuntu installation (currently Ubuntu 14.04). Once your instance is up and running, open the script file in the setuptheano directory you just cloned, and enter the public IP of your EC2 instance and the local path to the EC2 .pem file you chose to use for authentication into the CUDA_IP and CUDA_PEM variable declarations respectively (at the top of the script):

### set configuration variables here

CUDA_IP= # replace this with the ip of your ec2 instance
CUDA_USER=ubuntu # don't modify
CUDA_PEM=~/cuda.pem # local path to your ec2 ssh identity file

Then close and save the setup script, go to the command line, and run the script by entering the command:


When it asks you to add the new IP to the list of known hosts, answer yes. The script will run, installing Theano, CUDA support, and Keras onto your remote machine.