婉兮清扬

Running cxxnet on Amazon EC2 (Ubuntu 14.04)

发表时间：2015-08-09 08:47:06

1. Launch an EC2 instance with the g2.8xlarge instance type, using a Ubuntu 14.04 HVM AMI. When I launched the EC2 instance, I used a root EBS volume of 300 GB (General Purpose SSD) to have a decent disk I/O capacity. With general purpose SSD, you have 3 IOPS for each GB of storage. So 300 GB storage gives me 900 baseline IOPS, with the capability to burst up to 3000 IOPS for an extended period of time.

2. SSH into the EC2 instance and install CUDA driver, as below:

There is a detailed tutorial on this topic available on Github:

https://github.com/BVLC/caffe/wiki/Install-Caffe-on-EC2-from-scratch-(Ubuntu,-CUDA-7,-cuDNN)

3. Install OpenBLAS, as below

$ sudo apt-get install make gfortran

$ wget http://github.com/xianyi/OpenBLAS/archive/v0.2.14.tar.gz

$ tar zxvf v0.2.14.tar.gz

$ cd OpenBLAS-0.2.14

$ make FC=gfortran

$ sudo make PREFIX=/usr/local/ install

$ cd/usr/local/lib

$ sudo ln -s libopenblas.so libblas.so

4. Install OpenCV

There is a detailed documentation available from the Ubuntu community:

https://help.ubuntu.com/community/OpenCV

You will also need to install the header files for OpenCV

$ sudo apt-get install libopencv-dev

3. Install cxxnet, as below

$ cd ~

$ wget https://github.com/dmlc/cxxnet/

$ cd cxxnet

$ ./build.sh

In most cases, the build will fail. You need to customize your Makefile a little bit to reflect the actual situation of your build environment. Below is an example from my environment:

CFLAGS += -g -O3 -I./mshadow/ -I./dmlc-core/include -I/usr/local/cuda/include -I/usr/include -fPIC $(MSHADOW_CFLAGS) $(DMLC_CFLAGS)

LDFLAGS = -pthread $(MSHADOW_LDFLAGS) $(DMLC_LDFLAGS) -L/usr/local/cuda/lib64 -L/usr/local/lib

Then do the make again:

$ make

g++ -DMSHADOW_FORCE_STREAM -Wall -g -O3 -I./mshadow/ -I./dmlc-core/include -I/usr/local/cuda/include -I/usr/include -fPIC -msse3 -funroll-loops -Wno-unused-parameter -Wno-unknown-pragmas -DMSHADOW_USE_CBLAS=1 -DMSHADOW_USE_MKL=0 -DMSHADOW_RABIT_PS=0 -DMSHADOW_DIST_PS=0 -fPIC -DDMLC_USE_HDFS=0 -DDMLC_USE_S3=0 -DDMLC_USE_AZURE=0 -DCXXNET_USE_OPENCV=1 -DCXXNET_USE_OPENCV_DECODER=1 -fopenmp   -o bin/cxxnet src/local_main.cpp layer_cpu.o updater_cpu.o nnet_cpu.o main.o nnet_ps_server.o data.o dmlc-core/libdmlc.a layer_gpu.o updater_gpu.o nnet_gpu.o -pthread -lm -lcudart -lcublas -lcurand -lblas -lrt -L/usr/local/cuda/lib64 -L/usr/local/lib `pkg-config --libs opencv` -ljpeg

g++ -DMSHADOW_FORCE_STREAM -Wall -g -O3 -I./mshadow/ -I./dmlc-core/include -I/usr/local/cuda/include -I/usr/include -fPIC -msse3 -funroll-loops -Wno-unused-parameter -Wno-unknown-pragmas -DMSHADOW_USE_CBLAS=1 -DMSHADOW_USE_MKL=0 -DMSHADOW_RABIT_PS=0 -DMSHADOW_DIST_PS=0 -fPIC -DDMLC_USE_HDFS=0 -DDMLC_USE_S3=0 -DDMLC_USE_AZURE=0 -DCXXNET_USE_OPENCV=1 -DCXXNET_USE_OPENCV_DECODER=1 -fopenmp   -o bin/im2rec tools/im2rec.cc dmlc-core/libdmlc.a -pthread -lm -lcudart -lcublas -lcurand -lblas -lrt -L/usr/local/cuda/lib64 -L/usr/local/lib `pkg-config --libs opencv` -ljpeg

g++ -DMSHADOW_FORCE_STREAM -Wall -g -O3 -I./mshadow/ -I./dmlc-core/include -I/usr/local/cuda/include -I/usr/include -fPIC -msse3 -funroll-loops -Wno-unused-parameter -Wno-unknown-pragmas -DMSHADOW_USE_CBLAS=1 -DMSHADOW_USE_MKL=0 -DMSHADOW_RABIT_PS=0 -DMSHADOW_DIST_PS=0 -fPIC -DDMLC_USE_HDFS=0 -DDMLC_USE_S3=0 -DDMLC_USE_AZURE=0 -DCXXNET_USE_OPENCV=1 -DCXXNET_USE_OPENCV_DECODER=1 -fopenmp   -o bin/bin2rec tools/bin2rec.cc dmlc-core/libdmlc.a -pthread -lm -lcudart -lcublas -lcurand -lblas -lrt -L/usr/local/cuda/lib64 -L/usr/local/lib `pkg-config --libs opencv` -ljpeg

g++ -DMSHADOW_FORCE_STREAM -Wall -g -O3 -I./mshadow/ -I./dmlc-core/include -I/usr/local/cuda/include -I/usr/include -fPIC -msse3 -funroll-loops -Wno-unused-parameter -Wno-unknown-pragmas -DMSHADOW_USE_CBLAS=1 -DMSHADOW_USE_MKL=0 -DMSHADOW_RABIT_PS=0 -DMSHADOW_DIST_PS=0 -fPIC -DDMLC_USE_HDFS=0 -DDMLC_USE_S3=0 -DDMLC_USE_AZURE=0 -DCXXNET_USE_OPENCV=1 -DCXXNET_USE_OPENCV_DECODER=1 -fopenmp  -shared -o wrapper/libcxxnetwrapper.so wrapper/cxxnet_wrapper.cpp layer_cpu.o updater_cpu.o nnet_cpu.o main.o nnet_ps_server.o data.o dmlc-core/libdmlc.a layer_gpu.o updater_gpu.o nnet_gpu.o -pthread -lm -lcudart -lcublas -lcurand -lblas -lrt -L/usr/local/cuda/lib64 -L/usr/local/lib `pkg-config --libs opencv` -ljpeg

Now we can run an example:


$ cd example/MNIST

$ ./run.sh MNIST_CONV.conf 

libdc1394 error: Failed to initialize libdc1394

Use CUDA Device 0: GRID K520

finish initialization with 1 devices

Initializing layer: cv1

Initializing layer: 1

Initializing layer: 2

Initializing layer: 3

Initializing layer: fc1

Initializing layer: se1

Initializing layer: fc2

Initializing layer: 7

SGDUpdater: eta=0.100000, mom=0.900000

SGDUpdater: eta=0.100000, mom=0.900000

SGDUpdater: eta=0.100000, mom=0.900000

SGDUpdater: eta=0.100000, mom=0.900000

SGDUpdater: eta=0.100000, mom=0.900000

SGDUpdater: eta=0.100000, mom=0.900000

node[in].shape: 100,1,28,28

node[1].shape: 100,32,14,14

node[2].shape: 100,32,7,7

node[3].shape: 100,1,1,1568

node[4].shape: 100,1,1,100

node[5].shape: 100,1,1,100

node[6].shape: 100,1,1,10

MNISTIterator: load 60000 images, shuffle=1, shape=100,1,28,28

MNISTIterator: load 10000 images, shuffle=0, shape=100,1,28,28

initializing end, start working

round        0:[     600] 2 sec elapsed[1]      train-error:0.211783	test-error:0.0435

round        1:[     600] 3 sec elapsed[2]      train-error:0.0522667	test-error:0.0263

round        2:[     600] 5 sec elapsed[3]      train-error:0.0370833	test-error:0.0214

round        3:[     600] 7 sec elapsed[4]      train-error:0.0316167	test-error:0.023

round        4:[     600] 9 sec elapsed[5]      train-error:0.02905	test-error:0.0152

round        5:[     600] 11 sec elapsed[6]     train-error:0.0265167	test-error:0.0166

round        6:[     600] 13 sec elapsed[7]     train-error:0.0248333	test-error:0.0164

round        7:[     600] 15 sec elapsed[8]     train-error:0.0226667	test-error:0.0144

round        8:[     600] 17 sec elapsed[9]     train-error:0.0234167	test-error:0.0139

round        9:[     600] 19 sec elapsed[10]    train-error:0.0221	test-error:0.0152

round       10:[     600] 21 sec elapsed[11]    train-error:0.0218667	test-error:0.0121

round       11:[     600] 23 sec elapsed[12]    train-error:0.02025	test-error:0.0128

round       12:[     600] 24 sec elapsed[13]    train-error:0.01925	test-error:0.0142

round       13:[     600] 26 sec elapsed[14]    train-error:0.0194333	test-error:0.0129

round       14:[     600] 28 sec elapsed[15]    train-error:0.0190167	test-error:0.0114



updating end, 28 sec in all

At this point you can proceed to work with the examples provided by the cxxnet authors:

https://github.com/dmlc/cxxnet/tree/master/example

发表时间：2015-09-08 19:09:09 评论者：Silcowitz

Great stuff. Could you make the image available as a community image?

发表时间：2015-09-09 23:00:51 评论者：Bing Xu

@Silcowitz Thanks for your interest and suggestion. We are currently building new generation toolkit which fully support Python/R. It is called MXNet(https://github.com/dmlc/mxnet). We will make an image of MXNet when we finish it!


姓名：
评论：
	请输入下面这首诗词的作者姓名。两个黄鹂鸣翠柳，一行白鹭上青天。窗含西岭千秋雪，门泊东吴万里船。
答案：

云与清风常拥有，
冰雪知音世难求。
击节纵歌相对笑，
案上诗书杯中酒。

2020年12月31日
洛杉矶

婉兮清扬

案上诗书杯中酒之快意人生

Running cxxnet on Amazon EC2 (Ubuntu 14.04)

最新评论