致谢

By , July 27, 2014 7:09 pm

这次捐款活动共收到善款84笔,总金额为72127.29元。经多方协调,将善款发放给14户困难学生以及8 户贫困家庭。善款发放工作在文昌市翁田镇明月村委会的见证下完成。

此次捐款活动已经结束,请各位朋友不要再给我们捐款。

感谢各位转发和捐款的朋友。台风无情地摧毁了我们的家园,但是你们的爱心给我们的乡亲带来了希望。

IMG_0473

IMG_0474

向各位朋友求援

By , July 23, 2014 7:32 pm

经过再三考虑,我决定向各位认识或者不认识的朋友求援一下了。

就在前几天,41年来华南最强的台风“威马逊”在我的家乡海南文昌登陆。台风中心经过的翁田镇,农村的房子几乎全部倒塌,家禽牲畜全部死亡、农作物全部毁灭,通电和恢复正常生活生产遥遥无期… …我有一位高中同班同学,她的老家就在翁田镇龙头村,是受灾最严重的一个乡村之一。这位同学的父亲是当地的教师。村里有不少考上了中学或大学的贫困生,遭此灾难后,家里已经拿不出钱给他们在新学期当学费和生活费。还有一些生活艰苦的老人,他们在这次天灾中失去了房子和全部农作物,对生活陷入了绝望… …

为了帮助受灾的乡亲们渡过难关,我的同学们一起商量,打算为乡亲们筹集一些善款。善款的一部分用来支付贫困学生下一个学期的学费和生活费,剩下的部分用来照顾受灾的老人。我和同学们会确保这些善款及时、足额地交到孩子和老人的手中。

如果你愿意为我的乡亲们提供一点帮助,请将捐款汇到如下几个账户。无论金额多少,对于孩子和老人们来说都是莫大的帮助。我们会将善款直接交到孩子和老人手里,并及时公布捐款的详细数额和发放情况。为了便于我们统计帐目,请您汇款后用手机短信通知我们一下。

户名:陈庆晓
帐号:6222022002005303585
开户行:工商银行珠海迎宾支行
手机号码:13928003269

户名:陈美文
帐号:6226621105036956
开户行:中国光大银行海口海甸支行
手机号码:15120798161

支付宝账户:7559902@qq.com
户名:黄小花
手机号码:13876360625

谨代的家乡的孩子和老人谢谢各位好人!

 

13.pic_hd

12.pic

10.pic

台风中心经过的翁田镇,农村的房子几乎全部倒塌。说这些孩子和老人上无片瓦下无立锥之地,真的是一点都不过分。

几本书

By , July 19, 2014 3:31 pm

s1556748

刚到悉尼的时候,一口气读完的。

s3628415

大概是一个月前的某天晚上一口气读完的,结果好长时间都睡不着。

AK0082

这两个星期断断续续地读,每天晚上临睡前读那么三五章。昨天晚上读完的。

Multi Node Condor and Pegasus on Ubuntu 12.04 on AWS EC2

By , June 24, 2014 8:24 pm

Recently I need to do some experiments with the Montage astronomical image mosaic engine, using Pegasus as the workflow management system. This involves setting up a condor cluster and Pegasus on the submit host, and several other steps to run Montage in such an environment. After extensive search on the Internet, I find out that there exists no good documentation on how to accomplish this complicate task with my favorite Linux distribution – Ubuntu 12.04. I decide write a tutorial on this topic, in the hope that it might save someone else’s time in the future.

This tutorial includes the following three parts:

Single Node Condor and Pegasus on Ubuntu 12.04 on AWS EC2
Multi Node Condor and Pegasus on Ubuntu 12.04 on AWS EC2
Running Montage with Pegasus on AWS EC2

In the previous tutorial we have setup a single node Condor cluster with Pegasus. Now we will expand the Condor clutter to include multiple worker nodes. Keep the previous EC2 instance running, and we will call this instance the Master Node. The other Condor nodes will receive tasks from the Master Node, so we will call them Worker Nodes.

In this tutorial, we will show how to add one Worker Node to the cluster.

[STEP 1: Updating Security Group Settings]

The Master Node and the Worker Node should be able to communicate with each other. The easiest way to achieve this is to run both the Master Node and the Worker Node in the same VPC, and use the same security group for both the Master Node and the Worker Node . Edit the inbound rules of the security group, add a rule to allow all traffic from within the security group.

[STEP 2: Install Condor]

Similar to the previous tutorial, download the latest version of HTCondor (native package) for Ubuntu 12.04 from the following URL. What I have downloaded is condor-8.1.6-247684-ubuntu_12.04_amd64.deb. The actual filename might change over time.

http://research.cs.wisc.edu/htcondor/downloads/

Install Condor using the following commands:

$ sudo dpkg -i condor-8.1.6-247684-ubuntu_12.04_amd64.deb
$ sudo apt-get update
$ sudo apt-get install -f
$ sudo apt-get install chkconfig
$ sudo chkconfig condor on
$ sudo service condor start

Now we should have Condor up and running, and it should be automatically started when the system boots. Check into the status of Condor using the following commands:

$ condor_status
Name               OpSys      Arch   State     Activity LoadAv Mem   ActvtyTime

slot1@ip-10-0-5-11 LINUX      X86_64 Unclaimed Benchmar  0.060 1862  0+00:00:04
slot2@ip-10-0-5-11 LINUX      X86_64 Unclaimed Idle      0.000 1862  0+00:00:05
slot3@ip-10-0-5-11 LINUX      X86_64 Unclaimed Idle      0.000 1862  0+00:00:06
slot4@ip-10-0-5-11 LINUX      X86_64 Unclaimed Idle      0.000 1862  0+00:00:07
                     Total Owner Claimed Unclaimed Matched Preempting Backfill

        X86_64/LINUX     4     0       0         4       0          0        0

               Total     4     0       0         4       0          0        0

$ condor_q


-- Submitter: ip-10-0-5-114.ec2.internal :  : ip-10-0-5-114.ec2.internal
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               

0 jobs; 0 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended

[STEP 3: Config Condor Master Node]

Use a text editor to open /etc/condor/condor_config, and add the following line to the end of the file:

ALLOW_WRITE = *

Then restart Condor with the following command:

$ sudo service condor restart

Also, find the IP address of the Master Node with the following command, you will need it to config the Worker Node.

[STEP 4: Config Condor Worker Node]

Now we go ahead to config the Worker Node. Use a text editor to open /etc/condor/condor_config.local, find the following line

CONDOR_HOST = $(FULL_HOSTNAME)

and update it with the IP address of the Master Node. Assuming that the IP address of the Master Node is 192.168.1.1, then this line should look like the following

CONDOR_HOST = 192.168.1.1

Then restart Condor using the following command:

$ sudo service condor restart

Now on both the Master Node and the Worker Node, we will be able to see both nodes. In the following example, both the Master Node and the Worker Node are c3.xlarge instances. Each of the c3.xlarge instance have 4 vCPU’s, so we are seeing 8 slots in the cluster.

$ condor_status
Name               OpSys      Arch   State     Activity LoadAv Mem   ActvtyTime

slot1@ip-10-0-5-11 LINUX      X86_64 Unclaimed Idle      0.000 1862  0+00:04:36
slot2@ip-10-0-5-11 LINUX      X86_64 Unclaimed Idle      0.000 1862  0+00:05:05
slot3@ip-10-0-5-11 LINUX      X86_64 Unclaimed Idle      0.000 1862  0+00:05:06
slot4@ip-10-0-5-11 LINUX      X86_64 Unclaimed Idle      0.000 1862  0+00:05:07
slot1@ip-10-0-5-11 LINUX      X86_64 Unclaimed Idle      0.040 1862  0+00:04:36
slot2@ip-10-0-5-11 LINUX      X86_64 Unclaimed Idle      0.000 1862  0+00:05:05
slot3@ip-10-0-5-11 LINUX      X86_64 Unclaimed Idle      0.000 1862  0+00:05:06
slot4@ip-10-0-5-11 LINUX      X86_64 Unclaimed Idle      0.000 1862  0+00:05:07
                     Total Owner Claimed Unclaimed Matched Preempting Backfill

        X86_64/LINUX     8     0       0         8       0          0        0

               Total     8     0       0         8       0          0        0

[STEP 5: Add More Worker Nodes]

To add more Worker Nodes, you can create an AMI out of the first Worker Node, than launch as many Worker Nodes as needed. Since the AMI has the above-mentioned configurations, they should be automatically added to the cluster when they are in running state.

Single Node Condor and Pegasus on Ubuntu 12.04 on AWS EC2

By , June 24, 2014 6:46 pm

Recently I need to do some experiments with the Montage astronomical image mosaic engine, using Pegasus as the workflow management system. This involves setting up a condor cluster and Pegasus on the submit host, and several other steps to run Montage in such an environment. After extensive search on the Internet, I find out that there exists no good documentation on how to accomplish this complicate task with my favorite Linux distribution – Ubuntu 12.04. I decide write a tutorial on this topic, in the hope that it might save someone else’s time in the future.

This tutorial includes the following three parts:

Single Node Condor and Pegasus on Ubuntu 12.04 on AWS EC2
Multi Node Condor and Pegasus on Ubuntu 12.04 on AWS EC2
Running Montage with Pegasus on AWS EC2

[STEP 1: Create an EC2 Instance for the Master Host]

Create an EC2 instance with a Ubuntu 12.04 AMI. For testing purposes you might wish to take advantage of the spot instance to save money. Use one of the compute optimized (C3) instance types so that you will see multiple Condor slots with a single EC2 instance. In this document I use c3.xlarge, which has 4 vCPU’s.

For a single node setup, all you need to do with the security group setting is open up port 22 to your IP address so that you can SSH to the instance when it is up.

When the instance is up and running, SSH to the instance.

ssh -i yourkey.pem ubuntu@ip_of_the_instance

[STEP 2: Install Condor]

Download the latest version of HTCondor (native package) for Ubuntu 12.04 from the following URL. What I have downloaded is condor-8.1.6-247684-ubuntu_12.04_amd64.deb. The actual filename might change over time.

http://research.cs.wisc.edu/htcondor/downloads/

Install Condor using the following commands:

$ sudo dpkg -i condor-8.1.6-247684-ubuntu_12.04_amd64.deb
$ sudo apt-get update
$ sudo apt-get install -f
$ sudo apt-get install chkconfig
$ sudo chkconfig condor on
$ sudo service condor start

Now we should have Condor up and running, and it should be automatically started when the system boots. Check into the status of Condor using the following commands:

$ condor_status
Name               OpSys      Arch   State     Activity LoadAv Mem   ActvtyTime

slot1@ip-10-0-5-11 LINUX      X86_64 Unclaimed Benchmar  0.060 1862  0+00:00:04
slot2@ip-10-0-5-11 LINUX      X86_64 Unclaimed Idle      0.000 1862  0+00:00:05
slot3@ip-10-0-5-11 LINUX      X86_64 Unclaimed Idle      0.000 1862  0+00:00:06
slot4@ip-10-0-5-11 LINUX      X86_64 Unclaimed Idle      0.000 1862  0+00:00:07
                     Total Owner Claimed Unclaimed Matched Preempting Backfill

        X86_64/LINUX     4     0       0         4       0          0        0

               Total     4     0       0         4       0          0        0
$ condor_q


-- Submitter: ip-10-0-5-114.ec2.internal :  : ip-10-0-5-114.ec2.internal
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               

0 jobs; 0 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended

[STEP 3: Install Pegasus]

Pegasus needs Java (1.6 or higher) and Python (2.4 or higher). Ubuntu 12.04 comes with Python 2.7 but not Java, so we will need to install Java first. Optionally Pegasus also needs Globus for grid support, we will take care of Globus later.

$ sudo apt-get install openjdk-7-jdk

Then we config the Pegasus repository and install Pegasus.

$ gpg --keyserver pgp.mit.edu --recv-keys 81C2A4AC
$ gpg -a --export 81C2A4AC | sudo apt-key add -  

All the following line into /etc/apt/source.list:

deb http://download.pegasus.isi.edu/wms/download/debian wheezy main

Update the repository and install Pegasus:

$ sudo apt-get update
$ sudo apt-get install pegasus

Now we should have Pegasus installed on the system. Check the installation with the following command. If you see similar output, congratulations!

$ pegasus-status
(no matching jobs found in Condor Q)

Pegasus comes with some examples, we will use these example to test the installation further.

$ cd ~
$ cp -r /usr/share/pegasus/examples .
$ cd examples/hello-world
$ ls
dax-generator.py  hello.sh  pegasusrc  submit  world.sh

Run the hello-world example:

$ ./submit 
2014.06.24 10:34:00.455 UTC:   Submitting job(s). 
2014.06.24 10:34:00.460 UTC:   1 job(s) submitted to cluster 1. 
2014.06.24 10:34:00.465 UTC:    
2014.06.24 10:34:00.471 UTC:   ----------------------------------------------------------------------- 
2014.06.24 10:34:00.476 UTC:   File for submitting this DAG to Condor           : hello_world-0.dag.condor.sub 
2014.06.24 10:34:00.481 UTC:   Log of DAGMan debugging messages                 : hello_world-0.dag.dagman.out 
2014.06.24 10:34:00.487 UTC:   Log of Condor library output                     : hello_world-0.dag.lib.out 
2014.06.24 10:34:00.492 UTC:   Log of Condor library error messages             : hello_world-0.dag.lib.err 
2014.06.24 10:34:00.497 UTC:   Log of the life of condor_dagman itself          : hello_world-0.dag.dagman.log 
2014.06.24 10:34:00.503 UTC:    
2014.06.24 10:34:00.508 UTC:   ----------------------------------------------------------------------- 
2014.06.24 10:34:00.513 UTC:    
2014.06.24 10:34:00.519 UTC:   Your workflow has been started and is running in the base directory: 
2014.06.24 10:34:00.524 UTC:    
2014.06.24 10:34:00.530 UTC:     /home/ubuntu/examples/hello-world/work/ubuntu/pegasus/hello_world/20140624T103359+0000 
2014.06.24 10:34:00.535 UTC:    
2014.06.24 10:34:00.540 UTC:   *** To monitor the workflow you can run *** 
2014.06.24 10:34:00.546 UTC:    
2014.06.24 10:34:00.551 UTC:     pegasus-status -l /home/ubuntu/examples/hello-world/work/ubuntu/pegasus/hello_world/20140624T103359+0000 
2014.06.24 10:34:00.556 UTC:    
2014.06.24 10:34:00.562 UTC:   *** To remove your workflow run *** 
2014.06.24 10:34:00.567 UTC:    
2014.06.24 10:34:00.572 UTC:     pegasus-remove /home/ubuntu/examples/hello-world/work/ubuntu/pegasus/hello_world/20140624T103359+0000 
2014.06.24 10:34:00.578 UTC:    
2014.06.24 10:34:01.024 UTC:   Time taken to execute is 1.109 seconds
Check the status of the Pegasus jobs and Condor queue using the pegasus-statua and condor_q commands:
$ pegasus-status
STAT  IN_STATE  JOB                                               
Run      01:05  hello_world-0                                     
Summary: 1 Condor job total (R:1)

$ condor_q


-- Submitter: ip-10-0-5-114.ec2.internal :  : ip-10-0-5-114.ec2.internal
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
   1.0   ubuntu          6/24 10:34   0+00:01:28 R  0   0.0  pegasus-dagman -f 

1 jobs; 0 completed, 0 removed, 0 idle, 1 running, 0 held, 0 suspended

After the hello-world workflow have been executed, a trace file (jobstat.log) can be found in the working directory. Workflow related information is hidden several sub-directories deep. In my case the directory is ~/examples/hello-world/work/ubuntu/pegasus/hello_world/20140624T103359+0000. Please note that the last sub-directory is a timestamp, depending on the time you submit the workflow.

As a bonus of this tutorial, I have prepared an AMI with the above-mentioned setup, and make the AMI publicly available to the community. If all you need is a single node Pegasus + Condor configuration, you don’t need to repeat any of the above-mentioned steps. All you need to do is to launch an EC2 instance with AMI ami-5ee01b36 in the US-EAST-1 (N. Virginia) region. If you need to run this in other regions, copy the AMI to the desired region and launch an instance with the copied AMI in that region.

Panorama Theme by Themocracy