Nov 14 : EBS disk looses information after it is disconnected and reconnected. I used the following command
mkdir /storage mount /dev/sdf1 /storage cd /storage ls |
The file structure seems to be fine, but when I try to read some of the files I get this error for some of the files.
[root@ip-10-251-198-4 wL-pages-iter2]# cat * >/dev/null cat: wL1part-00000: Input/output error cat: wL1part-00004: Input/output error cat: wL1part-00005: Input/output error cat: wL1part-00009: Input/output error cat: wL2part-00000: Input/output error cat: wL2part-00004: Input/output error [root@ip-10-251-198-4 wL-pages-iter2]# pwd /storage/iter/bad/wL-pages-iter2 |
Note, EBS disk was partitioned & formatter with on the exactly the same operating system in the previous session
fdisk /dev/sdf mkfs.ext3 /dev/sdf1 |
Nov 13
*scp RCF --> Amazon, 3MB/sec, ~GB files;
*scp Amazon-->Amazon, 5-8 MB/sec, ~GB files
Nov 13 :
*Matt's customized Ubuntu w/o STAR software - 4-6 minutes, the smallest machine $0.10
*default public Fedora from EC2 : ~2 minutes
*launching Cloudera cluster 1+4 or 1+10 seems to take similar time of ~5 minutes
Nov 14 :
*there is a limit of 20 on # of EC2 machines I could launch at once with the command: hadoop-ec2 launch-cluster my-hadoop-cluster19
'20' would not work. This is my
> cat */.hadoop-ec2/ec2-clusters.cfg ami=ami-6159bf08 instance_type=m1.small key_name=janAmazonKey2 *availability_zone=us-east-1a* private_key=/home/training/.ec2/id_rsa-janAmazonKey2 ssh_options=-i %(private_key)s -o StrictHostKeyChecking=no |
Make sure to assign proper zone if you use EBS disk