회사에 정전이 일어났다.
hadoop 개발 장비가 모두 꺼졌다..
재부팅을 하고 hadoop namenode 를 구동하니 아래와 같은 에러 발생…
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
2013-01-03 08:30:29,803 INFO org.apache.hadoop.mapred.JobTracker: problem cleaning system directory: hdfs://namenode:9000/data1/hadoop/filesystem/mapreduce/system org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete /data1/hadoop/filesystem/mapreduce/system. Name node is in safe mode. The ratio of reported blocks 0.0000 has not reached the threshold 0.9990. Safe mode will be turned off automatically. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:1992) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:1972) at org.apache.hadoop.hdfs.server.namenode.NameNode.delete(NameNode.java:792) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382) at org.apache.hadoop.ipc.Client.call(Client.java:1066) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) at $Proxy5.delete(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy5.delete(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:828) at org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:234) at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:2410) at org.apache.hadoop.mapred.JobTracker.</init><init>(JobTracker.java:2192) at org.apache.hadoop.mapred.JobTracker.</init><init>(JobTracker.java:2186) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:300) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:291) at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4978) |
Hadoop이 정상적인 종료를 하지 않았을 때, 에러가 나는 것으로 보인다.
비정상적인 종료시 hadoop 은 safe 모드로 이동하는데. 종료시 아래와 같은 명령을 내려서 restart할 때 문제가 없도록 해야 한다.
1 2 |
]$ ./bin/hadoop dfsadmin -safemode leave Safe mode is OFF |
그리고 또 데이터 노드를 올리는데 이번에는 아래와 같은 에러가 난다..
1 2 3 4 5 6 7 8 9 10 |
2013-01-04 02:26:11,968 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Invalid directory in dfs.data.dir: Incorrect permission for /data1/hadoop, expected: rwxr-xr-x, while actual: rwxrwxrwx 2013-01-04 02:26:12,354 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: org.apache.hadoop.util.DiskChecker$DiskErrorException: Invalid value for volsFailed : 1 , Volumes tolerated : 0 at org.apache.hadoop.hdfs.server.datanode.FSDataset.</init><init>(FSDataset.java:951) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:380) at org.apache.hadoop.hdfs.server.datanode.DataNode.</init><init>(DataNode.java:290) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1553) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1492) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1510) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1636) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1653) |
확인해보니…
1 2 3 4 5 6 7 8 9 |
]$ cd /data1/ ]$ ll total 36 drwxr-xr-x 6 root root 4096 2012-04-18 09:46 ./ drwxr-xr-x 25 root root 4096 2012-02-27 21:31 ../ drwxrwxrwx 7 hadoop hadoop 4096 2012-12-20 16:05 hadoop/ drwxr-xr-x 2 hbase hbase 4096 2012-04-18 09:46 hbase/ drwx------ 2 root root 16384 2012-02-27 21:28 lost+found/ drwxr-xr-x 3 zookeeper zookeeper 4096 2013-01-03 08:32 zookeeper/ |
오잉? 왜 777이지?
다시 755으로 변경
1 2 3 4 5 6 7 8 9 |
]$ chmod 755 hadoop/ ]$ ll total 36 drwxr-xr-x 6 root root 4096 2012-04-18 09:46 ./ drwxr-xr-x 25 root root 4096 2012-02-27 21:31 ../ drwxr-xr-x 7 hadoop hadoop 4096 2012-12-20 16:05 hadoop/ drwxr-xr-x 2 hbase hbase 4096 2012-04-18 09:46 hbase/ drwx------ 2 root root 16384 2012-02-27 21:28 lost+found/ drwxr-xr-x 3 zookeeper zookeeper 4096 2013-01-03 08:32 zookeeper/ |
다시 구동해보니 정상적으로 구동완료!!