인프런 커뮤니티 질문&답변

celestial_님의 프로필 이미지

작성한 질문수

15일간의 빅데이터 파일럿 프로젝트

선생님, 질문이 있습니다!

21.05.06 20:45 작성

·

576

0

어제 hbase 적재 이후 적당한 쿼리를 입력하면서 잘 따라하고 있었는데요..

처음 쉬운 쿼리 몇개는 잘 동작되다가 갑자기 마지막 쿼리(강의 영상 중) 서버에 문제가 생겼다고

긴 메세지가 뜨면서 튕기고있습니다!
이와 동시에 포트 16010 이쪽도 서버가 죽었다고 나오는데요...

그래서 다시 hbase shell 접속후 확인해보면 다시 같은 문제가 발생합니다.

물론 오늘 제가 다시 해보면서 해결할 생각이지만 혹시 모르니 아래 메세지 확인후 도와주시면

정말 정말 감사할 것 같아요!!

감사합니다:) 

 04002130106102-S0001             column=cf1:area_number, timestamp=1620222187986, value=C01

 04002130106102-S0001             column=cf1:car_number, timestamp=1620222187986, value=S0001

 04006030106102-F0004             column=cf1:area_number, timestamp=1620221075233, value=C01

 04006030106102-F0004             column=cf1:car_number, timestamp=1620221075233, value=F0004

 04006030106102-R0010             column=cf1:area_number, timestamp=1620221077774, value=C01

 04006030106102-R0010             column=cf1:car_number, timestamp=1620221077774, value=R0010

 04007030106102-R0010             column=cf1:area_number, timestamp=1620221258521, value=C01

 04007030106102-R0010             column=cf1:car_number, timestamp=1620221258521, value=R0010

 04008030106102-R0010             column=cf1:area_number, timestamp=1620221447624, value=C01

 04008030106102-R0010             column=cf1:car_number, timestamp=1620221447624, value=R0010

 04018030106102-W0003             column=cf1:area_number, timestamp=1620221477673, value=C01

 04018030106102-W0003             column=cf1:car_number, timestamp=1620221477673, value=W0003

 04020030106102-S0001             column=cf1:area_number, timestamp=1620220033655, value=C01

 04020030106102-S0001             column=cf1:car_number, timestamp=1620220033655, value=S0001

 04021030106102-F0004             column=cf1:area_number, timestamp=1620220218763, value=C01

 04021030106102-F0004             column=cf1:car_number, timestamp=1620220218763, value=F0004

 04022030106102-R0010             column=cf1:area_number, timestamp=1620220402111, value=C01

 04022030106102-R0010             column=cf1:car_number, timestamp=1620220402111, value=R0010

 04023030106102-S0001             column=cf1:area_number, timestamp=1620220585394, value=C01

 04023030106102-S0001             column=cf1:car_number, timestamp=1620220585394, value=S0001

 04025030106102-W0003             column=cf1:area_number, timestamp=1620220955061, value=C01

 04025030106102-W0003             column=cf1:car_number, timestamp=1620220955061, value=W0003

 04028030106102-S0001             column=cf1:area_number, timestamp=1620221510655, value=C01

 04028030106102-S0001             column=cf1:car_number, timestamp=1620221510655, value=S0001

 04032030106102-F0004             column=cf1:area_number, timestamp=1620220432140, value=C01

 04032030106102-F0004             column=cf1:car_number, timestamp=1620220432140, value=F0004

 04032030106102-R0010             column=cf1:area_number, timestamp=1620220432244, value=C01

 04032030106102-R0010             column=cf1:car_number, timestamp=1620220432244, value=R0010

 04032130106102-R0010             column=cf1:area_number, timestamp=1620222283736, value=C01

 04032130106102-R0010             column=cf1:car_number, timestamp=1620222283736, value=R0010

 04032130106102-W0003             column=cf1:area_number, timestamp=1620222281189, value=C01

 04032130106102-W0003             column=cf1:car_number, timestamp=1620222281189, value=W0003

 04033030106102-W0003             column=cf1:area_number, timestamp=1620220615511, value=C01

 04033030106102-W0003             column=cf1:car_number, timestamp=1620220615511, value=W0003

 04034030106102-R0010             column=cf1:area_number, timestamp=1620220798841, value=C01

 04034030106102-R0010             column=cf1:car_number, timestamp=1620220798841, value=R0010

 04035030106102-F0004             column=cf1:area_number, timestamp=1620220985272, value=C01

 04035030106102-F0004             column=cf1:car_number, timestamp=1620220985272, value=F0004

java.net.SocketTimeoutException: callTimeout=60000, callDuration=62676: Failed after attempts=8, exceptions:

Wed May 05 22:45:51 KST 2021, RpcRetryingCaller{globalStartTime=1620222351334, pause=100, maxAttempts=8}, java.net.ConnectException: Call to server02.hadoop.com/192.168.56.102:16020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: server02.hadoop.com/192.168.56.102:16020

Wed May 05 22:45:51 KST 2021, RpcRetryingCaller{globalStartTime=1620222351334, pause=100, maxAttempts=8}, org.apache.hadoop.hbase.ipc.FailedServerException: Call to server02.hadoop.com/192.168.56.102:16020 failed on local exception: org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed servers list: server02.hadoop.com/192.168.56.102:16020

Wed May 05 22:45:51 KST 2021, RpcRetryingCaller{globalStartTime=1620222351334, pause=100, maxAttempts=8}, org.apache.hadoop.hbase.ipc.FailedServerException: Call to server02.hadoop.com/192.168.56.102:16020 failed on local exception: org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed servers list: server02.hadoop.com/192.168.56.102:16020

Wed May 05 22:45:51 KST 2021, RpcRetryingCaller{globalStartTime=1620222351334, pause=100, maxAttempts=8}, org.apache.hadoop.hbase.ipc.FailedServerException: Call to server02.hadoop.com/192.168.56.102:16020 failed on local exception: org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed servers list: server02.hadoop.com/192.168.56.102:16020

Wed May 05 22:45:52 KST 2021, RpcRetryingCaller{globalStartTime=1620222351334, pause=100, maxAttempts=8}, org.apache.hadoop.hbase.ipc.FailedServerException: Call to server02.hadoop.com/192.168.56.102:16020 failed on local exception: org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed servers list: server02.hadoop.com/192.168.56.102:16020

Wed May 05 22:45:53 KST 2021, RpcRetryingCaller{globalStartTime=1620222351334, pause=100, maxAttempts=8}, java.net.ConnectException: Call to server02.hadoop.com/192.168.56.102:16020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: server02.hadoop.com/192.168.56.102:16020

Wed May 05 22:45:55 KST 2021, RpcRetryingCaller{globalStartTime=1620222351334, pause=100, maxAttempts=8}, java.net.ConnectException: Call to server02.hadoop.com/192.168.56.102:16020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: server02.hadoop.com/192.168.56.102:16020

Wed May 05 22:45:59 KST 2021, RpcRetryingCaller{globalStartTime=1620222351334, pause=100, maxAttempts=8}, java.net.ConnectException: Call to server02.hadoop.com/192.168.56.102:16020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: server02.hadoop.com/192.168.56.102:16020

 row '' on table 'DriverCarInfo' at region=DriverCarInfo,,1620216526637.d4206de09025ad8786e7fe1c35fea90e., hostname=server02.hadoop.com,16020,1620216046538, seqNum=2

        at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:159)

        at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:80)

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

        at java.lang.Thread.run(Thread.java:748)

Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=8, exceptions:

Wed May 05 22:45:51 KST 2021, RpcRetryingCaller{globalStartTime=1620222351334, pause=100, maxAttempts=8}, java.net.ConnectException: Call to server02.hadoop.com/192.168.56.102:16020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: server02.hadoop.com/192.168.56.102:16020

Wed May 05 22:45:51 KST 2021, RpcRetryingCaller{globalStartTime=1620222351334, pause=100, maxAttempts=8}, org.apache.hadoop.hbase.ipc.FailedServerException: Call to server02.hadoop.com/192.168.56.102:16020 failed on local exception: org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed servers list: server02.hadoop.com/192.168.56.102:16020

Wed May 05 22:45:51 KST 2021, RpcRetryingCaller{globalStartTime=1620222351334, pause=100, maxAttempts=8}, org.apache.hadoop.hbase.ipc.FailedServerException: Call to server02.hadoop.com/192.168.56.102:16020 failed on local exception: org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed servers list: server02.hadoop.com/192.168.56.102:16020

Wed May 05 22:45:51 KST 2021, RpcRetryingCaller{globalStartTime=1620222351334, pause=100, maxAttempts=8}, org.apache.hadoop.hbase.ipc.FailedServerException: Call to server02.hadoop.com/192.168.56.102:16020 failed on local exception: org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed servers list: server02.hadoop.com/192.168.56.102:16020

Wed May 05 22:45:52 KST 2021, RpcRetryingCaller{globalStartTime=1620222351334, pause=100, maxAttempts=8}, org.apache.hadoop.hbase.ipc.FailedServerException: Call to server02.hadoop.com/192.168.56.102:16020 failed on local exception: org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed servers list: server02.hadoop.com/192.168.56.102:16020

Wed May 05 22:45:53 KST 2021, RpcRetryingCaller{globalStartTime=1620222351334, pause=100, maxAttempts=8}, java.net.ConnectException: Call to server02.hadoop.com/192.168.56.102:16020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: server02.hadoop.com/192.168.56.102:16020

Wed May 05 22:45:55 KST 2021, RpcRetryingCaller{globalStartTime=1620222351334, pause=100, maxAttempts=8}, java.net.ConnectException: Call to server02.hadoop.com/192.168.56.102:16020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: server02.hadoop.com/192.168.56.102:16020

Wed May 05 22:45:59 KST 2021, RpcRetryingCaller{globalStartTime=1620222351334, pause=100, maxAttempts=8}, java.net.ConnectException: Call to server02.hadoop.com/192.168.56.102:16020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: server02.hadoop.com/192.168.56.102:16020

        at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:145)

        at org.apache.hadoop.hbase.client.HTable.get(HTable.java:386)

        at org.apache.hadoop.hbase.client.HTable.get(HTable.java:360)

        at org.apache.hadoop.hbase.MetaTableAccessor.getTableState(MetaTableAccessor.java:1078)

        at org.apache.hadoop.hbase.client.ConnectionImplementation.getTableState(ConnectionImplementation.java:1970)

        at org.apache.hadoop.hbase.client.ConnectionImplementation.isTableDisabled(ConnectionImplementation.java:605)

        at org.apache.hadoop.hbase.client.ConnectionImplementation.relocateRegion(ConnectionImplementation.java:731)

        at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:328)

        at org.apache.hadoop.hbase.client.ScannerCallable.prepare(ScannerCallable.java:139)

        at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.prepare(ScannerCallableWithReplicas.java:399)

        at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:105)

        ... 4 more

Caused by: java.net.ConnectException: Call to server02.hadoop.com/192.168.56.102:16020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: server02.hadoop.com/192.168.56.102:16020

        at org.apache.hadoop.hbase.ipc.IPCUtil.wrapException(IPCUtil.java:178)

        at org.apache.hadoop.hbase.ipc.AbstractRpcClient.onCallFinished(AbstractRpcClient.java:390)

        at org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$100(AbstractRpcClient.java:95)

        at org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:410)

        at org.apache.hadoop.hbase.ipc.AbstractRpcClient$3.run(AbstractRpcClient.java:406)

        at org.apache.hadoop.hbase.ipc.Call.callComplete(Call.java:103)

        at org.apache.hadoop.hbase.ipc.Call.setException(Call.java:118)

        at org.apache.hadoop.hbase.ipc.BufferCallBeforeInitHandler.userEventTriggered(BufferCallBeforeInitHandler.java:92)

        at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:326)

        at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:312)

        at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireUserEventTriggered(AbstractChannelHandlerContext.java:304)

        at org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline$HeadContext.userEventTriggered(DefaultChannelPipeline.java:1426)

        at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:326)

        at org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:312)

        at org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.fireUserEventTriggered(DefaultChannelPipeline.java:924)

        at org.apache.hadoop.hbase.ipc.NettyRpcConnection.failInit(NettyRpcConnection.java:179)

        at org.apache.hadoop.hbase.ipc.NettyRpcConnection.access$500(NettyRpcConnection.java:71)

        at org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:267)

        at org.apache.hadoop.hbase.ipc.NettyRpcConnection$3.operationComplete(NettyRpcConnection.java:261)

        at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:502)

        at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:495)

        at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:474)

        at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:415)

        at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:540)

        at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:533)

        at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:114)

        at org.apache.hbase.thirdparty.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:327)

        at org.apache.hbase.thirdparty.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:343)

        at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:665)

        at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:612)

        at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:529)

        at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:491)

        at org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:905)

        at org.apache.hbase.thirdparty.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)

        ... 1 more

Caused by: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: server02.hadoop.com/192.168.56.102:16020

        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)

        at org.apache.hbase.thirdparty.io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:327)

        at org.apache.hbase.thirdparty.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340)

        ... 7 more

Caused by: java.net.ConnectException: Connection refused

        ... 11 more

ERROR: Connection refused

For usage try 'help "scan"'

 

답변 2

0

min102님의 프로필 이미지

2021. 05. 30. 23:39

빅디님

count 'DriverCarInfo'를 실행했을 때 비슷한 error log가 찍힙니다.

cdm에서는 hbase의 master heath가 불량이네요.

다시 살려도 금방 죽어버리는데 이건 어떻게 해야할까요

Big.D님의 프로필 이미지
Big.D
지식공유자

2021. 05. 31. 00:37

네~ 안녕하세요!

우선 서버로그(/var/logs/hbase..) 또는 CM의 Hbase > 로그를 확인해 불량의 원인을 파악해 봐야 할 것 같습니다.

추가로 HBase 마스터가 재기동 조차 안되는건, HBase 자체보다 주변 환경  문제일 수 있습니다.

아래 4가지 사항도 체크해 보세요!!   - 빅디 드림

1. 하둡(HDFS) 정상 여부 확인 - 깨진파일 등으로 Safe Mode 확인

2. 얀(YARN) 정상 여부 확인

3. 주키퍼 정상 여부 확인

4. 자원(Cpu/Mem/Disk 여유 확인) 부족 현상 확인

0

Big.D님의 프로필 이미지
Big.D
지식공유자

2021. 05. 06. 22:21

안녕하세요! 빅디 입니다.

Hbase는 원래 고가용성 소프트웨어로 장애 극복을 위해 분산 노드로 구성 해야합니다.

하지만 파일럿 환경에선 자원의 한계로 싱글 노드로만 구성해 고가용성이 불가능한 상태입니다.

Hbase는 파일럿 환경에 오버헤드 발생시 가장 민감하게 반응합니다. 아마도 무거운 쉘명령으로 hbase의 HMaster 또는 Region 서버가 셧다운 된것 같습니다. 조치로..

CM 홈 > Hbase 에서 셧다운 된 서버가 있는지 확인해 보시고, Hbase를 재기동 해보시기 바랍니다. -빅디 올림