Cisco Nexus

N9K Watchdog Timeout

최찐찐멍 2023. 6. 16. 11:21
반응형

장비를 운영하는 도중 예상치 못한 장비 리부팅이 발생할 경우가 있다.

Watchdog Time Out 증상이 이러한 상황에 속한다.

간단하게 이런 증상에서 RMA가 필요한 H/W이슈인지 확인해보자

`show system reset-reason`

----- reset reason for module 1 (from Supervisor in slot 1) ---

1) At 693743 usecs after Sat Dec 17 15:33:26 2022

Reason: Watchdog Timeout

Service: HW check by card-client

Version:

 

`show logging onboard internal reset-reason`

----------------------------

Module: 1

----------------------------

 

Switch OBFL Log: Enabled

 

Reset Reason for this card:

Image Version : 9.3(9)

Reset Reason (LCM): Unknown (0) at time Sat Dec 17 15:33:26 2022

Reset Reason (SW): Reset Requested by CLI command reload (9) at time Wed Jul 13 22:16:02 2022

Reset Reason (HW): Watchdog Timeout (32) at time Sat Dec 17 15:33:26 2022

Last log in OBFL was written at time Sat Dec 17 15:33:26 2022

'show logging onboard kernel-trace'

<6>[2065308.318713] writing reset reason succeeded with retval=0 on cpu=0

<0>[2065308.318717] NMI due WATCHDOG HIT

<7>[2065308.321702] cctrl DBG: cctrl_ow_write dev_type 5 data 4f

<6>[2065308.321703] CCTRL PANIC DUMP

<6>[2065308.321703] =========================

<6>[2065308.321705] WDT last punched at 2065306460351876

<6>[2065308.321707] REG(0x60) = 3c

<6>[2065308.321710] REG(0x64) = 0

<6>[2065308.321713] REG(0x300) = baadbeef

<6>[2065308.321716] REG(0x304) = baadbeef

<6>[2065308.321716] =========================

<0>[2065308.321717] nxos_panic: Kernel panic - not syncing: WATCHDOG HIT

위의 몇가지 명령어를 통해 장비가 H/W문제인지를 판단할 수 있습니다.

 

 

반응형

'Cisco Nexus' 카테고리의 다른 글

N9K-C92348GC's PSU went down and up  (0) 2023.06.16
lcnd: failed to get hp from offset 1728 - kernel  (0) 2023.06.16
N9K MCE Error  (0) 2023.06.16
N7K 와 N9K 비교  (0) 2023.06.16
protocol identification string lack carriage return - dcos_sshd  (0) 2023.06.16