# Debug common system issues

Source: [https://docs.qualcomm.com/doc/80-70015-12/topic/general_system_debugging.html](https://docs.qualcomm.com/doc/80-70015-12/topic/general_system_debugging.html)

Watchdog timeout, bus hang, timeout error, and hardware reset are some of the common
            system issues. The information on how to identify and debug such system issues is
            described here.

## Watchdog issues

Source: [https://docs.qualcomm.com/doc/80-70015-12/topic/general_system_debugging.html](https://docs.qualcomm.com/doc/80-70015-12/topic/general_system_debugging.html)

A watchdog (WD) is a fixed-length counter that enables a system to recover from an
      unexpected hardware or software catastrophe. Unless the system periodically pets the watchdog
      timer, the watchdog timer assumes a catastrophe and resets the subsystem or the entire system,
      depending on which watchdog is fired.

There are multiple kinds of watchdog implementation – hardware watchdog, software
      watchdog, bark, bite, and so on. The following table summarizes various types of watchdog
      implementation. 

Table : Watchdog implementations

| Types of watchdog | Timeout duration (in seconds) | Owner | When expired | Result |
| --- | --- | --- | --- | --- |
| Nonsecure WD bark | 11 | HLOS | IRQ to Qualcomm TEE | HLOS falls to Panic |
| Nonsecure WD bite | 12 | HLOS | Fast interrupt request (FIQ) to Qualcomm TEE | Qualcomm TEE asserts PS\_HOLD |
| Secure WD bark | 6 | Qualcomm TEE | FIQ to Qualcomm TEE | Qualcomm TEE just pets secure WD |
| Secure WD bite | 22 | Qualcomm TEE | Asserting PS\_HOLD | PMIC resets the system |

Both software and hardware watchdogs are used in the system. For example, modem DSP
      (mDSP) implements both software and hardware watchdogs. The hardware watchdog module is used
      to ensure that the processor is active and consists of a timer that counts down from a
      predetermined value. If the timer is not reset by the corresponding CPU core, it eventually
      counts to 0 (zero) and triggers a watchdog timeout.

### Watchdog for application processor CPU

**Nonsecure hardware watchdog**

- Every 10 seconds, a timer event is triggered on the HLOS to pet the nonsecure
          hardware watchdog. If the HLOS does not pet the nonsecure watchdog for 11 seconds, the
          nonsecure Watchdog bark fires and the HLOS must handle it. If the HLOS cannot handle it,
          the HLOS falls into panic.
- If the HLOS is unable to handle nonsecure watchdog bark, a nonsecure watchdog bite
          is triggered and sent to Qualcomm TEE and the Qualcomm TEE falls into a fatal error.
- The watchdog pet and bark time can be customized using the kernel configuration options.
          For example, the following configuration sets the bark time to 13 seconds and pet time to
          11 seconds:

        CONFIG_QCOM_WATCHDOG_BARK_TIME=13000Copy to clipboard

        CONFIG_QCOM_WATCHDOG_PET_TIME=11000Copy to clipboard

**Secure hardware watchdog**

- Every 6 seconds, a secure watchdog bark is triggered to Qualcomm TEE as fast
          interrupt request (FIQ). The FIQ handler in the Qualcomm TEE pets the secure hardware
          Watchdog. This issue is not an error or fatal issue.
- If Qualcomm TEE is unable to handle the secure watchdog bark for 22 seconds, the
          secure watchdog bite expires. Then, the PS\_HOLD pin is asserted on the PMIC and eventually
          the entire system is reset.

The complete functionality of this feature is available to licensed developers with
        authorized access. For more information on debugging watchdog issues, see [Qualcomm Linux Debug Guide - Addendum.](https://docs.qualcomm.com/bundle/resource/topics/80-70015-12A/general_system_debugging.html#watchdog_timeout)

## Bus hang and timeout error

Source: [https://docs.qualcomm.com/doc/80-70015-12/topic/general_system_debugging.html](https://docs.qualcomm.com/doc/80-70015-12/topic/general_system_debugging.html)

The SNoC, CNoC, xPU, TBU, and AHB are the system infrastructure components on the
                        device, which are responsible for operations such as follows:
- Bus transaction
- Address translation
- Memory protection

Some failures or timeout on these components might cause system errors and are
                        reported to the Qualcomm TEE.

This feature is available to licensed developers with authorized access. For more
                        information on debugging bus hang and timeout errors, see [Qualcomm Linux Debug Guide -
                                Addendum](https://docs.qualcomm.com/bundle/resource/topics/80-70015-12A/general_system_debugging.html#erroneous_transaction_on_bus_error_and_timeout).

## Hardware reset

Source: [https://docs.qualcomm.com/doc/80-70015-12/topic/general_system_debugging.html](https://docs.qualcomm.com/doc/80-70015-12/topic/general_system_debugging.html)

A secure watchdog, temperature sensor (TSENS), or PMIC issues can cause a hardware reset. The
      debugging approach for hardware reset issues depends on the cause of the hardware reset.
      Therefore, identifying the cause of the hardware reset is crucial.

This feature is available to licensed developers with authorized access. For more information
      on debugging hardware reset issues, see [Qualcomm Linux Debug Guide - Addendum](https://docs.qualcomm.com/bundle/resource/topics/80-70015-12A/general_system_debugging.html#reset_hardware).

Last Published: Oct 14, 2024

[Previous Topic
Debug Linux kernel space issues](https://docs.qualcomm.com/bundle/publicresource/80-70015-12/topics/debugging_linux_kernel.md) [Next Topic
Debug non-HLOS](https://docs.qualcomm.com/bundle/publicresource/80-70015-12/topics/debug-non-hlos-subsystems.md)