# Features impacting performance

The Qualcomm^®^ Linux^®^ kernel includes features, such as the CPU scheduler,
CPU frequency governor, dynamic voltage and frequency scaling (DVFS),
and memory management. This guide provides an overview of each feature
and related reference links. Additionally, Qualcomm has added a feature
called PerfHAL to enhance the performance of Qualcomm Linux.

## CPU scheduler

The CPU scheduler manages how the CPU time is distributed among the
processes running on Linux systems.

It uses the earliest eligible virtual deadline first (EEVDF) scheduler,
a feature provided by the Linux kernel. The EEVDF CPU scheduler uses per
entity load tracking (PELT) to monitor the task load.

For more information, see:

- [An EEVDF CPU scheduler for
Linux](https://lwn.net/Articles/925371/)
- [Per-entity load tracking
\[LWN.net\]](https://lwn.net/Articles/531853/)

Utilization clamping (UCLAMP or util clamp) is a scheduler feature that
helps manage performance requirements for tasks.

For more information, see:

- [https://docs.kernel.org/scheduler/sched-util-clamp.html](https://docs.kernel.org/scheduler/sched-util-clamp.html)
- [Customize CPU scheduler](https://docs.qualcomm.com/doc/80-70018-10/topic/18-customize.html#customize-scheduler)

## CPU frequency governor

A CPU frequency governor adjusts the CPU frequency based on the task
load. The CPU scheduler provides the necessary inputs for this process.

Qualcomm Linux uses the `schedutil` governor, a feature provided by
the Linux kernel.

This governor increases the frequency when the system is heavily loaded
and reduces it when the load is low, ensuring an optimal balance between
power consumption and performance.

For more information, see:

- [https://www.kernel.org/doc/Documentation/cpu-freq/governors.txt](https://www.kernel.org/doc/Documentation/cpu-freq/governors.txt)
- [Configure CPU](https://docs.qualcomm.com/doc/80-70018-10/topic/14-configure.html#cpu)
- [Customize the CPU frequency governor](https://docs.qualcomm.com/doc/80-70018-10/topic/18-customize.html#cpu-frequency-governer)

## DVFS governors

DVFS governors control the frequencies of CPU caches (L3), the last
level cache controller (LLCC), and the DDR based on the system workload.

These governors increase the frequency when the workload is high and
decrease it when the workload is low, ensuring an optimal balance
between power consumption and performance.

Qualcomm Linux supports the following two types of DVFS governors for L3 cache:

- LLCC
- DDR

### Static map DVFS governor

This governor aligns the frequencies of the CPU L3 cache and the DDR with
the current CPU frequency to balance the power and the performance requirements.

For example, if the CPU frequency is at its maximum, the L3 cache and
DDR frequencies must also be at their maximum levels.

The static mapping is available in the source code at
`arch/arm64/boot/dts/qcom/<target>.dtsi`.

For customization options, see [Customize static map DVFS governor](https://docs.qualcomm.com/doc/80-70018-10/topic/18-customize.html#section-u1x-jps-51c-caharris-03-20-24-2005-37-832).

### BWMON governor

The bandwidth monitoring (BWMON) governor dynamically adjusts the
frequencies of the LLCC and DDR based on the measured traffic flow from
the CPU to the LLCC and then to the DDR.

The BWMON hardware block measures this traffic. It monitors the data
throughput between memory and the other subsystems within a specified
sampling window and uses this information to scale the LLCC and DDR
frequencies to meet the required bandwidth.

The BWMON governor driver is available in the source code at
`drivers/soc/qcom/icc-bwmon.c`.

For more information, see:

- [\[PATCH v3 0/4\] soc/arm64: qcom: Add initial version of
bwmon](https://lwn.net/ml/linux-kernel/20220531105137.110050-1-krzysztof.kozlowski@linaro.org/)
- [Customize BWMON governor](https://docs.qualcomm.com/doc/80-70018-10/topic/18-customize.html#section-qxs-4ps-51c-caharris-03-20-24-2007-2-926)

## PerfHAL

PerfHAL is a Qualcomm proprietary service that offers added
functionality by making perflock APIs accessible. It's beneficial when
you need short-term performance enhancements or power savings.

Perflocks help in modifying system behavior to manage intermittent
workloads. For example, if a specific code segment must run at a higher
CPU frequency for a certain duration, use perflocks within that
code to boost the CPU frequency.

PerfHAL efficiently handles concurrent perflock requests from multiple
clients. When several requests are aimed at the same resource, PerfHAL
aggregates them to achieve the optimal performance level needed by the
device.

When a perflock of a client is no longer active, PerfHAL releases all
the perflocks associated with that client.

### Perflock APIs

Perflock APIs allow applications to adjust system
parameters for specific use cases, helping them meet their performance
and power objectives.

User space applications use the `perf_lock_acq()` and `perf_lock_rel()` APIs
to request specific values of system tunable parameters for both, a set
time period or an indefinite time period.

### Acquire perflock

Use `perf_lock_acq()` function to acquire a perflock with the
necessary optimizations.

The syntax for this function is as follows:

`int perf_lock_acq(int handle, int duration, int list[], int numArgs)`

Table : perf_lock_acq API parameters

| Parameter | Description |
| --- | --- |
| `handle` | Identifies the client request. |
| `duration` | <ul class="simple"><br><li><p>Indicates the maximum timeout period that a perflock must<br>be held, in milliseconds.</p></li><br><li><p><code class="docutils literal notranslate"><span class="pre">duration</span></code> parameter can be set for a definite time or<br>for an indefinite time <code class="docutils literal notranslate"><span class="pre">(0)</span></code>.</p><ul><br><li><p>Definite perflocks require a positive integer value to<br>specify the maximum timeout period. A timer is created<br>and the perflock is released when the timer expires.</p></li><br><li><p>Indefinite perflocks are held until the client calls<br>the release function. To manually release a perflock<br>that has been set for an indefinite duration, use the<br><code class="code docutils literal notranslate"><span class="pre">perf_lock_rel()</span></code> function.</p></li><br></ul><br></li><br></ul> |
| `list` | An array of resource opcodes and value pairs. Opcodes<br>indicate a system parameter (resource) and the value to set<br>it (level). |
| `numArgs` | Number of elements in the list array. |

Table : perf_lock_acq API returns and result

| Returns | Result |
| --- | --- |
| A non-zero integer | Success |
| -1 | Failure |

### Perflock release

The `perf_lock_rel()` function is used to release a perflock that's held
by the `perf_lock_acq()` API. Use this function only for the perflocks
that are set for an indefinite time period. The syntax for this function is
as follows:

`int perf_lock_rel(int handle)`

Table : perf_lock_rel API parameters

| Parameter | Description |
| --- | --- |
| `handle` | <ul class="simple"><br><li><p>Tracks unique requests</p></li><br><li><p>Passes the same handle that <code class="docutils literal notranslate"><span class="pre">perf_lock_acq()</span></code> returns to release the lock</p></li><br></ul> |

Table : perf_lock_rel API returns and result

| Returns | Result |
| --- | --- |
| A non-zero integer | Success |
| -1 | Failure |

### Resource opcodes

Perflock uses a combination of opcodes and their corresponding values to
perform specific operations on a perflock resource.

To know the supported opcodes on QCS9075 and QCS8275, see the corresponding addendum. The following guides are available to licensed users with authorized access:

- [Qualcomm Linux Performance Guide - Addendum for QCS9075](https://docs.qualcomm.com/bundle/resource/topics/80-70018-10A/overview.html)
- [Qualcomm Linux Performance Guide - Addendum for QCS8275](https://docs.qualcomm.com/bundle/resource/topics/80-70018-10B/overview.html)

The following table lists the supported opcodes:

Table : Supported opcodes

| Opcode | Purpose | Sysnode on device |
| --- | --- | --- |
| 0x44000000 | Sets the minimum acceptable performance level for individual<br>tasks and task groups. | `/proc/sys/kernel/sched_util_clamp_min` |
| 0x44004000 | Sets the maximum acceptable performance level for individual<br>tasks and task groups. | `/proc/sys/kernel/sched_util_clamp_max` |
| 0x44008100 | Sets the minimum frequency of the Silver cluster. | `/sys/devices/system/cpu/cpufreq/policy0/scaling_min_freq` |
| 0x44008000 | Sets the minimum frequency of the Gold cluster. | `/sys/devices/system/cpu/cpufreq/policy4/scaling_min_freq` |
| 0x44008200 | Sets the minimum frequency of the Prime cluster. | `/sys/devices/system/cpu/cpufreq/policy7/scaling_min_freq` |
| 0x4400C100 | Sets the maximum frequency of the Silver cluster. | `/sys/devices/system/cpu/cpufreq/policy0/scaling_max_freq` |
| 0x4400C000 | Sets the maximum frequency of the Gold cluster. | `/sys/devices/system/cpu/cpufreq/policy4/scaling_max_freq` |
| 0x4400C200 | Sets the maximum frequency of the Prime cluster. | `/sys/devices/system/cpu/cpufreq/policy7/scaling_max_freq` |

The following are some examples of the resource opcodes:

- 0x44008100, 1958400: This pair of opcode and value indicates that the
minimum frequency of the Silver cluster must be set to 1958400 KHz.
- 0x44008100, 1958400, 0x4400C100, 2100000: This pair of opcode and
value indicates that the minimum frequency of the Silver cluster must
be set to 1958400 KHz. The maximum frequency of the Silver cluster
must be set to 2100000 KHz.

For more information about how to use and debug perflock, see [Customize perflock](https://docs.qualcomm.com/doc/80-70018-10/topic/18-customize.html#customize-perlocks).

## Memory

RAM is used for all memory allocations made by Qualcomm Linux. RAM must be effectively managed to meet performance requirements and ensure the smooth functioning of applications. The following figure shows memory partitioning:

![../../_images/partitioning-memory.png](data:image/png;base64,UklGRrgNAABXRUJQVlA4TKwNAAAv/8FMAG/CKJIkRbN8DG/y73Bxeg5MMIwkSck9+i4xkDMREsArfveM20hSNHvM9MT8Q1zsmv8eEFjpYBRQARRGgAQE9aIyEAAIpgI/BARopAMd1MEIAiiG+oIYgCAQvUYxmz7WNFSC72hW8a5lhkbT0Txvju22+cMeTns4/bnrF9bW4meBtwLvBN4KvBN4y5uoP1aKnJg9QpOKJ7/JLcAr5Yz4kXIlfBf1ztqVtTerLeOvKDXCL7lEQkgpuPfLHk6bP0gsCgAcp5XjvY/ed5uW9SNkYanpndIrvVuaZpP//x88kgnYYOd4a9ljjSVGo4j+04JsO2zb7LOV0iWhhydEUiCY/H/nrYHx7gcNO3wR/NaPv/fdRh092g+0vvGjr7//XsMOwz+Kuu/vvy8NPAxFxdfy72b+5WD7P/xIGnqcg8998m1p7OG6nP20uRts7jvn5u4c3irnywREBl0k+HPgczWVZYKLRgznQJbglarNXXITdgHG2z0cbtoErsGwbMgEvsw8JPgFeqhFs+WxbdSWyfwe8RLIEhaxyKYZM4gP4E1GZhPMoHlyL1niGzAYBubQicc5hMlSerwPDdlStmy+b5ZgB2x365AlNOKNJfKlyBnV56CcEY99flwjdgZ/ZCrOBbkH92Ab87xDJy3OlUzuAWGcS1N2yeAOY84yLqCAGdD8P3f6yn9vR/C7X6wIIKJjlsEPfqWrS73K4DfjY33461/+xL93/AKoaeF/GR/rN16OY3zzt1LX4ie/aPo/9qum78PVS4eGVk+m6OUmeppwlOPlphdj8470VVzzjm9+VrmIDcNf7yoYf//4iLvGe0WGZTTvKjjv2HIl45+V0Suj5AGvyufLz7uD6DWnkBbJEltcm9heVq6hN3cVjViVw+b4BbDcP8MFSezCcv+IzvrH6US61sOihB1q433l/Kyqusq4YptjLh8zlwOSjEEPlrxDbR6wq5nMVbb1JS+3LhYJ2kvp/QIxXPeIeUrISZ6a4ZBkScwf+HuRS0TEDCZ6VpLot4l6q9GC/rQJGjF4zvO31ic478ArSwfSNKLvZDHTXnIsro20pmbyyxa0ZK5EWi920UvP21GRcAde9zPupWcUk4+LdiFiWvWpBuZh20uQRJdkYluvusAe5tKxOec5z7343opnK71XUmpNQSc1eg5w5C5scW3E28PNO9YVD4nNx8pWZBGJeAFeFngpcomZtAEMMjjOkK20QQPh+Z4z3YZpc7rxk4+lLMpiLuckSm9FzgYK+fAJreg7qdslUzKPeNbWBjBc2oaOJikd6ZdiWglXYhDENWSAiyTCIQCcMXb4oBFamDoUe0TkArJAA14yknPGeS7WdCJAj4iIrDBanXSAn+cZob42NRVeXbQonpVhJPIYsrBZKzuCg8UUks7YPA/OLHi6QpNYQH362pH4VsQy5tHqa1NT2SufXOSy5HqXodAa7YZE7o0OstVtIIveGR9E+kwk/7+y7PlexHTgbSuwaB/lASPoVJ6B1dQGCovaSTxayV2221aytoC6i8Jycg1l+l3gi0qeIBcc1ivvRaVPwDvH3oLBvbD5gFqKLPxuQIbSL2D6QVubQx9WXzzgPXhM4T8jrwuGgrYhycO3Atle05CSBG5zWPKlbt5BachbUV5m8obaKNKx0pC3O+sxSyIQ2BbUpqsD8w7Vjnkm+4hd5h0tP0hZwnup9eLV5T5zbK3Gci5lcd/aWu+Ke9lnoFFcJ/OLUn3uePo3altQc3TyG3q9SZqiGtDQiz+vGecd3/z9/1zRGCkW+bjSFTjh4sgVjjEWfFe8Aqfb37nSQUekAv98xGEnD1fbqDdyteN0N8RH5NHmV0qoMTt9Db3eJM3fyTjsBY3Zn48HZ+umf7zzn60/ur+VyeErUAWr2uGGkMuk9BWoFQ2VBtINHwtIgWu4f1EEGaRE9o7TyUAjrx2BbG5m5HDvSl2Besq7gbdGuuFxajl14x2nhHeGkIMbGcnvW7krUEs5SnNxYOY45XR9t+XgU3x3w8xd3LdyV6Aih8mJ3SU1LuYPMOXhCtwSiPW+lbsCpVE3DkNC8Hc6jwIbwoPsUu4K1FQc3YYn012wWx+kodJWoNbiaXpnKDIP6wJgGmCThn3PO5W7ArWVoTX7lMjdFuEbRxR430pdgcrMOzacz50r0dDrd3IqrGqrVW0wtVVi3Z7cuKG+GvSf+XsanwAwc7XU+I1x8g4Nt5aIHHMkmiKnms/4fEd0U3roKJ24Wxe3nPUUOupGVqcfbWeJJqA4zIG2Ny7Po07OO04d36YMb1y6uQs0gpucocCjsyXnKfCmcxwdeL7zNLnlSccjBb7t1mxBWqD/AZikt3fr2VAbG4LjzQfdxkCoMRmYOUxLbujghiHPsVPXn8JxZTv4irTumDl1OUOGmWdYI+cdLK3twNAQaRtBYxqr8OM0QtxF7oIOpkTkFKkG0qZ2zjsN1pHVbQx6tgo6VXCGNhr5BtvdDDXTxg7gcQZmgXNFGz6U3DpVbCjlO43gWK9ww471cd5xmK3ha/9sKx3ueNgyZHS7dLbkjLILc0pBx5Nn74podomzoW42BB5rx5uBKDjLvK5cQzEXaKPDgWgS3a2etqFyzzvX988d7LDPCjRSm5lv9tZ7foEr97xjjV93cCzI63dyKqxqq9ULDv6dPOwhpK4xGpnHqWs8qOT4ZnYQzaKhEtCcO+12TZSOd7wBa3vZziIFQmahoKbGlmgybZT8LGrOnbZW1sh2ax4m6Z2lAJ9jwkyTmnUje2qO8rjRnjttAAu7cMbMSGKd8rQ0iZrUXCzhxv+q7b9652rrKnoz0o3u3GmeBjsyWwIxArcOB+fV1AbCEvpTtf1Bj6vtL7J7lHzDTVm7XvvG0oRjp//tqnVEXWobwhIeJn+tsvaoVOARSn6YIeRBHWggD39Doxa6IbImtc6Wkbzp/1LR6OMRqcBnUlmcas+dxvn6yXGYblsJnfrblY4Ma1ILYJdDqTlRYZ5Q+aVJ0NAUlYZGBXwZUlKDDQ1NSamjXqNTQWoref1OjpJm0dBrMJ/9pbIxHpEKdCd93vHvXOGgY1KBE96Q50pHf0zWnRxXJ2TdWXdM1p293vi945f/mu8u9NRBXk1ZvcSYd1SvLufSY0VzxoYmz4Gb5xFyWDueEk13GWL+BwVH6aj79XiZwDsIM3JYAZ4sOVgTta+lmnesBXY6JVrBZYg9TZiH7la9/O+g5BNmIyP5KnBKDSaavlZo3uE02uWUaAWXIVZOoUqoXv43OGY2JF1g5i6WfpcRdBvWRNPXmmmXU6IVXYbY08jMs6Be/jdOlbxBrKtwo62J0teaaZdTohVchlhDvfxvTLWnSWOuiNipfa2T847TCI/b5ZRoRZchVjeCYyA4mJRbV4lak9m6Ts47gtMex9lml1OiFV2G2BO8A/Xyv7fqe0BCljRUhFITta/1cN6BGafgsr27nBKt4DLEW5HyhtTL/+obosDVoNZE7WudUorw74xH53OHZwhS36werfmvO3k1ZVXfyFOH1VO5dBjeIQvvYARLewIz+ylR2FJmnbHTXj+xtFI3IYojazsMlz11oyVKN3DtU1QWI6Xd1pDWQjOfP7ycOjCz79m7wIYCjw5eMDHOhlGbhpsNpQWuvONGDmTUDsOPvcjxbToBl3LbpMMdSAbJ8O0s1AJycOs7ZjeFd9BFZrYzngzMfNNt3jHMQujJMAdUFimVFez4LDIPFHQdHpkDCSjGumNmT0ZZC2EZSQ4871gn3My21KUdnkZldZcnEIEnZAOzukip3CygdlgxAF1kkBQUO15PSj7vcNrZnPYv9mgH8soipUpQO/wohozDkjvNG1a74W8MBWaQAXiTpy5SqgJDYRfrVL/BXTob6iE35rvkCneZWp50I3s35nnddmt1kVIVqB3W0+yisLTmemB1cKw05O6Yd2uow7wh/fUTq0DtsJ7akGokrIfzjun6aM87+RmX3KrxSG2zd/PO5Nngcwepb+ipg7z814APe+qwev708nHe8f/VNhZcAfWINnTEHs/x4wP4e7XRMalA+fy+mv4gBxDHKvtvLKjAudJPgG/KRz77Z+Xq2Mn2Zwf5+vu/qwrbWFiBtroVoCh7doxjJfUtnjqsXgFovEFPHeTFlFXt8fabeUf5yv95x7elzzteJvPcPfY76vEa+GJeKj+/gj43zQBmZbVqTJjtPoiUTJOYd7SYsVdknHAmYhgTXFwW31g0C8Qv80xFuuQ+y1NtFwfFRtNHw6Bztu9Fth2qOQ2VT0x6IONeekYxyVJsG3bRdpIlFyrxrWR8PjC1jyrpMMu7XLPJogXA421YHQPF1PGmYTlwJoe/AZ3S9dW0vpXa7YLN1hkjHGiKe0TMHkciPvJfIwciPd7Xb4JJprW/G8nKqcV4+ucdK0h6zjfMvjcuOGMsn2VrWlt3Giqlh0S3y+PJfCFit+xcsqQri3t8kJ7nYjFK62s4WeoaKgSi1+vmiAbvDT6IzPvLg0AQVgfksrD9RWvzvmZf+b/usO6sO3wq9eE365tfnwg/+FV989PGT/bgN6ttwNeoY5bBT3V1qVUZ7K45x1sKfOvT5m6wuU++3dx9t8u9+8PGvnkOPkdf+/77zZwh9w6MbzTzL0PxHTXij77euP/D8I8i6eID++PvtY06erQfUEF4bNjh6W2BQQ==)

**Figure : Memory partitioning**

Certain sections of RAM are managed independent of the Linux system. For
example, firmware such as modem, video, and audio run from these
specific RAM partitions. The Linux kernel manages all other RAM
partitions.

The Linux kernel features its own memory management subsystem, which
includes:

- Implementation of virtual memory and demand paging
- Allocation of memory to both kernel internal structures and user space programs
- Mapping of files into the address space of the processes
- Other memory management operations

### RAM memory partitioning

The following table describes various types of memory allocations.

Note

The commands specified in the following table should be run on the device.

| RAM classification | Memory segment | Allocation types | Description |
| --- | --- | --- | --- |
| Non-Linux | – | – | <ul class="simple"><br><li><p>Memory is reserved in the form of carveouts by various<br>subsystems other than Linux.</p></li><br><li><p>These carveouts are specified in the respective DTSI<br>files.</p></li><br></ul> |
| Linux (system RAM) | Kernel static | Vmlinux + kernel page structures | <ul><br><li><p>The kernel reserves this memory at boot time for its own<br>usage.</p></li><br><li><p>Vmlinux is the memory used to store the vmlinux image.</p></li><br><li><p>The size and breakdown of the vmlinux image can be<br>obtained from the <code class="docutils literal notranslate"><span class="pre">dmesg</span></code> logs at boot:</p><br><blockquote><br><div><p>Memory: 3061872K/4134912K available (28800K kernel code, 2090K rwdata, 10688K rodata, 3072K init, 969K bss, 679824K reserved, 393216K cma-reserved)</p><br><p>Kernel code + rwdata + rodata + init +bss indicates vmlinux kernel image size (28800k + 2090k + 10688k + 3072k + 969k)</p><br></div></blockquote><br></li><br><li><p>The kernel page structure is the memory used by the<br>kernel to maintain page structures for every page of RAM.<br>This is calculated as 16&nbsp;MB per GB of RAM size.</p></li><br></ul> |
| Linux (system RAM) | Kernel dynamic | Slab | <ul><br><li><p>The slab is used by the kernel for faster and more<br>efficient memory usage of frequently used data<br>structures.</p></li><br><li><p>To understand the memory usage of the slab, run the following<br>command:</p><br><div class="highlight-default notranslate"><div class="highlight"><pre class="pre codeblock"><code>cat /proc/meminfo | grep -i slab<br></code><span class="copyclip"><svg xmlns="http://www.w3.org/2000/svg" class="copyclipicon" width="25px" height="25px" viewbox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1" stroke-linecap="round" stroke-linejoin="round"><rect x="9" y="9" width="13" height="13" rx="2" ry="2"></rect><title>Copy to clipboard</title><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"></path></svg></span></pre></div><br></div><br></li><br><li><p>To understand the breakup of various slabs and their<br>usage, enable <code class="docutils literal notranslate"><span class="pre">CONFIG_SLUB_DEBUG</span></code> in the kernel<br>configuration, and then run the following command:</p><br><div class="highlight-default notranslate"><div class="highlight"><pre class="pre codeblock"><code>cat /proc/slabinfo<br></code><span class="copyclip"><svg xmlns="http://www.w3.org/2000/svg" class="copyclipicon" width="25px" height="25px" viewbox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1" stroke-linecap="round" stroke-linejoin="round"><rect x="9" y="9" width="13" height="13" rx="2" ry="2"></rect><title>Copy to clipboard</title><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"></path></svg></span></pre></div><br></div><br></li><br></ul> |
| Linux (system RAM) | Kernel dynamic | Kernel stack | <ul><br><li><p>The kernel stack stores the call stack of every process.</p></li><br><li><p>To understand the memory usage of the kernel stack, run the<br>following command:</p><br><div class="highlight-default notranslate"><div class="highlight"><pre class="pre codeblock"><code>cat /proc/meminfo | grep -i kernelstack<br></code><span class="copyclip"><svg xmlns="http://www.w3.org/2000/svg" class="copyclipicon" width="25px" height="25px" viewbox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1" stroke-linecap="round" stroke-linejoin="round"><rect x="9" y="9" width="13" height="13" rx="2" ry="2"></rect><title>Copy to clipboard</title><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"></path></svg></span></pre></div><br></div><br></li><br></ul> |
| Linux (system RAM) | Kernel dynamic | PageTables | <ul><br><li><p>The kernel uses memory to store PageTables that map<br>virtual addresses to physical addresses.</p></li><br><li><p>To understand the memory usage of PageTables, run the<br>following command:</p><br><div class="highlight-default notranslate"><div class="highlight"><pre class="pre codeblock"><code>cat /proc/meminfo | grep -i PageTables<br></code><span class="copyclip"><svg xmlns="http://www.w3.org/2000/svg" class="copyclipicon" width="25px" height="25px" viewbox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1" stroke-linecap="round" stroke-linejoin="round"><rect x="9" y="9" width="13" height="13" rx="2" ry="2"></rect><title>Copy to clipboard</title><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"></path></svg></span></pre></div><br></div><br></li><br></ul> |
| Linux (system RAM) | Kernel dynamic | Modules | <ul><br><li><p>Represents the kernel entities that are dynamically<br>loaded into the kernel in the form of kernel modules.</p></li><br><li><p>To display the list of loaded kernel modules and their<br>memory usage, run the following command:</p><br><div class="highlight-default notranslate"><div class="highlight"><pre class="pre codeblock"><code>cat /proc/modules<br></code><span class="copyclip"><svg xmlns="http://www.w3.org/2000/svg" class="copyclipicon" width="25px" height="25px" viewbox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1" stroke-linecap="round" stroke-linejoin="round"><rect x="9" y="9" width="13" height="13" rx="2" ry="2"></rect><title>Copy to clipboard</title><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"></path></svg></span></pre></div><br></div><br></li><br></ul> |
| Linux (system RAM) | Kernel dynamic | Vmalloc | <ul><br><li><p>Used to allocate contiguous memory.</p></li><br><li><p>To understand the Vmalloc memory breakup, run the following command:</p><br><div class="highlight-default notranslate"><div class="highlight"><pre class="pre codeblock"><code>cat /proc/vmallocinfo<br></code><span class="copyclip"><svg xmlns="http://www.w3.org/2000/svg" class="copyclipicon" width="25px" height="25px" viewbox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1" stroke-linecap="round" stroke-linejoin="round"><rect x="9" y="9" width="13" height="13" rx="2" ry="2"></rect><title>Copy to clipboard</title><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"></path></svg></span></pre></div><br></div><br></li><br></ul> |
| Linux (system RAM) | Kernel dynamic | Cached (kernel + user space) | <ul><br><li><p>The amount of file-backed memory that resides in RAM.</p></li><br><li><p>To understand the cached memory usage, run the following<br>command:</p><br><div class="highlight-default notranslate"><div class="highlight"><pre class="pre codeblock"><code>cat /proc/meminfo | grep -i cached<br></code><span class="copyclip"><svg xmlns="http://www.w3.org/2000/svg" class="copyclipicon" width="25px" height="25px" viewbox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1" stroke-linecap="round" stroke-linejoin="round"><rect x="9" y="9" width="13" height="13" rx="2" ry="2"></rect><title>Copy to clipboard</title><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"></path></svg></span></pre></div><br></div><br></li><br></ul> |
| Linux (system RAM) | Kernel dynamic | Buffers | <ul><br><li><p>Buffers are of fixed size and contain blocks of<br>information either read from disk or written to disk.</p></li><br><li><p>To understand the buffer memory usage, run the following<br>command:</p><br><div class="highlight-default notranslate"><div class="highlight"><pre class="pre codeblock"><code>cat /proc/meminfo | grep -i Buffers<br></code><span class="copyclip"><svg xmlns="http://www.w3.org/2000/svg" class="copyclipicon" width="25px" height="25px" viewbox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1" stroke-linecap="round" stroke-linejoin="round"><rect x="9" y="9" width="13" height="13" rx="2" ry="2"></rect><title>Copy to clipboard</title><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"></path></svg></span></pre></div><br></div><br></li><br></ul> |
| Linux (system RAM) | Kernel dynamic | Shmem | <ul><br><li><p>Shared memory is a common block of memory that's mapped<br>into the address spaces of two or more processes.</p></li><br><li><p>To understand the shared memory usage, run the following<br>command:</p><br><div class="highlight-default notranslate"><div class="highlight"><pre class="pre codeblock"><code>cat /proc/meminfo | grep -i shmem<br></code><span class="copyclip"><svg xmlns="http://www.w3.org/2000/svg" class="copyclipicon" width="25px" height="25px" viewbox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1" stroke-linecap="round" stroke-linejoin="round"><rect x="9" y="9" width="13" height="13" rx="2" ry="2"></rect><title>Copy to clipboard</title><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"></path></svg></span></pre></div><br></div><br></li><br></ul> |
| Linux (system RAM) | User space | ZUSED (ZRAM) | An anonymous memory post compression by ZRAM. |
| Linux (system RAM) | User space | CMA | <ul class="simple"><br><li><p>A physically continuous memory is typically mapped to<br>other IPs, such as video and display, however it's<br>allocated to the runtime.</p></li><br><li><p>The free memory that the system can use is reduced with<br>the usage of more CMA reservations. Only the movable<br>allocations, such as user space process allocations can<br>use the CMA reserved free memory. However, it can't be<br>used for the kernel allocations.</p></li><br></ul> |
| Linux (system RAM) | User space | ANON | <ul><br><li><p>Memory that user space applications allocate using<br><code class="docutils literal notranslate"><span class="pre">malloc()</span></code> or <code class="docutils literal notranslate"><span class="pre">new()</span></code> function calls.</p></li><br><li><p>To understand the ANON memory breakup for a process, run the<br>following command:</p><br><div class="highlight-default notranslate"><div class="highlight"><pre class="pre codeblock"><code>cat /proc/&lt;pid&gt;/smaps<br></code><span class="copyclip"><svg xmlns="http://www.w3.org/2000/svg" class="copyclipicon" width="25px" height="25px" viewbox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1" stroke-linecap="round" stroke-linejoin="round"><rect x="9" y="9" width="13" height="13" rx="2" ry="2"></rect><title>Copy to clipboard</title><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"></path></svg></span></pre></div><br></div><br></li><br></ul> |
| Linux (system RAM) | User space | ION | <ul><br><li><p>ION memory allows sharing buffers between hardware IPs<br>such as video, camera, and Qualcomm Linux.</p></li><br><li><p>ION manages one or more memory pools, which can be set<br>aside at boot time to combat fragmentation.</p></li><br><li><p>To understand the ION memory usage, run the following<br>commands:</p><br><div class="highlight-default notranslate"><div class="highlight"><pre class="pre codeblock"><code>mount -t debugfs none /sys/kernel/debug<br></code><span class="copyclip"><svg xmlns="http://www.w3.org/2000/svg" class="copyclipicon" width="25px" height="25px" viewbox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1" stroke-linecap="round" stroke-linejoin="round"><rect x="9" y="9" width="13" height="13" rx="2" ry="2"></rect><title>Copy to clipboard</title><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"></path></svg></span></pre></div><br></div><br><div class="highlight-default notranslate"><div class="highlight"><pre class="pre codeblock"><code>cat /sys/kernel/debug/dma_buf/bufinfo | grep bytes<br></code><span class="copyclip"><svg xmlns="http://www.w3.org/2000/svg" class="copyclipicon" width="25px" height="25px" viewbox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1" stroke-linecap="round" stroke-linejoin="round"><rect x="9" y="9" width="13" height="13" rx="2" ry="2"></rect><title>Copy to clipboard</title><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"></path></svg></span></pre></div><br></div><br></li><br></ul> |
| Linux (system RAM) | User space | KGSL | <ul><br><li><p>Memory allocated by the graphics driver.</p></li><br><li><p>To understand the overall kernel graphics support layer (KGSL)<br>memory usage, run the following command:</p><br><div class="highlight-default notranslate"><div class="highlight"><pre class="pre codeblock"><code>cat /sys/class/kgsl/kgsl/page_alloc<br></code><span class="copyclip"><svg xmlns="http://www.w3.org/2000/svg" class="copyclipicon" width="25px" height="25px" viewbox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1" stroke-linecap="round" stroke-linejoin="round"><rect x="9" y="9" width="13" height="13" rx="2" ry="2"></rect><title>Copy to clipboard</title><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"></path></svg></span></pre></div><br></div><br></li><br><li><p>To understand the process level breakup, run the following<br>command:</p><br><div class="highlight-default notranslate"><div class="highlight"><pre class="pre codeblock"><code>cat /sys/class/kgsl/kgsl/proc/&lt;pid&gt;/kernel<br></code><span class="copyclip"><svg xmlns="http://www.w3.org/2000/svg" class="copyclipicon" width="25px" height="25px" viewbox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1" stroke-linecap="round" stroke-linejoin="round"><rect x="9" y="9" width="13" height="13" rx="2" ry="2"></rect><title>Copy to clipboard</title><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"></path></svg></span></pre></div><br></div><br></li><br></ul> |
| Linux (system RAM) | Free memory | – | <ul><br><li><p>Free memory is the memory that's not yet used and is<br>available for any allocation.</p></li><br><li><p>To understand the free memory, run the following command:</p><br><div class="highlight-default notranslate"><div class="highlight"><pre class="pre codeblock"><code>cat /proc/meminfo | grep -i MemFree<br></code><span class="copyclip"><svg xmlns="http://www.w3.org/2000/svg" class="copyclipicon" width="25px" height="25px" viewbox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1" stroke-linecap="round" stroke-linejoin="round"><rect x="9" y="9" width="13" height="13" rx="2" ry="2"></rect><title>Copy to clipboard</title><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"></path></svg></span></pre></div><br></div><br></li><br></ul> |
|  |  |  |  |
|  |  |  |  |
|  |  |  |  |
|  |  |  |  |
|  |  |  |  |
|  |  |  |  |
|  |  |  |  |
|  |  |  |  |
|  |  |  |  |
|  |  |  |  |
|  |  |  |  |
|  |  |  |  |
|  |  |  |  |
|  |  |  |  |

## Real-time kernel

Real-time (RT) Linux is an optional feature that's not enabled by
default on Qualcomm Linux. It can be enabled based on the product
requirements.

RT Linux is designed to offer deterministic and predictable behavior for
applications that are time‑sensitive.

### Set up workspace

In Qualcomm Linux, the RT Linux kernel recipes are referred to as follows:

- Base BSP: `linux-qcom-base-rt`
- Custom BSP: `linux-qcom-custom-rt`

The Qualcomm Linux kernel supports long-term support (LTS) RT kernel
6.6 version, which is maintained through the Yocto recipe in the
`meta-qcom-realtime` layer at the following paths in the source code:

- Base BSP: `recipes-kernel/linux/linux-kernel-base-rt_6.6.bb`
- Custom BSP: `recipes-kernel/linux/linux-kernel-custom-rt_6.6.bb`

For more information about how to clone the workspace and acquire all the
meta layers to use Qualcomm RT Linux kernel, see [Sync and build with
real-time Linux](https://docs.qualcomm.com/bundle/publicresource/topics/80-70018-254/how_to.html#sync-and-build-with-real-time-linux).

### Enable RT kernel

Use the RT Linux kernel recipe to enable the RT kernel. This recipe
fetches the kernel, downloads pre-empt RT patches and applies them to
the kernel. It also allows a fully pre-emptible kernel with:

`CONFIG_PREEMPT_RT=y`

For more information, see [Qualcomm Linux Kernel
Guide](https://docs.qualcomm.com/bundle/publicresource/topics/80-70018-3/overview.html).

### Verify kernel type

After booting, verify the kernel type by running the following command
on the device:

uname -v
    Copy to clipboard

The following is an output of the command:

SMP PREMPT\_RT

### Test RT Linux kernel

The RT Linux kernel test helps to obtain the following information:

- Real-time performance of the RT Linux kernel
- RT Linux kernel latencies and key performance indicators (KPIs)

Note

This section is only applicable for QCS6490.

Caution

Ensure that the system isn't rebooted during the RT Linux kernel test because this test runs for approximately more than 24 hours.

### Cyclictest

Cyclictest tool is used for benchmarking the RT Linux kernel systems. It's used to evaluate the relative performance of the real-time systems. The Qualcomm Linux build has the cyclictest tool.
For more information, see [https://wiki.linuxfoundation.org/realtime/documentation/howto/tools/cyclictest/start/](https://wiki.linuxfoundation.org/realtime/documentation/howto/tools/cyclictest/start/).

This guide describes the following cyclictests:

- Cyclictest with no-load: System load isn't added to perform this test.
- Cyclictest with stress-ng (next-generation): Specific percentage of load is added to perform this test to measure the worst case system latencies. For more information about stress-ng, see [Kernel/Reference/stress-ng - Ubuntu Wiki](https://wiki.ubuntu.com/Kernel/Reference/stress-ng/).

#### Presetting

> 
> 
> Ensure to complete the following presetting before you run a cyclictest:

1. Configure and isolate the CPU core 1 to core 3 for the RT tasks. You may configure any other CPU cores depending on your requirement.

    For example, you can configure the RT CPUs in the source code at the following path:

    `layers/meta-qcom-realtime/blob/scarthgap/conf/layer.conf`

    For example, configure the CPU as follows:

KERNEL_CMDLINE_EXTRA:qcm6490 = "pcie_pme=nomsi net.ifnames=0 pci=noaer kpti=off kasan=off kasan.stacktrace=off swiotlb=128 mitigations=auto kernel.sched_pelt_multiplier=4 rcupdate.rcu_expedited=1 rcu_nocbs=1-3 isolcpus=1-3 irqaffinity=4-7 nohz_full=1-3 no-steal-acc vfio_iommu_type1.allow_unsafe_interrupts=1"
        Copy to clipboard
2. Run the following commands on the RT Linux kernel:

> 
> 
> echo 0 > /sys/kernel/tracing/tracing_on
>         Copy to clipboard
> 
> 
> echo E0 > /sys/devices/virtual/workqueue/kgsl-workqueue/cpumask
>         Copy to clipboard
> 
> 
> echo E0 > /sys/devices/virtual/workqueue/scsi_tmf_0/cpumask
>         Copy to clipboard
> 
> 
> echo E0 > /sys/devices/virtual/workqueue/writeback/cpumask
>         Copy to clipboard
> 
> 
> echo 1 > /sys/devices/system/cpu/cpu0/cpuidle/state0/disable
>         Copy to clipboard
> 
> 
> echo 1 > /sys/devices/system/cpu/cpu0/cpuidle/state1/disable
>         Copy to clipboard
> 
> 
> echo 1 > /sys/devices/system/cpu/cpu0/cpuidle/state2/disable
>         Copy to clipboard
> 
> 
> echo 1 > /sys/devices/system/cpu/cpu1/cpuidle/state0/disable
>         Copy to clipboard
> 
> 
> echo 1 > /sys/devices/system/cpu/cpu1/cpuidle/state1/disable
>         Copy to clipboard
> 
> 
> echo 1 > /sys/devices/system/cpu/cpu1/cpuidle/state2/disable
>         Copy to clipboard
> 
> 
> echo 1 > /sys/devices/system/cpu/cpu2/cpuidle/state0/disable
>         Copy to clipboard
> 
> 
> echo 1 > /sys/devices/system/cpu/cpu2/cpuidle/state1/disable
>         Copy to clipboard
> 
> 
> echo 1 > /sys/devices/system/cpu/cpu2/cpuidle/state2/disable
>         Copy to clipboard
> 
> 
> echo 1 > /sys/devices/system/cpu/cpu3/cpuidle/state0/disable
>         Copy to clipboard
> 
> 
> echo 1 > /sys/devices/system/cpu/cpu3/cpuidle/state1/disable
>         Copy to clipboard
> 
> 
> echo 1 > /sys/devices/system/cpu/cpu3/cpuidle/state2/disable
>         Copy to clipboard
> 
> 
> echo 1 > /sys/devices/system/cpu/cpu4/cpuidle/state0/disable
>         Copy to clipboard
> 
> 
> echo 1 > /sys/devices/system/cpu/cpu4/cpuidle/state1/disable
>         Copy to clipboard
> 
> 
> echo 1 > /sys/devices/system/cpu/cpu4/cpuidle/state2/disable
>         Copy to clipboard
> 
> 
> echo 1 > /sys/devices/system/cpu/cpu5/cpuidle/state0/disable
>         Copy to clipboard
> 
> 
> echo 1 > /sys/devices/system/cpu/cpu5/cpuidle/state1/disable
>         Copy to clipboard
> 
> 
> echo 1 > /sys/devices/system/cpu/cpu5/cpuidle/state2/disable
>         Copy to clipboard
> 
> 
> echo 1 > /sys/devices/system/cpu/cpu6/cpuidle/state0/disable
>         Copy to clipboard
> 
> 
> echo 1 > /sys/devices/system/cpu/cpu6/cpuidle/state1/disable
>         Copy to clipboard
> 
> 
> echo 1 > /sys/devices/system/cpu/cpu6/cpuidle/state2/disable
>         Copy to clipboard
> 
> 
> echo 1 > /sys/devices/system/cpu/cpu7/cpuidle/state0/disable
>         Copy to clipboard
> 
> 
> echo 1 > /sys/devices/system/cpu/cpu7/cpuidle/state1/disable
>         Copy to clipboard
> 
> 
> echo 1 > /sys/devices/system/cpu/cpu7/cpuidle/state2/disable
>         Copy to clipboard
> 
> 
> echo performance > /sys/devices/system/cpu/cpufreq/policy7/scaling_governor
>         Copy to clipboard
> 
> 
> echo performance  > /sys/devices/system/cpu/cpufreq/policy4/scaling_governor
>         Copy to clipboard
> 
> 
> echo performance > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
>         Copy to clipboard
> 
> 
> echo +cpuset > /sys/fs/cgroup/cgroup.subtree_control
>         Copy to clipboard
> 
> 
> mkdir /sys/fs/cgroup/cpuset
>         Copy to clipboard
> 
> 
> echo +cpuset > /sys/fs/cgroup/cpuset/cgroup.subtree_control
>         Copy to clipboard

> 
> 
> - Cyclictest with no-load
>     - To run a cyclictest with no-load, follow these steps:
> 
> 1. Complete the [presetting](https://docs.qualcomm.com/doc/80-70018-10/topic/2-performance-features.html#presetting).
> 2. Run the following commands to start cyclictest:
> 
> 
> > 
> > 
> > mkdir /sys/fs/cgroup/cpuset/core1-3/
> >         Copy to clipboard
> 
> 
> 
> echo +cpuset > /sys/fs/cgroup/cpuset/core1-3/cgroup.subtree_control
>         Copy to clipboard
> 
> 
> echo 1-3 > /sys/fs/cgroup/cpuset/core1-3/cpuset.cpus
>         Copy to clipboard
> 
> 
> echo $$ > /sys/fs/cgroup/cpuset/core1-3/cgroup.procs
>         Copy to clipboard
> 
> 
> cyclictest -a 1-3 -t 3 -m -l 100000000 -i 1000 -p 90 -h 800 --mainaffinity 4 --spike 100
>         Copy to clipboard
> 3. Note the latencies.
> 
> - Cyclictest with stress-ng
>     - To run a cyclictest with stress-ng, follow these steps:
> 
> 
> 
> 1. Complete the [presetting](https://docs.qualcomm.com/doc/80-70018-10/topic/2-performance-features.html#presetting).
> 2. Open a shell and run the following commands to run stress-ng. In the second example command, the non-RT CPU is loaded with 60% load:
> 
> 
> 
> > 
> > 
> > mkdir /tmp/temp-path
> >         Copy to clipboard
> 
> 
> 
> > 
> > 
> > stress-ng --cpu 5 --cpu-load 60 --temp-path /tmp/temp-path --sched fifo --sched-prio 1 -t 2d
> >         Copy to clipboard
> 
> 
> 
> This procedure completes in approximately 48 hours. In this example, CPU 1, 4, and 7 are loaded.
> 3. Run the following commands to start cyclictest in another terminal to run the cyclictest and stress-ng simultaneously:
> 
> 
> 
> > 
> > 
> > mkdir /sys/fs/cgroup/cpuset/core1-3/
> >         Copy to clipboard
> 
> 
> 
> > 
> > 
> > echo +cpuset > /sys/fs/cgroup/cpuset/core1-3/cgroup.subtree_control
> >         Copy to clipboard
> 
> 
> 
> echo 1-3 > /sys/fs/cgroup/cpuset/core1-3/cpuset.cpus
>         Copy to clipboard
> 
> 
> echo $$ > /sys/fs/cgroup/cpuset/core1-3/cgroup.procs
>         Copy to clipboard
> 
> 
> cyclictest -a 1-3 -t 3 -m -l 100000000 -i 1000 -p 90 -h 800 --mainaffinity 4 --spike 100
>         Copy to clipboard
> 4. Use Ctrl + C to stop stress-ng.
> 5. Note the worst-case latencies.

### RT Linux kernel KPIs

The following tables describe the cyclictests KPIs:

> 
> 
> KPIs for cyclictest with no-load
> 
> 
> | RT cores | Core1 | Core2 | Core3 |
> | --- | --- | --- | --- |
> | Minimum latencies (in microseconds) | 8 | 9 | 9 |
> | Maximum latencies (in microseconds) | 86 | 29 | 29 |
> 
> 
> 
> 
> KPIs for cyclictest with stress-ng
> 
> 
> | RT cores | Core1 | Core2 | Core3 |
> | --- | --- | --- | --- |
> | Minimum latencies (in microseconds) | 8 | 9 | 9 |
> | Maximum latencies (in microseconds) | 53 | 37 | 28 |

Last Published: Apr 04, 2025

[Previous Topic
Get started with performance tuning and optimization](https://docs.qualcomm.com/bundle/publicresource/80-70018-10/topics/get-started.md) [Next Topic
Performance analysis tools](https://docs.qualcomm.com/bundle/publicresource/80-70018-10/topics/13-performance_tools.md)

Source: [https://docs.qualcomm.com/doc/80-70018-10/topic/2-performance-features.html](https://docs.qualcomm.com/doc/80-70018-10/topic/2-performance-features.html)