Cortex-A series processors contain event counting hardware which can be used to profile and benchmark code. The counters for these are programmed using:
Which events would be counted using the Performance Monitoring Unit (PMU) in order to measure the data cache efficiency of an application?
In the ARM instruction set what is the maximum branch distance for a Branch or Branch and Link instruction?
In an ARMv7-A system, the following C function calculates a simple checksum for an input data packet of variable length. The checksum is defined to be the sum of all of the 16-bit data items in the packet modulo 65536. The parameter data_items contains the number of 2-byte data items in the packet, and it cannot be zero by design.
When using an ARM compiler, which TWO of the following optimizations could improve the performance of this code? (Choose two)