Synopsis of the QK Port on ARM Cortex-M
- Preemption Scenarios in QK on ARM Cortex-M
The qf_port.h Header File
QK Port Implementation for ARM Cortex-M
Writing ISRs for QK
Using the FPU in the QK Port (Cortex-M4F/M7)
- FPU used in ONE thread only and not in any ISR
- FPU used in more than one thread only or the ISR
QK Idle Processing Customization in QK_onIdle()
Testing QK Preemption Scenarios

The preemptive, non-blocking QK kernel is specifically designed to execute non-blocking active objects. QK runs active objects in the same way as prioritized interrupt controller (such as NVIC in ARM Cortex-M) runs interrupts using the single stack (MSP on Cortex-M). This section explains how the preemptive non-blocking QK kernel works on ARM Cortex-M.

Remarks: In a QK port, the only components requiring platform-specific porting are QF and QV itself. The other two components: QEP and QS require merely recompilation and will not be discussed here. With the QV port you're not using the QV or QXK kernels. The QK port to ARM Cortex-M is located in the folder /ports/arm-cm/qk/.

Synopsis of the QK Port on ARM Cortex-M

The ARM Cortex-M architecture is designed primarily for the traditional real-time kernels that use multiple per-thread stacks. Therefore, implementation of the non-blocking, single-stack kernel like QK is a bit more involved on Cortex-M than other CPUs and works as follows:

The ARM Cortex-M processor executes the QK application code (active objects) in the Privileged Thread mode, which is exactly the mode entered out of reset. The exceptions (including all interrupts) are always processed in the Privileged Handler mode.
QK uses only the Main Stack Pointer (QK is a single stack kernel). The Process Stack Pointer is not used and is not initialized.
ARM Cortex-M enters interrupt context without disabling interrupts (without setting the PRIMASK bit or the BASEPRI register). Generally, you should not disable interrupts inside your ISRs. In particular, the QP services QF_PUBLISH(), QF_TICK_X(), and QACTIVE_POST() should be called with interrupts enabled, to avoid nesting of critical sections. (NOTE: If you don’t wish an interrupt to be preempted by another interrupt, you can always prioritize that interrupt in the NVIC to a higher level – use a lower numerical value of priority).
The QK port uses the PendSV exception (number 14) and the NMI exception (number 2) to perform asynchronous preemption and return to the preempted thread, respectively (see Chapter 10 in [PSiCC2]). The startup code must initialize the Interrupt Vector Table with the addresses of PendSV_Handler() and NMI_Handler() exception handlers.

NOTE: QK uses only the CMSIS-compliant exception and interrupt names, such as PendSV_Handler, NMI_Handler, etc.
NOTE: The QK port specifically does not use the SVC exception (Supervisor Call). This makes the QK ports compatible with various "hypervisors" (such as mbed uVisor or Nordic SoftDevice), which use the SVC exception.
The QF_init() function calls the function QK_init() to set the priority of the PendSV exception to the lowest level in the whole system (0xFF). The function QK_init() additionally sets the interrupt priority of all IRQs available in the MCU to the safe value of QF_BASEPRI (for ARM-v7 architecture).
It is strongly recommended that you do not assign the lowest priority (0xFF) to any interrupt in your application. With 3 MSB-bits of priority, this leaves the following 7 priority levels for you (listed from the lowest to the highest urgency): 0xC0, 0xA0, 0x80, 0x60, 0x40, 0x20, and 0x00 (the highest priority).
Before returning, every "kernel aware" ISR must check whether an active object has been activated that has a higher priority than the currently running active object. If this is the case, the ISR must set the PensSV pending flag in the NVIC. All this is accomplished in the macro QK_ISR_EXIT(), which must be called just before exiting every ISRs.
In ARM Cortex-M the whole prioritization of interrupts, including the PendSV exception, is performed entirely by the NVIC. Because the PendSV has the lowest priority in the system, the NVIC tail-chains to the PendSV exception only after exiting the last nested interrupt.
The pushing of the 8 registers comprising the ARM Cortex-M interrupt stack frame upon entry to NMI exception is wasteful in a single-stack kernel, but is necessary to perform full interrupt return to the preempted context through the NMI's return.

Preemption Scenarios in QK on ARM Cortex-M

The "synchronous preemption" occurs when one (low-priority) thread is preempted by another (high-priority) thread. QK handles this case as a regular function call. This function call happens inside the QActive post function:

Synchronous preemption scenario in QK

The "asynchronous preemption" occurs when an interrupt posts an event to a higher-priority thread than the currently executing. In ARM Cortex-M, this preemption is hanlded in the PendSV exception handler:

Asynchronous preemption scenarios in QK

0 The timeline begins with the QK executing the idle loop.
1 At some point an interrupt occurs and the CPU immediately suspends the idle loop, pushes the interrupt stack frame to the Main Stack and starts executing the ISR.
2 The ISR performs its work, and in QK always must call the QK_ISR_EXIT() macro, which calls the QK scheduler (QK_sched_()) to determine if there is a higher-priority AO to run. If so, the macro sets the pending flag for the PendSV exception in the NVIC. The priority of the PendSV exception is configured to be the lowest of all exceptions (0xFF), so the ISR continues executing while PendSV exception remains pending. At the ISR return, the ARM Cortex-M CPU performs tail-chaining to the pending PendSV exception.
3 The PendSV exception is entered via tail-chaining.
4 The job of the PendSV exception is to run the QK "activator" (QK_activate_()), which in turn runs all threads.

NOTE: The QK activator must run in the thread context, while PendSV executes in the exception context. The change of the context is accomplished by returning from the PendSV exception directly to the QK "activator". To return directly to the QK activator, PendSV synthesizes an exception stack frame, which contains the exception return address set to QK_activate_().

The QK "activator" enables interrupts and calls the Low-priority thread (a regular C function call). The Low-priority thread (active object) starts running.
5 Some time later a low-priority interrupt occurs. The Low-priority thread is suspended and the CPU pushes the interrupt stack frame to the Main Stack and starts executing the ISR.
6 Before the Low-priority ISR completes, it too gets preempted by a High-priority ISR. The CPU pushes another interrupt stack frame and starts executing the High-priority ISR.
7 The High-priority ISR sets the pending flag for the PendSV exception by means of the QK_ISR_EXIT() macro. When the High-priority ISR returns, the NVIC does not tail-chain to the PendSV exception, because a higher-priority ISR than PendSV is still active.
8 The NVIC performs an exception return to the preempted Low-priority interrupt, which finally completes.
9 Upon the exit from the Low-priority ISR, it too sets the pending flag for the PendSV exception by means of the QK_ISR_EXIT() macro. The PendSV is already pended from the High-priority interrupt, so pending is again is redundant, but it is not an error.
10 The NVIC performs tail-chaining to the PendSV exception.
11 The PendSV exception returns to the QK activator as previously described. The QK activator detects that the High-priority thread is ready to run and launches the High-priority thread (normal C-function call). The High-priority thread runs to completion and returns to the activator.
12 The QK activator does not find any more higher-priority threads to execute and needs to return to the preempted thread. The only way to restore the interrupted context in ARM Cortex-M is through the interrupt return, but the thread is executing outside of the interrupt context (in fact, threads are executing in the Privileged Thread mode). The thread enters the Handler mode by pending the NMI exception.

NOTE: The NMI exception is pended while interrupts are still disabled. This is not a problem, because NMI cannot be masked by disabling interrupts, so it runs without any problems.
13 The only job of the NMI exception is to discard its own interrupt stack frame, re-enable interrupts, and return using the interrupt stack frame that has been on the stack from the moment of thread preemption.
14 The Low-priority thread, which has been preempted all that time, resumes and finally runs to completion and returns to the QK activator. The QK activaotr does not find any more threads to launch and causes the NMI exception to return to the preempted thread.
15 The NMI exception discards its own interrupt stack frame and returns using the interrupt stack frame from the preempted thread context

The qf_port.h Header File

The QF header file for the ARM Cortex-M port is located in /ports/arm-cm/qk/gnu/qf_port.h. This file is almost identical to the QV port, except the header file in the QK port includes qk_port.h header file instead of qv_porth. The most important function of qk_port.h is specifying interrupt entry and exit.

Note: As any preemptive kernel, QK needs to be notified about entering the interrupt context and about exiting an interrupt context in order to perform a context switch, if necessary.

Listing: qk_port.h header file for ARM Cortex-M

    /* determination if the code executes in the ISR context */
[1] #define QK_ISR_CONTEXT_() (QK_get_IPSR() != (uint32_t)0)
 
    __attribute__((always_inline))
[2] static inline uint32_t QK_get_IPSR(void) {
        uint32_t regIPSR;
        __asm volatile ("mrs %0,ipsr" : "=r" (regIPSR));
        return regIPSR;
    }
 
    /* QK interrupt entry and exit */
[3] #define QK_ISR_ENTRY() ((void)0)
 
[4] #define QK_ISR_EXIT()  do { \
[5]     QF_INT_DISABLE(); \
[6]     if (QK_sched_() != (uint_fast8_t)0) { \
[7]        (*Q_UINT2PTR_CAST(uint32_t, 0xE000ED04U) = (uint32_t)(1U << 28)); \
        } \
[8]     QF_INT_ENABLE(); \
    } while (0)
 
    /* initialization of the QK kernel */
[9] #define QK_INIT() QK_init()
    void QK_init(void);
 
    #include "qk.h" /* QK platform-independent public interface */

1 The macro QK_ISR_CONTEXT() returns true when the code executes in the ISR context and false otherwise. The macro takes advantage of the ARM Cortex-M register IPSR, which is non-zero when the CPU executes an exception (or interrupt) and is zero when the CPU is executing thread code.

NOTE: QK needs to distinguish between ISR and thread contexts, because threads need to perform synchronous context switch (when a higher-priority thread becomes ready to run), while ISRs should not do that.
2 The inline function QK_get_IPSR() obtains the IPSR register and returns it to the caller. This function is defined explicitly for the GNU-ARM toolchain, but many other toolchains provide this function as an intrinsic, built-in facility.
3 The QK_ISR_ENTRY() macro notifies QK about entering an ISR. The macro is empty, because the determination of the ISR vs thread context is performed independently in the QK_ISR_CONTEXT() macro (see above).
4 The QK_ISR_EXIT() macro notifies QK about exiting an ISR.
5 Interrupts are disabled before calling QK scheduler.
6 The QK scheduler is called to find out whether an active object of a higher priority than the current one needs activation. The QK_sched_() function returns non zero value if this is the case.
7 If asynchronous preemption becomes necessary, the code sets the PENDSV Pend bit(28) in the ICSR register (Interrupt Control and State Register). The register is mapped at address 0xE000ED04 in all ARM Cortex-M cores.
8 The interrupts are re-enabled after they have been disabled in step [5].

NOTE: Because the priority of the PendSV exception is the lowest of all interrupts, it is actually triggered only after all nested interrupts exit. The PendSV exception is then entered through the efficient tail-chaining process, which eliminates the restoring and re-entering the interrupt context.

QK Port Implementation for ARM Cortex-M

The QK port to ARM Cortex-M requires coding the PendSV and NMI exceptions in assembly. This ARM Cortex-M-specific code, as well as QK initialization (QK_init()) is located in the file ports/arm-cm/qk/gnu/qk_port.c

Note: The single assembly module qk_port.s contains common code for all Cortex-M variants (Architecture v6M and v7M) as well as options with and without the VFP. The CPU variants are distinguished by conditional compilation, when necessary.

QK_init() Implementation

Listing: QK_init() function in qk_port.c file

[1] void QK_init(void) {
 
[2] #if (__ARM_ARCH != 6) /* NOT Cortex-M0/M0+/M1 (v6-M, v6S-M)? */
 
        uint32_t n;
 
        /* set exception priorities to QF_BASEPRI...
        * SCB_SYSPRI1: Usage-fault, Bus-fault, Memory-fault
        */
[3]     SCB_SYSPRI[1] |= (QF_BASEPRI << 16) | (QF_BASEPRI << 8) | QF_BASEPRI;
 
        /* SCB_SYSPRI2: SVCall */
[4]     SCB_SYSPRI[2] |= (QF_BASEPRI << 24);
 
        /* SCB_SYSPRI3:  SysTick, PendSV, Debug */
[5]     SCB_SYSPRI[3] |= (QF_BASEPRI << 24) | (QF_BASEPRI << 16) | QF_BASEPRI;
 
        /* set all implemented IRQ priories to QF_BASEPRI... */
[6]     n = 8U + ((*SCnSCB_ICTR & 0x7U) << 3); /* (# NVIC_PRIO registers)/4 */
        do {
            --n;
[7]         NVIC_IP[n] = (QF_BASEPRI << 24) | (QF_BASEPRI << 16)
                         | (QF_BASEPRI << 8) | QF_BASEPRI;
        } while (n != 0);
 
    #endif /* NOT Cortex-M0/M0+/M1(v6-M, v6S-M) */
 
        /* SCB_SYSPRI3: PendSV set to the lowest priority 0xFF */
[8]     SCB_SYSPRI[3] |= (0xFFU << 16);
    }

1 The QK_init() function is called from QF_init() to perform initialization specific to the QK kernel.
2 If the ARM Architecture is NOT v6 (Cortex-M0/M0+), that is for Cortex-M3/M4/M7, the function initializes the exception priorities of PendSV and NMI as well as interrupt priorities of all IRQs available in a given MCU. (NOTE: for Cortex-M0/M0+, this initialization is not needed, as the CPU does not support the BASEPRI register and the only way to disable interrupts is via the PRIMASK register. In this case, all interrupts are "kernel-aware" and there is no need to initialize interrupt priorities to a safe value.
3 Exception priorities of Usage-fault, Bus-fault, and Memory-fault are set to QF_BASEPRI.
4 Exception priorities of SVCall is set to QF_BASEPRI.
5 Exception priorities of SysTick, PendSV and Debug are set to QF_BASEPRI.

NOTE: the exception priority of PedSV is later changed to 0xFF in step [8]
6 The number of implemented interrupts is extraced fom SCnSCB_ICTR register.
7 Exception priorities of all implemented interrupts are set to QF_BASEPRI.
8 Exception priority of PendSV is set to 0xFF, which is the lowest interrupt priority in the system.

PendSV_Handler() Implementation

Listing: PendSV_Handler() and Thread_ret() functions in qk_port.c file

 [1] __attribute__ ((naked))
 [2] void PendSV_Handler(void) {
 [3] __asm volatile (
 
         /* Prepare constants in registers before entering critical section */
 [4]     "  LDR     r3,=" STRINGIFY(NVIC_ICSR) "\n" /* Interrupt Control and State */
 [5]     "  MOV     r1,#1            \n"
 [6]     "  LSL     r1,r1,#27        \n" /* r0 := (1 << 27) (UNPENDSVSET bit) */
 
         /*<<<<<<<<<<<<<<<<<<<<<<< CRITICAL SECTION BEGIN <<<<<<<<<<<<<<<<<<<<<<<<*/
     #if (__ARM_ARCH == 6)               /* Cortex-M0/M0+/M1 (v6-M, v6S-M)? */
 [7]     "  CPSID   i                \n" /* disable interrupts (set PRIMASK) */
     #else                               /* M3/M4/M7 */
     #if (__ARM_FP != 0)                 /* if VFP available... */
 [8]     "  PUSH    {r0,lr}          \n" /* ... push lr plus stack-aligner */
     #endif                              /* VFP available */
 [9]     "  MOV     r0,#" STRINGIFY(QF_BASEPRI) "\n"
[10]     "  CPSID   i                \n" /* disable interrutps with BASEPRI */
[11]     "  MSR     BASEPRI,r0       \n" /* apply the Cortex-M7 erraturm */
[12]     "  CPSIE   i                \n" /* 837070, see ARM-AT610-611. */
     #endif                              /* M3/M4/M7 */
 
         /* The PendSV exception handler can be preempted by an interrupt,
         * which might pend PendSV exception again. The following write to
         * ICSR[27] un-pends any such spurious instance of PendSV.
         */
[13]     "  STR     r1,[r3]          \n" /* ICSR[27] := 1 (unpend PendSV) */
 
         /* The QK activator must be called in a Thread mode, while this code
         * executes in the Handler mode of the PendSV exception. The switch
         * to the Thread mode is accomplished by returning from PendSV using
         * a fabricated exception stack frame, where the return address is
         * QK_activate_().
         *
         * returns with interrupts DISABLED.
         * NOTE: the QK activator is called with interrupts DISABLED and also
         */
[14]     "  LSR     r3,r1,#3         \n" /* r3 := (r1 >> 3), set the T bit (new xpsr) */
[15]     "  LDR     r2,=QK_activate_ \n" /* address of QK_activate_ */
[16]     "  SUB     r2,r2,#1         \n" /* align Thumb-address at halfword (new pc) */
 
[17]     "  LDR     r1,=Thread_ret   \n" /* return address after the call   (new lr) */
[18]     "  SUB     sp,sp,#8*4       \n" /* reserve space for exception stack frame */
[19]     "  ADD     r0,sp,#5*4       \n" /* r0 := 5 registers below the SP */
[20]     "  STM     r0!,{r1-r3}      \n" /* save xpsr,pc,lr */
 
[21]     "  MOV     r0,#6            \n"
[22]     "  MVN     r0,r0            \n" /* r0 := ~6 == 0xFFFFFFF9 */
[23]     "  BX      r0               \n" /* exception-return to the QK activator */
         );
     }
 
     /****************************************************************************/
     __attribute__ ((naked))
[24] void Thread_ret(void) {
     __asm volatile (
 
         /* After the QK activator returns, we need to resume the preempted
         * thread. However, this must be accomplished by a return-from-exception,
         * while we are still in the thread context. The switch to the exception
         * context is accomplished by triggering the NMI exception.
         * NOTE: The NMI exception is triggered with nterrupts DISABLED,
         * because QK activator disables interrutps before return.
         */
 
         /* before triggering the NMI exception, make sure that the
         * VFP stack frame will NOT be used...
         */
     #if (__ARM_FP != 0)                 /* if VFP available... */
[25]     "  MRS     r0,CONTROL       \n" /* r0 := CONTROL */
[26]     "  BICS    r0,r0,#4         \n" /* r0 := r0 & ~4 (FPCA bit) */
[27]     "  MSR     CONTROL,r0       \n" /* CONTROL := r0 (clear CONTROL[2] FPCA bit) */
[28]     "  ISB                      \n" /* ISB after MSR CONTROL (ARM AN321,Sect.4.16) */
     #endif                              /* VFP available */
 
         /* trigger NMI to return to preempted task...
         * NOTE: The NMI exception is triggered with nterrupts DISABLED
         */
[29]     "  LDR     r0,=0xE000ED04   \n" /* Interrupt Control and State Register */
[30]     "  MOV     r1,#1            \n"
[31]     "  LSL     r1,r1,#31        \n" /* r1 := (1 << 31) (NMI bit) */
[32]     "  STR     r1,[r0]          \n" /* ICSR[31] := 1 (pend NMI) */
[33]     "  B       .                \n" /* wait for preemption by NMI */
         );
     }

1 Attribute naked means that the GNU-ARM compiler won't generate any entry/exit code for this function.
2 PendSV_Handler is a CMSIS-complinat name of the PendSV exception handler. The PendSV_Handler exception is always entered via tail-chaining from the last nested interrupt.
3 Entire body of this function will be defined in this one inline-assembly instruction.
4,5,6 Before interrupts are disabled, the following constants are loaded into registers: address of ICSR into r3 and (1<<27) into r1.

For the ARMv6-M architecture (Cortex-M0/M0+)...

7 Interrupts are globally disabled by setting PRIMASK (see Section 3)

Otherwise, for the ARMv7-M architecture (Cortex-M3/4/7) and when the __ARM_FP macro is defined...

NOTE: The symbol __ARM_FP is defined by the GNU-ARM compiler when the compile options indicate that the ARM FPU is used.

8 The lr register (EXC_RETURN) is pushed to the stack along with r0, to keep the stack aligned at 8-byte boundary.

NOTE: In the presence of the FPU (Cortex-M4F/M7), the EXC_RETURN[4] bit carries the information about the stack frame format used, whereas EXC_RETURN[4] ==0 means that the stack contains room for the S0-S15 and FPSCR registers in addition to the usual R0-R3,R12,LR,PC,xPSR registers. This information must be preserved, in order to properly return from the exception at the end.
9 For the ARMv7-M architecture (Cortex-M3/M4), interrupts are selectively disabled by setting the BASEPRI register.

NOTE: The value moved to BASEPRI must be identical to the QF_BASEPRI macro defined in qf_port.h.
10 Before setting the BASEPRI register, interrupts are disabled with the PRIMASK register, which is the recommended workaround for the Cortex-M7 r0p1 hardware bug, as described in the ARM Ltd. [ARM-AT610-611], Erratum 837070.
11 The BASEPRI register is set to the QF_BASEPRI value.
12 After setting the BASEPRI register, interrupts are re-anabed with the PRIMASK register, which is the recommended workaround for the Cortex-M7 r0p1 hardware bug, as described in the ARM Ltd. [ARM-AT610-611], Erratum 837070.
13 The PendSV exception is explicitly un-pended.

NOTE: The PendSV exception handler can be preempted by an interrupt, which might pend PendSV exception again. This would trigger PendSV incorrectly again immediately after calling QK activator.

The following code [14-23] fabricates an exception stack frame, to perform an exception-return to the QK activator without destroying the original exception stack frame of the PendSV exception. This is necessary to preserve the context of the preempted code.

14 The value (1 << 24) is synthesized in r3 from the value (1 << 27) already available in r1. This value is going to be stacked and later restored to xPSR register (only the T bit set).
15 The address of the QK activator function QK_activate_() is loaded into r2. This will be pushed to the stack as the PC register value.
16 The address of the QK activator function QK_activate_() in r2 is adjusted to be half-word aligned instead of being an odd THUMB address.

NOTE: This is necessary, because the value will be loaded directly to the PC, which cannot accept odd values.
17 The address of the Thread_ret() function is loaded into r1. This will be pushed to the stack as the lr register value.

NOTE: The address of the Thread_ret label must be a THUMB address, that is, the least-significant bit of this address must be set (this address must be odd number). This is essential for the correct return of the QK activator with setting the THUMB bit in the PSR. Without the LS-bit set, the ARM Cortex-M CPU will clear the T bit in the PSR and cause the Hard Fault. The GNU-ARM assembler/linker will synthesize the correct THUMB address of the svc_ret label only if this label is declared with the .type Thread_ret , function attribute (see step [23]).
18 The stack pointer is adjusted to leave room for 8 registers.
19 The top of stack, adjusted by 5 registers, (r0, r1, r2, r3, and r12) is stored to r0.

20 The values of xpsr, pc, and lr prepared in r3, r2, and r1, respectively, are pushed on the top of stack (now in r0). This operation completes the synthesis of the exception stack frame. After this step the stack looks as follows:

Hi memory
           (optionally S0-S15, FPSCR), if EXC_RETURN[4]==0
           xPSR
           pc (interrupt return address)
           lr
           r12
           r3
           r2
           r1
           r0
           EXC_RETURN (pushed in step [7] if FPU is present)
old SP --> "aligner"  (pushed in step [7] if FPU is present)
           xPSR == 0x01000000
           PC == QK_activate_
           lr == Thread_ret
           r12  don't care
           r3   don't care
           r2   don't care
           r1   don't care
    SP --> r0   don't care
Low memory

21-22 The special exception-return value 0xFFFFFFF9 is synthesized in r0 (two instructions are used to make the code compatible with Cortex-M0, which has no barrel shifter).

NOTE: the r0 register is used instead of lr because the Cortex-M0 instruction set cannot manipulate the higher-registers (r9-r15). NOTE: The exception-return value is consistent with the synthesized stack-frame with the lr[4] bit set to 1, which means that the FPU registers are not included in this stack frame.
23 PendSV exception returns using the special value of the r0 register of 0xFFFFFFF9 (return to Privileged Thread mode using the Main Stack pointer). The synthesized stack frame causes actually a function call to QK_sched_ function in C.

NOTE: The return from the PendSV exception just executed switches the ARM Cortex-M core to the Privileged Thread mode. The QK_sched_ function internally re-enables interrupts before launching any thread, so the threads always run in the Thread mode with interrupts enabled and can be preempted by interrupts of any priority. NOTE: In the presence of the FPU, the exception-return to the QK activator does not change any of the FPU status bit, such as CONTROL.FPCA or LSPACT.
24 The Thread_ret function is the place, where the QK activator QK_activate_() returns to, because this return address is pushed to the stack in step [16]. Please note that the address of the Thread_ret label must be a THUMB address.
25-28 If the FPU is present, the read-modify-write code clears the CONTROL[2] bit [2]. This bit, called CONTROL.FPCA (Floating Point Active), would cause generating the FPU-type stack frame, which you want to avoid in this case (because the NMI exception will certainly not use the FPU).

NOTE: Clearing the CONTROL.FPCA bit occurs with interrupts disabled, so it is protected from a context switch.
28-32 The asynchronous NMI exception is triggered by setting ICSR[31]. The job of this exception is to put the CPU into the exception mode and correctly return to the thread level.
33 This endless loop should not be reached, because the NMI exception should preempt the code immediately after step [31]

NMI_Handler() Implementation

Listing: NMI_Handler() function in qk_port.c file

    __attribute__ ((naked))
[1] void NMI_Handler(void) {
    __asm volatile (
 
[2]     "  ADD     sp,sp,#(8*4)     \n" /* remove one 8-register exception frame */
 
    #if (__ARM_ARCH == 6)               /* Cortex-M0/M0+/M1 (v6-M, v6S-M)? */
[3]     "  CPSIE   i                \n" /* enable interrupts (clear PRIMASK) */
[4]     "  BX      lr               \n" /* return to the preempted task */
    #else                               /* M3/M4/M7 */
[5]     "  MOV     r0,#0            \n"
[6]     "  MSR     BASEPRI,r0       \n" /* enable interrupts (clear BASEPRI) */
    #if (__ARM_FP != 0)                 /* if VFP available... */
[7]     "  POP     {r0,pc}          \n" /* pop stack aligner and EXC_RETURN to PC */
    #else                               /* no VFP */
[8]     "  BX      lr               \n" /* return to the preempted task */
    #endif                              /* no VFP */
    #endif                              /* M3/M4/M7 */
        );
    }

1 The NMI_Handler is the CMSIS-compliant name of the NMI exception handler. This exception is triggered after returning from the QK activator in step [31] of the previous listing. The job of NMI is to discard its own stack frame and cause the exception-return to the original preempted thread context. The stack contents just after entering NMI is shown below:

Hi memory
           (optionally S0-S15, FPSCR), if EXC_RETURN[4]==0
           xPSR
           pc (interrupt return address)
           lr
           r12
           r3
           r2
           r1
           r0
old SP --> EXC_RETURN (pushed in PendSV [7] if FPU is present)
           "aligner"  (pushed in PendSV [7] if FPU is present)
           xPSR don't care
           PC   don't care
           lr   don't care
           r12  don't care
           r3   don't care
           r2   don't care
           r1   don't care
    SP --> r0   don't care
Low memory

2 The stack pointer is adjusted to un-stack the 8 registers of the interrupt stack frame corresponding to the NMI exception itself. This moves the stack pointer from the "old SP" to "SP" in the picture above, which "uncovers" the original exception stack frame left by the PendSV exception.
3 For ARMv6-M, interrupts are enabled by clearing the PRIMASK.
4 For ARMv6-M, The NMI exception returns to the preempted thread using the standard EXC_RETURN, which is in lr.
5-6 For the ARMv7-M, interrupts are enabled by writing 0 into the BASEPRI register.
7 If the FPU is used, the EXC_RETURN and the "stack aligner" saved in PendSV step [7] are popped from the stack into r0 and pc, respectively. Updating the pc causes the return from PendSV.
8 Otherwise, NMI returns to the preempted thread using the standard EXC_RETURN, which is in lr.

Detailed stack allocation in QK for ARM Cortex-M

Writing ISRs for QK

The ARM Cortex-M CPU is designed to use regular C functions as exception and interrupt service routines (ISRs).

Note: The ARM EABI (Embedded Application Binary Interface) requires the stack be 8-byte aligned, whereas some compilers guarantee only 4-byte alignment. For that reason, some compilers (e.g., GNU-ARM) provide a way to designate ISR functions as interrupts. For example, the GNU-ARM compiler provides the __attribute__((__interrupt__)) designation that will guarantee the 8-byte stack alignment.

Typically, ISRs are application-specific (with the main purpose to produce events for active objects). Therefore, ISRs are not part of the generic QP port, but rather part of the BSP (Board Support Package).

The following listing shows an example of the SysTick_Handler() ISR (from the DPP example application). This ISR calls the QF_TICK_X() macro to perform QF time-event management.

Listing: An ISR header for QK

    void SysTick_Handler(void) __attribute__((__interrupt__));
    void SysTick_Handler(void) {
         ~ ~ ~
[1]      QK_ISR_ENTRY();   /* inform QK about entering an ISR */
         ~ ~ ~
         QF_TICK_X(0U, &l_SysTick_Handler); /* process all armed time events */
         ~ ~ ~
[2]      QK_ISR_EXIT();    /* inform QK about exiting an ISR */
    }

1 Every ISR for QK must call QK_ISR_ENTRY() before calling any QP API
2 Every ISR for QK must call QK_ISR_EXIT() right before exiting to let the QK kernel schedule an asynchronous preemption, if necessary.

Note: The QK port to ARM Cortex-M complies with the requirement of the ARM-EABI to preserve stack pointer alignment at 8-byte boundary. Also, all QP examples for ARM Cortex-M comply with the CMSIS naming convention for all exception handlers and IRQ handlers.

Using the FPU in the QK Port (Cortex-M4F/M7)

If you have the Cortex-M4F CPU and your application uses the hardware FPU, it should be enabled because it is turned off out of reset. The CMSIS-compliant way of turning the FPU on looks as follows:

    SCB->CPACR |= (0xFU << 20);

Note: The FPU must be enabled before executing any floating point instruction. An attempt to execute a floating point instruction will fault if the FPU is not enabled.

Depending on wheter or not you use the FPU in your ISRs, the "Vanilla" QP port allows you to configure the FPU in various ways, as described in the following sub-sections.

FPU used in ONE thread only and not in any ISR

If you use the FPU only at a single thread (active object) and none of your ISRs use the FPU, you can setup the FPU not to use the automatic state preservation and not to use the lazy stacking feature as follows:

    FPU->FPCCR &= ~((1U << FPU_FPCCR_ASPEN_Pos) | (1U << FPU_FPCCR_LSPEN_Pos));

With this setting, the Cortex-M4F processor handles the ISRs in the exact-same way as Cortex-M0-M3, that is, only the standard interrupt frame with R0-R3,R12,LR,PC,xPSR is used. This scheme is the fastest and incurs no additional CPU cycles to save and restore the FPU registers.

Note: This FPU setting will lead to FPU errors, if more than one thread or any of the ISRs indeed start to use the FPU

FPU used in more than one thread only or the ISR

If you use the FPU in more than one of the threads (active objects) or in any of your ISRs, you should setup the FPU to use the automatic state preservation and the lazy stacking feature as follows:

FPU->FPCCR |= (1U << FPU_FPCCR_ASPEN_Pos) | (1U << FPU_FPCCR_LSPEN_Pos);

This is actually the default setting of the hardware FPU and is recommended for the QK port, because it is safer in view of code evolution. Future changes to the application can easily introduce FPU use in multiple active objects, which would be unsafe if the FPU context was not preserved automatically.

Note: As described in the ARM Application Note "Cortex-M4(F) Lazy Stacking and Context Switching" [ARM-AN298], the FPU automatic state saving requires more stack plus additional CPU time to save the FPU registers, but only when the FPU is actually used.

QK Idle Processing Customization in QK_onIdle()

QK can very easily detect the situation when no events are available, in which case QK calls the QK_onIdle() callback. You can use QK_onIdle() to suspended the CPU to save power, if your CPU supports such a power-saving mode. Please note that QK_onIdle() is called repetitively from an endless loop, which is the QK idle-thread. The QK_onIdle() callback is called with interrupts enabled (which is in contrast to the QV_onIdle() callback used in the non-preemptive configuration).

The THUMB-2 instruction set used exclusively in ARM Cortex-M provides a special instruction WFI (Wait-for-Interrupt) for stopping the CPU clock, as described in the "ARMv7-M Reference Manual" [ARM 06a]. The following listing shows the QK_onIdle() callback that puts ARM Cortex-M into a low-power mode.

Listing: QV_onIdle() for ARM Cortex-M

[1] void QK_onIdle(void) {
         ~ ~ ~
[2] #if defined NDEBUG
        /* Put the CPU and peripherals to the low-power mode.
        * you might need to customize the clock management for your application,
        * see the datasheet for your particular Cortex-M3 MCU.
        */
[3]     __WFI(); /* Wait-For-Interrupt */
    #endif
    }

1 The preemptive QK kernel calls the QK_onIdle() callback with interrupts enabled.
2 The sleep mode is used only in the non-debug configuration, because sleep mode stops CPU clock, which can interfere with debugging.
3 The WFI instruction is generated using inline assembly.

Testing QK Preemption Scenarios

The bsp.c file included in the examples/arm-cm/dpp_ek-tm4c123gxl/qk directory contains special instrumentation (an ISR designed for testing) for convenient testing of various preemption scenarios in QK.

The technique described in this section will allow you to trigger an interrupt at any machine instruction and observe the preemption it causes. The interrupt used for the testing purposes is the GPIOA interrupt (INTID == 0). The ISR for this interrupt is shown below:

void GPIOPortA_IRQHandler(void) {
    QK_ISR_ENTRY(); /* inform QK about entering an ISR */
    QACTIVE_POST(AO_Table, Q_NEW(QEvt, MAX_PUB_SIG), /* for testing... */
                 &l_GPIOPortA_IRQHandler);
    QK_ISR_EXIT();  /* inform QK about exiting an ISR */
}

GPIOPortA_IRQHandler(), as all interrupts in the system, invokes the macros QK_ISR_ENTRY() and QK_ISR_EXIT(), and also posts an event to the Table active object, which has higher priority than any of the Philo active object.

The figure below hows how to trigger the GPIOA interrupt from the CCS debugger. From the debugger you need to first open the register window and select NVIC registers from the drop-down list (see right-bottom corner of Figure 6).You scroll to the NVIC_SW_TRIG register, which denotes the Software Trigger Interrupt Register in the NVIC. This write-only register is useful for software-triggering various interrupts by writing various masks to it. To trigger the GPIOA interrupt you need to write 0x00 to the NVIC_SW_TRIG by clicking on this field, entering the value, and pressing the Enter key.

Triggering the GPIOA interrupt from Eclipse debugger

The general testing strategy is to break into the application at an interesting place for preemption, set breakpoints to verify which path through the code is taken, and trigger the GPIO interrupt. Next, you need to free-run the code (don’t use single stepping) so that the NVIC can perform prioritization. You observe the order in which the breakpoints are hit. This procedure will become clearer after a few examples.

Interrupt Nesting Test

The first interesting test is verifying the correct tail-chaining to the PendSV exception after the interrupt nesting occurs, as shown in Synchronous Preemption in QK. To test this scenario, you place a breakpoint inside the GPIOPortA_IRQHandler() and also inside the SysTick_Handler() ISR. When the breakpoint is hit, you remove the original breakpoint and place another breakpoint at the very next machine instruction (use the Disassembly window) and also another breakpoint on the first instruction of the QK_PendSV handler. Next you trigger the PIOINT0 interrupt per the instructions given in the previous section. You hit the Run button.

The pass criteria of this test are as follows:

The first breakpoint hit is the one inside the GPIOPortA_IRQHandler() function, which means that GPIO ISR preempted the SysTick ISR.
The second breakpoint hit is the one in the SysTick_Handler(), which means that the SysTick ISR continues after the PIOINT0 ISR completes.
The last breakpoint hit is the one in PendSV_Handler() exception handler, which means that the PendSV exception is tail-chained only after all interrupts are processed. You need to remove all breakpoints before proceeding to the next test.

Thread Preemption Test

The next interesting test is verifying that threads can preempt each other. You set a breakpoint anywhere in the Philosopher state machine code. You run the application until the breakpoint is hit. After this happens, you remove the original breakpoint and place another breakpoint at the very next machine instruction (use the Disassembly window). You also place a breakpoint inside the GPIOPortA_IRQHandler() interrupt handler and on the first instruction of the PendSV_Handler() handler. Next you trigger the GPIOA interrupt per the instructions given in the previous section. You hit the Run button.

The pass criteria of this test are as follows:

The first breakpoint hit is the one inside the GPIOPortA_IRQHandler() function, which means that GPIO ISR preempted the Philo thread.
The second breakpoint hit is the one in PendSV_Handler() exception handler, which means that the PendSV exception is activated before the control returns to the preempted Philosopher thread.
After hitting the breakpoint in PendSV_Handler(), you single step into QK_activate_(). You verify that the activator invokes a state handler from the Table state machine. This proves that the Table thread preempts the Philo thread.
After this you free-run the application and verify that the next breakpoint hit is the one inside the Philosopher state machine. This validates that the preempted thread continues executing only after the preempting thread (the Table state machine) completes.

Testing the FPU (Cortex-M4F/M7)

In order to test the FPU, the Board Support Package (BSP) for the Cortex-M4F EK-TM4C123GXL board uses the FPU in the following contexts:

In the idle loop via the QK_onIdle() callback (QP priority 0)
In the thread level via the BSP_random() function called from all five Philo active objects (QP priorities 1-5).
In the thread level via the BSP_displayPhiloStat() function called from the Table active object (QP priorty 6)
In the ISR level via the SysTick_Handler() ISR (priority above all threads)

To test the FPU, you could step through the code in the debugger and verify that the expected FPU-type exception stack frame is used and that the FPU registers are saved and restored by the "lazy stacking feature" when the FPU is actually used.

Next, you can selectively comment out the FPU code at various levels of priority and verify that the QK context switching works as expected with both types of exception stak frames (with and without the FPU).

Other Tests

Other interesting tests that you can perform include changing priority of the GPIOA interrupt to be lower than the priority of SysTick to verify that the PendSV is still activated only after all interrupts complete.

In yet another test you could post an event to Philosopher active object rather than Table active object from the GPIOPortA_IRQHandler() function to verify that the QK activator will not preempt the Philosopher thread by itself. Rather the next event will be queued and the Philosopher thread will process the queued event only after completing the current event processing.

Next: Preemptive "Dual-Mode" QXK Kernel

Table of Contents