Introduction to Cortex Serial Wire Debugging
Serial Wire Debug (SWD) provides a debug port for severely pin limited packages,
often the case for small package microcontrollers but also complex ASICs
where limiting pin-count is critical and can be the controlling factor in device costs.
SWD replaces the 5-pin JTAG port with a clock + single bi-directional data pin,
providing all the normal JTAG debug and test functionality plus real-time access to system memory
without halting the processor or requiring any target resident code.
SWD uses an ARM standard bi-directional wire protocol, defined in the ARM Debug Interface v5,
to pass data to and from the debugger and the target system in a highly efficient and standard way.
As a standard interface for ARM processor-based devices,
the software developer can count on a wide choice of interoperable tools
from ARM and third party tool vendors.
Host : Falling Edge Write, Falling Edge Read
Target : Rising Edge Read, Rising Edge Write
SWD provides an easy and risk free migration from JTAG as the two signals,
SWDIO and SWCLK, are overlaid on the TMS and TCK pins,
allowing for bi-modal devices that provide the other JTAG signals.
These extra JTAG pins are available for other uses when in SWD mode.
SWD is compatible with all ARM processors and any processor
using JTAG for debug and provides access to debug registers in Cortex™ processors (A,R,M)
and the CoreSight debug infrastructure.
JTAG to SWD switching
The SW-DP must use connection sequence of 50 clock cycles with data = 1.
And this sequence is also used as a line reset sequence which requires 50 consecutive 1s on the data input.
SWJ-DP enables either an SWD or JTAG protocol to be used on the debug port.
To do this, it implements a watcher circuit that detects a specific 16-bit selection sequence on the SWDIOTMSpin:
The 16-bit JTAG-to-SWD select sequence is defined to be 0b0111100111100111, MSB first.
This can be represented as 16'h79E7 if transmitted MSB first or 16'hE79E if transmitted LSB first.
The host must read IDCODE register after line request sequence.
This requirement gives confirmation that correct packet frame alignment has been achieved.
SWD Protocol Each successful SWD transfer consists of 3 parts:
• A header (always from the external debugger)
• An acknowledgement from the target (provided it recognises the header)
• A data payload, the direction of which is determined by the header.
A write transfer is shown in Figure 3 below.
A start bit is used to enable the line to idle, a period when the clock can be stopped or free running
(note that the clock also does not need to have a set frequency).
One bit defines the transfer as an AP access or a DAP access,
the direction of data on the SWD interface is provided,
and 2 address bits are given.
These address bits allow a sequence of AP accesses to use the 4 registers
in a bank of a specific AP without having to change the AP select register in the DP.
A parity bit and a stop bit are added to provide some tolerance to data corruption and hot plugging.
The header ends by driving the line high, where it should be held by a pull-up.
After the header, the target will respond (after a single cycle)
giving an indication of the status of the interface, and if the acknowledgement matches the OK pattern,
write data is sent with a parity bit.
A successful read transfer is similar, as shown in Figure 4.
The turn-round cycle (TRN in the diagrams) is placed after the data phase for a read,
as there is no change of direction between ACK and RDATA.
For both reads and writes, the packet is 46 clock cycles, with a payload of 32 bits.
In situations where the debugger hardware does not permit analysis and reaction to the ACK bits
(for example an ASIC vector replay tester, or a simple device on a high latency interface),
the packet timing can be fixed with these 46 cycle frames.
Improved bandwidth efficiency can be obtained in the normal mode of operation,
where the data phase is only present after an ‘OK’ acknowledge phase.
This mode uses a shorter packet, as shown in Figure 5 below
if the debug port is not yet ready to receive a new transaction.
This response indicates to the debugger that the debug port is still active,
and that the communication link is operating, but there is an outstanding transfer which has not completed.
This packet is 13 cycles long, and reduces the bandwidth penalty of performing debug accesses
which are faster than the target is able to accept them.
It is important to note that the protocol is optimised for performing blocks of transfers,
and both read and write data are buffered.
When a read transfer is issued on the SWD interface, the response will be the result from the previous read.
Thus to read an ASIC memory location, typically 3 transfers are necessary:
- • Write to the AP’s Transfer Address Register with the target address
- • Read the AP’s Data Transfer Register to initiate the transaction.
- • Read a benign register (DP status for example) to return the required target data.
Similarly, if it is necessary to determine that a write access to the system has completed,
that write has to be followed by a DP access, which can return a WAIT response if the write is still in progress.
SWD has a similar packet format which is used when a sticky bit is set.
This uses the FAULT response as shown in Figure 6 below.
The FAULT response indicates that the link is still active,
but the debug port will only respond to a read of its ID or Status registers,
or a write to its ABORT register which is used to clear the state of any sticky bits once they have been read.
Using these responses, the status of the debug interface and the status of the debug infrastructure are separated,
making it possible for a debug session to remain connected when system clocks are stopped.
If there is a fault and access via a system port becomes deadlocked,
the active AP can be instructed to terminate its transaction on the DAP bus.
In addition to allowing the AP to be interrogated and its state determined,
this frees up any remaining debug infrastructure in the target device giving the possibility of probing the deadlock scenario,
possibly giving valuable insight into the cause of the lockup.
Conclusion
The development of the Serial Wire Debug interface standard and protocol
has provided a reduced pin count alternative to JTAG,
which has the additional benefits of higher performance
and error detection that a packet-based communication protocol can bring.
Serial Wire Debug is an effective mechanism for accessing a modern busbased debug and trace design,
the packet nature of the communication being wellmatched with this bus-based architecture.
References 1.
http://www.arm.com/products/solutions/CoreSight.html
3.2.1. Packet request phase
The request phase consists of 8 bits. The meaning of each bit in the request is illustrated below.
Start
-- A single start bit, with value 1.APnDP
-- A single bit, indicating whether the Debug Port or the Access Port Access Register is to be accessed. 1 for accessing AP.RnW
-- A single bit, this bit is 0 for an write access, or 1 for a read access.A[2:3]
-- Two bits, giving the A[3:2] address field for the DP or AP register address.Parity
-- A single odd parity bit is made over the APnDP, RnW and A[2:3] bits.
The number of bits set to 1 is odd, then the parity bit is set to 1.Stop
-- A single stop bit. In the synchronousSWD protocol this is always 0.Park
-- A single bit, with value 1.- SWD Line : Start : APnDP : RnW : A[2] : A[3] : Parity : Stop : Park
3.2.2. Acknowledge response phase
ACK phase is a three-bit(LSB-first) target to host response. Three type ACK response.
OK Response
-- It indicates successful operation, value is b001. : SWD Line : 1 0 0WAIT response
-- The host must retry the operation later, value is b010. : SWD Line : 0 1 0FAULT response
-- If the target responds with FAULT, an error has occurred and one of the sticky bits in CTRL/STAT is set.
The host can check the sticky error bits to see what kind of error has occurred.
It must clear the sticky bits in ABORT register before using any AP commands,
because the target will always respond with FAULT as long as one of the sticky error bits are set.
Value is b100. : SWD Line : 0 0 1
3.2.3. Data transfer phase
It contains 32 data bits and 1 parity bit.
The The parity check is made over the 32 data bits.
The number of bits set to 1 is odd, then the parity bit is set to 1.
3.2.4. Turnaround period
From figure above, there is Trn
-- Turnaround period between each phase.
Every time the SWDIO changes data direction, a one-cycle turnaround period is inserted which both sides should ignore.
This means there is always a turnaround period between the request and acknowledge.
On a write request, there is a turnaround period between acknowledge and the data phase.
On a read request there is a a turnaround after the data phase.
LPC ARM Cortex-based microcontrollers from NXP can be controlled via SWD,
ARM’s Serial Wire Debug protocol and the Coresight register set which allow nonintrusive debugging.
This blog post series summarizes use of the SWD protocol to do basic debugging functions on the Cortex-M0
and presents demonstration code and a simple hardware design to implement a USB-to-SWD bridge.
With additional software the SWD-to-USB bridge could allow debugging
or flash programming from a computer or other USB host device.
Serial Wire Debug (SWD) Fundamentals.
The SWD interface has two signal wires, SWCLK and SWDIO.
SWCLK or Serial Wire Clock is driven by the master and synchronizes the SWD data transfer.
SWDIO or Serial Wire Data Input/Output is driven by either the master or the target.
Every SWD transaction starts with the master driving SWDIO
but switches to having the target drive SWDIO at some point.
I/O levels and thresholds on SWCLK and SWDIO should be set to match the I/O voltage of the target.
In the simplistic hardware design given in this document,
we assume that the target voltage is 3.3V and no provisions are made to adjust to lower target voltages.
SWD Levels and Edges.
Since SWD is a bi-directional communications standard with only one data line,
data needs to be written to the bus by both the master and the slave.
Every SWD transaction begins with the master controlling SWDIO.
After a turnaround period, the slave controls SWDIO,
then control switches back to the master.
Whenever the bus is idle, the clock should be high and the master controls the bus.
When the master is in control of SWDIO, it clocks data out on the falling edge of SWCLK
and the slave latches the data on the rising edge of SWCLK.
When the slave takes over SWDIO, it presents data to SWDIO on the rising edge of SWCLK
and the master latches the incoming data on the falling edge of SWCLK.
In both cases, SWCLK is driven by the master.
The 1/2 and 1 1/2 period 1 values on SWDIO are called “turnaround periods” in the SWD specification.
They need to be driven by the host to prevent the bus from being dragged low by coupling of transitions on the SWDCLK.
If the SWD target detects that SWDIO is not 1 during the turnaround period,
it will go into an error state and not respond to any further transactions.
SWD Protocol.
The swd protocol consists of read and write transactions.
Both types of transactions have three phases.
These phases are the request phase, the acknowledge phase, and the data phase.
All of the data sent over SWD is sent in little-endian order and is sent least-significant-bit first.
During the request phase the host requests a read or write operation on the target
and indicates whether the Debug Port or Access Port is being accessed.
The Debug Port is a set of four important registers within the Cortex debug port that perform very basic operations.
The Access Port is a larger address space of 64 registers which are useful for accessing the microcontroller’s main bus.
The request phase consists of eight bits of data, some of which are constant.
The acknowledge phase consists of three bits where the SWD target indicates status to the host.
The data phase is where data is sent from the host to the target during a write operation.
Data is sent from the target to the master during a read operation.
The data phase consists of 33 bits- 32 bits of data (lsb first) followed by 1 bit of parity.
Request phase data
bit numberbit valuebit description
- 0 (lsb)always 1start bit
- 1 1 for Access PortSWD Port select- selects Access Port or Debug Port
- 2 1 for Readread select- selects whether this is a read or a write transaction
- 3 address bit 2bit 2 of read or write address (bits 0 and 1 are 0)
- 4 address bit 3bit 3 of read or write address (bits 0 and 1 are 0)
- 5 parity1 if sum of request phase bits 1-4 is odd
- 6 always 0stop bit
- 7 (msb)always 1park bit
Acknowledge phase data
bit numberbit valuebit description
0-2 acknowledge value
- 1- means acknowledge, no error
- 2- means wait, target not ready
- 4- means fault or error
- 7- typically means the target is disconnected
SWD Read and Write Details.
The SWD port supports reading and writing to the Debug Port and the Access Port.
The Cortex Debug Port consists of 4 32-bit registers.
When executing an SWD transaction, a Debug Port transaction is specified by leaving bit 1 (port select)
set to zero during the request phase of the SWD transaction.
The address is specified by bits 3 and 4.
Read is specified by setting bit 2 as 1, writes by clearing bit 2 to 0.
Writes do not actually complete until after the following SWD command
or until receiving a “flush” which is simply eight 0s clocked out onto SWDIO.
The Cortex Access Port address space has 64 32-bit registers.
Because only two bits of address are specified in the SWD request phase,
four more bits of address need to be set by using the AP Select register at address 8 on the Debug Port.
To execute a transaction on the Access Port, typically the Debug Port AP Select register
must be written first before the Access Port is written.
When executing a read on the Access Port, the result returned will be that of the previous read.
Therefore when executing a read operation, a 2nd dummy read must be done to get the results.
If executing a sequence of reads, only one dummy read is needed to get the first result.
There is no need for a dummy read between every read in a sequence.
Please note that both the Debug Port and the Access Port are NOT part of the Cortex CPU core’s general address space.
These ports and their registers only exist inside the SWD debug port.
For details on these registers and their uses please see the ARM Debug Interface v5 Architecture Specification documentation and errata.
Initializing SWD.
The SWD protocol allows full control of an LPC microcontroller.
Because of this, it is critical that the port be insensitive to noise under a wide range of design conditions.
To make the SWD port insensitive to noise, an unlock or connection sequence must be executed before the port can be used.
The unlock sequence consists of several different steps.
SWD Unlock Sequence Steps
step numberdescription
- 1 The Host needs to switch the target from JTAG to SWD mode by clocking 0xE79E onto SWDCLK/SWDIO
- 2 SWD connection sequence- clock out more than 50 binary 1s
- 3 Must read the Debug Port IDCODE register (address 0)
- 4 Turn on Debug Port by settings bits 28 and 30 at DP address 4
- 5 Write AP select (debug port address 8) to 0xF0 (to prep for AP read of 0xFC)
- 6 Unlock Access Port by reading AP ID register (AP address 0xFC)
Programming internal SRAM over SWD
3.2.1. Packet request phase
Address
|
Read
|
Write
|
0x00
|
IDCODE
|
ABORT
|
0x04
|
CTRL/STAT
|
CTRL/STAT
|
0x08
|
RESEND
|
SELECT
|
0x0C
|
RDBUFF
|
N/A
|
Bits
|
Function
|
Description
|
[31:5]
|
-
|
Reserved
|
[4]
|
ORUNERRCLR
|
Write 1 to this bit to clear the STICKYORUN overrun error flag to 0.
|
[3]
|
WDERRCLR
|
Write 1 to this bit to clear the WDATAERR write data error flag to 0
|
[2]
|
STKERRCLR
|
Write 1 to this bit to clear the STICKYERR sticky error flag to 0.
|
[1]
|
STKCMPCLRa
|
Write 1 to this bit to clear the STICKYCMP sticky compare flag to 0.
|
[0]
|
DAPABORT
|
Write 1 to this bit to generate a DAP abort. This aborts the current AP transaction. Do this only if the debugger has received WAIT responses over an extended period.
|
Bits
|
Function
|
Description
|
[31:24]
|
APSEL
|
Selects the current AP.
|
[7:4]
|
APBANKSEL
|
Selects the active four-word register bank on the current AP
|
Address
|
Bank
|
Function
|
Description
|
0x00
|
0x00
|
CSW
|
Control/Status Word Register
|
0x04
|
0x00
|
TAR
|
Transfer Address Register
|
0x0C
|
0x00
|
DRW
|
Data Read/Write Register
|
0xFC
|
0x0F
|
IDR
|
Identification Register
|
Bits
|
Function
|
Description
|
[30:24]
|
Prot
|
Bus access protection control. This field enables the debugger to specify protection flags for a debug access.
|
[5:4]
|
AddrInc
|
Address auto-increment and packing mode.
|
[2:0]
|
Size
|
b000: 8 bits; b001: 16 bits; b010: 32 bits
|
Address
|
Name
|
Description
|
0xE000EDF0
|
DHCSR
|
Debug Halting Control and Status Register
|
0xE000EDF4
|
DCRSR
|
Debug Core Register Selector Register
|
0xE000EDF8
|
DCRDR
|
Debug Core Register Data Register
|
0xE000EDFC
|
DEMCR
|
Debug Exception and Monitor Control Register
|
0xE000ED0C
|
AIRCR
|
Application Interrupt and Reset Control Register
|
Bits
|
Name
|
Function
|
[31:16]
|
DBGKEY
|
Debug key: A debugger must write 0xA05Fto this field to enable write accesses to bits [15:0], otherwise the processor ignores the write access.
|
[16]
|
S_REGRDY
|
A handshake flag for transfers through the DCRDR: 0 = There has been a write to the DCRDR, but the transfer is not complete; 1 = The transfer to or from the DCRDR is complete.
|
[0]
|
C_DEBUGEN
|
Halting debug enable bit: 1 = Enabled
|
Bits
|
Name
|
Function
|
[16]
|
REGWnR
|
0 = read; 1= write
|
[6:0]
|
REGSEL
|
Specifies the ARM core register, special-purpose register, or Floating-point extension register, to transfer: R0-R12, SP, LR, DebugReturnAddr, xPSR, MSP, PSP, etc.
|
Bits
|
Name
|
Function
|
[10]
|
VC_HARDERR
|
Enable halting debug trap on a HardFault exception
|
[9]
|
VC_INTERR
|
Enable halting debug trap on a fault occurring during exception entry or exception return.
|
[8]
|
VC_BUSERR
|
Enable halting debug trap on a BusFault exception.
|
[7]
|
VC_STATERR
|
Enable halting debug trap on a UsageFault exception caused by a state information error, for example an Undefined Instruction exception.
|
[6]
|
VC_CHKERR
|
Enable halting debug trap on a UsageFault exception caused by a checking error, for example an alignment check error.
|
[5]
|
VC_NOCPERR
|
Enable halting debug trap on a UsageFault caused by an access to a Coprocessor.
|
[4]
|
VC_MMERR
|
Enable halting debug trap on a MemManage exception.
|
[0]
|
VC_CORERESET
|
Enable Reset Vector Catch. This causes a Local reset to halt a running system.
|
Bits
|
Name
|
Function
|
[31:16]
|
VECTKEY
|
Vector Key. Register writes must write 0x05FAto this field, otherwise the write is ignored.On reads, returns 0xFA05.
|
[15]
|
ENDIANNESS
|
0 = Little endian; 1 = Big endian
|
[10:8]
|
PRIGROUP
|
Priority grouping
|
[2]
|
SYSRESETREQ
|
Writing 1 to this bit asserts a signal to the external system to request a Local reset.
|
[1]
|
VECTCLRACTIVE
|
Writing 1 to this bit clears all active state information for fixed and configurable exceptions.
|
[0]
|
VECTRESET
|
Writing 1 to this bit causes a local system reset.
|
def swd_write_mem(uda, address, data_ws, length): # Auto increment addresses uda.QueueWrite(DP_SELECT, MEMAP_BANK_0) uda.QueueWrite(MEMAP_CSW, 0x23000012) uda.QueueWrite(MEMAP_TAR, address) for x in range (0, length): uda.QueueWrite(MEMAP_DRW, data_ws[x]) uda.StartTransfers() def swd_read_mem(uda, address, length): data_ws = [] # Auto increment addresses uda.QueueWrite(DP_SELECT, MEMAP_BANK_0) uda.QueueWrite(MEMAP_CSW, 0x23000012) uda.QueueWrite(MEMAP_TAR, address) for x in range (0, length): uda.QueueRead(MEMAP_DRW) data_ws.append(uda.StartTransfers()[0]) return data_ws
7.3.2 Access core registers
def swd_write_core_register(uda, n, val): uda.QueueWrite(DP_SELECT, MEMAP_BANK_0) uda.QueueWrite(MEMAP_CSW, 0x23000002) uda.QueueWrite(MEMAP_TAR, DCRDR) uda.QueueWrite(MEMAP_DRW, val) uda.StartTransfers() uda.QueueWrite(MEMAP_TAR, DCRSR) uda.QueueWrite(MEMAP_DRW, n | (1 << 16)) uda.StartTransfers() def swd_read_core_register(uda, n): uda.QueueWrite(DP_SELECT, MEMAP_BANK_0) uda.QueueWrite(MEMAP_CSW, 0x23000002) uda.QueueWrite(MEMAP_TAR, DCRSR) uda.QueueWrite(MEMAP_DRW, n) uda.StartTransfers() uda.QueueWrite(MEMAP_TAR, DCRDR) uda.QueueRead(MEMAP_DRW) val = uda.StartTransfers()[0] return val
7.3.3 Programming firmware into internal SRAM
f = open("sim3u1xx_USBHID_ram.bin",mode = 'rb') data = f.read() swd_write_mem(uda, 0x20000000, data, len) . . . swd_write_mem(uda, 0xe000ed08, 0x20000000, 1) swd_write_core_register(uda, 15, data[1]& 0xFFFFFFFE) swd_write_core_register(uda, 13, data[0]) write_AHB(uda, DHCSR, 0xA05F0000)