4
Introduction It is important to get an idea of the events occurring in an embedded wireless node – when it is deployed in the field – away from the convenience of an interactive debugger Such visibility can be useful for – post-deployment testing – replay-based debugging – for performance and energy profiling of various software components

7
AVEKSHA In this paper, AVEKSHA is proposed – a hardware-software approach for achieving the above goals – in a non-intrusive manner Based on the key insight that most embedded processors have an on-chip debug module (OCDM) – has traditionally been used for interactive debugging – provides significant visibility into the internal state of the processor

8
Debug Board A debug board is designed – interfaces with the on-chip debug module of an embedded node’s processor – through the JTAG port Provides three modes of event logging and tracing – Breakpoint – Watchpoint – Program counter polling

9
Triggers Using expressive triggers that the on-chip debug module supports AVEKSHA can watch for, and record, a variety of programmable events of interest

10
Key Features of AVEKSHA The target processor does not have to be stopped during event logging – in the last two of the three modes – subject to a limit on the rate at which logged events occur AVEKSHA also performs power monitoring of the embedded wireless node – enables power consumption data to be correlated to events of interest

11
Key Features of AVEKSHA AVEKSHA is an operating system-agnostic solution Functionality and performance has been demonstrated using three applications running on Telos motes – Two in TinyOS – One in Contiki

12
Key Features of AVEKSHA Can trace tasks and other generic events at the function and task-level granularity AVEKSHA can be used to find a subtle bug in the TinyOS low power listening protocol

25
Breakpoint Mode Any of the 8 concurrent triggers available can be set as a breakpoint When a breakpoint is reached, the application processor halts execution AVEKSHA performs a continuous poll of the CPU state of the application processor

26
Breakpoint Mode When AVEKSHA sees that the CPU is halted – it retrieves the state of the application processor (e.g., the value of the PC) – sends the JTAG command to resume execution

28
Watchpoint Mode Any of the 8 triggers can be set as watchpoints rather than breakpoints The JTAG interface has an 8-entry circular buffer – where memory address bus (MAB) and memory data bus (MDB) are stored – when a watchpoint is hit

29
PC Polling Mode The final approach to trace generation – is to forego using triggers entirely Instead poll the program counter of the application processor With this approach, each polled PC value is used to determine – what section of the code the application processor is executing in

30
Advantage The advantage of PC polling is that – it is about 19 times faster than the watchpoint mode Hence can keep pace with a higher frequency of events – such as every function transition

31
Challenge Implementing PC polling presents a practical challenge – synchronizing the TDB and the application mote This is because the TDB is reading the PC values while the mote is executing To achieve this synchronization – we replace the mote’s external crystal oscillator with a wire from the FPGA

35
Microbenchmarks Objective – to evaluate the performance of the building blocks of AVEKSHA How many clock cycles it takes to perform event monitoring for each of the three modes – breakpoint, watchpoint, and PC polling

36
Microbenchmarks The accuracy of the energy monitoring – by comparing it with measurements obtained using a Fluke multimeter a dedicated power monitor from Monsoon Inc The energy consumption of the TDB itself

39
Power Consumption of TDB It is important that the TDB itself consume a small amount of power Because the typical usage scenario is – the TDB coupled to the application processor board when deployed in the field

40
Energy Saving A large source of energy saving on the application processor comes from entering a low-power sleep state when not in use With a small modification of the application processor’s code – to signal when it is in sleep mode – it is also possible for the TDB to enter a low-power sleep state

41
Application Setup Our experiments use both TinyOS and Contiki applications – without needing any extra programming effort – since AVEKSHA is OS-agnostic by design Two TinyOS applications – TestNetworkLpl and TestFtsp An object tracking application in Contiki

42
Watchpoints One application for the TDB is to monitor various state variables Monitoring state variables and transitions is useful because – they can be correlated to power consumption – they can aid in understanding the behavior of applications

43
Watchpoints In TestNetworkLpl, we have instrumented the following to monitor state changes – application layer – low-power-listening layer – radio layer The instrumentation is simply to place a nop instruction which can be used as a trigger in the watchpoint mode

44
Watchpoint trace of states when sending a message in TestNetworkLpl

47
Watchpoint trace of application level functions and threads of a sender node - in the Contiki tracking application

48
PC Polling Sampling the PC counter is a quick and non- intrusive operation It does not have the flexibility of setting watchpoint triggers for specific conditions However, it has the advantage of being able to measure events – with greater timing accuracy than watchpoint polling

49
PC Polling This is because – it is faster to take a sample of the PC – it does not have to do buffer management A useful application of PC polling is for statistical profiling of an application – say to determine what parts of the code are most active

51
Trace of the functions invoked in one execution of the local2Global function

52
Overhead of a Simple Software Profiler Software profiling is used to collect and arrange different statistics about function calls in a program – such as the time spent in each function, how many times a function was called, etc

56
Processors For the purpose of creating a flexible prototype – more powerful processors are used than are required for the current implementation The MCU firmware uses 8.7kB program ROM and 5kB RAM – Which are 18% and 49% of the MSP430’s total capacity respectively

57
FPGA The FPGA firmware uses the following of the Actel FPGA – 35% of the core logic – 2.8kB or 62% of the RAM

58
Where to store the logged events The target mode of operation – the TDB coupled with the application processor board in field deployments There is the consideration of where to store the logged events For debugging, it is sometime sufficient to have a history of the last few events before some condition in the application was reached

59
Where to store the logged events For example, the OCDM of the MSP430 provides a history of the last 8 watchpoint events with this type of debugging in mind If this is the case, the TDB’s main processor can maintain a circular buffer of events in RAM

60
Where to store the logged events If a larger history is required it is possible to store events into the TDB MCU’s flash memory For very long term storage – an external USB storage host could be attached to the TDB – a compressed trace can even be sent to a computing platform over wireless communication

62
Conclusions In this paper, AVEKSHA has been presented A hardware-software solution to the problem of tracing events at runtime – in an embedded wireless node – without slowing down the application AVEKSHA can trace a variety of events – particular PC addresses – reads from peripherals – entry and exits from tasks and interrupt service routines – arbitrary user-defined events

63
Conclusions We have shown through two applications in TinyOS and one in Contiki – such tracing is useful for profiling the execution times of different tasks and event handlers While the watchpoint mode of operation is capable of capturing a practically limitless variety of events – it cannot keep pace with events that occur more frequently than 122 clock cycles – on a sustained basis

64
Conclusions The PC polling mode of operation is restricted in the kinds of events that it can detect – but being faster it can keep pace with events that occur at a rate less than every 7 clock cycles AVEKSHA has the ability to do energy monitoring over the μA to mA range – coupling it to execution regions between two events of interest

65
Conclusions This work points to the feasibility of tracing a wide variety of events of interest in a low-cost and non-intrusive manner – while the embedded node is deployed in the field – Do not have to rely on expensive and custom-built hardware to achieve this The events provided by AVEKSHA can be used by a variety of existing and yet-to-be-developed solutions – such as replay-based debugging, performance profiling, and energy monitoring