Thursday 16 April 2015

To CC or not to CC

Earlier this year TI announced a new generation of wireless MCUs code-named CC26xx. This is a family of CM3-powered SoCs with varying 2.4GHz RF capabilities:
  • CC2630: IEEE 802.15.4
  • CC2640: BLE
  • CC2650: BLE and IEEE 802.15.4
If you read around the web, you may also get glimpses of a CC2610 and a CC2620.

The CC2650 is basically a superset of the CC2630 and the CC2640.

On the same day as the hardware announcement, TI also released a port of the Contiki OS for two CC2650-based boards:

This port supports all the cool stuff one can do with Contiki, but it also provides a first glimpse of BLE, which is not officially supported by the OS.

But... What about the CC2538?


One might think that this is a little too early, less than 2 years from the release of the CC2538. But are the two chips really competitors? Is the CC26xx meant to replace the CC2538? What are the key differences? These are the questions I'm pondering in this post, and I shall try to answer them by taking a deeper look at some of the features of the CC26xx and putting them side-by-side with its predecessor.

Oh and by the way, since we live in a capitalist, { disclaimer , indemnity }-fueled world, I must say that: "Naturally, this is just an overview, so if you want more details you should read the respective specs. Especially so if you are trying to choose between those two parts for a product."

Moving swiftly on...

Power management


A quick look at the two datasheets [1, 2] immediately reveals the ultra low-power nature of the CC26xx. Let's have a quick look side-by-side:


CC26xx CC2538
Test Condition Typ Typ Max Test Condition
Standby.
With RTC, CPU, RAM and (partial) register retention.
RCOSC_LF
1μA 1.3μA 2μA Power mode 2.
Digital regulator off
16-MHz RCOSC and 32-MHz crystal oscillator off
32.768-kHz XOSC, POR, and sleep timer active
RAM and register retention
Active.
Core running CoreMark (Peripherals inactive)
1.45 mA + 31 μA/MHz
( = 1.946mA at a theoretical 16MHz)
7mA Digital regulator on
16-MHz RCOSC running
No radio, crystals, or peripherals active.
CPU running at 16-MHz with flash access

Now those test conditions are obviously not identical, but they can still help us move forward. We see a difference under low power operation, we see a big difference while running. In fact, even configuring the CC26xx to fire on all cylinders (48MHz), its active consumption will be lower (2.938mA) than those 7mAs on the CC2538.

I've always been taking those datasheet values with a pinch of salt, since it's often difficult to understand the exact test conditions in terms of what was running and what was not.

What happens under the hood?

Power management, old style


Up to the CC2538 inclusive, TI's chips traditionally had preset "power modes" (I like calling them profiles). Those were built into the chip. For example, in the table above, we see that the datasheet mentions "Power Mode 2". In a nutshell, to enter a low power mode, one would roughly follow the steps below:
  1. Shut down / configure board peripherals (LEDs, sensors, etc)
  2. Select one of the pre-defined power modes by writing some hardware register
  3. Configure the chip for this power mode (e.g. wakeup sources)
  4. Enter this power mode
Let me emphasise this: The CC2538's PM2 is built into the chip itself, and so are the remaining power profiles (active, sleep, PM[0..3]).

Now this is very very simple to write software for, but it does have a problem: Imagine that you want to enter low-power operation and that you need some functionality X while operating in this low-power mode. You have a quick look at the device datasheet and you see that PM2 turns off feature X. Feature X is only available in PM1, which will result in a higher consumption. However, you do in fact really need feature X, so you are stuck with PM1. You are also stuck with all the other things that run (and draw current) while in PM1, even those that you couldn't care less about.

Power management, CC26xx-style


The CC26xx re-writes the textbook: It does not have pre-defined power modes. In the table above, we see the term "Standby", but this is not built into the chip, it is just a software-defined profile in TI-RTOS. The reality with the chip itself is a lot more flexible, which also makes it a lot more complex to write software for. Let's have an introductory look.

First things first, in typical CM3 fashion, at any give point in time the micro on the CC26xx can be in one of three states:
  1. Active / Running
  2. Sleeping
  3. Deep Sleeping
 The CC26xx digital power system is partitioned:
  • The CC26xx has two separate VDs: MCU and AON. The MCU VD can be turned off when we want to fully shut the chip down, but it will otherwise be on. AON is... well... AON.
  • Each VD is further partitioned into Power Domains (PDs). For example, the MCU VD has the following PDs:
    • MCU AON: This is in fact AON and cannot be turned off by software, except by turning off the entire MCU VD.
    • CPU
    • SYSBUS
    • VIMS
    • RFCORE
    • PERIPH
    • SERIAL
  • Each PD contains digital modules.
    • VIMS: Flash, cache and ROM
    • PERIPH: Supplies the DMA controller, Crypto engine, the TRNG, GPTs [3:0], GPIO, SSI1 and the I2S module
    • SERIAL: Supplies the UART, SSI0, I2C
  • Some of the modules have retention which can be enabled/disabled by software
  • Additionally, each of the  peripherals / modules has a clock gate and of course the module itself can also be enabled / disabled by software, as usual.
  • RAM is partitioned in 4 blocks, and each block can have its retention enabled or disabled
This drawing tries to capture the VD / PD / Module / Clock Gate logical structure:

    The SERIAL and PERIPH PDs can be turned on/off by software on demand.  For the CPU domain, we can request it to be off, but this will only happen when the CM3 drops to deep sleep. We can also request VIMS / SYSBUS on or off, but their behaviour depends on some additional factors and, generally speaking, we cannot explicitly force them on or off. For instance, SYSBUS will only actually turn off if neither the micro nor the RF are requesting access to it.

    While active, the chip can be supplied by a DC-DC or a Global LDO. When we drop to deep sleep, we can request a switch to a micro-LDO (uLDO), to reduce leakage.

    This system is very flexible and we can do various shenanigans. For example, we can do the following:
    • Configure the SSI0 clock to run while the CM3 is active, as well as during sleep and deep sleep.
    • Keep the SERIAL PD on (it's where SSI0 sits).
    • Disable all clocks under sleep and deep sleep for the remaining modules within SERIAL (e.g. for the UART).
    • Turn off PERIPH and RFCORE (this will turn off the RF as well as all modules within the PERIPH PD).
    • Request SYSBUS, VIMS and CPU PDs off. RFCORE is off, so SYSBUS will turn off when the MCU drops to deep sleep.
    • Request retention to be disabled for a part of our RAM (we have made sure there's nothing we need in there).
    • Disable VIMS and RFCORE retention.
    • Turn off the AUX PD within the AON VD (The AUX being a secondary, 16-bit sensor controller).
    • Drop to deep sleep.
    • Wait for SSI0 trigger to wake-up.
    This is merely an example. If we chose to do so, we can even leave RFCORE running and turn off everything else. We can pretty-much configure any combination we like (within some loose constraints of course).

    Unlike TI-RTOS, Contiki does not attempt to pre-define power profiles. Instead, it provides a set of helper functions that control the power down/up sequence and aim to help developers configure the state of the chip under low-power operation. In my humble opinion, CC26xx power mode support is the most mature power-related system among those present in the official Contiki source code repository.

    The CC26xx's power management features are described in greater detail in the CC26xx TRM [3].

    I love the CC2538, but - despite the complexity - the CC26xx wins.

    One RF to rule them all


    It only makes sense to compare the CC2538 with the CC2630 and the CC2650. The key difference here is that the CC2650 radio can operate in IEEE 802.15.4 as well as in BLE mode. Therefore, if for whatever reason you need to support both a 6LoWPAN / ZigBee and a BLE stack, you can do so without having to spin a dual-RF board. The CC2650 can do both with a single radio, albeit not simultaneously. The switching can be done on-demand and it is controlled by software.

    Controlling the Radio


    The CC2538 adopts a standard approach whereby the RF is controlled exclusively by hardware registers (In the region of ~70 documented in the user guide). The only thing that's perhaps worthy of a comment is that, even though the MCU is a 32bit one, the registers are 8bit-wide, with bits 31..8 being reserved. Makes me think that this radio has been used in a product powered by an 8bit micro in the past...

    The CC26xx adopts an entirely different approach. Reading the TRM, one can count fewer than 15 RF-related hardware registers, with another few power-related ones.

    The RF is a CM0-powered chip (also called a CPE), which communicates with the CM3 using a shared memory interface. The two chips can raise interrupts to one another. Software running on the CM3 passes commands to the CPE through a single register, called CMDR. The module responsible for the communication between the CM3 and the CPE is called "Radio Doorbell" (RFC_DBELL).

    Each command from the CM3 to the CPE can be of one among three types:
    • Direct Command
    • Immediate command
    • Radio Operation command
    Direct commands are very straightforward: The CM3 writes the command and its parameters directly to CMDR. The 16 high bits [31..16] represent the command ID, the following 14 bits [15..2] represent optional parameters, whereas the 2 least significant bits must be 01. An example of a direct command is CMD_ABORT, which will signal the CPE to abort any current ongoing operations immediately.

    Immediate commands and Radio Ops are more complicated. The command itself is represented by a data structure stored in the CM3's RAM, the chip's flash or the radio's RAM. The value written to CMDR is merely a pointer to this data structure. The CPE can tell that a command is an Immediate one or a Radio OP by inspecting the 2 least significant bits of CMDR, which for those commands must be clear. Therefore, the data structure representing those commands must be aligned on a 4-byte boundary.

    Immediate commands change or inspect the status of the radio and can be issued at any time, although some of them are only relevant when the radio is actually doing something. An example of an Immediate command is CMD_IEEE_CCA_REQ, which requests CCA and RSSI information.

    Radio Ops will access the actual radio hardware, to perform operations such as enter RX mode or TX a frame.

    Direct commands are basically Immediate commands that either don't need parameters at all, or that need parameters which can be accommodated within the space available in CMDR.

    A second hardware register, called CMDSTA, is used so that the CPE can communicate to the CM3 the status of the most recent command written to CMDR. The 8 least significant bits of CMDSTA represent the command's result, while the 24 most significant bits may contain command-specific signalling.

    Specifically for Radio Ops, the data structure representing the command has a status field, which is used by the CPE to inform about command execution status and is used in addition to the signalling via CMDSTA. Normally, CMDSTA will be updated immediately upon command reception (e.g. the command was successfully submitted for execution, or an error occurred), while the command's status field will keep getting updated over time as the command is being executed (e.g. "not started yet", "running", "done / complete" or "done / aborted").

    It certainly took me a while to get my head around this new way of working. From the CM3 developer's perspective, the CPE is a bit of a black box. You send commands to it and it does things. It's easy to tell what you think you asked the CPE to do. It's not always easy to tell what the CPE is actually doing. The API is very thorough in terms of sending commands to the CPE, but I find it lacking in terms of requesting status information. CMDSTA and the status field of Immediate commands and Radio Ops provides some information, but this information only relates to the most recent command sent (CMDSTA) or the respective Immediate command or Radio Op (command's status field). Thus, if you send a "Enter IEEE RX" command, you can then inspect the status field and see if the command is currently running. But if you do not send a command, then you have no way of knowing what the CPE is up to. For example, if you configure it to send ACKs for frames that pass frame filtering, it will every now and then interrupt RX mode, switch to TX, send the ACK, go back to RX. This happens without any commands sent from the CM3, so there is no status field to inspect. The API does not provide a "Are you currently in TX?" command, nor a "Have you started receiving a frame?" or "What is your current state?". Some (but not all) of those questions can be answered by inspecting interrupt flags, but the ability to directly enquire about those and other similar status updates would be nice to have in the API.

    Hardware crypto


    TI seem to have taken a step backwards on this front. The CC26xx only has an AES crypto engine (that can only do 128-bit AES). Nothing to write home about, this feature has been around since the CC2420 / CC2430 days.

    Conversely, the CC2538 provides hardware acceleration for AES-128 as well as 256, but it also provides acceleration for SHA2 and, optionally, PKA for ECC-128/256 and RSA.

    As IoT applications are gaining traction, hardware manufacturers, product developers, regulators, end-users and, generally speaking, all stakeholders are becoming more interested in security. The devices we are discussing here are very constrained and they cannot run the latest and greatest security-related technological advances. The RERUM EC-funded R&D project aims to tackle some of the security- and privacy-related challenges for Smart City applications. During the course of RERUM, Zolertia (one of the twelve consortium partners) developed the Re-Mote, a new hardware platform, which was designed by the requirements of the Use-Cases considered by the project. Zolertia, in collaboration with RERUM consortium partners, chose the CC2538 SoC to power the Re-Mote, partly due to its cryptography-related capability.

    I am sure TI had their reasons for taking this decision, and time will tell whether they were right or not.

    For the time being, the CC2538 wins. Hands down.

    USB


    The CC2430 was what got me involved with Contiki in the first place. The CC2530 and multicast support were the main reasons why I was invited to join the Contiki team of maintainers. Naturally, I feel emotionally attached with the CC2530, especially so with the almighty CC2531 USB dongle!

    The CC2530s are powered by 8051-based, 8-bit micros. I still think that the CC2530 is a great chip, but it does have two problems:
    • Harvard-based architecture: Very very simply speaking, this means our toolchain is not GCC-based. We use SDCC to build images, which is not a bad thing at all in itself, but SDCC does have a much smaller community. This means it doesn't evolve as quickly as GCC-based toolchains and the code optimisations it can do are nowhere near as sophisticated.
    • As I have said in various places, at various times (on occasion kicking and screaming), the 8051's stack is limited to 256 bytes, when building with SDCC at least. This is a big big problem. This renders the chip plain and simple unsuitable for Contiki and 6LoWPAN-networked applications. It would be possible to develop all those things in an 8051-friendly way, but when writing software for this chip one should implement some things in a different way than the way one would use for Von Neumann-based machines. Contiki is standards-compliant and cross-platform, therefore it is impossible to adopt 8051-friendly software design decisions for platform-independent code.
    Back on topic: Despite all of the above, the CC2531 does have this amazing feature: A USB controller. But for the limitations listed above, I would consider the USB dongle to be the perfect solution for 6LoWPAN border routers and dummy radios (SLIP radios using Contiki nomenclature). With Contiki, this functionality has been traditionally been achieved by UART-based solutions, and the interface between the border router and its host has always been a bottleneck.

    The CC2538 also has native USB support, but it doesn't suffer from any of the CC2530's limitations. It is high-time we replaced the CC2531 dongle by something newer, and I am hoping that someone will soon produce a CC2538-powered USB stick!

    In fact, I heard it through the grapevine that one is around the corner... It has a smaller form factor than the CC2531; I believe that the curved part at the right hand-side has cut lines and can be taken off. Button on the side seems like a great idea and it also has an epic LED that can turn pink!

    Reproduced with Zolertia's express permission

    The CC26xx does not feature a USB controller. This is probably not a huge deal for wireless applications. Depending on what one is trying to achieve, one can see the CC2538's USB as a useful feature, or one can see it as an unnecessary burden.

    Other noteworthy differences


    In the previous sections, I focused on the features that I consider to be key differentiating factors between the two chips. However, the CC2538 and the CC26xx have additional subtle (or not quite as subtle) differences. I am going to try to summarise them in the table below.

    CC26xx CC2538
    Key features
    Power management Advanced
    Flexible, but complex to harness
    Standard
    Not very flexible, but simple to implement
    RF
    • CC2630: 2.4GHz IEEE 802.15.4
    • CC2640: BLE
    • CC2650: 2.4GHz IEEE 802.15.4 and BLE
    Very few hardware registers
    Most operations use a new API
    2.4GHz IEEE 802.15.4
    Traditional approach based on H/W registers
    Crypto 128-bit AES
    • 128- and 256-bit AES
    • SHA2
    Optionally:
    • ECC 128/256
    • RSA
    USB Controller Nay Yay
    OS Support Both very-well supported by Contiki
    Features not discussed extensively
    MCU Speed Up to 48MHz Up to 32MHz
    RAM 20KB, all capable of retention
    (+8KB VIMS Cache +2KB on the AUX)
    16 or 32KB
    (16KB with retention in all PMs)
    Flash 128 KB 128, 256 or 512 KB
    UARTs 1 2
    SSIs 2 2
    I2Cs 1 1
    AUX 16-bit sensor controller with 2KB SRAM Nay
    RNG TRNG [1] PRNG
    I2S Yay Nay
    Can iron your shirts Nay Nay

     References

    [1] "CC2650 SimpleLinkTM Multistandard Wireless MCU", SWRS158, February 2015, [ pdf ]
    [2] "CC2538 Powerful Wireless Microcontroller System-On-Chip for 2.4-GHz IEEE 802.15.4, 6LoWPAN, and ZigBee® Applications", SWRS096D, December 2012 - Revised April 2015, [ pdf ]
    [3] "CC26xx SimpleLinkTM Wireless MCU Technical Reference Manual", SWCU117A, February 2015 - Revised March 2015, [ pdf ]