Bit Fields in Python

binary data

<Shameless Plug>

We use home-grown tool at Promenade called Parlay that allows us to write python scripts to talk to embedded devices, script behaviors, unit test hardware and simulate hardware we don’t have yet. It’s a super easy to use solution, written in Python and Javascript that can get non-embedded programmers (like me) up and running on hardware fast, and allows non-programmers (like scientists or testers) to interact and script behaviors without having to take any programming courses.  It’s dual licensed GPL and Commercial, so feel free to clone GPL version on github.

</Shameless Plug>

So that means we’re interfacing Python code with packed binary protocols over RS-232, CAN, GPIB, etc all the time.

Python’s struct library makes interfacing with binary protocols a snap.  For instance, if there is a struct like this in C (and it’s packed on a little endian machine before being sent over a serial line)

then we can read it from a binary buffer in python like this

and we can write to a binary buffer like this

struct.pack and struct.unpack make communicating over a binary protocol a snap. They are some of my favorite examples of how Python makes tedious tasks simple, easy, and readable.

What about Bitfields?

struct.pack doesn’t work well with a C struct that uses bitfields though. For example, the following C struct only takes up 32 bits (size of unsigned int). 4 bits for x, 3 bits for y, 5 bits for z and 20 bits for c.

The struct library  will let us get a 32 bit int out of the buffer, but doesn’t give you the ability to break apart arbitrary bits. So we’re stuck with an ugly solution like this

Yuck!  That’s verbose, prone to math errors and a pain to maintain if there are any changes to the struct. Seriously, even this toy example took multiple attempts and a pad of paper to get right. You have to mentally keep track of endianess, convert the bit masks from binary. There is way too much chance for human error here.

CTypes to the Rescue

Time to bring in my good friend ctypes.  CTypes is used when you want to interface Python with a C library. It’s typically used when trying to leverage  C code that is already written for legacy or performance reasons.

It turns out that CTypes can be used to make our life a lot easier, even when talking over a remote protocol like Serial. Check out this example:


Looks an awful lot like the struct we defined in our C code doesn’t it? That’s because that’s exactly what it is!

PacketBits inherits from  ctypes.LittleEndianStructure, which means it will be packed into a little endian structure. Each field has 3 arguments (name, ctypes type, bit-length) just like in a C struct.

The class Packet is a union between the bit field struct, and a simple 32 bit int, so we can easily pack the full structure to and from struct.pack and struct.unpack for transport.

For example


That’s all there is to it.  ctypes.struct is easy to use, easy to maintain and best of all makes the code look pythonic.

Intro to Twisted and Event Driven Programming – a simple chat server

This is a companion to a simple tutorial I gave at the OCPython Meetup in Irvine on October 6th, 2015.

What is Event-Driven programming?

Twisted is an asynchronous event-driven communication library that implements the Reactor Pattern. What this means is that twisted takes all of the asynchronous events coming from things like TCP/UDP streams, Serial lines, etc, and turns them into discrete ‘events’. The reactor then ‘reacts’ to these events one by one by calling your code.

For example, instead of writing code like this:

you would set up an event listener  like this:

This inversion of control makes it simple and straightforward to write certain types of servers. Also, since you only ever deal with one event at a time you never have to worry about threads or race conditions.

Things event-driven programming is good at:

  • I/O bound applications (Where you spend most of your time ‘waiting’ for other things)
  • Applications with heavy resource sharing
  • High throughput but low processing overhead applications (i.e. get in and get out fast)

Things event-driven programming is bad at (and all the ways twisted tries to mitigate these negatives):

  • Interfacing with blocking code (Twisted has easy mapping between deferred and threads with deferToThread() and blockingCallFromThread())
  • Long procedures a.k.a. ‘Callback Hell’ (Twisted’s @inlineCallbacks uses generators to make async code that looks and smells like synchronous procedural code.)
  • Processor intensive applications. Since the reactor can only ‘react’ to one event at a time, any cpu-hogging code effectively blocks the entire server. (Twisted’s deferToThread() can offload CPU intensive functions to another thread, but it has its limitations)
  • Parallel code. The reactor can only ‘react’ to one event at a time, and this can not be parrallelized. (Twisted offers no mitigation here. If your code is embarrassingly parallel, you’re better off going with threads)


Now let’s talk about Twisted.

The main objects in twisted are: The Reactor, Transports, Protocols, Factories, Deferreds.

  • The Reactor  is the heart of twisted. It listens on ports, manages events, and pulls the whole system together. There should only ever be one in your application, so treat it like a singleton.
  • Transports  are the physical wire (or effective equivalent abstraction) that carry bytes. E.g.: TCP, UDP, Serial, Modbus, etc.  Protocols use transports to send and receive events (e.g.: messages/packets/data). This de-coupling is important because it means that the same protocol code can work with any transport it is bound with.
  •  Protocols  are application specific objects that represent a single connection. In the below chat-server examples, each client connection will have an associated Protocol object to manage that specific connection.
  • Factories create new protocols dynamically. Factories are attached to Transports through the Reactor, and when a new connection is established, the Factory creates the appropriate protocol object to manage that connection in its buildProtocol() method.
  • Deferreds are a way to schedule asynchronous callbacks without blocking. They are like javascript’s Promises, or Scala’s Futures. I won’t go into them in this post, but they are such an integral part of Twisted they had to be mentioned

See Twisted’s excellent getting started guide for more details

Chat Server

Here is the first version of the chat server. We create a custom protocol that inherits from LineReceiver. LineReceiver is a helper Protocol class that takes a stream of bytes from a transport, and calls lineReceived() when it hits a newline terminator. (\n by default, but can be customized to any byte-value)

The CharServerFactory is very simple. When a new protocol is requested by the reactor, it constructs it and returns it.

The second to last line instructs the reactor to listen on port 1234 for tcp connections, and to ask a ChatServerFactory for protocols when it gets a new connection. The last line actually runs the reactor. The reactor takes control of the main thread and this function call will not return.

Test it out! Run the script and open up Telnet (‘telnet localhost 1234’ on the command line) or launch PuTTY and connect to localhost:1234 .

Ok so that version isn’t really a chat server. It just prints the lines that were sent to it. Let’s add a feature so we know which of our connected clients typed each line.

How can we get the info of the connected client? Ask the transport of course! Each protocol gets a reference to its transport layer in the self.transport member variable.

Alright, so the server is now printing each line it gets sent, but we want each connected client to get the messages, not just have it printed on the console by the server.  LineReceiver has a helper method called sendLine() that sends a string down its associated transport. All we need is a reference to the protocol we want to send the line to.

Since the Factory constructs each protocol as it gets new connections, the Factory can keep track of them in a list, and can pass a reference to itself to each protocols constructor.

Hey! This is starting to actually look like a chat server now. Let’s add some more features like a MOTD, user enter/leave notifications and an easy way for clients to exit by typing /exit.

All protocols have a connectionMade() and connectionLost() class that get called when connections are established and destroyed. I refactored out the logic to send a line to all clients because it was called in multiple places.

QT5 Signals and slots over a network

In a recent project, we wanted to be able to call QT5 signals/slots from a network connection, for unit testing and for communicating with other parts of the system that are written in other languages. After spending way too much time in duck typed dynamic languages like Python and Javascript, I naively thought this would be pretty easy given the introspection abilities of c++11 and  Qt’s MOC. I was wrong. Below is a solution that involves about 48 straight hours of tinkering and coding over a weekend, all why saying “This should be so easy. I’m an idiot, and must be missing something obvious.”


Before I jump in to the code, I want to point out that I did *not* use c++11 variadic templates for this, even though this problem totally screams “USE VARIADIC TEMPLATES”. That was because after a full weekend with maybe 4 hours of sleep total, just manually brute forcing a solution for < 9 arguments was sufficient for my needs.  I’ll leave it as an exercise for the reader to take this and make it work with a variadic template solution. If you do, be sure to let me know.

The Use Case:

We have JSON messages coming in over a QWebSocket with information on which signal/slot to call like this: ‘{“obj”: “obj_id”, “name”:”slot_name”, “args”: { “foo”: 1, “bar”: “hello world” }}’. I want to call the Qt Signal names ‘slot_name’ on the object with the unique id obj_id, and with the values foo=1, bar =”hello world”. (basically I want to do a call like: obj_list[obj_id]->slot_name(1,  “hello world”); ) This is great for integration testing, causing weird faults, testing pieces of code that are only triggered by signals, and just in general integrating our Qt Application with the rest of our system that communicates over Websockets with JSON.

The straightforward part:

We can already figure out type information for the arguments (e.g. foo is an int, bar is a QString) from Qt’s MetaObject / MetaMethod system and return an error code to the remote caller if the argument value can’t be coerced into the appropriate type. The actual coercing can be done with QVariant’s value() template function.  Tweet me if you want a write-up of how I did that, but It’s pretty straight forward with a look at the Qt documentation.

Alright, so now I have a  this QJsonArray of values, I know their types, and can coerce them at will using a combination of QMetaMethod’s type list and QVariant’s value(). How do I actually call the signal/slot ? The problem is that any  type coercion for a function has to be known at compile time.


The harder part:

This solution will take a  list of QVariants (QVariantList) of unknown types, coerce them into the correct types for the method arguments, and then call the methods with the values. The only caveat is that this method will not work with reference arguments (e.g. int &foo).

Alright, so there are two parts to this: macro magic, and template magic.


Macro magic:

All this template does is define a method names __COERCE_func_name , constructs a Coercer with the signature of CLASS::func_name, and then call the templated coercer object’s coerce method with a point to the appropriate function pointer and argument variant list.


Template magic:

This is where the coercion actually happens. I’ve only included  0 and 1 arguments templates for brevity, but I also manually copy and pasted 2-9 argument versions.  The hard part was figuring out this syntax:

This takes a method pointer from the class Obj, that takes a single argument, and returns type R. Now that we have the types of the arguments we can call QVariant.value<Arg>() to turn the QVariant into the correct value, and call the function with it!


There you go. That’s how you can call slots,  with arguments, over a Websocket. All you need to do is ‘register’ your slot like below, and the rest is handled auto-magically by the template

Easy huh?



Next Steps: Do it in pure C++11


DEFCON 23 Retrospective

This year’s DEFCON was especially exciting because there were a lot more opportunities to get down and dirty with actual hardware. The return of the ICS (Industrial Controls Systems) village, and the brand new IoT (Internet of Things) village gave people the opportunities to play with everything from home office routers to water treatment plants.

I also got to do a first for me. A workshop lead by Lyon Yang (@l0Op3r) walked us through exploiting a real buffer overrun vulnerability on a an actual SOHO router.


buffer overrun in IoT workshop

buffer overrun in IoT workshop


The badges were a little disappointing this year honestly. I appreciate the nod to retro technology, but they were huge, kept falling off of the clip, and didn’t have any electronic modding opportunities.



The Convention took up both Bally’s and Paris. It was so spread out and there was so much going on, it was impossible to see even 50% of of events going  but overall it was a productive, fun convention. I look forward to catching up with all of the talks and workshops I missed in preparation for next year.

glibc, yocto, and cross compiling

(There is a TL;DR for those of you just interested in the fix at the bottom)

I have a love/hate relationship with yocto. It promises to make building custom embedded linux variants as easy as ordering from a menu, and it seems to get me 90% there every time. The problem is with the last 10%. Hob, the UI, is constantly crashing, so I have to default to the command line interface. It uses absolute paths EVERYWHERE, making portability an problem, and debugging issues is a nightmare. Finding where recipes are, and which layers possibly amended or patched them is like trying to find a vaguely worded meta-needle in a haystack (grep is always your friend in situations like this).

My latest issue was when I tried to cross compile an Application built with Qt and Boost for an iMX6 board. The meta-qt5 layer seems too be mature enough that it built the sysroot and QT Creator environment without a hitch (minus the fact that I still had to patch it to explicitly choose ALSA over gstreamer).  Cross compiling was a different experience. The compiling itself worked fine, but the linker complained it couldn’t find /lib/ . I checked the sysroot, and sure enough it had a /lib/ .  I spent the next hour banging my head on the keyboard and yelling “ITS RIGHT THERE!!!” at the screen before I realized, “Wait a minute, is it really looking for /lib/″ (Emphasis on that first slash aka root on the root filesystem). Sure enough, if I copied from my sysroot’s /lib to my host’s /lib, it compiled fine! “Great, more absolute paths – thanks yocto”, I thought.

Well, after digging a littler deeper it turns out this isn’t entirely yocto’s fault. It’s glibc’s. If you look at the made by glibc, you’ll see that it  hard codes the path it gives to the linker


I wasn’t sure about why it did this, until I read some snarky comments in the makefile of uClibc:

The amount of time saved by this optimization is actually too small to
measure. The linker just had to search the library path to find the
linker script, so the dentries are cache hot if it has to search the
same path again.  But it’s what glibc does, so we do it too.

Seems about as good a reason as any I guess. Hopefully this post helps someone avoid a couple hours of debugging and searching.

TL;DR: If you’re trying to cross compile an application and get a linker error like:

ld: cannot find /lib/

Then edit the following files:


and make the absolute paths relative paths so they look like this: