Re: Auto-update protocol



D Yuniskis <not.going.to.be@xxxxxxxx>
wibbled on Tuesday 09 March 2010 20:48


I would use TFTP for the actual image transfer - simple, standard and the
client initiates the connection so as long as it is happy it is talking
to the right IP it doesn't have to worry too much about connection based
attacks - code signing should sort the rest of the security out.

The issue (I think) with TFTP is that it doesn't protect against
the image file being updated "between packets". So, you run
the risk of pulling down part of one image and part of another,
flashing this combination -- only to discover the resulting image
is corrupt (fails checksum).

I suppose it depends on who controls the TFTP server platform and how much
confidence you have in them using management software and not fiddling
directly?

With a connection oriented protocol, the server side can open()
the desired file and keep a link to it throughout the transaction
(even if the original file is unlink(1)-ed, etc.)

That's true - although the original file can still be modified in place
whilst the server is reading it - unless you apply the "immutable" bit to
the file, assuming linux. You could read the file into RAM when a client
requests it, and check the integrity against the signature/checksum before
doling it out to the client on the net. That would be pretty bombproof
AFAICS.

I think there is some merit to the simplicity of the TFTP protocol even if
you implement a stronger server as you may find ready made code for the
client side.

If that doesn't suit, then perhaps it it time to write your own protocol. In
which case, I would do something based on simple frames with a length, frame
type, sequence number and checksum. That gives you the possibility to use
different frame types as commands, with one type for shifting blocks of
image data (hence the sequence numbers).
Replies including ACKs use the same sequence number as the original request
which allows marrying up things again.

Do you already have such a protocol in place for other functions of the
device?


The client can try to *recover* from this mixup. But, it means
starting the entire process over again. In a pathologic scenario,
it could *never* recover completely (highly unlikely, but possible).

- Minimize unnecessary network traffic as well as
load on the server (the goal is for the user *not*
to notice this activity -- though I am not trying
to "keep it secret")

Unless you go multicast/broadcast (with the extra complexity that will
force onto the clients) I don't think you could do much better. Would you
really need to reduce net traffic below one image transmission per
device?

My point was to avoid *needless* traffic. E.g., don't download
the image ALL THE TIME if you don't need to do so. (imagine
a RAM-based environment which *would* need that sort of support)

OK, polling a tiny manifest file would seem to fit the bill? The image is
only pulled if the client determines there is actually an upgrade it needs.

(seems like I have forgotten something -- but I
can't recall what! :< Too early in the day...)

Now the interesting bit is advertising the availability of images to the
clients. TFTP of course cannot do directory listings, though the client
could periodically retrieve a manifest file.

Or, it can just request a *specific* file name (e.g., related
to it's MAC address -- since both sides of the link would need
to know that)

True. You could avoid the need to symlink the image files N-times by using
the broadcast MAC address as a catch-all. Client checks for a file of its
own MAC first, then for the catch-all.

So all your files are of the form

TypeID-MAC

eg

023a-ffffffffffff <- for everything of type 023a
023a-0124ab4314ca <- except this guy

Well, audio clients differ from video clients differ from...

But, I don't want to have to "presonalize" each instance of
an "audio client" from each other instance. They configure
themselves at runtime.

So your "typeid" would be "audio client" or a numerical representation
thereof? One image to all audio clients - don't see that affecting anything.

However, I may want to try a different image on a particular
client (e.g., during development or when testing new features).
So, I would want to be able to say:

"device having MAX xx:xx:xx:xx:xx:xx please use *this* image"

As such, if you can support this, then you might as well
specify the image for *each* device in this same way.

For example, I manage my X Terminal clients with a file
hierarchy of:

ModelA/
Version1/
ConfigurationX
ConfigurationY
Version2/
ConfigurationW
Version3/
ConfigurationQ
ModelB/
Version2/
ConfigurationP
Version47/
...
Devices/
MACxxxxxxxxxxxx -> ../ModelA/Version1/ConfigurationY
MACyyyyyyyyyyyy -> ../ModelA/Version3/ConfigurationQ
MACzzzzzzzzzzzz -> ../ModelB/Version2/ConfigurationP

but this is very "manual" -- not the sort of thing I want to
impose on others.

My scheme does give both options with a certain conciseness - not to say
that that is the only scheme. Your users/customers are going to have to be
aware of something or do you remote manage all this as part of the contract?
Any scheme that the user sees can always be dressed up with a pretty web
page or gui.

I don't want the "image server" to be aware of this stuff.
I want to burden the devices with it all. After all, *they*
are the things "benefiting" from this... :>


OTOH if you need to write your own image server, then why not? My last place
had a motto - do the fiddly stuff on the big computer and make the embedded
stuff as dumb as possible as the embedded stuff is far harder to program and
debug.

I'd just go for Type and Version numbers, it's all you need if I
understand your problem correctly. Timestamps are not really necessary.

Timestamp is a way around explicit version numbers.
Their advantage is that they are viewable "outside"
the image itself.

But fragile and to some extent meaningless except in as much as they are a
monotonically increasing sequence. Also much larger. I would strongly urge
using formal version numbers that can be tied back to a branch of a version
controlled source tree. I worked somewhere once that did not use version
numbers in a consistent way and worse had no source control at all. To say
it was a mess is an understatement. I left when the full horror became
apparent - more fool me for not looking harder at the interview...

Put a management system in place that people do not manage the TFTP
directory directly, but give new firmware to a script that gracefully
puts it in place, atomically updating the manifest file (hint rename()
under linux is atomic)

Again, this forces changes on the server. I can build something
into the makefile (e.g., "make release") that automates some of this.
But, that will be happening on a different host so that adds
more things that can go wrong.

And, it doesn't address "production deployments" -- build an image
and distribute it to others for *them* to deploy locally.

OK - fair point. Perhaps you would be better implementing your own server
then.

I think the hardest part is managing encryption and/or code signing in a
way that doesn't overload the client if they are lightweight. The basic
approach

That's another reason why I want to avoid unnecessary updates.
(network traffic, slower boot times, more windows of vulnerability,
etc.). The "easy" fix is something that *pushes* updates to
each device. But, that also requires the most "support"
(on the servers as well as "by the developer") :<

I want a push deployment in a pull implementation! :>

Yeah - I think pulling a manifest file OR (based on earlier comments here)
probing for the correct image and just pulling the header block from that
image to determine version number would both be lightweight enough. If you
have a 100 devices and they probe once every 2 minutes, that's still no more
than one very short transfer per second and no modern network is going to
notice that. If it were me, I would back off to checking once per hour or
slower and have a magic packet (or other command) that could be used to
force a check on a specific client for that edge case when you want to force
it *now*. IME the edge case use will be rare but when you need it, you
*really* need it...

Cheers

Tim

--
Tim Watts

Managers, politicians and environmentalists: Nature's carbon buffer.

.