Re: Binary files, little&big endian setting bits

From: Eric Sosman (eric.sosman_at_sun.com)
Date: 10/28/04


Date: Thu, 28 Oct 2004 11:56:16 -0400

Steve wrote:
> Hi, i know this is an old question (sorry)
>
> but its a different problem, i need to write a binary file as follows
>
> 00000011
> 00000000
> 00000000
> 00000101
> 00000000
> 11111111
> 11111111
> 00000000
>
> program will be compiled in Microsoft Visual C++
>
> was thinking of just writing it as chars (afaik chars are the only
> unsigned int thats only 1 byte) so basicly i'll be writing
> 3,0,0,5,0,256,256,0
>
> question is if i write a file like that will it come out as the bits
> above, does VC++ write little or big endian and other than endian
> issues if it doesn't come out as above, why not??

    I think you're asking about the order in which the
individual bits of each byte will be written: will the
first bit of the 3 be the high-order zero or the low-
order one?

    To begin with, there may not *be* any order at all.
For example, suppose the output is sent to a parallel
interface that presents all eight bits simultaneously:
which bit is "first among equals" when they all march
in line abreast? The individual bits may not even
exist as discrete units: Consider writing to a modem
that encodes many bits in each signal transition, or
which uses data compression and winds up transmitting
2.71828 bits to encode the eight you presented? At the
C language level -- and even at the machine language
level, for most machines -- the byte is an indivisible
unit of I/O, and since it's indivisible the "order" of
its components cannot be discerned.

    The question does eventually arise, at the level of
the medium on which the data is stored or through which
it is transmitted. And here, each storage device or
transmission medium has its own standards for the encoding
of these "indivisible" bytes. Some, like serial interfaces,
will indeed "split the atom" and transmit the individual
bits in a specified other. Others, like SCSI controllers,
designate specific signal lines for specific bits. Still
others, like card punches (anybody remember punched cards?)
will produce a pattern of holes that encode the character
designated by 3; this pattern will probably not have any
obvious relation to the original eight bits.

    But you needn't worry about this unless you're the
person charged with implementing the electrical interface
to the storage or transmission medium. It is the job of
that interface to accept the serialized bits or the SCSI
signals or the holes in a punched card and to reconstitute
the indivisible byte value from them. As a programmer you
almost never care about the details (unless, perhaps, you're
writing diagnostic code that wants to produce specified
patterns in the signal lines to detect cross-talk, or that
sort of thing). You write out a 3, and it's the business
of the various media through which that 3 travels to ensure
that a 3 comes out at the other end. No huhu, cobber.

    Where you *do* need to worry about endianness issues
is when you're dealing with multi-byte data objects: the
low-level media take care of individual bytes for you, but
you're responsible for arranging those bytes into larger
structures. Different systems have different conventions
for such arrangements, and that's why you can't just use
`fwrite(&int_value, sizeof int_value, 1, stream)' to send
an integer from one system to another. But once you've
settled on an "exchange format" that specifies the order
and meaning of the individual bytes, all you need to do is
decompose your larger objects into those bytes before
writing them, and reassemble the bytes into the larger
objects when reading. The actual form of the bytes "in
flight" is not your problem.

    The only possible worry you might have with byte-by-
byte data exchange is if the machines use different byte
sizes: Exchanging data between machines with 8-bit and
9-bit bytes, for instance, can be tricky. But if you're
dealing with a common byte size, all is well.

-- 
Eric.Sosman@sun.com


Relevant Pages

  • Re: Echos from the past, code test not a hindrance to a ticket
    ... There is too much noise and signals are subject to the vagaries of wave propagation phenomena. ... I'd have to work it out, but I think there are some ham segments where we'd need more bandwidth than is alloted ... High Command that determines who gets what frequencies for what path at what time. ... For example, a system of 128 states could be done, if the system S/N is good enough, allowing the transmission of seven units of information per unit time in the same bandwidth as was previously used for one unit. ...
    (rec.radio.amateur.policy)
  • Re: Phyllis was this approved by Dr Jones Lawyers? If not SHUT THE FUCK UP
    ... > Telemedicine involves a high speed hookup so the patient can be "seen" ... > individual patients and the transmission of information related to ... > Asynchronous communication: ... > of signals in a frequency-modulated fashion, ...
    (sci.med.diseases.lyme)
  • Re: Double dipole better than YAGI 5 elements??
    ... Double antennas deliver double the signal ... The wireless shadows of tall buildings are a challenge for transmission. ... radio signals bounce off objects in the ... as 'transmit diversity', ...
    (rec.radio.amateur.antenna)
  • Re: Double dipole better than YAGI 5 elements??
    ... Double antennas deliver double the signal ... The wireless shadows of tall buildings are a challenge for transmission. ... radio signals bounce off objects in the ... as 'transmit diversity', ...
    (rec.radio.amateur.antenna)
  • Re: Can you decrypt this?
    ... writing begins broadly. ... Try aging the script's dead tablet and ... Mohammar will notably remember them behind you. ... Her transmission was dependent, homeless, and robs alongside the ...
    (sci.crypt)