Re: Binary v. Text, why is it faster?



Arctic Fidelity wrote:
I have constantly seen and heard that reading binary data is faster than
reading textual data. I have always presumed this to be a fact. But now I
am at the point where I would like to understand why.

I was trying to think about it, and it has rather confused me. To my
understanding, reading a text file is reading in the bytes which
correspond to, for example, ASCII character codes. But if we are dealing
with a 1-byte character encoding, how is it slower to read in 'a' rather
than some binary representation of that?

They are the same because often a text file IS a binary file. On some
ancient machines text and binary files are are different but most
modern OSes/machines only implement binary files - so these days a text
file is nothing more than a binary file with only ASCII characters. So
reading an 'a' is in essence reading a binary file.

And in addition to this, what is the actual difference between binary and
textual files? I had always thought that a binary file was simply a file
composed of any combination of bytes, whereas a text file was a file
composed of a limited subset of the bytes available to a binary file. Am I
misunderstanding something here?

What people usually mean by this is storing data in the form of either
binary representation or an ASCII string. Say for example you have four
variables, the first two are short integers (assume this to be 16 bits)
and the second two are long integers (assume this to be 32 bits). You
want to store the values in a file. In text format there are many ways
to represent this in a file but the most common is the newline
separated file:

100
4050
234262
400000

Reading the text file requires reading 22 bytes (including the
newlines). And then you also have to convert the ASCII string back to
integers using atoi() etc. which consumes CPU time. Compare this to
reading a pure binary file. Reading the binary file only requires
reading 12 bytes and at most you'd have to handle endianness by
flipping the bytes over (or calling ntohs() and friends) which consumes
a lot less CPU time compared to atoi().

Now, with four values you'll not see much difference but imagine doing
this for very large amounts of data. Compare for example the size of a
256 color image the size of 1024x768 as a gif file (which is binary) to
the same image as an X-pixmap file (which is ASCII text).

.



Relevant Pages

  • Re: Fast reading and unpacking of binary data (struct module)
    ... I have a Python newbie question about reading data from a binary file. ... I have an huge binary file from an external program. ...
    (comp.lang.python)
  • Re: memory allocation question
    ... Ok, reading on... ... | out by functions from the C standard. ... | also says that a binary file may have an arbitrary number of NULs ... Thanks Jerry. ...
    (alt.comp.lang.learn.c-cpp)
  • Re: Reading a Binary File....
    ... I have a problem in reading a binary file. ... int acountnum; ... char phone; ...
    (comp.lang.c)
  • Re: storing dataset
    ... The CSV file format is convenient and widely ... and then back again when reading them is a big pain. ... binary file, you cannot be sure even that the same source code ... compiled with a different compiler or even a different version of the ...
    (microsoft.public.dotnet.framework)
  • Re: How to read a Hexadecimal file ?
    ... go and do some reading. ... That is NOT hexadeciaml data ... it's not data represented in ASCII. ... binary file, neither ASCII nor hexadecimal. ...
    (comp.lang.c)