Re: Detecting line terminators in a CSV file



Daniel Kasak wrote:
Greetings.

Hello,

I'm trying to write a CSV import routine for MySQL ( possibly will
extend it with a flashy GUI and release open-source ).

I'm having great difficulty doing this under Windows. If I use the code
below under Linux, it works perfectly, detecting correctly whether the
file has a Windows ( \r\n ) EOL sequence or a Unix ( \n ) EOL sequence.

However when I run it under Windows, the 1st substr() function that gets
the last 2 characters doesn't see things properly! Instead, it sees the
\n character and then a double quote ( " ) ... all the fields are
wrapped in double-quotes.

Why doesn't this code see the \r portion under Windows?

Because when on Windows the CR LF pair is converted to the "\n" newline character.

perldoc PerlIO
[snip]
:crlf
A layer that implements DOS/Windows like CRLF line endings. On
read converts pairs of CR,LF to a single "\n" newline character.
On write converts each "\n" to a CR,LF pair. Note that this layer
likes to be one of its kind: it silently ignores attempts to be
pushed into the layer stack more than once.

Also see the "Newlines" section of perlport:

perldoc perlport


How should I be doing the stuff below so that it works?

Are you sure that you want to "fix" this?

perldoc -f binmode


# Parse the 1st line of the import file and extract fieldnames
eval{
open SOURCE, $options->{source}
|| die "Failed to open file $options->{source}.\nIs the file
already open?";
};

if ( $@ ) {
Gtk2::Ex::Dialogs::ErrorMsg->new_and_run(
title => "Error opening file!",
text => $@
);
return FALSE;
}

You don't *have* to die if open doesn't work! And besides, using the high
precedence || operator means it won't die even if you wanted it to.

open SOURCE, '<', $options->{source} or do {
Gtk2::Ex::Dialogs::ErrorMsg->new_and_run(
title => 'Error opening file!',
text => $!
);
return FALSE;
};


# Read the 1st line
my $fieldnames = <SOURCE>;

# Close file
close SOURCE;

# Figure out what happens at the end of each line
# This should either be \n ( Unix ), or \r\n ( Windows )
my $line_terminator;

if ( substr( $fieldnames, length( $fieldnames ) -2, 2 ) eq "\r\n" ) {

You don't have to call the length() function, you can just use a negative number:

if ( substr( $fieldnames, -2, 2 ) eq "\r\n" ) {




John
--
use Perl;
program
fulfillment
.



Relevant Pages

  • SourceForge.net Sitewide Update: Jan 26th, 2005 (fwd)
    ... I just returned from an enjoyable Open Source conference in Hawaii ... open sourced the database under the name Derby. ... The CVS client SF.net staff recommend for MS Windows users is ... priority technical support. ...
    (comp.os.linux.announce)
  • Re: Qns on linux security frm windows users :::Help !!!
    ... But can anyone help me with some qns which windows users asked me...??? ... If you are getting your versions of Open Source software ... The alternate case is that some programmer makes Bar Deluxe as a closed ... a closed source software project is likely to have sloppy ...
    (comp.os.linux.security)
  • Re: vb6 client server scenario
    ... Open source depends on what I call "Heros", that is to say, some ... Windows, for any language other than C/C++, and a real hatred of VB. ... It is a file-based database. ...
    (microsoft.public.vb.general.discussion)
  • Re: MicroMonopoly aids Terrorism?
    ... By Making Microsoft’s OSs Open Source. ... companies wouldn't have the clout to dominate the market. ... If Windows goes Open Source, ... competition in those markets, and during the decade or so it takes for the ...
    (microsoft.public.windowsxp.general)
  • Re: MicroMonopoly aids Terrorism?
    ... By Making Microsoft’s OSs Open Source. ... companies wouldn't have the clout to dominate the market. ... If Windows goes Open Source, ... competition in those markets, and during the decade or so it takes for the ...
    (microsoft.public.windows.inetexplorer.ie6.browser)