Re: How to Read csv Files with both Characters and Numbers?



On 27 Mar, 16:33, nos...@xxxxxxxxxxxxx (Richard Maine) wrote:
relaxmike <michael.bau...@xxxxxxxxx> wrote:
- string_split(string,chars,nbcomponent,component) splits
the given string everytime one character in the chars argument is
found and add it to the array of string "component".

- string_is(string,class) returns false or true if the
given string is an element of the given class, with class
is a string chosen in the list : "integer", "character", string",
"real", "double", etc...
...
Of course, the basic blocks string_split and string_is do not
exist, therefore one would have to develop it...

...which isn't to hard to do. I've done simillar string utility routines
myself and I use them all the time. The interface to my procedures isn't
exactly as above, but they do follow the basic idea of having string
utility procedures that get used for parsing tasks.

I think that most people who have done much string manipulation in
Fortran have eventually developed such a set of personal string
manipulation tools (or borowed one from someone else). It's just the
"obvious" thing to do once one notices that the same basic tasks keep
coming up. That old "code reuse" buzz phrase; us old Fortraners have
been reusing code from long before the term was popularized.

But then, at a lower level, the procedures use some of the kinds of
techniques discussed here. They just add a slightly higher level of
interface.

--
Richard Maine                    | Good judgement comes from experience;
email: last name at domain . net | experience comes from bad judgement.
domain: summertriangle           |  -- Mark Twain

You're so right!

Here, for example, is my latest "split_string" using Fortran 95:
SUBROUTINE split_string (string, majdel, majsplit, ierror, mindel,
minsplit)
CHARACTER(LEN=*), INTENT(IN) ::
string
CHARACTER(LEN=*), INTENT(IN) ::
majdel
CHARACTER(LEN=LEN(string)), DIMENSION(:), POINTER ::
majsplit
INTEGER, INTENT(INOUT) ::
ierror
CHARACTER(LEN=*), INTENT(IN), OPTIONAL ::
mindel
CHARACTER(LEN=LEN(string)), DIMENSION(:,:), POINTER, OPTIONAL ::
minsplit
! This subroutine splits character "string" into a number of
substrings,
! returned by one-dimensional character pointer array "majsplit".
! The substrings of "string" are delimited by character "majdel",
! If requested ("mindel" and "minsplit" PRESENT) each of the
"majsplit"
! character array elements will, in turn, be split by delimiter
"mindel".
! The result in this case is, in addition to "majsplit", returned
in the
! two-dimensional pointer character array "minsplit", where
! SIZE(minsplit,1)==SIZE(majsplit) and SIZE(minsplit,2) is
! determined by the character in the "majsplit" array that has
most
! "mindel"s in it.
! The possibly "empty" substrings in "minsplit" are filled with
blanks.
! "ierror" returns an error code:
! 0 OK = no errors dicovered
! 1 "majsplit" is ASSOCIATED on entry
! 2 "majsplit" cannot be allocated
! 3 "minsplit" is ASSOCIATED on entry
! 4 Only one of "mindel", "minsplit" is PRESENT
! 5 "minsplit" cannot be allocated
! Examples:
! CALL split_string ('1-3:A;2-9:D;4:D', ';', majsplit, ierror)
! returns
! majsplit = "1-3:A"
! "2-9:D"
! "4:D"
! CALL split_string (1-3:A;2-9:D;4:D, ';', majsplit, ierror,
':', minsplit)
! returns the same "majsplit" as above and
! minsplit = "1-3" "A"
! "2-9" "D"
! "4 " "D"

INTEGER :: i, nmajdels, nmajsplit, istat, p,
dim2, j, nminsplit
CHARACTER(LEN=LEN(string)) :: work

ierror = 0

! Check non-association of "majsplit":
IF (ASSOCIATED(majsplit)) THEN
ierror = 1
RETURN ! -------------------------------->
ENDIF

! How many major delimiters, "nmajdels" are there?
nmajdels = 0
i = 0
DO
i = i + 1
IF (i>LEN_TRIM(string)) THEN
EXIT ! ------------------>
ENDIF
IF (string(i:i+LEN(majdel)-1)==majdel) THEN
nmajdels = nmajdels + 1
i = i + LEN(majdel) - 1
ENDIF
END DO

! "nmajsplit" must be one bigger:
nmajsplit = nmajdels + 1
ALLOCATE (majsplit(1:nmajsplit), STAT=istat)
IF (istat/=0) THEN
ierror = 2
RETURN ! -------------------------------->
ENDIF

! Fill with blanks:
majsplit = ' '

work = string
! "work" is shrunk in DO loop below:

DO i=1, nmajsplit
p = INDEX (work, majdel)
IF (p>0) THEN
majsplit(i) = work(1:p-1)
ELSE
! Last one:
majsplit(i) = work
ENDIF
! Ready for next substring:
work = work(p+LEN(majdel):)
END DO

! RETURN if "minsplit" not requested, indicated by BOTH arguments,
! "mindel" and "minsplit" being PRESENT.
IF (.NOT.PRESENT(mindel).AND..NOT.PRESENT(minsplit)) THEN
RETURN ! ------------------------------->
ENDIF

! Check them further:
IF (PRESENT(mindel).AND.PRESENT(minsplit)) THEN
IF (ASSOCIATED(minsplit)) THEN
ierror = 3
RETURN ! -------------------------------->
ENDIF
ELSE
! Here only one of them is PRESENT.
ierror = 4
RETURN ! ---------------------------------->
ENDIF

! Scan through the "majsplit"s to get max 2nd dimension, "dim2",
for
! "minsplit":
dim2 = -HUGE(0)

DO i=1, SIZE(majsplit)
nminsplit = 0

j = 0
DO
j = j + 1
IF (j>LEN_TRIM(majsplit(i))) THEN
EXIT ! ----------------------->
ENDIF
IF (majsplit(i)(j:j+LEN(mindel)-1)==mindel) THEN
nminsplit = nminsplit + 1
j = j + LEN(mindel) - 1
ENDIF
END DO
dim2 = MAX (dim2, nminsplit)
END DO

! Similar treatment as for one-dimensional "majsplit" above:
dim2 = dim2 + 1
ALLOCATE (minsplit(1:SIZE(majsplit),1:dim2), STAT=istat)
IF (istat/=0) THEN
ierror = 5
RETURN ! -------------------------------->
ENDIF

minsplit = ' '

DO i=1, SIZE(minsplit,1)
work = majsplit(i)
DO j=1, SIZE(minsplit,2)
p = INDEX(work, mindel)
IF (p>0) THEN
minsplit(i,j) = work(1:p-1)
ELSE
! Last for this "i":
minsplit(i,j) = work
EXIT ! -------------------->
ENDIF
work = work(p+LEN(mindel):)
END DO
END DO
END SUBROUTINE split_string
.



Relevant Pages

  • RfD: Escaped Strings version 4
    ... the S" string can only contain printable characters, ... the S" string cannot contain the '"' character, ... as an escape character for the entry of characters that cannot be ... \b BS (backspace, ASCII 8) ...
    (comp.lang.forth)
  • RfD: Escaped Strings version 4
    ... the S" string can only contain printable characters, ... the S" string cannot contain the '"' character, ... as an escape character for the entry of characters that cannot be ... \b BS (backspace, ASCII 8) ...
    (comp.lang.forth)
  • Re: RfD: Escaped Strings
    ... the S" string can only contain printable characters, ... the S" string cannot contain the '"' character, ... \b BS (backspace, ASCII 8) ... \ ** escapes to characters much as C does. ...
    (comp.lang.forth)
  • Re: A note on computing thugs and coding bums
    ... code is valid for any character set that is legal in C (which is a ... characters in the required source character set ... A String, in C Sharp or Java, can be redefined. ... allow programmers to handle some other data format, ...
    (comp.programming)
  • Re: input & output in assembly
    ... [As you've not specified OS or assembler, ... using individual character I/O and handling the rest yourself in your ... it finds in that string, ... ENTER key is pressed (maximum buffer size: ...
    (comp.lang.asm.x86)