Re: join enhancements



Andreas Leitgeb wrote:
Neil Madden <nem@xxxxxxxxxxxxx> wrote:
....

I wanted to also extend split, but this opened a can of worms, which
I found myself unable to handle yet:
What if the value (in your example) contains an "="? (yes, I know, it could also contain a \n, breaking things completely, but thats not what I'm worrying about here)
My question is: will multi::split create a three-element sublist then,
or will it ignore any further occurrence of the "inner" split-char until it finds next "outer" split-char? and will it ignore any
"outer" split-chars until it finds an "inner" one?

It will create a three-element sub-list. This is what should happen: split (and multi::split) take a *string* as input -- i.e. they assume no structure in what they are given, and just split on the given chars. Thus, talking about "inner" and "outer" chars doesn't make sense in this context. The caller is assumed to have taken care of quoting (or rather, eliminating) any stray delimiter characters. A join/split pair *could* be created that take care of quoting/unquoting, something like:


proc quote-join {list delim} {
  join [map [quote $delim] $list] $delim
}
proc quote-split {string delim} {
  set re [format {(?:[^%s]|\\[%s])+} $delim $delim]
  map [unquote $delim] \
      [regexp -all -inline $re $string]
}

with the some reasonable definitions for the other funcs:

proc map {func list} {
  set ret [list]
  foreach item $list {
    lappend ret [uplevel 1 [linsert $func end $item]]
  }
  return $ret
}
proc quote {delims} {
  set map [list \\ {\\}]
  foreach char [split $delims {}] {
    lappend map $char \\$char
  }
  return [list string map $map]
}
proc unquote {delims} {
  set map [list {\\} \\]
  foreach char [split $delims {}] {
    lappend map \\$char $char
  }
  return [list string map $map]
}

With these definitions:

quote-split [quote-join $str $delim] $delim

and

quote-join [quote-split $str $delim] $delim

should be the identity function for all inputs (well, except for the list normalisation performed by regexp). However, the quoting mechanism now is altering the structure of the original input beyond a simple split/join, so it is quite a big change to the semantics and should be given a different name to avoid confusion.


so, back to proposed non-nesting cycling split, should split {a=b=c,r,x=y} "=" "," return {a b=c r,x y} or {a b=c r {} x y} ?

The former, as that makes it the inverse of your cycling join:

% cycle-join {a b=c r,x y} = ,
a=b=c,r,x=y

why would it produce the latter? With my multi::split it would produce:

a b {c r x} y

which seems more natural to me.
....

Well, I think this functionality would really nicely fit into the one tcl join.


Perhaps. I'm not so sure that it's the right behaviour, and I don't see a compelling case for the inclusion when it is so easy to do in a few lines of Tcl. I think tcllib is a better place for these sorts of simple functions, at least until the "right" behaviour can be agreed on.


-- Neil
.



Relevant Pages

  • Re: content scripts, mainly map scripts, questions
    ... The current plan is to use a file format that a) allows ascii ... char map element details, and b) optional xml style content for extra things ... difference is the two-lined 'window' chars as opposed to the one-lined ... A primitive roguelike could work on the basic elements (wall, ...
    (rec.games.roguelike.development)
  • Re: LOTRO- another map site
    ... had a heck of a time finding the fragrant herbs for the draught. ... map. ... roleplay & forget ALL areas with new chars that start in the same ...
    (comp.sys.ibm.pc.games.rpg)
  • Re: content scripts, mainly map scripts, questions
    ... The current plan is to use a file format that a) allows ascii ... char map element details, and b) optional xml style content for extra things ... difference is the two-lined 'window' chars as opposed to the one-lined ... The general idea is that the meaning of characters can vary from map to ...
    (rec.games.roguelike.development)
  • Re: content scripts, mainly map scripts, questions
    ... The current plan is to use a file format that a) allows ascii char map element details, and b) optional xml style content for extra things like traps, creatures, items, random spots, etc. ... However, lurk will make use of walls made from different materials, and one visual difference is the two-lined 'window' chars as opposed to the one-lined window chars. ...
    (rec.games.roguelike.development)