Re: code critique



Daniel Leidisch <news@xxxxxxxxxxxx> writes:

Since I'm learning Lisp on my own, I would highly appreciate any tips
concerning my code. What could be done better, what is already ok?

(defun convert-hex-encoded-chars (string)
"Converts %HEX encoded chars found in string."
(if (cl-ppcre:scan "%.." string)
(cl-ppcre:register-groups-bind
(before match after) ("([^%]*)(%..)(.*)" string)
(concatenate 'string before
(string (code-char
(parse-integer (subseq match 1) :radix 16)))
(convert-hex-encoded-chars after)))
string))

This is fine; the following are just a few suggestions:

* it may be more efficient to use a string-output-stream than to
repeatedly construct intermediate strings. See
WITH-OUTPUT-TO-STRING.

* A portability caveat: ANSI Common Lisp does not require characters
to be encoded in ASCII or any other encoding, and so CODE-CHAR may
not do what you expect here.

* While it's evident when you look at the code for a few seconds (if
one knows regular expressions), neither the name of the function nor
the docstring actually says which way this function "converts"; it
could be "convert /from/", or "convert /to/". Why not name it
something like DECODE-HEX-ENCODED-CHARACTERS, or
DECODE-URLENCODED-STRING?

* Note that you don't really need regular expressions here: you can go
over characters in the string fairly easily with LOOP or DOTIMES, or
by constructing a string-input-stream and looping with READ-CHAR,
decoding and accumulating the first two characters after a percent
sign, and accumulating other characters verbatim. (This isn't a
claim that regular expressions are to be avoided: rather, that if
you're looking for exercises, then since you demonstrate facility
with regular expressions already, you might like to try writing this
routine using only standard CL routines.)

(defun split-query-string (query-string)
"Splits a query-string and returns a list of variable=value assignments."
(cl-ppcre:split "&" (substitute #\Space #\+
(convert-hex-encoded-chars query-string))))

If you split after decoding, what happens in case one of the decoded
variables or values contains an ampersand? For example:

(split-query-string "foo=b%26r")

(defun get-parameters (&optional
(query-string
(osicat:environment-variable "QUERY_STRING")))
"Returns an alist of (variable . value) pairs for a cgi query-string, which
may be given as an optional parameter. Otherwise, the environment-variable
QUERY_STRING is used."
(loop for variable in (split-query-string query-string)
collect (cl-ppcre:register-groups-bind
(key value) ("(.*)=(.*)" variable)
(cons (intern (string-upcase key))
value))))

Likewise, if you split variables from values after decoding, what
happens in case one of the decoded variables or values contains an
equal sign? For example:

(get-parameters "foo=b%3Dr")

--
RmK
.



Relevant Pages

  • Re: How to convert Infix notation to postfix notation
    ... If this is for an error message, why isn't it using stderr for its output? ... array of 15 characters, and you call this function with the limit 15 on ... Making sure that the only string I allocate and append to, ... because mulFactor in all versions must needs incorporate the functions ...
    (comp.lang.c)
  • Re: Prothon should not borrow Python strings!
    ... """It does not make sense to have a string without knowing what encoding ... same cul de sac as Python. ... Prothon_String_As_ASCII // raises error if there are high characters ... Python's split between byte strings and Unicode strings is ...
    (comp.lang.python)
  • Re: Letter to US Sen. Byron Dorgan re unpaid overtime
    ... put them in stupid places. ... Programming is difficult (as you must surely appreciate, ... > strings will be in the range 1...1000 characters. ... impose an artificially small limit on string length." ...
    (comp.programming)
  • Re: Byte Array to String
    ... retrieved text will mismatch the original characters. ... encoding the characters. ... Dim strFileData as String ...
    (microsoft.public.dotnet.framework.aspnet)
  • Re: A note on personal corruption as a result of using C
    ... impossible to write effective string validation routines by definition ... (Note that a string literal may contain embedded null characters; ... without resorting to abusive language. ... In practice, programmers typically use "struct" ...
    (comp.programming)