Re: code critique
- From: Daniel Leidisch <news@xxxxxxxxxxxx>
- Date: Wed, 30 May 2007 17:13:27 +0200
In article <87r6oyigvo.fsf@xxxxxxxxxxxxxxxxxx> you wrote:
Daniel Leidisch <news@xxxxxxxxxxxx> writes:
Since I'm learning Lisp on my own, I would highly appreciate any tips
concerning my code. What could be done better, what is already ok?
(defun convert-hex-encoded-chars (string)
"Converts %HEX encoded chars found in string."
(if (cl-ppcre:scan "%.." string)
(cl-ppcre:register-groups-bind
(before match after) ("([^%]*)(%..)(.*)" string)
(concatenate 'string before
(string (code-char
(parse-integer (subseq match 1) :radix
16)))
(convert-hex-encoded-chars after)))
string))
This is fine; the following are just a few suggestions:
* it may be more efficient to use a string-output-stream than to
repeatedly construct intermediate strings. See
WITH-OUTPUT-TO-STRING.
Although performance was not my foremost concern, that seems to be
a good tip. I'll keep it in mind, and maybe I'll change the code
according to your suggestion.
* A portability caveat: ANSI Common Lisp does not require characters
to be encoded in ASCII or any other encoding, and so CODE-CHAR may
not do what you expect here.
Do you have any suggestions as to how to implement this
more portably? After a quick (apropos 'ascii) I recognized
that all fbound symbols are in the SB-IMPL package (sbcl is my
cl-implementation of choice), so I assume that there is no standard
way to do it. The first thing that came to my mind was to add an
ascii table lookup myself. Is there a better solution?
* While it's evident when you look at the code for a few seconds (if
one knows regular expressions), neither the name of the function nor
the docstring actually says which way this function "converts"; it
could be "convert /from/", or "convert /to/". Why not name it
something like DECODE-HEX-ENCODED-CHARACTERS, or
DECODE-URLENCODED-STRING?
You're right. I like the names you proposed, so they are used now.
* Note that you don't really need regular expressions here: you can go
over characters in the string fairly easily with LOOP or DOTIMES, or
by constructing a string-input-stream and looping with READ-CHAR,
decoding and accumulating the first two characters after a percent
sign, and accumulating other characters verbatim. (This isn't a
claim that regular expressions are to be avoided: rather, that if
you're looking for exercises, then since you demonstrate facility
with regular expressions already, you might like to try writing this
routine using only standard CL routines.)
That would make an interesting exercise, indeed. I think I'll give
it a try.
(defun split-query-string (query-string)
"Splits a query-string and returns a list of variable=value
assignments."
(cl-ppcre:split "&" (substitute #\Space #\+
(convert-hex-encoded-chars
query-string))))
If you split after decoding, what happens in case one of the decoded
variables or values contains an ampersand? For example:
(split-query-string "foo=b%26r")
(defun get-parameters (&optional
(query-string
(osicat:environment-variable
"QUERY_STRING")))
"Returns an alist of (variable . value) pairs for a cgi
query-string, which
may be given as an optional parameter. Otherwise, the
environment-variable
QUERY_STRING is used."
(loop for variable in (split-query-string query-string)
collect (cl-ppcre:register-groups-bind
(key value) ("(.*)=(.*)" variable)
(cons (intern (string-upcase key))
value))))
Likewise, if you split variables from values after decoding, what
happens in case one of the decoded variables or values contains an
equal sign? For example:
(get-parameters "foo=b%3Dr")
Cripes, that's awkward. Of course you're right. In my defence: To
me the major problem was implementing convert-hex-encoded-chars,
so I seem to have become sloppy after it worked. Lack of discipline,
I guess. Now it works!
Yet again, thank you very much for your help -- it is appreciated.
Regards,
dhl
--
RmK
Here is a quick overhaul, I hope that there are no major slips of
the pen:
(defun decode-urlencoded-string (string)
"Decodes an urlencoded string: Occurrences of %HEX encoding are
transformed to the represented characters and + is substituted by
#\Space."
(substitute #\Space #\+
(decode-hex-encoded-chars string)))
(defun decode-hex-encoded-chars (string)
"Occurrences of %HEX in stringencoding are transformed to the
represented characters."
(if (cl-ppcre:scan "%.." string)
(cl-ppcre:register-groups-bind
(before match after) ("([^%]*)(%..)(.*)" string)
(concatenate 'string before
(string (code-char
(parse-integer (subseq match 1) :radix 16)))
(decode-hex-encoded-chars after)))
string))
(defun get-parameters (&optional
(query-string
(osicat:environment-variable "QUERY_STRING")))
"Returns an alist of (variable . value) pairs for a cgi query-string,
which may be given as an optional parameter. Otherwise, the environment-
variable QUERY_STRING is used."
(loop for variable in (cl-ppcre:split "&" query-string)
collect (cl-ppcre:register-groups-bind
(key value) ("(.*)=(.*)" variable)
(cons (intern (string-upcase (decode-urlencoded-string key)))
(decode-urlencoded-string value)))))
(defparameter *parameters* (get-parameters)
"Alist of QUERY_STRING variables.")
(defun get-parameter (key)
"Returns the value of a QUERY_STRING variable named key."
(cdr (assoc key *parameters*)))
.
- Follow-Ups:
- Re: code critique
- From: Richard M Kreuter
- Re: code critique
- From: Vassil Nikolov
- Re: code critique
- References:
- code critique
- From: Daniel Leidisch
- Re: code critique
- From: Richard M Kreuter
- code critique
- Prev by Date: Re: Specializing slot-value-using-class in OpenMCL
- Next by Date: Re: randomize-list
- Previous by thread: Re: code critique
- Next by thread: Re: code critique
- Index(es):
Relevant Pages
|