Re: input stream 101



In article <1123100574.396477.35760@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>,
hawat.thufir@xxxxxxxxx <hawat.thufir@xxxxxxxxx> wrote:
>blmblm@xxxxxxxxxxxxx wrote:
>...
>> So, the reason for having a separate interface definition is to have
>> something short that just lists method signatures? Well, okay.
>> I still think it's a bit redundant to have both an interface and
>> a class, though perhaps there's an additional benefit I'm not
>> thinking of.
>
>Keep in mind that, for me, this is a chance to experiment. The idea
>of:
>
>Foo fooBar;
>fooBar = new Bar();
>
>I've scratched the surface with non-generics collections and that idea
>seems to be the best way to use them, like referring to an ArrayList as
>a List (if I have that right).

Yes -- but the reason behind referring to an ArrayList as a List is
that there could be several classes that implement the List interface,
and if your code just needs the functionality defined by List, you
write the code in terms of Lists, and then if you ever wanted to use,
instead of ArrayList, some other class that implements List (e.g.,
LinkedList), it would be relatively easy to do -- you just replace
"new ArrayList" with "new LinkedList". Or your code might not be
creating the lists anyway (maybe it's library code intended to be
called by some other code that actually creates lists), in which case
there's even more reason to have it work in terms of List rather than
in terms of some particular class that implements List.

Maybe the idea still makes sense even if you only intend to have
one class implement your interface; I can't think of a reason, but
there might be one.

>Partly it's just to apply that idea and see what happens, I think
>you've shot down all my other reasons ;)

But in a kind and gentle way, I hope. :-)

>> With regard to documentation, do you know about the "javadoc"
>> tool?

[ snip ]

>I knew about javadoc and have seen code commented with it. It seemed
>to make the code so ugly, my opinion, that I couldn't justify that.
>However, it's interesting to know that undocumented code can be run
>through javadoc, that's neat.

Wow. To me the benefit (automatically generating documentation that
looks like the "standard" documentation) justifies a certain amount
of ugliness in the source -- and anyway aside from the use of the
javadoc tags (e.g., "@param"), I don't find that my comments look
much different from the way they did before I discovered javadoc.
May be a "YMMV" thing, though.

>I think I'll have to start doing some documentation, it, err "seemed"
>"faster" to not deal with that...I know that's erroneous thinking, but
>there you are.

I think it can help to write the comments first, since that forces
you to try to define what it is you're doing, rather than just diving
in and hacking out code. However, I admit that sometimes I find
myself just hacking out code (and feeling a bit guilty about it).

I think there's also a school of thought that claims that if you
choose your method and variable names really carefully, you won't need
comments -- e.g., if you're writing a Circle class, does getArea()
really need a comment to explain what it does? I don't think this
always works, but sometimes it does.

[ snip ]

>Yes, Test16 is a crappy name, I kept it largely because I just wanted
>to use it as someone elses code and not muck with it. I believe that
>it now makes sense to go back, get rid of the BasicTidy interface and
>rename Test16 to BasicTidy.

That's what I'd probably suggest.

>I'm not sure I would've arrived there without the input here, because,
>err, "it's my code and therefore it's perfect and doesn't need
>changing"; that kind of thinking. It's good to see what others think
>of what I have, definitely :)

And as I'm sure you know -- you probably progress faster if you're
willing to listen to other people's ideas, and keep the ones that make
sense to you.

[ snip ]

>Where I'm going with this is that I have more than 100 HTML files on my
>hard drive. I want to take those files, tidy-ize them, extract some
>XML, use XSLT to insert (?correct term?) that data into a database,
>probably MySQL. All the HTML files are the same pattern.
>
>What I have right now is the capability to hard-code in the URL for one
>of these HTML files and tidy-ize it. I want to just stick with this
>one file, for now. The entire process is:
>
>1.) hard code url
>2.) run black box (java -jar ControlTidy.jar)

If you're going to run from the command line, you could pass in
the URL and output file name as command-line parameters. main()
has a parameter of type String[], right? which at runtime contains
an array of Strings, one for each command-line argument. I think
this would be a very easy change from your current code.

>3.) get foo.html
>4.) extract xml from foo.html for foo.xml
>5.) put foo.xml into, for examply, MySQL
>
>I could work more on the front end, like a nice GUI, but I'd rather get
>some output.

And if the intended purpose of this code is to mass-convert 100+
files, my guess is that you'd be better served by something you can
run from the command line, because then .... Well, I'm thinking
in Unix/Linux terms, where the obvious way to do this would be with
shell-script constructs to loop over all the files, etc. I seem to
remember messages from you to a Linux mailing list? so maybe that
approach would work for you.

>To that end I've been looking into some stuff. I'm not
>at home, so I don't have the links, but I looked at some sample code at
>sun that simply copies xxx.txt to yyy.txt, named FileCopy.
>
>(My time on this computer is expiring, so I can't look it up right now.
> It uses like FileReader, if memory serves. I was thinking of using
>that in conjunction with ByteStreamReader to read something directly
>off the hard drive.)
>
>I'm more interested in the back-end, though. gotta go!

[ snip ]

>> Also, if BasicTidy extends Runnable -- which
>> I'm not sure makes object-oriented-design sense, but whatever -- then
>> there's no need for it to explicitly include a signature for run().
>> I'm not really sure whether the duplication could be harmful.
>
>I'd like to get rid of that, it seems out of place.

I'm guessing this is another artifact of your having adapted the
code .... Nothing wrong with that, by the way; I think it's a fine
way to get started using stuff you don't understand 100%. But then
as you learn, you figure out what parts can be modified to suit your
purposes better.

Anyway, an easy change to your code would be to remove "extends
Runnable" from your interface definition, and just call your run()
method directly from the main program, rather than creating a Thread
object and calling its start(). If you wanted to avoid possible
confusion for readers who think run() must have something to do with
threads, you could rename it convert() or doIt() or whatever.

--
| B. L. Massingill
| ObDisclaimer: I don't speak for my employers; they return the favor.
.