Re: My First C# (warning - long post)
- From: LX-i <lxi0007@xxxxxxxxxxxx>
- Date: Mon, 05 Feb 2007 20:45:41 -0600
andrewmcdonagh wrote:
On Feb 5, 1:42 am, LX-i <lxi0...@xxxxxxxxxxxx> wrote:andrewmcdonagh wrote:
snipped
Ouch! ;)
// This is loaded with the keywords from the below table
private List<String> cobolKeywords = new List<String>();
Thats ok, but the performance is going to suffer.
Every time you create an instance of this class, its going to:
1) Create a Array of the string literals
2) Copy that array one by one into a List
Then everytime you call 'isKeyword()' on your class, it asks the List
if it contains the keyword. Which basically is going to loop through
its internal data structure, one at a time until it find the keyword
or finishes not finding it..
Where as......
using System;
using System.Collections.Generic;
using System.Text;
namespace CodeStats
{
class FastCobolKeywordDictionary : KeywordDictionary
{
static Dictionary<String, String> keywords = new
Dictionary<string,string>();
// Static Constructor - means it will be called
// once and only once, sometime between when the
// program begins and the class is instantiated.
static FastCobolKeywordDictionary()
{
//...
keywords.Add("AUTHOR", null);
keywords.Add("BEFORE", null);
keywords.Add("BINARY-1", null);
// ....
}
public bool isKeyword(String keyword)
{
return keywords.ContainsKey(keyword);
}
}
}
This class has a 'static constructor', which is called once and only
once by the runtime system for you, at some point between the program
starting and the first usage of the class.
Plus, when 'isKeyword()' is called, it asks the Dictionary if it
contains the keyword, which does a super fast lookup, which out
looping through the entire collection.
Dictionarys (And Hashtables) internally separate their contents into
buckets based upon the hashcode of the Key. So when asked
'ContainsKey(...) they only need to look in the relevant (small)
bucket for a match, instead of the entire collection.
Ah... I shied away from that because I thought, "I don't need a key-value pair - just the value." Would it still be acceptably efficient to do the Add()s off the array defined in the constructor? I could convert it to hard-coded Add()s, that would be easy enough.
Also, regarding the above - would there be anything wrong with making isKeyword() a static method as well? Then, you wouldn't actually have to create an object - you could just say something like
if (FastCobolKeywordDictionary.isKeyword(words[i])) {
// do something really cool
}
I did that today with the NormalizeSpace method - I created a class called CSCSFuncs (as plain Funcs was already taken), and made NormalizeSpace() a class method.
(As a *way* aside, I found out that my web host, for $1 more a month, offers ASP.NET with SQL Server 2005. I'm really tempted to cough up the extra $12 to start working with some of my websites with this. Plus, the plan still include PHP and MySQL, so my current stuff should continue to work as-is. That way, I'd be able to get more experience in the environment.)
This is really cool stuff! The intent of this look-up is to keep us
from hitting the database to determine if the name is a database name.
However, it just hit me when I was typing this e-mail - I could make
DataSets from the database tables, convert them to lists, and use the
list lookups for the cull building! No repetitive database access!
(You were just the witness to a huge light coming on in my head... And
no, it's not just the glare from where I don't have hair there
anymore... ;) )
oh ok...be care, this caching technique is widely know and so are its
problems.
These wouldn't updateable. Basically, I've got to look at every word in a COBOL program to see if it's a database name. By excluding keywords, I would not have to "hit the database" for those, which reduces my database access time. Ditto for words with symbols. All I'm doing is looking to see if that word is in the database of "database words". If it is, I determine how it's being used (we track used vs. updated, although it's kind of weak because an update of the group level item updates all the elementary items; I think that the existing retrieval walks up the hierarchy if you ask it to), then update the count.
I may have opportunity to use datasets for the update counts - but, I would be AcceptChanges()ing often. Also, this process runs when a user checks in a piece of source code - and, I can say with near certainty that two people are going to be checking in the same item at the same time! :) (If they are, we have other problems...)
(And yes, I know that this is "the original intent" of the code, which may be changed over the years... Maybe straight ExecuteSQL()s would be better (wrapped in another class, of course).)
And - your inteface - I could come up with something like
NetworkRecordKeywordDictionary : KeywordDictionary
NetworkItemKeywordDictionary : KeywordDictionary
RelationalTableKeywordDictionary : KeywordDictionary
RelationalItemKeywordDictionary : KeywordDictionary
I actually *will* use a dictionary on the relational items, as I'll have
a proc data name (ex. R119-Home-ELC) that references an actual database
name (HOME_ELC in the HOME_USER table). I may have to add another
method to the interface for those, or change the return type - of
course, I guess at that point it's not really using the
KeywordDictionary interface. :)
note sure I follow you here...
I'm not sure I followed myself... :)
The interface is there to generalise the method, so if you change it
to specialise again, then you code will need to
if (cobol)
return cobolKeywordDictionary.isKeyword(something);
else
return someOtherKeywordDictionary.isKeyword(Someting,
andSomethingElse);
always try to 'push' the knowledge of which one to call, to the other
class (Our keyword dictionaries in this case), that way we can simply
someDictionary.isKeyword(something);
Unless I return an array... Hmmm... [blink] Man, my power bill is going to be sky-high this month! :)
But yes, I agree - the lookup for the relational items probably wouldn't be an isKeyword() call, because I'm not implementing the interface if I make an isKeyword() method that doesn't return a "bool" value, right?
Man - I'm excited. :) I really wish that there was a generic way to
code, though, that didn't have to know about what sort of database
backend you have. SqlDataAdapter is the DataAdapter for SQL Server,
while OdbcDataAdapter is the one for ODBC. If I could "genericise" (is
that a word?) those in the code, and just have it use whatever type of
connection I passed to it, I could actually run this against a copy of
the SQL Server database on my laptop (in MySQL or Access or something
else). (And yes, I know I could just code it with ODBC, but I don't
want it to run generically in production when there are libraries
specifically enhanced for it.)
And that is a great technique - One I'd endorse.... which a twist...as
above. push the knowledge of the Connection to the data source class
itself, or pass it when you create an instance of the class.
I tend to always abstract the data reading/writing behind a general
purpose 'plain old c# ' interface, and let the implementors of that
interface decide how/where to get their data from.
e.g.
interface LineCountReporter
{
void WriteTotalLineCount(String sourceFilename, int32
lineCount);
}
class MySqlLineCountReporter : LineCountReporter
{
public MySqlLineCountReporter
{
dbConnection = // set the dbConnection here...
}
// provide the connection at object creation time only.
public MySqlLineCountReporter(Connection connection)
{
dbConnection = connection
}
public void WriteTotalLineCount(String sourceFilename, int32
lineCount)
{
// use MySql database code to write to a table.
}
}
I guess I'm not seeing this... How does this help me run the code on my laptop against MySQL, e-mail to work, and run it on our server there against SQL Server? (I don't mean this to sound as harsh as it probably does, but I can't think of another way to word the question... Like I said, I'm probably just not connecting the dots.)
Thanks again... :) I think I'm going to have to figure out how to use Google Groups, because I don't want to lose this newsgroup in a couple of weeks when my cable modem is disconnected!
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~ / \ / ~ Live from Montgomery, AL! ~
~ / \/ o ~ ~
~ / /\ - | ~ daniel@thebelowdomain ~
~ _____ / \ | ~ http://www.djs-consulting.com ~
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
~ GEEKCODE 3.12 GCS/IT d s-:+ a C++ L++ E--- W++ N++ o? K- w$ ~
~ !O M-- V PS+ PE++ Y? !PGP t+ 5? X+ R* tv b+ DI++ D+ G- e ~
~ h---- r+++ z++++ ~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
"Who is more irrational? A man who believes in a God he doesn't see, or a man who's offended by a God he doesn't believe in?" - Brad Stine
.
- Follow-Ups:
- Re: My First C# (warning - long post)
- From: andrewmcdonagh
- Re: My First C# (warning - long post)
- From: andrewmcdonagh
- Re: My First C# (warning - long post)
- References:
- Re: My First C# (warning - long post)
- From: LX-i
- Re: My First C# (warning - long post)
- From: andrewmcdonagh
- Re: My First C# (warning - long post)
- From: LX-i
- Re: My First C# (warning - long post)
- From: andrewmcdonagh
- Re: My First C# (warning - long post)
- Prev by Date: Re: My First C# (warning - long post)
- Next by Date: The Ides o' March bug
- Previous by thread: Re: My First C# (warning - long post)
- Next by thread: Re: My First C# (warning - long post)
- Index(es):
Relevant Pages
|