Synchronization routine



I'm looking into a writitng a efficient routine and am looking for different
views of the problem:

Basically I have two lists, one on my firewall and one in a database. The
list on the firewall must become the same as the list in the database. The
current sizes are over 1 million entries and I expect to see 2 million rows
in the future.
Unfortuanlly the firewall vendor (Microsoft) made loading that list very
very slow when it gets large enough. Loading this list will take several
days on a very decent machine. Unfortuanlly I have only a slow system for
todays standards (would be top notch 6-7 years ago) but for the purpose.


So I have:

from the firewall:
Set A, which I can randomly access (though slowly)
from the database:
Set B, which is ordered, but can only be traversed sequentially once(!).
I expect the sets to scale up to 2 million elements.

The results should be:
C = A / B = all elements from A that are not in B
D = B / A = all elements from B that are not in A

I do not update set A until after retreiving the results. I have absolutely
no clue how they remove or add an enty to the list. So I don't want to jump
to make any assumptions in this regard.

Of course this must be done with only very limited computing power and I
mind the memory use.


My current solutions to the issue were:

Under the assumption that set A was ordered I used a 'merge join' which was
very fast. Unfortunally the assumption was flawed.

Using a sort routine has not yet worked out. I used quicksort, but I don't
believe its the correct solution. The entries in A are largely ordered so
quicksort wasn't too efficient and finally caused a stack overflow.

Next is using a 'hash join'. In my previous (probably inefficient)
implementation I saw memory usage reach over 100 MB for only 40000 entries.

Maybe anyone has a good suggestion which direction I should look for?
I thought it is a very nice problem, but hard to get it right.

- Joris


.



Relevant Pages

  • Re: Synchronization routine
    ... >> Basically I have two lists, one on my firewall and one in a database. ... >> list on the firewall must become the same as the list in the database. ... >on memory, so this isn't practical. ... For about one million entries, ...
    (comp.programming)
  • Re: How to get rid of persistent virus programs.
    ... > Long query about dealing with Pesky trojans and spyware ... > At least something like before and after lists, ... I'll mainly work around Windows XP, as that is what the bulk of this ... Why you should use a computer firewall.. ...
    (microsoft.public.windowsxp.help_and_support)
  • [fw-wiz] Re: Best Practices
    ... people separate network level (firewall, proxy, router acls, etc.) from ... so a security policy might be a base best practice;> Only part ... best practices aren't as much about giving people specific lists ... practices, I know I have other things to do and I assume you and Paul do ...
    (Firewall-Wizards)
  • RE: Looking for ipfw info.
    ... > legacy stateless rules when only stateful rules should be used to ... Yes for an firewall without an lan behind it ... You can access this lists archives at ... Then search the questions list archives at ...
    (freebsd-questions)
  • Spyware Blocklist 12/22/2002
    ... Entries for spyware which are new will say so. ... Changed references to the old Tiny Personal Firewall to the newer ... In the Network Mask box, enter the number from the middle ... might be able to delete rules in the previous lists which block ...
    (comp.security.firewalls)