Re: Challenging problem



Time for a dual lesson here :-)


Nameless wrote:
nth1 is defined as:

    nth1(1, [Head|_], Head) :- !.
    nth1(N, [_|Tail], Elem) :-     % (+,+,-)(+,+,+)
        nonvar(N),
        M is N-1,
        nth1(M, Tail, Elem).
    nth1(N,[_|T],Item) :-          % (-,+,+)
        var(N),
        nth1(M,T,Item),
        N is M + 1.

It's the last clause which is of interest. The first thing
we notice is that it is non-tail-recursive, which means
that it is generating choice points.

Non-tail-recursivity and [shallow] non-determinism (the cause of choicepoints) are orthogonal. In the above code, choicepoints are created if the call is with N a nonvar and the Prolog compiler does not exploit the manifest mutual exclusiveness of the last two clauses [most Prolog systems don't]

You can rewrite this code with an accumulating parameter (so that
it becomes tail-recursive) and the same non-determinism and choicepoints
will remain.

Tail-recursion is related to using constant environment stack space
(call frames in languages without backtracking). Non-tail-recursive
clauses tend to use a non-constant number of call frames.

Both (non-determinism and non-tail-recursiveness) are space inefficient,
but do not change the time complexity necessarily. [they don't in this
case]


Ok, then let's make it tail recursive by executing the
counter before the recursive call. Sure, we rid ourselves
of the choicepoints

I hope you can believe by now this isn't true. If you don't, try to understand it. If you don't, ask explanation.


but we then get (1 + L) * L // 2
additions. It's a no-win situation! One way out of the
predicament is to use member/2 to lookup values, and nth1/3
only when we know the lookup value exists:

         ...
         ( member(Mod, Mods) ->
              nth1(I, Mods, Mod),
              length(L, I),
              append(L, _, [Quot|Quots]),
              reverse(L, P)
         ...

For N = 77373 and 77377 the recurring period lengths are
respectively 2149 and 77376. Let period/2 be your original
program and period1/2 the modified program, which uses
member/2 for lookup functionality. In all other respects
period1/2 is the same as period/2. Then the results of
extracting the recurring periods of 77373 and 77377 are:

    ?- period(77373, R3).
    R3 = [0, 0, 0, 0, 1, 2, 9, 2, 4, 4, 0, 5, 1, 5, 4, ...]
    Yes (0.51s cpu)

    ?- period1(77373, R4).
    R4 = [0, 0, 0, 0, 1, 2, 9, 2, 4, 4, 0, 5, 1, 5, 4, ...]
    Yes (0.30s cpu)

    ?- period(77377, R3).
    R3 = [0, 0, 0, 0, 1, 2, 9, 2, 3, 7, 3, 7, 0, 2, 7, ...]
    Yes (658.97s cpu)

    ?- period1(77377, R4).
    R4 = [0, 0, 0, 0, 1, 2, 9, 2, 3, 7, 3, 7, 0, 2, 7, ...]
    Yes (365.70s cpu)

Now, I'm no expert in complexity theory, but I should think
that period1/2 is slightly better than O(N^2). Anyone?

While my earlier remarks mostly addressed terminology (choicepoints and call frames) and issues (tail recursion and determinism) that one could perhaps not expect every programmer to have a grasp on, the last cited comments of Nameless show some misunderstanding of complexity theory that was beyond my imagination and needs correcting.

First: concluding from two measurements for each algorithm that the
asymptotic behaviour of one is better than the other ... how to make
it clear this is wrong in a way most people understand it ... ok

take the value of two shares A and B on the stock market on two
different days; suppose A has risen more in absolute value than B; now
conclude that A will always be more profitable than B in the long run.
Does that make sense ? I do not think so.

Complexity theory is about what happens in the long run. If one has
already established that what happens in the long run is regular (like
polynomial) than a few measurements might give the impression to support
that. But the other way around is not ok.


Second: a shallow inspection of member/2 and nth1/3 (whichever version) and the program that executes member(Mod,Mods) as a precheck to executing nth1/3, teaches us that the complexity is exactly the same, but that the constant factors may differ.

Constant factors are important in practice, but not to complexity
theory: look up: "linear speedup theorem" if you want to know why.


If the conclusion of Nameless were carried a bit further, the nth1/3 predicate should be implemented as:

	nth1(N,Xs,X) :-
		member(X,Xs),
		workhorse_nth1(N,Xs,X).

because this would result in a "slightly better than O(N^2)" complexity.

While the latter implementation of nth1/3 might be a good idea sometimes
(because of the constant factor), it doesn't lead to a change in
complexity, neither in a systematic improvement of performance, i.e. it
is also sometimes a bad idea.


Cheers

Bart Demoen

ps. I have rewritten the above post three times trying to remove all
my usual acid from it; if I failed, it is because I lack time to make it
nicer: it's just not my style; my sincere apologies
.



Relevant Pages

  • Re: Multi-threaded Performance Pitfalls presentation
    ... CPU, thus preventing each one of those threads to run concurrently. ... system resources latencies; network vs. disks vs. memory transfers can ... "overall" tasks, as periodic controls, memory pool cleaning, monitoring ... may depend on the complexity you need for the ...
    (comp.programming.threads)
  • Re: PHP Read Text File
    ... Complexity of my algorithm (the same as GNU "tail") is O, ... which has an complexity of O. ... What is your basis to say that parsing the entire file is more CPU efficient ...
    (comp.lang.php)
  • Re: Direcshow problems with WMV9 Decoder for XP platform
    ... If you set the complexity to 1 and your processing takes ... that's the typical performance of a 400 MHz CPU I ... The block errors in your decoded frames are not the result ... of lowering the complexity (lowering the complexity produces ...
    (microsoft.public.win32.programmer.directx.video)
  • Re: Best FPGA for floating point performance
    ... we could see much more massive reduction in complexity ... > management is really a small portion of a CPU implementation. ... > cache and integer and fp alus which you'd need anywhere. ... and it now does with RLDRAM and as long as the latency hiding is in the ...
    (comp.arch.fpga)