Re: Assembly Language - Mathematics WITHOUT maths coprocessor
- From: "Richard Cooper" <spamandviruses@xxxxxx>
- Date: Tue, 11 Oct 2005 10:48:35 GMT
On Tue, 11 Oct 2005 03:04:28 -0400, ¬a\/b <al@xxx> wrote:
this should be a/b =(a.sn.sn<<(32*precision))/b.sn; ?
Yes.
It's because when you store a number as a fixed point number, all that you are really doing is storing the number multiplied by a constant value. We usually make that constant 256 or 65536 or something of that sort, but it could just as well be 7, 69, 111, anything.
Now, that fourth grade math I was talking about...
When I was in fourth grade, we were taught how to do long division with decimal numbers. We were doing problems like this:
_____
1.2 ) 37
The way we were taught to do this was to convert 1.2 to an integer, by multiplying it by 10 enough times to do so. So for this particular problem, we would multiply it by 10 once to get 12. So that we came up with the correct answer, we also multiplied 37 by 10, so that the problem looked like this:
______ 12 ) 370
Then it was just an ordinary integer division problem.
This is similar to what we're doing with fixed point numbers. We're multiplying them so that they become integers, and then preforming operations on integers instead.
Now in 4th grade, we multiplied both numbers in the problem by the same amount, because we wanted to get the correct answer when we were done. Doing the division would remove the constant that we multiplied each number by, so that the answer we got was the correct one.
For a division problem, "n / d", we basically turned it into "(10 * n) / (10 * d)" and of course the "10 / 10" part of that doesn't affect the answer since 10 / 10 is 1, and so the answer came out multiplied by 1.
Now in fixed point math we actually want the answer to also be multiplied by 10 (or whatever). So we've got to do the equivelant of "(10 * 10 * n) / (10 * d)" so that the 10 in the denominator eliminates one of the 10s in the numerator, leaving us with the other 10 that we want the answer to be multiplied by.
Now before you had written: a/b =(a.sn.sn<<(2*32*precision))/b.sn;
Which is basically: a/b =(a.sn.sn<<(32*precision)<<(32*precision))/b.sn;
A single "<<(32*precision)" is just like a "* 10"
The problem with this is that a.sn.sn (what's with the two 'sn' there anyway?) already has one "<<(32*precision)" applied to it when it was converted to the fixed point format. So even though we want the numerator to be "n *10 * 10", it's representation in the floating point format is already "* 10" so it only needs one more "* 10" to make it "* 10 * 10".
The same is true with the denominator, which also needs to be "* 10" for it to all work out, but it's already "* 10" because it's in fixed point format, so we don't do anything to it.
i think it is more easy than what you say for input something like this: if Lstring is the string of numbers and "a" is the number result fnum a, b; unum c; char *p1, *p2; b.sn=read(Lstring, p1);
a=b<<(32*precision);
if(*p1==".") ++p1; else goto label; c=read(p1, p2);
if(p1==p2) {label:;
return a; }
if(precision<c.len) {k=c.len-precision; c>>= 32*k; } a+=c; return a;
for output, something like above, should be possible
That certainly doesn't look any easier. It won't work either. (not that I claim to understand it, but what of it that I do understand, it won't work)
The problem is that it's just not that simple of a conversion. For example, 0.69 in binary is 0.1011000010100011, however 69 in binary is 1000101, and as you see, there's just no similarity in the bit patterns.
Now if you took the binary form of 69 and divided it by 100 (decimal), then you'd have the correct bit pattern. However, that's basically just doing what I said, except that it breaks the number into two halves and does each seperately. I fail to see how going through all of that trouble is any easier than the method I outlined.
if(precision<c.len) {k=c.len-precision; c>>= 32*k; }
You particularly want to account for the cases with lots of extra decimal places. For example, if I were doing 32-bit numbers with 8 fractional bits, I'd likely be reading the number into a 64-bit integer. The integer part is 24 bits, so that means I have 40 bits left over that I can toss extra decimal places into.
So say I get this number as input: 3.1415926535897932384626433832795028841971693993751
Those 40 bits can hold log_10(2^40) or 12.04119983 decimal digits. (Or if you wanted to do it the easy way, you'd just divide 40 by 3.321928095 and get the same number.) So what I'd do in my number reading routine is make it just always read 12 digits after the decimal point.
So if I gave it the above number, it would treat it as 3.141592653589, and if I gave it 3, it would treat it as 3.000000000000. So it works like this:
1. let x = 0 and let c = 12 2. Read next character. 3. If character is a number, multiply x by 10 and then add that number. 4. If character is a decimal point, go to step 10. 5. If character is anything else, go to step 20. 6. Go to step 2.
10. Read next character 11. If character is a number, multiply x by 10 and then add that number. 12. If character is anything else, go to step 20. 13. Subtract 1 from c. 14. If c == 0, go to step 30. 15. Go to step 10.
20. Multiply x by 10. 21. Subtract 1 from c. 22. If c != 0, go to step 20.
30. Shift x left by 8 bits (or whatever your "32*precision" is). 31. Add 500000000000 (half of 10^12) to x. 32. Divide x by 1000000000000 (which is 10^12) 33. Rejoice that you now have the number in fixed point format.
So steps 1 through 22 convert 3.141592653589 to 0x2DB75839F15.
You may notice that the next digit in pi is a 7, but that I didn't provide for rounding up the 9. This is because so many of those digits aren't even going to make it into the final number that the error caused by not doing so is very very very small. Like 0.00000001% small. So it's just not worth the trouble.
Step 30 adds in the normal "<< 8" format of the numbers, which turns it into 0x2DB75839F1500. Then step 31 takes care of rounding. Then step 32 moves the decimal point 12 places left, turning the number into 0x324. That makes it 11.00100100 in binary, which in decimal is 3.140625, which is as close to pi as you're going to get with only 8 fractional bits. So for the most part, most of those 12 digits we looked at were unnecessary, however on extremely rare cases they actually do make a difference, for example 1.005859374 would become 0x101, but 1.005859376 would become 0x102. Not really worth making a fuss over, but as long as you've got those extra 40 bits and can input 12 digits, you might as well.
.
- Follow-Ups:
- References:
- Re: Assembly Language - Mathematics WITHOUT maths coprocessor
- From: Richard Cooper
- Re: Assembly Language - Mathematics WITHOUT maths coprocessor
- From: ¬a\\/b
- Re: Assembly Language - Mathematics WITHOUT maths coprocessor
- From: Richard Cooper
- Re: Assembly Language - Mathematics WITHOUT maths coprocessor
- From: ¬a\\/b
- Re: Assembly Language - Mathematics WITHOUT maths coprocessor
- From: Richard Cooper
- Re: Assembly Language - Mathematics WITHOUT maths coprocessor
- From: ¬a\\/b
- Re: Assembly Language - Mathematics WITHOUT maths coprocessor
- From: Richard Cooper
- Re: Assembly Language - Mathematics WITHOUT maths coprocessor
- From: ¬a\\/b
- Re: Assembly Language - Mathematics WITHOUT maths coprocessor
- From: Richard Cooper
- Re: Assembly Language - Mathematics WITHOUT maths coprocessor
- From: ¬a\\/b
- Re: Assembly Language - Mathematics WITHOUT maths coprocessor
- From: ¬a\\/b
- Re: Assembly Language - Mathematics WITHOUT maths coprocessor
- Prev by Date: Re: Assembly Language - Mathematics WITHOUT maths coprocessor
- Next by Date: nasm segment directive
- Previous by thread: Re: Assembly Language - Mathematics WITHOUT maths coprocessor
- Next by thread: Re: Assembly Language - Mathematics WITHOUT maths coprocessor
- Index(es):
Relevant Pages
|