libera/#sicl - IRC Chatlog
Search
8:55:44
beach
I looked into how functions such as the trigonometric ones are computed, which is something that prov worked (still works?) on.
8:55:54
beach
So, take SIN for instance. It appears the first quadrant is split into small intervals and for each interval, there is an appropriate polynomial approximation to evaluate.
8:56:01
beach
So then there are considerations such as how many intervals to use, and the degree of the polynomial approximation. These considerations lead to an interesting optimization problem.
8:56:09
beach
For one thing, it might be good to have a non-uniform interval size. For small values of the argument, the interval can likely be greater for some fixed degree of polynomial.
8:56:21
beach
Then comes the way that the right interval is determined. One could use a tree of comparisons much like we do for generic dispatch, but comparing floats is not fast. So perhaps translate the float into a fixed-point value between 0 and π/2 and do integer comparisons instead. So the number of such comparisons also influences the performance.
8:57:45
beach
I guess the desired precision is fixed to be the precision of the floating-point number. But I wonder if that full precision is possible to maintain when the final value is the result of several, more elementary, floating-point operations.
8:58:38
moon-child
'the desired precision is fixed to be the precision of the floating-point number. But I wonder if...' most math libraries provide some bounded error. Eg 1.5 ulp, 5ulp, whatever
8:59:46
beach
I see. So you are saying that some error is tolerated, as long as the library specifies an upper bound?
9:00:12
moon-child
there are also correctly rounded math libraries, but those tend to be appreciably more expensive. There was some recent work on making a high-performance one, but it is still more expensive than ones with a small error bound
9:00:44
moon-child
ieee 754 specifies that some operations--like basic arithmetic and sqrt--must be correctly rounded, but that others--in particular transcendentals--do not have to be
9:00:52
beach
I can imagine, yes. For the reason I mentioned. One would have to use higher precision to do the calculation and then round.
9:01:38
moon-child
from what I understand, it is frequently possible to obtain correctly rounded results without using higher precision, using fma to implement newton-raphson iteration or similar. But I may be wrong about that
9:02:00
moon-child
I think the most important thing is to provide consistent results across (hardware) platforms. This is not guaranteed, but I think it is a very valuable thing to provide
9:03:35
moon-child
reading more ... comparing floats is rather fast. But branches are not so nice in dense math code, especially if there is the potential for vectorisation (and I think it would be good to provide vectorisation, a la sleef)
9:04:18
beach
Float comparison must be like a subtraction, and I understand that takes a few cycles.
9:07:48
moon-child
an interesting result I stumbled upon recently is that lexicographic comparison is cheaper than subtraction. Not that it matters, at these bit widths. But anyway: you are half-right; I forgot that vector ops have higher latency than int ops (but generally as good or better throughput), and all float ops are vector ops. The difference is less pronounced on intel, but it is still a couple of cycles
9:09:33
beach
So you are saying, prefer fewer intervals to have fewer comparisons, and instead use higher-degree polynomials because they can be evaluated using vector instructions?
9:14:54
moon-child
my understanding was that, at least for sin, they only had a polynomial approximation over one interval, and then mirrored over the rest
9:15:40
moon-child
a quick skim of sleef appears to weakly confirm this, as it only has one exit point; it has a couple of branches, but they seem to do only preprocessing
9:20:19
moon-child
https://github.com/shibatch/sleef/blob/master/src/libm/sleefdp.c#L789 is sleef's sin
9:22:20
beach
I don't understand much of it. I would need to understand the general technique first.
9:23:12
beach
Anyway, it is time for a lunch break. Feel free to say more. I'll read up when I get back.
9:23:48
moon-child
I understand. I think there is a paper somewhere about sleef, but it does not go into details. glibc's math library might have more links; I think the gnu people are usually pretty good about citing their sources