Search
Wednesday, 17th of May 2017, 13:41:13 UTC
17:20:18
drmeister
I'm still working away on this. This was a wrenching, breaking change.
17:21:22
drmeister
I am making progress though - I discovered that the single-dispatch-generic-functions that Clasp uses to wrap virtual functions and methods were broken. Fixed now.
17:22:05
drmeister
The C calling convention was really, really not designed for this.
17:50:26
Shinmera
Cause get ready for that windows port where the standard calling convention is different
19:14:14
phoe
drmeister: Networking issues on origin should be resolved now.
19:14:19
phoe
If they aren't - let me know immediately.
23:50:35
drmeister
Why would an llvm 'add' be converted to an 'or'?
23:51:13
stassats
sometimes they're equivalent
23:51:24
stassats
when some the bits are clear
23:52:00
drmeister
https://www.irccloud.com/pastebin/arOT3lvV/
23:52:19
drmeister
closure is a tagged pointer with the low bit 1
23:53:18
drmeister
It's being converted to this llvm-ir
23:53:19
drmeister
https://www.irccloud.com/pastebin/vvcqmpVq/
23:53:43
stassats
but OR and ADD have the same throughput and latency on, say, haswell
23:53:55
stassats
maybe some other arches have cheper OR
23:54:12
drmeister
but they aren't equivalent here.
23:54:38
drmeister
I need a: add i64 %ptrtoint, 7
23:54:47
drmeister
I asked for an add - I need an add.
23:58:24
drmeister
I do. Essentially I have a tagged pointer with low bits 0001, I need to add 7 (0111), I need to get 1000
23:58:35
drmeister
The 'or' will give me 0111
0:03:19
stassats
then your types are incorrect
0:04:53
drmeister
Hmmm, maybe my types are incorrect.
0:05:21
stassats
do they assume alignment?
0:05:26
drmeister
My types are wrong.
0:05:41
drmeister
I need to dereference closure
0:05:44
stassats
llvm may also be broken
0:06:24
stassats
doesn't llvm have better instruction for offsets for loading instructions?
0:06:43
stassats
cause effective addresses can express addition and whatnot
0:08:19
stassats
you have %entry-point-addr, align 8
0:08:24
stassats
what does align 8 mean?
0:09:38
stassats
googling stuff just confirms my opinion that llvm is badly document and has terrible API
0:11:17
drmeister
No, I needed to dereference the pointer. The {}** must be being treated as aligned and or is equiv to add in that case.
0:11:49
stassats
but there should be no OR in real machine code
0:12:07
drmeister
align 8 means what it says - the pointer is aligned to 8-byte words
0:12:11
stassats
it should be MOV ABC, [PTR+7]
0:18:10
stassats
just tried to look at what ((int*)x)[10] from C would look like in IR
0:18:19
stassats
it's %4 = getelementptr inbounds i32, i32* %3, i64 10 %5 = load i32, i32* %4, align 4
0:23:59
stassats
yours should looke something like %4 = getelementptr inbounds i8, i8* %3, i64 7
0:25:20
drmeister
Yeah - but casting and pointer arithmetic is easier on my brain at the moment.
0:25:52
drmeister
I'll do a search for ptrtoint later and change things to getelementptr - there's only a handful of these.
0:28:06
stassats
well, presumably this gep thing will allow to encode the offset in the load instruction and save on a temporary register
0:32:27
stassats
just looking at return (x & ~7) + 3;
0:32:34
stassats
%2 = and i32 %0, -8 %3 = or i32 %2, 3
0:32:44
stassats
so it does convert to OR when it thinks the low bits are clear
0:33:13
stassats
but that's in IR, shouldn't that be the job of whatever optimizes stuff to machine code?
0:34:20
drmeister
Optimization happens at the IR level - it may happen at others - but I'm really familiar with the IR level.
0:35:29
stassats
ADD to OR is a bit silly, at least in this case
0:35:38
stassats
though i can see (x & ~7) + 7 just going to OR 7
0:36:41
drmeister
The dereference did the trick - now cleavir is compiling things.
0:37:09
drmeister
I still have an exception handling bug. I have a stack unwind that is skipping a landing pad.
0:37:25
drmeister
This almost certainly means I have a CALL where I need an INVOKE.
0:37:31
drmeister
These are tough to find.
0:38:43
stassats
now google isn't working for me, great
0:41:47
stassats
"It's not you, it's us Bing isn't available right now, but everything should be back to normal very soon."
0:41:51
stassats
are you kidding me
0:43:13
Bike
how mysterious, it works here, except that getelementptr returns the wikipedia page on praseodymium
0:43:41
stassats
things work intermittently
0:48:01
stassats
i doubt any architecture would have different OR and ADD performance characteristics
0:50:04
stassats
on x86-64, that OR goes down to leaq 3(%rdi), %rax
0:50:32
stassats
so it does pass around the information about set bits
1:13:08
drmeister
Bike: dictionary.lisp - there is an apply with no test for call-arguments-limit
1:14:04
drmeister
What do I do with that?
1:14:18
drmeister
Clasp now has a 64 argument limit for funcalls.
1:16:18
drmeister
https://www.irccloud.com/pastebin/i6iA2Bc0/
1:17:07
drmeister
The issue is I have to set the limit somewhere and what do I do with APPLY's like this?
1:17:22
stassats
64 is far too small
1:17:46
drmeister
A few days ago you said that was enough for anyone?!?
1:17:56
drmeister
I based my life on your teachings.
1:18:15
stassats
your mixing me up with Bill Gates
1:18:18
drmeister
But whatever I set it to - someone is going to hit the limit.
1:19:07
stassats
that's why you make it unlimited
1:19:12
drmeister
What would you set it to if you had to set a limit?
1:19:31
stassats
65535, if really had to
1:19:43
drmeister
The only way I can see to make it unlimited would be to generate them for higher arities as they are needed and cache them.
1:19:54
stassats
but if you can make it 64, you can make it any number
1:20:12
drmeister
I have these monsters though...
1:21:13
drmeister
https://gist.github.com/drmeister/7495fed5dff16eb9203da7e7062449b2
1:21:52
stassats
is that to prove my point that llvm is poorly designed?
1:22:05
drmeister
That's more of a C/C++ problem
1:22:31
drmeister
I think so. ECL has the same thing.
1:22:46
stassats
ecl doesn't have access to assembly
1:22:59
drmeister
Neither does llvm.
1:23:18
drmeister
It generates assembly - but it doesn't have any better access to it.
1:25:03
stassats
well, what can i say
1:25:12
stassats
this is all really bad
1:26:16
stassats
i'd inline some assembly
1:33:48
Bike
ecl has the same thing but has a higher call-arguments-limit somehow.
1:34:13
drmeister
it may be a limit in the byte compiler but not the C compiler.
1:34:18
drmeister
I'm guessing here.
1:38:46
drmeister
Bike: Could we convert that into something that doesn't blow APPLY?
1:40:14
stassats
(reduce #'bag-join (cleavir-ir:predecessors instruction) :key (lambda (pred) (arc-bag pred instruction dictionary))) you mean?
1:40:21
Bike
ecl allows 200 arguments every way i can think of to compile something
1:40:38
Bike
and yes, that could be done, but i'd rather there be a higher call-arguments-limit
1:41:10
drmeister
What I'm wondering - but haven't stated explicitly is...
Thursday, 18th of May 2017, 1:41:13 UTC