Search
Monday, 18th of October 2021, 19:40:23 UTC
19:40:45
karlosz
thats how i got the move to be inside the vop, rather than a move vop
19:41:10
karlosz
as in https://pastebin.com/Ar9frwHb
19:41:14
karlosz
it might be a load-if thing
19:42:52
stassats
load-if is probably already too late
19:44:42
karlosz
yeah, arm doesn't seem to special case it anyway
19:44:45
stassats
oh, and arm64's cmov is three operand
19:45:57
stassats
so it doesn't do the whole dance of loading immediate in move-if
19:46:40
karlosz
true, but it seems unrelated to the issue at hand, since the problem seems to be at the vop level, right?
19:47:52
stassats
i think it just inverts the condition incorrectly
19:48:39
karlosz
what do you mean by that? which condition?
19:48:42
stassats
it should be conditionally moving in nil, not t
19:49:10
stassats
but since it's the reverse, it has to first load the destination with a nil
19:49:22
stassats
i.e. no three operands
19:49:34
karlosz
oh, you're totally right
19:50:17
karlosz
but it could just be LEA RDX, [T] \\ CMOVEQ RDX, R12 ideally
19:50:36
stassats
i was very glad to avoid having to do that
19:53:19
stassats
i also didn't do the multi-flag option right, but arm64 doesn't have any, so it's under a rag at the moment
19:53:45
stassats
or maybe it's right, three operands and all
19:54:06
karlosz
so i think i can just invert the condition inside the move-if on x86
19:54:19
karlosz
that should allow erasing the need for the temp
19:57:22
stassats
i hadn't done move-if on arm64 previously because then it pessimized cbz/cbnz (branch if zero), but with the instcombine pass it's no longer an issue
19:58:00
stassats
and it will even work with tbz/tbnz (branh if bit set)
19:59:44
stassats
but i really want a real code model for peephole optimization, which registers definitions are available, maybe which constant values are where
19:59:54
stassats
so it can do more than just adjacent instructions
20:02:41
karlosz
you'd need ssa for that
20:03:03
karlosz
or, i mean you can do reaching definitions the way copyprop odes
20:03:19
karlosz
but that's not a super efficient data structure for register definitions
20:03:40
karlosz
ir2 was not meant to be optimized on, really
20:05:55
stassats
the hard part is adding descriptions to all instructions
20:06:07
stassats
including their flag usage
20:10:59
karlosz
yeah, that's just a data input problem
20:11:16
karlosz
mips kinda has those annotations
20:11:23
karlosz
for the scheduling assembler
20:11:33
stassats
it's tedious and easy to get wrong
20:11:38
karlosz
it was actually pretty easy to solve the move-if problem
20:11:51
karlosz
canonicalize then to always be the register and inverting the condition worked
20:12:08
karlosz
i only did it for the single flag case though
20:12:17
karlosz
i don't understand multiple flags
20:12:25
karlosz
the not-p case makes not much sense to me
20:12:47
stassats
i was very scared by that too
20:13:26
karlosz
im just going to assume not much stuff cares about multiple flags
20:14:00
karlosz
yeah, then not a priority
20:14:08
stassats
(declare (double-float x)) (> x 1d0)
20:14:08
karlosz
not that float code doesnt matter
20:14:20
karlosz
but xchg is not something i want to deal with
20:22:24
karlosz
of course crashes in cold init
20:22:30
karlosz
because it can never be that easy
20:23:31
karlosz
https://pastebin.com/gdhyCZvN
20:24:12
karlosz
works fine for stuff like functionp
20:28:14
karlosz
(if (functionp x) t 'a) is broken
20:37:34
karlosz
moving :target res to the then branch makes it work
23:06:47
karlosz
i guess i could solve the fixup issue by making fixups relative to null-tn
23:15:14
karlosz
oh, ppc already has asm-routine-nil-offset
23:51:35
karlosz
need to make asm-routine-nil-offset* to really make things address independent
Tuesday, 19th of October 2021, 7:40:23 UTC