Search
Monday, 18th of October 2021, 18:14:15 UTC
18:21:21
karlosz
ah, so the issue is that constant-tn-p doesn't understand that null-tn is a constant
18:21:33
karlosz
so compute-from-flags never fires
18:21:47
karlosz
that's okay, we can just rely on move-if/t
18:25:31
karlosz
stassats: do you know how to get rid of the redundant move here, or is it impossible? r12 holds null-tn. https://pastebin.com/a17fmVX7
18:25:48
karlosz
this is for #'functionp
18:26:51
stassats
karlosz: can you do (setf sb-c::*debug-print-types* t) and print again?
18:30:29
karlosz
https://pastebin.com/T1DXtEVp
18:31:59
karlosz
well, that mov looks pretty unnecessary to me
18:32:06
karlosz
do i need to tell cmov it can use it?
18:32:39
stassats
ah wait, is copyprop before ir2opt?
18:33:28
stassats
i recently added cmov to arm64, and i handle nil and zero there specially
18:33:28
karlosz
but surely we don't need copy prop to optimize this away
18:34:02
stassats
btw, i still don't think x86-64 should have null-tn
18:35:27
karlosz
yeah, that's understandable
18:36:28
stassats
at least until we don't have dumb register allocation around loops
18:36:38
karlosz
i need it for the flexible code model though
18:36:51
karlosz
may keep it out of tree until it's bettter justified then
18:36:59
stassats
memory loads seem to kill conditional branch prediction
18:43:25
karlosz
hm, i did the same special casing as in arm
18:43:33
karlosz
it just moved the mov into the vop
18:46:04
karlosz
now i just get this https://pastebin.com/Ar9frwHb
18:46:09
karlosz
which is an improvement
18:46:55
karlosz
if i can get rid of the move, functionp and fixnump will go from 25 to 22 bytes
18:47:23
karlosz
doug said icache was a big issue, so it would be better to squeeze those out
18:51:20
karlosz
oh, i just needed to change the ptype move-if/t accepts to (:or t list)
18:51:27
karlosz
now functionp is 21 bytes
18:51:55
karlosz
down from baseline of 25 in master
19:18:43
stassats
i have it as *, x86-64 has t
19:18:56
stassats
* doesn't seem to break anything, so maybe try that
19:35:45
stassats
what could be interesting, per-frame wired null registers
19:36:09
stassats
i suppose deciding when to use one is going to be hard
19:36:37
stassats
i have some optimization for loading constants, but it's only basic-block wide
19:37:25
stassats
like it breaks things?
19:37:37
karlosz
no, as in the MOV is still there
19:38:50
karlosz
it doesn't do that on arm64
19:39:10
stassats
i have 2: MOVE-IF/DESCRIPTOR t8[R0] :NORMAL DESCRIPTOR-REG T t12[NULL] :COMPONENT DESCRIPTOR-REG LIST {(:EQ)} => t13[R0] :NORMAL DESCRIPTOR-REG T
19:39:58
karlosz
i mean i have the same: 3: MOVE-IF/T 'T!7 :CONSTANT IMMEDIATE T
19:39:58
karlosz
t8[R12(d)] :COMPONENT DESCRIPTOR-REG LIST
19:39:59
karlosz
=> t9[RDX(d)] :NORMAL DESCRIPTOR-REG T
19:40:21
stassats
but i'm returning the null-tn in convert-conditional-move-p
19:40:45
karlosz
thats how i got the move to be inside the vop, rather than a move vop
19:41:10
karlosz
as in https://pastebin.com/Ar9frwHb
19:41:14
karlosz
it might be a load-if thing
19:42:52
stassats
load-if is probably already too late
19:44:42
karlosz
yeah, arm doesn't seem to special case it anyway
19:44:45
stassats
oh, and arm64's cmov is three operand
19:45:57
stassats
so it doesn't do the whole dance of loading immediate in move-if
19:46:40
karlosz
true, but it seems unrelated to the issue at hand, since the problem seems to be at the vop level, right?
19:47:52
stassats
i think it just inverts the condition incorrectly
19:48:39
karlosz
what do you mean by that? which condition?
19:48:42
stassats
it should be conditionally moving in nil, not t
19:49:10
stassats
but since it's the reverse, it has to first load the destination with a nil
19:49:22
stassats
i.e. no three operands
19:49:34
karlosz
oh, you're totally right
19:50:17
karlosz
but it could just be LEA RDX, [T] \\ CMOVEQ RDX, R12 ideally
19:50:36
stassats
i was very glad to avoid having to do that
19:53:19
stassats
i also didn't do the multi-flag option right, but arm64 doesn't have any, so it's under a rag at the moment
19:53:45
stassats
or maybe it's right, three operands and all
19:54:06
karlosz
so i think i can just invert the condition inside the move-if on x86
19:54:19
karlosz
that should allow erasing the need for the temp
19:57:22
stassats
i hadn't done move-if on arm64 previously because then it pessimized cbz/cbnz (branch if zero), but with the instcombine pass it's no longer an issue
19:58:00
stassats
and it will even work with tbz/tbnz (branh if bit set)
19:59:44
stassats
but i really want a real code model for peephole optimization, which registers definitions are available, maybe which constant values are where
19:59:54
stassats
so it can do more than just adjacent instructions
20:02:41
karlosz
you'd need ssa for that
20:03:03
karlosz
or, i mean you can do reaching definitions the way copyprop odes
20:03:19
karlosz
but that's not a super efficient data structure for register definitions
20:03:40
karlosz
ir2 was not meant to be optimized on, really
20:05:55
stassats
the hard part is adding descriptions to all instructions
20:06:07
stassats
including their flag usage
20:10:59
karlosz
yeah, that's just a data input problem
20:11:16
karlosz
mips kinda has those annotations
20:11:23
karlosz
for the scheduling assembler
20:11:33
stassats
it's tedious and easy to get wrong
20:11:38
karlosz
it was actually pretty easy to solve the move-if problem
20:11:51
karlosz
canonicalize then to always be the register and inverting the condition worked
20:12:08
karlosz
i only did it for the single flag case though
20:12:17
karlosz
i don't understand multiple flags
20:12:25
karlosz
the not-p case makes not much sense to me
20:12:47
stassats
i was very scared by that too
20:13:26
karlosz
im just going to assume not much stuff cares about multiple flags
20:14:00
karlosz
yeah, then not a priority
20:14:08
stassats
(declare (double-float x)) (> x 1d0)
20:14:08
karlosz
not that float code doesnt matter
20:14:20
karlosz
but xchg is not something i want to deal with
20:22:24
karlosz
of course crashes in cold init
20:22:30
karlosz
because it can never be that easy
20:23:31
karlosz
https://pastebin.com/gdhyCZvN
20:24:12
karlosz
works fine for stuff like functionp
20:28:14
karlosz
(if (functionp x) t 'a) is broken
20:37:34
karlosz
moving :target res to the then branch makes it work
23:06:47
karlosz
i guess i could solve the fixup issue by making fixups relative to null-tn
23:15:14
karlosz
oh, ppc already has asm-routine-nil-offset
23:51:35
karlosz
need to make asm-routine-nil-offset* to really make things address independent
Tuesday, 19th of October 2021, 6:14:15 UTC