libera/#shirakumo - IRC Chatlog
Search
14:12:50
Colleen
<shinmera> selwyn: good. means I'm at least not mentally bought out by the burgeoisie
14:16:15
hayley
ACTION uploaded an image: (63KiB) < https://libera.ems.host/_matrix/media/r0/download/matrix.org/QLuNAZMTlLUwcJfvGoLBnPYs/q1pjjbw55qv71.jpg >
14:23:42
hayley
Trying to avoid the "broken and slow" corner of things this week. I managed to write two of three loops in a way that GCC can auto-vectorise, but the last just defies it (and Clang too) and sticks out like a sore thumb in profiling. You know anything on trying to provoke either compiler into working?
14:24:34
Colleen
<shinmera> try to write everything in a way that's: {read into vars; compute; store}
14:25:08
Colleen
<shinmera> it results in very large-to-look-at code but usually compilers (and machines) eat it well.
14:26:10
Colleen
<shinmera> but I haven't actually looked at auto vectorise results much to really have more of an idea than that.
14:27:16
hayley
The code looks something like for (int n = 0; n < end; n++) A[n] = M[n] ? A[n] : B[n]; like...was it select() in OpenCL, and this just completely confounds both compilers somehow.
14:29:27
hayley
I kinda suspect having A[n] on the right hand side might throw it off, as that is unique to this loop, but when I skimmed the papers it looked like the algorithms were smarter than that.
14:33:57
hayley
The sweep() in https://godbolt.org/z/h9ezqbqf1 - Clang says it "cannot identify array bounds" which is an odd thing to report on, I think.
14:39:45
hayley
Doesn't seem to help, though indeed the compiler couldn't vectorised if there was aliasing.
15:03:33
hayley
It did! Took me some time (and reading an llvm-dev thread) to figure why though. Thanks.
15:04:25
hayley
When I say M[n] ? A[n] : B[n] the compiler can't emit unconditional loads to A and B, but it can when I write it your way, as the source already has unconditional loads.