freenode/#clasp - IRC Chatlog
Search
14:22:32
Bike
clasp translate believes really deeply that variables are all in memory locations, so i ahve to rewrite a lot
14:22:55
Bike
why do we have precalc-value-instruction handle immediates? does it actualy get immediates as an argument? we shouldn't need an instruction...
14:26:06
Bike
i'm trying to move my discussion of cleavir things to #sicl, though the boundary is sometimes porous
14:50:15
drmeister
Bike: there is a very low barrier to improve things in clasp. Let’s talk about it tomorrow or Tuesday.
14:51:32
drmeister
My talk went extremely well. I had about a dozen people tell me it was the most entertaining and impressive talk.
15:26:52
drmeister
Bike: I had the backend and then tagged immediates - so they have a bolted on feel.
15:28:59
drmeister
Be wary that bclasp and cclasp share a lot of code and bclasp needs to keep working. But it can be updated as well.
15:32:55
Bike
i think maybe all we need to do is not generate precalc-value-instruction for immediates, and set up instructions to be able to take non-alloca inputs which i'm doing anyway
15:53:08
drmeister
Bike. It shouldn’t be too hard to add methods on translate-xxx-instruction and generate PTX - right? Feel free to ask questions for more context.
15:54:51
drmeister
There are some new barrier and synchronization instructions that we would need to add.
15:59:52
drmeister
Then an api call says “run this kernel on this warp of 1024 threads with this memory setup”
16:02:13
drmeister
The kernels support a tiny subset of Common Lisp. Arithmetic with unboxed values, if, tagbody/go
16:05:10
drmeister
And the HIR gives us a way to optimize the sequence of operations with barriers and to keep the chip busy.
16:07:38
Bike
well, that one we might be able to lose, it's pretty much just one operation (discrimination) that needs special codegen
16:12:45
drmeister
That’s a very interesting thought - I’ll look at the ptx calling convention with that in mind.
16:14:20
drmeister
Functions taking lots of inputs and returning one output is stupid. It should be symmetric. I bet PTX can support that.
16:15:51
drmeister
There doesn’t seem to be a much of a distinction between local memory and registers.
16:18:02
drmeister
I described this to a google engineer who helped me figure out exception handling and debug info. He got very agitated and said I needed to talk to a team of google engineers that he knew.
16:19:35
drmeister
The inlined code - but I’ve been toying with the idea of moving compiler optimizations to the gpu.
16:35:51
beach
If so, why do we need multiple values? I guess that depends on what kind of code you want to run on the GPU.
16:40:31
beach
I should be quiet. I had a very long day and a very long week, so I am too tired to think clearly. Plus, dinner is imminent.
16:46:51
Bike
i cannot understand how precalc value reference is supposed to work. it gets an immediate input, but the "immediate" is one of these "literal" objects somehow even though that's not what translate-datum returns...