freenode/#clasp - IRC Chatlog
Search
10:38:33
drmeister
They probably have some kind of microcode - but Ptx is the documented, best low level target.
10:43:41
heisig
cl-cuda generates Cuda C code and calls the Nvidia compiler on it. It has nice CFFI wrappers for though.
10:46:55
heisig
That would be a huge improvement over the status quo. The question is how far you want to go. Should GPUs be able to signal conditions to the host? Should GPUs be able to run generic functions?
10:49:38
heisig
GPUs are not particularly good at running general purpose code. A technique I have seen is to have a Lisp interpreter on the GPU, as a fallback for the tricky stuff.
10:52:22
heisig
On a CPU, you typically have a cache coherence protocol that ensures some order of reads and writes. GPUs are much more liberal when it comes to that.
10:54:29
drmeister
I'm going to reduce the problem of molecular design to some table look ups and matrix multiplications and distance calculations.
10:55:04
drmeister
Then I'm going to generate custom kernels on the fly to search for solutions to specific problems.
10:56:02
heisig
For scientific computing, it is probably sufficient if you have a function CLASP-CUDA:COMPILE that signals an error if the given lambda expression contains anything but arithmetic functions on floats or fixnums, IF and TAGBODY.
10:58:00
heisig
Other projects that might be relevant for such an undertaking: https://github.com/cbaggers/varjo and https://github.com/digego/extempore.
10:58:23
drmeister
Sure. With cleavir generic functions it will be very straightforward to write some additional methods and generate code for a new backend.
10:59:19
heisig
The former project compiles a subset of Lisp to OpenGL shaders, the latter is (among other things) a high-performance compiler for audio and physics processing.
11:13:24
heisig
If you have the time and resources and a straightforward problem, ASICs or FPGAs are fastest, but typically no one has that much time and resources.
11:14:05
heisig
GPUs are pretty damn fast, but when compared for the same power budget, one should not compare a GPU to a CPU, but a GPU to two 18 core CPUs.
11:23:03
drmeister
Society is going to need really smart molecules to solve it's problems in the next decades. I think the resources will be available.
11:24:20
Shinmera
GPUs are good at doing really simple, very low-branching arithmetic in massive parallel
11:27:48
heisig
My colleague (who works on molecular dynamics and with whom I share my office) also recommends GPUs :)
11:35:27
drmeister
Oh - I'm not delusional - I'm not selecting between GPU's, FPGA's or ASICs like I have a choice right now.
11:36:07
drmeister
I'm talking at NVidia on Friday - I want to get the lay of the high performance landscape.
11:37:46
drmeister
Cando has multi-threading - so I can use 18-core CPU's no problem. Clasp's arithmetic isn't so good right now - but I can write custom math (and I do a lot of this) in C++ and run it from Common Lisp.
11:41:27
drmeister
This guy made a lot of money and then had custom ASICs built to simulate molecular dynamics.
11:42:46
drmeister
The value of this has been dubious. GPU's are the current sweet spot for performance/effort
11:45:05
heisig
Yes, I think you are right. The only problem I see in the long run is that GPU currently means CUDA, which in turn is a proprietary toolchain that puts you at the whim of a single company.
11:57:10
drmeister
And there are no good comparisons between OpenCL on other GPU's vs CUDA on Nvidia GPU's because that's a really, really hard thing to compare.
12:14:20
Shinmera
OpenCL has the advantage that you can run it practically everywhere, even on laptops where you typically only have intel cpus