freenode/#clasp - IRC Chatlog
Search
1:42:22
Shawn_
Hi! I was playing with ECL to let C++, CL and CUDA work together, and found Clasp. It is amazing. Based on my understanding, if Clasp generated LLVM-IR, and Clang supports compiling CUDA code, is it possible to compile cuda code with nvcc/clang, then link the code with the code Clasp generated?
1:44:19
drmeister
Hi Shawn_ - we were exploring that very question just this last weekend. I have the exact same question you do and I couldn't find an answer .
1:45:13
drmeister
Wait - not the exact same question now that I read your question again. Yes - I think you could link the code generated by nvcc.
1:57:14
Shawn_
Perhaps that's is the ultimate solution, i.e., compiling cl directly to cuda device assembly code, but it does not feel necessary. What cuda achieves is mainly to offload the computational intensive part to GPU. If we could link the code generated, we could just write a small wrapper in the C++ part. Whenever any manipulation happens in the lisp side, it could be translated to cuda part, e.g., variables on GPU, in C++ part.
1:59:07
Shawn_
Not much I guess, I am planning to use the deep learning related code in CUDA written by others
2:00:23
drmeister
I spoke at Nvidia a couple of weeks ago - I also spoke at the llvm developers meeting - did you see the llvm-dev meeting talk?
2:00:51
drmeister
It just hit the youtube today - I mentioned that I wanted to integrate GPU programming into Clasp.
2:02:09
drmeister
This one just hit youtube today - so it's just chance that it arrived the same day you did.
2:03:30
drmeister
It's at about 19 min I mentioned a few thoughts about generating Nvidia PTX code from the Cleavir compiler - but I agree with you - compiling cl directly to cuda device assembly code (I guess that's PTX) is not necessary - but it might be fun.
2:05:08
drmeister
The problem I had was nvcc - it seems like it's forced on us for both the CUDA compilation and the C++ that integrates with it. The CUDA C++ API doesn't seem to have public calls for binding device code to host code - it seems like you need nvcc to do that.
2:06:12
drmeister
This weekend I got nvcc to generate the host code that it passes to clang or gcc - it's straight up C++ - but it references API calls that are not public to actually bind the device code to the host code.
2:17:29
Shawn_
It reminds me of something I read on clang. I just read it again. It seems that clang can compile cuda code itself, without using nvcc. And I believe there you may find the binding from the device code to the host code. The link on info is here: https://llvm.org/docs/CompileCudaWithLLVM.html#dialect-differences-between-clang-and-nvcc
2:28:32
Shawn_
Clasp is really an important work. It may not be the choice for all programmer, but if C++, CUDA and lisp can work together seamlessly, it would be the ultimate solution to scientific computing for quite a long time. I guess I will try minimal working example to combine cl with cuda soon. Looking for any examples, or documents now ...
2:30:04
drmeister
(1) llvm is slow (2) I think we are generating dead code and useless type-checks and our type inference isn't working properly yet to remove them - it will soon.
2:34:32
drmeister
Bike: What do you think of me adding a 'fields/fieldsp' method to Instance_O? I can serialize the class name and the slots as a vector. Upside: it is a small change that will work. Downside: It will be brittle and won't tolerate changes in the class slot layout at all.
2:42:21
drmeister
I've been scratching my head about how one would define a method to make this work?