freenode/#lisp - IRC Chatlog
Search
21:33:10
figurelisp
I am still in my very early lisp phases. It was just that this question popped in my head
4:05:20
Bike
depends on implementation. the obvious is n², but i think some implementations use hash tables and stuff to reduce it
4:37:36
no-defun-allowed
i put a basic chip8->cl compiler in my emulator and it's 10 times faster, averaging 800 x86 cycles/chip8 cycle
4:38:22
no-defun-allowed
thanks! i'd like to emulate something more difficult but i'm not sure what to do next
4:39:40
no-defun-allowed
(the way it works is i save the dispatched function bodies and i try to get all strings of code where no jumps are done)
4:46:55
no-defun-allowed
it seems a lot of the issues with emulators are C problems, like not having the system compiler (well, maybe not with llvm or libgcc) and having to write half a compiler
4:46:57
beach
I don't need such a thing immediately myself, so I won't be of much use for testing it.
4:48:15
no-defun-allowed
yes, i think that would simplify a lot of emulator writing as you don't have to invent an IR and compiler for it
4:49:11
no-defun-allowed
i read someone's documentation on writing a dynamic recompiler and most of the issues they listed seemed to be "inventing your own compiler" problems?
4:50:28
beach
I do something similar (I think) in my SICL bootstrapping procedure. I take the intermediate code generated by the Cleavir compiler, and I translate it to Common Lisp and compile it with the native compiler.
4:52:23
beach
What I mean to say is that I think your analysis is right that a lot of difficulties in this domain are solved by having the compiler around.
4:52:43
no-defun-allowed
i was reading through https://github.com/marco9999/Dynarec_Guide which goes over doing an emulator/"compiler" in C++
4:57:34
drmeister
Shoot - I went to the trouble of implementing a parallel compile-file and now I'm running into a problem with special variables and quicklisp.
5:02:54
drmeister
I compile-file that form in a child thread - when I load the compile-file'd result it should create the special variable and bind it globally.
5:04:56
beach
So, there should be no compile-time reference to the value in the same file as the DEFVAR form.
5:15:47
drmeister
It might be messing up because I'm hot patching the code and reloading things. I'll wipe everything out and rebuild from scratch.
5:17:34
beach
Does it work as we have discussed, i.e. generate the ASTs sequentially and then work on each AST in parallel?
5:18:04
drmeister
Also - it doesn't fully utilize the machine - I have some mutexes around parts of the compiler that are throttling performance. Compile and discriminating functions drop back to serial.
5:20:00
drmeister
Maybe I can't do anything about the lock around COMPILE because it needs to be serial for AST generation.
5:23:17
drmeister
The parallel compile-file doesn't use all cores - I see maybe a 2x speedup - even though I see like 15 threads for the clasp process.
5:24:48
drmeister
But there are several things that could be throttling performance, my COMPILE lock (which I could get rid of with llvm7) the Boehm GC, llvm malloc.
5:25:58
aeth
drmeister: I suspect parallel compile file in CL won't get you much in general because most people have many small files.
5:28:29
drmeister
Still, we are seeing about 80% of the time being spent in llvm - if there are more than two top level forms then that should all go parallel. I'm a bit mystified.
5:29:16
drmeister
You know what I can do though - I can use dtrace and profile everything. That should illuminate what is locking.
5:30:41
drmeister
ASDF is a test case. With the serial compile-file it takes 190 seconds, with the parallel one it takes 120 seconds.
6:14:38
elderK
It's crazy how like, fundamentally, nothing is really different. But conceptually, wayyyy different. Never returning, always calling our successor. Programs get turned inside out.
6:44:57
aeth
drmeister: It might be best to ask someone who has all of Quicklisp for some large files
6:47:17
drmeister
Well, ASDF works. But I can't get quicklisp to work yet with the parallel compile-file.
6:51:38
drmeister
Yeah - it's an interesting bug. ~/quicklisp/quicklisp/package.lisp - it defines many packages, among them :ql-config.
6:51:56
drmeister
If I load the bitcode file generated by my parallel compile-file - the package is defined.
6:55:43
drmeister
Clearly something is wrong - but the compiler is generating the correct bitcode! But lowering it to object code or linking it into the final fasl is messing up.
7:34:15
dim
if you want to benchmark easily, I guess that (ql:quickload "pgloader") could be a nice test as it loads about 64 systems with the build dependencies ;-)
7:45:06
hectorhonn
what is the idiomatic usage of streams? is it possible to take N first characters from a stream?
8:11:06
beach
Although I tested it for lots of random values both of the base and for the number itself.
8:13:32
beach
For example, printing (expr 12 1000000) seems to take less than 5 seconds, whereas SBCL takes more than 6.
8:17:27
dim
maybe you maths background is better than the one of the sbcl contributor who implemented that? or maybe your code is using cache lines in a way that is better for modern CPUs?
8:19:14
jdz
beach: How important is for your implementation to use double floats instead of single floats?
8:28:01
beach
dim: I didn't take cache lines into account when I wrote it. It was the first thing I could think of.
8:31:00
beach
And I am not using (manual) lambda lifting either. But then, perhaps the SBCL compiler does that automatically.
8:31:40
beach
Oh, and I am calling lower recursively on the second part. I could make that step iterative instead.
8:32:18
beach
I mean, the performance of printing an integer probably isn't terribly important. So I am not going to do all that.
8:33:17
beach
Here is the test I ran for SBCL: (progn (with-open-file (stream "/dev/null" :direction :output :if-exists :overwrite) (time (princ (expt 12 1000000) stream))) nil)
8:33:43
beach
And this is for my code: (progn (with-open-file (stream "/dev/null" :direction :output :if-exists :overwrite) (time (print-positive-integer (expt 12 1000000) 10 stream))) nil)
8:35:06
jdz
beach: No, I would not expect code to be slower with single floats. If anything, I'd expect it to be faster; but that's why I'm asking because expectations like these should not be held :)
8:36:42
jackdaniel
doesn't princ on sbcl do a check first what type is the thing you print? maybe that's the reason of different reasults?
8:39:22
beach
But I am surprised that your computer is so much faster than mine, and mine is brand new.
8:40:31
no-defun-allowed
beach: i remember you usually keep lisp images open for a while, but do you have the same declarations as before?
8:40:57
no-defun-allowed
last time we had that confusion it was cause i had default declarations and you were on...was it (speed 3) (debug 3)? something like that
8:41:09
beach
no-defun-allowed: Changing the OPTIMIZE settings does not significantly influence the performance of my code, probably because there are no declarations.
8:41:52
beach
But that would explain the difference between what jackdaniel is getting and what I am getting.
9:03:30
shka__
well, i have a test file, in this file i have few testing forms, few of such forms tests large chunk of random data, one of those forms runs test for 2 seconds with debug 3
9:05:39
beach
I have seen very little difference between DEBUG 2 and DEBUG 0 in terms of performance.