freenode/#lisp - IRC Chatlog
Search
23:08:15
sea
I think so. I took that off. Now I'm running with: (declaim (optimize (debug 0) (speed 3) (space 0)))
23:08:32
aeth
(defun foo (l) (let ((sum 0)) (loop for i in l do (incf sum i)) sum)) (defun bar (v) (declare (optimize (speed 3) (debug 1)) ((simple-array fixnum (*)) v)) (let ((sum 0)) (loop for j across v do (incf sum j)) sum)) (let ((l (iota 100000))) (time (foo l))) (let ((v (coerce (iota 100000) '(simple-array fixnum (*))))) (time (bar v)))
23:08:53
sea
837,554 processor cycles vs 6,534,470 processor cycles and this time, it takes 8x as long!
23:16:25
jack_rabbit
For me, list took 4,695,880 processor cycles, vector took 722,763 processor cycles
23:17:23
jcowan
cdr-coded lists would help in this situation, but not enough overall for anyone to implement them any more
23:22:03
sea
I tried disassemble on both foo and bar but they're exactly the same as far as I can tell
23:22:41
aeth
Well, first make sure that they're not the sb-profile wrapper. You might have to (sb-profile:unprofile) before disassembling now
23:23:24
aeth
Same basic structure of generic-+, but the actual surroundings reflect iterating over their respective types
23:24:58
aeth
My latest bar has this: (declare (optimize (speed 3) (debug 1)) ((simple-array fixnum (*)) v))
23:26:50
aeth
Generic sequence and number code is almost always going to lose to specific sequence and number code in performance. They're basically the only two areas where type declarations are very useful for performance ime.
23:27:04
jack_rabbit
It doesn't matter the data type if the code iterating through it is for generic sequences.
23:27:22
sea
I need to alter the coerce as well. How do I coerce something to be a simple array of fixnums?
23:27:27
aeth
jack_rabbit: but my SBCL still optimizes bar once it knows that it is a simple-array fixnum (*)
23:28:15
aeth
sea: If it can only hold something of one non-T type, it's going to be a different thing than something that holds something of T
23:29:27
aeth
You win twice with an array type like I just gave (three times if a length is given): (1) it knows it's a certain kind of sequence and (2) it can infer what type the items are, which usually cannot be done
23:30:19
aeth
Unfortunately, this only applies to a small number of things. Portably just bit and character. Non-portably, a bunch of other numeric types like (almost always) single-float and (unsigned-byte 8) and fixnum
23:31:54
aeth
An array with an element-type should almost always be the most performant kind of sequence (or data structure in general) in Common Lisp. It will even beat lists at some things that lists are supposed to be better at.
23:32:45
sea
That's how I discovered this in the first place. I was timing an 'optimized' program, and found it got slower
23:34:20
sea
and the thing is that along with the time: 445,976 processor cycles , 4,636,812 processor cycles I get a tonne of time results printed as well, and they all basically look like this. The vector one is much larger
23:36:13
sea
Okay, restarted and re-evaluated what I had in the paste before. 0.148 seconds for bar, and 0.014 seconds for foo
23:39:24
pierpa
arrays with an element-type are not necessarily more performant than arrays with generic element types. It depends on what/when/how much the elements needs unboxing and reboxing.
23:48:02
sea
Why does it do that in one case and not the other? What's the behavior of 'being the elements of' supposed to be, and 'across'?
23:50:34
pierpa
nobody can tell you why "being the elements" is slow since "being the elements" is not CL. It must be an extension of the implementation you are using.
23:51:20
pillton
sea: It is defined here http://www.doc.gold.ac.uk/~mas01cr/papers/ilc2007/sequences-20070301.pdf.
23:52:06
aeth
So it's the sequence-generic version, but unlike most sequence-generic things it doesn't un-generic when the type is known
23:53:24
jack_rabbit
Is there another free CL implementation out there that works well aside from SBCL?
23:56:34
aeth
CCL has a superior GC than SBCL and is fairly comparable to performance in SBCL. ECL apparently is better on some niche areas like bignum performance.
23:57:01
aeth
SBCL, though, in general is pretty nice. It's usually the fastest, the most helpful, and the most feature-rich.
23:57:31
aeth
You could definitely beat SBCL in performance, though, if you really tried. There's definitely lots of room for improvement all over the place.
23:59:10
aeth
SBCL is pretty fast, but its optimizations don't really compare to some of the ridiculous optimizations compilers with big budgets can do these days.
23:59:31
jack_rabbit
pierpa, ccl gave me an error compiling some quicklisp library. I assume that is the library's fault. clisp crashes trying to load swank, which I assume is clisp's fault.
0:00:08
aeth
Ime, libraries will usually work on CCL, often work on ECL, and give issues with just about any other implementation, especially 32-bit ones.
0:01:01
aeth
It's hard to not write for SBCL, though. There are so many ways to figure out what's going on in SBCL.
0:01:16
aeth
I'm pretty sure of how my code behaves in SBCL, at least at the defaut optimization levels.
0:02:28
jack_rabbit
The library is static-vectors, and the error is: "Foreign function not found: X86-LINUX64::|memset|"
0:02:48
aeth
Really? static-vectors works for me in CCL. It gives me issues in ECL, though, even though it's supposed to support it.
0:05:26
aeth
But that does seem to match my experience. Things that use CFFI are the most problematic.
0:10:27
aeth
It's unfortunate that unless CLX works for you there's no way to avoid at least some foreign code.
0:31:57
pillton
White_Flame: I'm not sure what problem static-vectors solves. Do some implementations invoke the GC during foreign function calls?
0:32:29
White_Flame
you can't pass a pointer to foreign code if it could be moved at any time in the future
0:35:11
White_Flame
and in a lot of I/O cases, including graphics, the call does not synchronously encapsulate all access to the buffer you give it
0:43:52
aeth
pillton: Without static-vectors, you're either going to be working with a foreign array through stuff like mem-aref (not a pleasant experience) or you're going to copy from a CL-native vector into a foreign array at some point (which can kill your performance).
0:44:49
aeth
With static vectors, there's no need to do either, as long as you're in control of the allocation and not the foreign library.
0:46:23
aeth
The downside is that you're going to either have to use with-static-vector/with-static-vectors or you'll have to explicitly call free-static-vector in your own unwind-protect at some point.
0:47:15
aeth
I'm guessing you also can't use (declare (dynamic-extent foo)) on a static-vector to stack allocate, so that's another restriction.
0:48:12
aeth
Another downsize is that it seems to fool SBCL's type inference, so I have to (declare (whatever-type foo)) after with-static-vector or a let initializing the static-vector in order to get efficient sequence code, which is unnecessary with a normal vector.
2:20:45
jack_rabbit
Can anyone with CCL execute (read-from-string "#_memset") and let me know what happens?
2:38:14
jack_rabbit
huh. I didn't even need to rebuild. Just used the download from the clozure.com site rather than my distro repo.
3:39:19
aeth
Everything on QL has to run on at least two implementations, so supporting #1 and #2 by popularity is pretty much the absolute minimum.
6:52:20
TMA
jack_rabbit: (read-from-string "#_memset") => (values 'WIN32::|memset| 8) or (values 'WIN64::|memset| 8)
7:20:51
rme
TMA: Support for running the 32-bit lisp on 64-bit Windows was added (by yours truly) in ccl 1.7.
7:48:32
TMA
rme: oh, I never knew. I have an old 1.6 sitting in a directory transferred from an old 32-bit XP system and I tried to run it.
8:10:30
TMA
schweers: I do not. I ran XP in 32 bit mode, I run everything in 64 bits since. I would return to 32-bit mode on low memory devices like low-end netbooks though.
8:11:56
schweers
okay, that explains. I was wondering how windows 10 would perform with a maximum memory of ... 3GB? I know that 4GB or close to that are possible, but if I remember correctly, windows has a weird limit on 32 bit systems.
8:12:43
schweers
It sure is, but then again, I was thinking about running windows 10, which -- I presume -- needs lots of memory just to boot.
8:13:54
flip214
TMA: but having more registers in 64bit-mode might mean less memory needs (eg. for temporary data) and faster computation as well
8:19:13
TMA
flip214: that's true for arithmetic-intensive workloads. I guess most of what I do is data traversing, not number crunching
8:32:35
aeth
The specifics of Lisp complicate 32v64 bit further. e.g. larger fixnums and unboxed single-floats
8:36:38
jack_rabbit
schweers, windows 32-bit (IIRC) reserved ~1GB of address space for kernel stuff, so only 3GB available to user programs.
8:37:26
schweers
I thought there was something else, but I may be wrong. Not my main platform anyway ;)
9:00:49
flip214
TMA: still, being able to hold much more data (eg. pointers!) in CPU registers might help, not only when doing arithmetic.
9:56:12
hlavaty
hi i have a fileSystems."/var/lib/foo" entry in configuration.nix. the disk failed and now on boot the machine goes into rescue mode. how do i disable the disk so that the machine starts normally again and i can ssh in an upload and activate new configuration?
10:15:09
TMA
flip214: I refuse to take position on that matter when I have no data. I am saying I can pack twice as many conses into the same amount of memory. I do not have performance data to tell, whether it will be faster. the execution speed is nowadays usually severely constrained by the memory access time (that's why beach's generic function dispatch scheme that removes one memory access is so awesome)
10:16:53
TMA
flip214: so I guess, it might be faster for some workloads. your guess of the registers helping might be better or worse than mine. without data there is nothing we can do to tell them appart
10:51:39
makomo
how can i define a function in a different package if i want to do it within a file which has a different package in (in-package ...) on the top?
10:52:04
makomo
i tried using (in-package) right before the defun (and then again to switch back). i've also tried rebinding *package*. none worked
10:53:30
Shinmera
If the symbol is already exported from the other package you can also do foo:bar, of course.
10:55:30
Bike
i don't know how slime decides a package to read code in, it might just look for the first in-package in the file
10:56:15
Shinmera
I'm gonna go ahead and guess it errors because of something that isn't related to the name of the function
10:57:54
Bike
as in, (in-package #:foo) (defun bar () 'foo) (in-package #:foo2) (defun bar () 'foo2), C-c C-c the last, get (foo2::bar) => foo2::foo2
11:01:02
Shinmera
The reader reads only complete forms. By the time it's evaluated, and the package switch would happen, it's already read.
11:01:09
makomo
i remember the same issue i had with quicklisp, loading a library in the same form and using it, but that made sense because reading happens before evaluation
11:03:15
Shinmera
(in-package foo) (progn (in-package bar) (defun baz ..)) is read as: (cl-user::in-package cl-user::foo) (foo::progn (foo::in-package foo::bar) (foo::defun foo::baz ..))
11:04:32
makomo
would there still be a way to create a macro which would temporarily switch packages, evaluate a body and then switch back?
11:05:01
Bike
no, because you read before evaluating/compiling, and macroexpansion happens during evaluation/compilation
11:05:22
Shinmera
A macro could do nasty things with trying to guess what the read form of a symbol was and translate it according to that
11:05:53
Bike
for the purpose of this question, i'm assuming that evil magic is prohibited by the ancient treaty