libera/commonlisp - IRC Chatlog

15:28:16 saltrocklamp[m] https://bpa.st/LRTQ does anyone see an obvious reason why the lisp version of my code (with sbcl) is not only ~5x slower than the python version of my code, but also isn't giving the right answers? the correct output numbers should be something like `249.9008` and `288.6628`

15:28:41 saltrocklamp[m] any recommendations for a profiler would be appreciated too

15:30:32 Bike sbcl has two built in profilers http://sbcl.org/manual/#Profiling

15:32:03 Bike if i had to guess, slow points could be parse-number and read-line

15:32:30 Bike the latter of which you could deal with by reusing a preallocating string instead of allocating a new one each time read-line is called; maybe python is smart enough to do that for the "for line in" construct

15:37:25 Krystof I think python defaults to double float; SBCL definitely defaults to single-float. That might be enough to explain the different answers

15:39:24 Krystof well, actually: your wrapper around parse-number returns (values 0.0 nil) on a parse failure, but your check tests for the primary value being null

15:39:39 saltrocklamp[m] oop, that was from an old version

15:39:41 Krystof so you don't handle invalid lines correctly in your Lisp version

15:39:48 saltrocklamp[m] yeah let me try fixing

15:40:37 saltrocklamp[m] that said, i had a previous version of this that used READ + TYPEP to "parse" floats (returning NIL if it didn't actually read a float), and i think the answer was wrong there too, even when i set *READ-DEFAULT-FLOAT-FORMAT* to double-float

15:43:12 saltrocklamp[m] Bike: i'll try `sb-sprof`, that looks nice and easy

15:43:46 saltrocklamp[m] and i will try re-using the string, i assume you mean i should `setq`/`setf` it instead of using a step-form in `do`?

16:27:07 saltrocklamp[m] urgh, i ended up having to rewrite this with `prog`

16:27:32 Bike by reusing the string i was thinking more like read-sequence

16:27:48 pjb saltrocklamp[m]: I'd use loop instead: https://termbin.com/4snu

16:30:13 saltrocklamp[m] that's much nicer pjb , i was wondering if there was a tidy loop version. Bike , wouldn't that cause problems if the current line is shorter than the previous line?

16:30:57 Bike getting it right is more involved, yeah

16:34:24 saltrocklamp[m] this is a great demo of advanced `loop`ing. setting `*read-default-float-format*` did fix the accuracy, but it's still ~7.5 seconds while the python version is ~1.5

16:35:15 saltrocklamp[m] hard to tell which calls are slow, as opposed to just frequent. let me try the deterministic profile

16:40:13 saltrocklamp[m] yep it looks like `read` is really the culprit, 0.000002 seconds per call at 2909618 calls, that's ~5.8 seconds spent on just `read`ing

16:46:45 saltrocklamp[m] i'm open to suggestions for how to fix this.. i admit i'm disappointed, i was expecting lisp to at least be comparable to python

16:52:18 Bike how's parse-number compared to read? i'd expect parse-number to be faster

17:13:44 saltrocklamp[m] not appreciably faster in some of the tests i ran, but i can try again

17:14:01 Bike hm. well that sucks.

17:14:08 saltrocklamp[m] i'm not sure if it's possible to write the `loop` version using it

17:16:58 saltrocklamp[m] weird that there's `parse-integer` in the standard but not `parse-float` - i saw some discussion about it above, maybe sicl will turn out to be the fast implementation :)

17:17:47 Krystof the more generic your parsing thing, the slower it is likely to be

17:18:11 Krystof I'd start by trying to use parse-float, though I don't know how optimized it is

17:18:41 Krystof read is a full parser; parse-number is presumably a limited tokenizer; parse-float will be even more limited

17:19:13 Krystof I would also be a bit suspicious of the deterministic profiler; the overhead is substantial and subtracting the overhead off is not necessarily 100% correct

17:25:10 saltrocklamp[m] does sbcl have `parse-float`?

17:25:42 saltrocklamp[m] i thought only lispworks had it

17:25:49 Catie It's a library, loadable through quicklisp

17:25:59 pjb saltrocklamp[m]: now, reading in lisp involves decoding octet sequences from files, into text, sequences of characters.

17:26:15 saltrocklamp[m] oh, the library. i was wondering if it would be faster than `parse-number`

17:26:19 saltrocklamp[m] i can try it

17:26:36 pjb saltrocklamp[m]: in C, and python does like C, one only processes the octets, and almost never decode them into actual characters.

17:26:49 pjb saltrocklamp[m]: so called, "utf-8 octet sequences"…

17:27:04 pjb saltrocklamp[m]: that's where a lot of time (and memory) is spent when reading in CL.

17:27:26 pjb saltrocklamp[m]: if you want to attain the same I/O performance, you must read octets in CL as well.

17:28:41 saltrocklamp[m] python 3 strings are sequences of unicode code points, by default the input encoding is utf-8. so it's definitely doing full string parsing in my example (although i could probably make the python version faster by dropping down to use raw bytes)

17:29:23 saltrocklamp[m] and in fact i think the internal storage is utf-16 or something like that, so not only is it parsing utf-8 but it's also converting it to another format

17:29:41 saltrocklamp[m] however it is written in c, and i'm sure it's been heavily optimized

17:29:57 saltrocklamp[m] not sure if that's what you meant, or something else?

17:30:13 saltrocklamp[m] (in python 2, strings were raw octet/byte sequences)

17:31:42 saltrocklamp[m] i'm definitely open to suggestions though, maybe this is a missing piece in the library ecosystem

17:33:06 pjb saltrocklamp[m]: https://termbin.com/r555

17:34:13 saltrocklamp[m] wow, you just wrote all that?

17:34:14 pjb this is an intermediate solution: we still convert to string, but assuming pure ascii input (this could also be done with :external-format :us-ascii, but it is highly implementation dependent whether it's possible to set the external format of *standard-input*.

17:34:25 pjb saltrocklamp[m]: no, copy-and-paste from my libraries.

17:35:06 pjb we could avoid converting to characters (which in sbcl take 32-bit each), by processing the octets directly. The float parsing function would have to be changed to use vectors of octets instead of strings.

17:35:18 saltrocklamp[m] is this slurping the entire thing into memory? i forgot to mention this earlier, but one of the other requirements was to assume that the data is "huge" to the point where it can't be reasonably loaded all at once

17:35:20 pjb eg. testing for 48 instead of #\0 etc.

17:35:47 saltrocklamp[m] i see, hm. i wonder if that's what `parse-number` is doing

17:36:00 pjb saltrocklamp[m]: then yes, looping on reading a buffer with read-sequence, and processing the octets instead of converting to string would be best.

17:36:28 saltrocklamp[m] i am curious what python, lua, and nim are doing that make this so much more efficient than whatever sbcl and ccl are doing

17:36:58 pjb saltrocklamp[m]: you can use https://github.com/informatimago/lisp/blob/master/common-lisp/cesarum/ascii.lisp#L382 to help processing octets of ascii codes.

17:37:41 pjb Notably, if you want to split the lines on newlines: https://github.com/informatimago/lisp/blob/master/common-lisp/cesarum/ascii.lisp#L584

17:38:36 pjb saltrocklamp[m]: I told you: they process octets, instead of characters.

17:39:56 saltrocklamp[m] i think i was confused before. you are talking about the number parsing part?

17:40:00 pjb which, for a file that contains mostly 10, 43, 45, 46, and 48-57, let you avoid a lot of processing…

17:41:49 saltrocklamp[m] `for line in sys.stdin` in python iterates over true unicode strings, not byte sequences. but it would make sense if e.g. `float(s)` operated on the underlying bytes of the string `s`

17:43:16 saltrocklamp[m] your `contents-from-stream` function seems to implement Bike's suggestion to re-use the string

17:47:40 pjb AFAIK, python keeps the string as a utf-8 sequence.

17:53:30 saltrocklamp[m] it's not utf-8 internally in cpython at least, they use some wider encoding in order to do string lookups in constant time

17:54:22 saltrocklamp[m] i'll have to spend some time reading these code snippets, my understanding of how "streams" work in list is hazy still

17:55:10 saltrocklamp[m] i do wonder about the memory allocation, i am trying to look at the generated c code from the nim version to see if that's what nim does

18:00:08 saltrocklamp[m] found it, but holy moly that's a complicated c program

20:13:47 ski_ ** NICK ski

20:24:29 James` Hey I got a question and I read this chat is active

20:25:13 James` Anybody know how I can write a defmacro within a defun? Similar to this post on SO, but for defmacros

20:25:31 James` https://stackoverflow.com/questions/3772365/how-to-defun-a-function-within-a-defun

20:26:12 Bike you can use macro-function instead of symbol-function there, but you probably don't want to actually do this

20:27:01 James` I see, I can try that. Yeah I wasn't sure if its the right way...

20:27:16 Bike What are you trying to do?

20:30:42 James` So I am writing some tests (fiasco), and doing stuff like (deftest t1-fail () ....

20:31:23 James` Now I want to automate the name generation, so I can write each test like a reader macro #T(...) and have the names automatically created for each

20:32:01 James` So I just write the basic test, and I extract from it which function I'm testing and a counter for that function (stored in a hash table) to then build the deftest form

20:37:43 Bike can you just do like, (defmacro my-deftest (&body body) `(deftest ,(generate-a-name) ,@body)), or what am i missing here

20:41:38 Nilby Lisp seems like a drug that causes the human mind to create test frameworks.

20:45:59 James` That's what I initially tried, but couldn't get it to work

20:46:23 Bike what happened?

20:49:34 James` It doesn't seem to create the test macro (e.g. I do my-test-1231 and I get an error 'No test ....)

20:49:54 James` Whereas if I manually do it, it works. I think it may have something to do with interning symbols...

20:50:14 James` But not sure to be honest

20:50:39 James` (defun generate-a-name () (incf *counter*) (make-symbol (concatenate 'string "my-test-123" (write-to-string *counter*))))

20:51:45 Bike wait, okay, so then you need the name later?

20:51:58 Bike i'm not sure i understand your goal here. You want nameless tests except you actually do need to know the name?

20:52:31 James` Yeah I'm going to write all the tests in a file, so I need the names

20:53:31 Bike okay, well how about you store the names somewhere. do (defvar *test-names* nil) and then (defmacro my-deftest (&body body) (let ((name (generate-a-name))) `(progn (push ',name *test-names*) (deftest ,name ,@body)))

20:57:54 James` Thanks, I will play around with it, don't want to take up more of your time. I faced the same issue, 'no test ....' after running it, but I gotta figure this out for myself, sounds like its possible to do it the way you say and what I thought initially so its something else

20:58:04 James` That's causing the issue

20:58:31 James` Thanks Again

21:00:01 Bike Good luck

21:03:19 James` Cheers mate

21:50:51 jcowan (= 8.7799997 8.78) => T because round to even, whereas (= 8.7800002 8.78) => NIL

22:54:39 JeromeLon jcowan: both return T for me with sbcl, ecl and clisp.

22:56:51 jcowan ah, I had one too many 0s

22:57:19 Catie Conversely, CCL returns NIL for both under FreeBSD with *READ-DEFAULT-FLOAT-FORMAT* set to DOUBLE-FLOAT

22:58:08 Catie With SINGLE-FLOAT I get both as T

0:06:19 zacque Good morning

1:56:31 Xach new quicklisp this day

1:56:34 Xach (dist, that is)

2:03:27 hayley Xach announces Quicklisp 2: lisp quicker

2:03:58 philnum[m] ** NICK ArgoLargo[m]