freenode/lisp - IRC Chatlog

5:21:02 jackdaniel McCLIM progress report: https://common-lisp.net/project/mcclim/posts/Progress-report-7.html :-)

5:29:51 beach jackdaniel: Great!

5:30:22 pillton jackdaniel: Is there a McCLIM channel?

5:30:33 beach pillton: We use #clim.

5:30:51 pillton Thanks.

6:32:39 beach I had some insight about Earley-style parsing, and about parsing in general. This insight is relevant to Cleavir, because I want to use Earley-style parsing for lambda lists, so as to make it possible for client code to customize what specific lambda-list keywords are allowed, and how to treat them.

6:32:41 beach The insight is that the tokenizer is context free, so it must be possible to determine the nature of a token without knowing how it will be used. For lambda lists, a `token' can in fact be a pattern, requiring a recursive parsing task to be started. But whether a list is a pattern or (say) an optional parameter definitely depends on the context.

6:33:28 pillton I wondered that when you brought it up the other day.

6:33:46 beach Good, that means I am not alone. :)

6:35:11 beach In terms of Earley parsing, the consequences are that the Earley `scanner' is just a special case of the `completer' in that it uses some equality predicate to check the next token. What I should do is generalize the completer to use a custom test and eliminate the special case represented by the scanner.

6:36:29 beach If I have a generic function to do that, then, based on the context, I can trigger a subordinate parsing task when a pattern is required and I see a list as the next `token'.

6:37:32 beach But when an optional parameter is required and I see a list, then the list is considered to be an optional parameter with a default value.

6:38:29 beach Definitely smells like a potential paper in the end.

6:40:54 pjb beach: *IF* the tokenizer is context free, not *that*.

6:42:18 pjb beach: what about *read-base*; what about reader macros implementing context-dependent tokenizers? Almost all programming languages have context-dependent lexers (hence the states in lex/flex).

6:43:29 beach pjb: I am not going to use this technique to parse general Common Lisp code.

6:43:52 beach pjb: In fact, I have a tokenizer, namely READ.

6:44:01 pjb :-)

6:44:18 beach But for lambda lists, the concept of a token gets generalized a bit, since lambda lists are nested.

6:44:36 beach Therefore, an element of a lambda list can very well be another lambda list.

6:45:43 beach But yeah, you are right, most tokenizers need some kind of kludge to determine context, but it is usually not the same mechanism as is used by the parser. Though I am aware that there are parsing techniques that don't require a tokenizer.

6:49:27 shrdlu68 Good morning, folk.

6:49:48 beach Hello shrdlu68.

6:50:02 flip214 shrdlu68: good morning..... but shouldn't that be "folks"?

6:50:34 shrdlu68 flip214: Maybe. Second-hand english :)

6:51:22 flip214 oh, okay. Never mind. Just asking to fix _my_ interpretation of "English" ;)

7:22:38 shrdlu68 The design of x509 is frustrating. It requires parsing a data structure multiple times. Parse it once, extract x and y, then go back again and extract a and b.

7:24:52 flip214 shrdlu68: uh... why not simply get all attributes out (recursively?) first time around?

7:28:56 shrdlu68 flip214: That's what I'm doing, except "all the attributes" keeps changing as you read the specs.

7:29:58 flip214 well, at least you know what attributes you _need_ to get out, right?

7:31:09 shrdlu68 I'm discovering now that for OSCP checking, I need the hash of the DER-encoded value of the public-key-info in a certificate, and the distinguished name.

7:32:16 shrdlu68 Which means either re-encode what I have decoded, which is insane, or go back and save the raw data before decoding _and_ then decode it and save the attributes in it too.

7:34:59 shrdlu68 The ASN.1 parse includes a :mode option, either :serialized or :deserialized. If you use :deserialized (the default), it recursively deserializes an ASN sequence for you and returnt the elements: strings, integers, octet strings, object identifiers, etc.

7:36:25 shrdlu68 Which saves a lot of work because if you use :serialized then you have to deserialized each element manaully, which is soul-corroding because a sequence may contain a very large number of nested elements.

7:36:54 flip214 shrdlu68: I'd recommend to just store displaced arrays into the original data along with the parsed-out data somewhere.

7:37:49 shrdlu68 flip214: That's a good idea.

10:56:15 xificurC starting a script with quicklisp enabled takes me 0.3 seconds (sbcl). If I skipped quicklisp and went straight with asdf to load the libraries would I get a speedup? I tried putting `(require "asdf") (asdf:load-system "cl-ppcre")` to test the speed but asdf cannot find cl-ppcre

10:57:07 phoe xificurC: you could explicitly tell ASDF to look in the proper folder, ~/quicklisp/dists/quicklisp/software/cl-ppcre......

11:00:31 xificurC phoe: ok, that seems to not crash, however I'm not sure if I'm running the script properly

11:00:41 xificurC sbcl --noinform --non-interactive --no-userinit --no-sysinit --script regex.lisp

11:00:48 xificurC shows no output even though there's prints in the file

11:01:49 guaqua xificurC: what is the reason for optimizing startup speed? if you want to get straight to running your program quickly, i'd suggest you look at saving a core and starting the program with that

11:02:07 xificurC I'm wondering how much of an uncharted territory did I just walk into

11:02:37 xificurC guaqua: that's a viable solution for 1 script. Imagine all your bash and python scripts were like that

11:03:06 guaqua that's true. if your scripts seem to share the same set of libraries, they could maybe use the same core

11:03:07 _death I save a core with often used (but not often updated) systems loaded..

11:03:15 flip214 me too.

11:03:23 xificurC I'm optimizing for startup speed because I'm writing a script that can be called many, many times

11:03:26 _death you can then load multiple scripts with the same core

11:03:46 flip214 and even the monthly updates are just a "rm -rf ~/.cache/common-lisp && ./script" ...

11:04:22 _death yeah, have a `rebuild-core' shell script for doing that

11:04:45 xificurC so I can create 1 core with all the libs I want to use and then I can write scripts that use that core. Is that correct?

11:04:45 flip214 having a superset of systems loaded doesn't really hurt. My image (including hunchentoot, which requires ~30 other systems(?), and some more) has a 0.03sec "empty" start time.

11:04:54 White_Flame xificurC: it sounds like you're predicting a launch performance problem, and not actually observed that it's a problem yet?

11:05:20 xificurC I can preload the libraries so I can already call e.g. cl-ppcre:split without doing any asdf:load-system or ql:quickload?

11:05:26 flip214 xificurC: yes.

11:05:46 flip214 White_Flame: when using ASDF, this costs at least 0.5 seconds.

11:06:24 flip214 time ~/.my-sbcl --no-userinit --quit --eval '(print "hello")'

11:06:27 xificurC White_Flame: not really. I am planning to write a script that I know will be called dozens of times. If a quicklisp startup takes 0.3 seconds here, that's a no-go

11:06:28 flip214 real 0m0,188s

11:06:29 _death you'd also want a shell script `my-sbcl' or something that uses that core.. then you can do #!/path/to/my-sbcl --script or something

11:07:10 White_Flame xificurC: stepping back, it seems like you should find a way to batch your requests into a longer running lisp image, than trying to start & stop it so many times

11:07:31 xificurC anyone has a build-core shell script somewhere on github/gitlab/... ?

11:07:49 flip214 xificurC: one sec.

11:08:17 xificurC White_Flame: that changes the design of the whole script

11:08:25 _death it can be a simple thing.. just loading the systems and uiop:dump-image

11:08:52 flip214 xificurC: http://paste.lisp.org/display/345197

11:10:12 xificurC flip214: thanks, I'll take a look

11:10:22 flip214 xificurC: bash script in annotation, too

11:11:32 flip214 xificurC: BSD license for that paste, BTW ;)

11:11:47 flip214 or public domain, even.

11:12:23 White_Flame xificurC: given that you're looking to work hard on optimization tests & multiple coding efforts, it might be worth it to put that effort into reducing the number of calls instead

11:13:32 _death White_Flame: I like to use pzmq for that kind of thing

11:17:20 _death still, there were some issues the last time I used it: https://github.com/death/FFmpeg/commit/91149048ecc8168475889a1a72f97febc13bc88a

12:37:19 xificurC White_Flame: I understand your point, unfortunately not all programs/scripts fall into that category. You wouldn't e.g. go put all of your database data into 1 big flat table just to be able to get all the necessary data in 1 query, would you

12:37:46 White_Flame when speed is necessary, database queries tend to be glommed together into 1 mega-query, yes

12:38:03 White_Flame though that doesn't change the tables

12:38:14 White_Flame well, obviously schema updates are a part of latency optimization as well

13:30:30 azzamsa__ ** NICK azzamsa

13:45:05 phoe haha, I love playing with deployed Lisp apps

13:45:17 phoe I had a config file that I read using #'READ

13:45:28 phoe so now I'm abusing this to print more debug information by means of #.

14:25:18 flip214 phoe: just (LOAD) that file

14:25:30 flip214 then you can as well do (TRACE ...) and similar stuff in there

14:36:07 Guest10882 ** NICK CrazEd

14:36:37 CrazEd ** NICK Guest18683

15:36:43 Guest18683 ** NICK CrazEd

15:37:12 CrazEd ** NICK Guest13136

15:56:37 phoe flip214: not really

15:56:39 phoe it's a data file

15:56:47 phoe my program READs it

15:57:06 phoe so I'm limited to #. but nonetheless the stack-printing debugger hook gives me all the info I think I need