freenode/lisp - IRC Chatlog
Search
20:32:42
jurov
but i had some disappointments trying to compile stuff there, maybe it's improved since
23:11:01
pillton
borei: The CLIM specification solves this problem by providing move-to and move-to*.
5:21:02
jackdaniel
McCLIM progress report: https://common-lisp.net/project/mcclim/posts/Progress-report-7.html :-)
6:32:39
beach
I had some insight about Earley-style parsing, and about parsing in general. This insight is relevant to Cleavir, because I want to use Earley-style parsing for lambda lists, so as to make it possible for client code to customize what specific lambda-list keywords are allowed, and how to treat them.
6:32:41
beach
The insight is that the tokenizer is context free, so it must be possible to determine the nature of a token without knowing how it will be used. For lambda lists, a `token' can in fact be a pattern, requiring a recursive parsing task to be started. But whether a list is a pattern or (say) an optional parameter definitely depends on the context.
6:35:11
beach
In terms of Earley parsing, the consequences are that the Earley `scanner' is just a special case of the `completer' in that it uses some equality predicate to check the next token. What I should do is generalize the completer to use a custom test and eliminate the special case represented by the scanner.
6:36:29
beach
If I have a generic function to do that, then, based on the context, I can trigger a subordinate parsing task when a pattern is required and I see a list as the next `token'.
6:37:32
beach
But when an optional parameter is required and I see a list, then the list is considered to be an optional parameter with a default value.
6:42:18
pjb
beach: what about *read-base*; what about reader macros implementing context-dependent tokenizers? Almost all programming languages have context-dependent lexers (hence the states in lex/flex).
6:44:18
beach
But for lambda lists, the concept of a token gets generalized a bit, since lambda lists are nested.
6:45:43
beach
But yeah, you are right, most tokenizers need some kind of kludge to determine context, but it is usually not the same mechanism as is used by the parser. Though I am aware that there are parsing techniques that don't require a tokenizer.
7:22:38
shrdlu68
The design of x509 is frustrating. It requires parsing a data structure multiple times. Parse it once, extract x and y, then go back again and extract a and b.
7:24:52
flip214
shrdlu68: uh... why not simply get all attributes out (recursively?) first time around?
7:28:56
shrdlu68
flip214: That's what I'm doing, except "all the attributes" keeps changing as you read the specs.
7:31:09
shrdlu68
I'm discovering now that for OSCP checking, I need the hash of the DER-encoded value of the public-key-info in a certificate, and the distinguished name.
7:32:16
shrdlu68
Which means either re-encode what I have decoded, which is insane, or go back and save the raw data before decoding _and_ then decode it and save the attributes in it too.
7:34:59
shrdlu68
The ASN.1 parse includes a :mode option, either :serialized or :deserialized. If you use :deserialized (the default), it recursively deserializes an ASN sequence for you and returnt the elements: strings, integers, octet strings, object identifiers, etc.
7:36:25
shrdlu68
Which saves a lot of work because if you use :serialized then you have to deserialized each element manaully, which is soul-corroding because a sequence may contain a very large number of nested elements.
7:36:54
flip214
shrdlu68: I'd recommend to just store displaced arrays into the original data along with the parsed-out data somewhere.