freenode/#lisp - IRC Chatlog
Search
0:54:06
elderK
mobile_c: Read itself converts the characters and stuff into a tree of lists and things. READ is fundamental.
0:54:22
elderK
If you're used to parsing like, things from another language like C, you will have to roll that mechanism yourself.
0:55:24
mobile_c
how can i parse something like this https://paste.pound-python.org/show/K4QGpELlMYw0VWUjbl37/ using lisp
0:56:10
elderK
mobile_c: If you'd like to learn about how the reader works, you can find out here: http://clhs.lisp.se/Body/02_.htm
0:56:34
didi
I think we are dismissing READ too quickly. It's amazing that we can do it. We should praise it more.
0:57:54
elderK
mobile_c: To parse something like that, you'll probably do a lot of stuff you're used to doing in other languages if you've parsed by hand. Or, you can learn to use one of the many parser-generator libraries available for Lisp.
0:59:10
mobile_c
as i want to parse it like a parser grammar (since technically at the moment it is very similar to one)
1:01:31
malice
With problem definition like this, I'd take a look at parser generators: https://www.cliki.net/parser%20generator
1:02:47
mobile_c
as the main problem is figuring how how to parse it like a rule definition/rule expamsion
1:03:27
elderK
mobile_c: If you want to do it by hand, you could just write a simple lexer and recursive descent parser :)
1:03:59
malice
(also note that the site could use updating; some of the entries are 404 and there are probaly a couple of new ones not listed)
1:05:34
malice
mobile_c: I'm afraid I don't understand the problem well enough to suggest an optimal solution. One of the things I do not understand is the need for Lisp parser
1:06:14
malice
then I do not understand the goal - do we want any parser, some specific parser, what representation of AST should we produce, how do we handle the errors, etc.
1:08:59
malice
although writing your own parser won't be much different from other languages, I guess.
1:12:06
aeth
keep in mind that cliki is probably 15 years old, and not as popular as random github pages like https://github.com/CodyReichert/awesome-cl these days
1:14:56
malice
mobile_c: there's also rdp generator here: http://www.informatimago.com/develop/lisp/index.html
1:33:11
aeth
I think that for a reader macro as long as it returns (turns into?) one thing you can just quote it, but I could be wrongly generalizing from read-eval.
2:48:15
antonv
a library (an ASDf system) shoudl chose a dependency (another ASDF system) based on what OS / distro it runs on
2:48:52
antonv
simply speaking, depending on OpenSSL version installed, we should choose an FFI wrapper to load
3:43:35
fiddlerwoaroof
however, if it's something like "which openssl version is installed", you might have to do a bit of work to get the features setup appropriately.
9:52:53
ogamita
elderK: to test reader macros easily, you can use read-from-string: (read-from-string "`(foo \"string\" ,x)") #| --> (list* 'foo (list* "string" (list x))) ; 18 |# be sure to escape double-quotes and backslashes!
9:55:29
ogamita
aeth: when you prefix a reader macro by a quote, this prevents what is read to be evaluated. So it should print what has been read. Unfortunately, the pretty printer, and even the printer, will often print some objects in a special way. For example: (prin1-to-string '(function foo)) #| --> "#'foo" |# instead of printing as a normal (function foo) list. You can use your own printing function to avoid this caveat, eg. (print-conses
9:55:29
ogamita
'(function foo)) #| (function . (foo . ())) --> #'foo |# ; notice how the result after --> is printed by cl:print.
10:23:22
ogamita
elderK: another trick when you are developping a reader macro is to use 'my-reader-macro instead of #'my-reader-macro in set-macro-character or set-dispatch-macro-character.
10:23:41
ogamita
elderK: with ' when you redefine the reader macro, it's taken into account immediately.
10:42:14
no-defun-allowed
The symbol adds redirection so the name is looked up during a funcall instead of being handed the old function object.
10:42:39
ogamita
it's used with apply or funcall, so a symbol denotes the global function of same name. (actually, symbol-function is used).
10:49:01
elderK
Is there any particular reason to say #'name rather than just 'name for say, apply or reduce or whatever?
10:50:20
jackdaniel
elderK: if you have (flet ((name () "foo")) …) then 'name will refer to a global function definition, while #'name will refer to the local one
10:51:28
jackdaniel
also #'foo gives you a function itself, so if you (let ((foo #'foo)) (loop (funcall foo))), it will always call the same function when looping (even if you redefine it in a different thread)
11:08:46
jackdaniel
you probably want to replace lambda and uiop piggyback with your own function working on strings
11:15:26
elderK
If you have trouble like, navigating the CLHS as it is online, Zeal could really help :)
11:16:51
jmercouris
I know about sort, did know how to write the predicate to compare two strings and see which one has alphabetical precedence
11:19:06
jackdaniel
jmercouris: lexicographic in uiop is there for lists, that's why I did coerce them
11:21:28
White_Flame
the inequality tests are based on char< etc, which do defer to implementation specifics
11:24:02
elderK
jmercouris: string<= doesn't seem to be implementation-specific. Although I guess like, if you like, want to compare unicode strings in an ASCII-only Lisp, I guess would hit trouble.
11:24:39
White_Flame
it relies on character codes, which used to not be very standardized, and thus implementation specific
11:28:01
elderK
White_Flame: Right, so it's "super portable" only if you stick to like, the "standard ASCII" stuff, right?
11:28:29
elderK
As soon as you say, have any kind of international characters - you need to either use an implementation that supports Unicode for its stuff, or implement your own predicates, right?
11:30:09
White_Flame
if they didn't, I dont' believe you'd be able to use character types to represent unicode characters. (but I wouldn't bet my life on it)
11:31:04
White_Flame
ah, there's CHAR-CODE-LIMIT, which things will refuse to work with if you go outside fo
11:31:49
elderK
White_Flame: Right. So, much like in other languages that don't natively support Unicode strings, you'd have to implement your own stuff.
11:32:09
elderK
That's no major issue though. At least, not if you just want codepoints. Unicode, while annoying, is easy enough to decode.
11:32:46
Bike
you could use flexi streams or the like to manipulate things to some extent, if there were no characters past ascii or anything. it would suck though.
11:34:07
elderK
I imagine as long as you can read binary, you can read Unicode. Just to varying degrees of "annoying."
11:34:23
Bike
well, right, it takes care of that. but you wouldn't be able to manipulate the result as lisp strings.
11:35:27
elderK
How would you add support if you so wanted? Would you have to go as far as creating like, your own types and predicates and everything?
11:35:56
Bike
the set of characters is determined by the implementation and can't be extended by users.
11:35:56
elderK
I guess you would. Maybe implement a code-point type, create predicates for that, then define "unicode strings" on that.
11:36:29
White_Flame
basically, dig into the implementation and send a pull request when you're done :-P
11:37:27
elderK
jmercouris: As far as I am concerned, at least compared to the languages I usually deal with, CL is pretty well built and is quite flexible.
11:38:13
jmercouris
maybe I'm just in an argumentative mood, but it doesn't seem like an oversight to me
11:38:44
elderK
Taken in context, it doesn't. I mean, shit, C's support for strings of any kind is kind of crap :P
11:39:13
elderK
And White_Flame is right: Back then, well, it was by the platform. Maybe you had code-pages or something, maybe not.
11:39:50
White_Flame
and as Bike listed, this decision on what the character encoding is is wound up in all sorts of ways in character & string handling
11:40:16
jmercouris
hindsight is always 20/20 and, as far as I understand, it is mostly up to the implementation to decide
11:40:21
jackdaniel
for instance extending valid character set would require implementations to allow specializing stream encoders and decoders
11:40:40
White_Flame
the ability to define new character code ranges would mess with the low level byte representation of internal characters/strings, which is beyond the recompilation sensibilities of the day
11:41:14
elderK
Still, I guess it only really matters if you're reading stuff that you intend to evaluate or something anyway, right?
11:41:36
elderK
Like, if it's just strictly read in, do stuff, write out. You can deal with Unicode yourself.
11:41:42
White_Flame
anyting you specifically want as a character or string would require the characters' codes to be within range
11:43:41
White_Flame
but in actuality that char code limit was established long before the emoji craze
11:43:52
elderK
But still, just to get an answer: It is really an issue if and only if you're actually say, relying on CL's native string stuff, right? If you aren't say, intending on having the user give you stuff to read-from-string, or if you aren't reading "text streams", it's not really an issue, right?
11:44:27
White_Flame
again, anything you specifically want as a character or string type in the runtime at some point would require the characters' codes to be within range
11:44:50
White_Flame
if you're reading from a byte stream, it will decode in whatever ways it supports
11:44:55
jackdaniel
while emoji does seems silly at first sight, adding pictograms to unicode doesn't anymore
11:46:03
White_Flame
emoji is not creating a record of existing glyphs, it's inventing new ones for the purpose
11:46:06
jackdaniel
jmercouris: you don't see any utility in simple pictograms being part of the charaset?
11:46:06
elderK
White_Flame: I mean like, if you're reading as binary, not as text, straight bytes. Uninterpreted or altered. And you, yourself, perform decoding and implemenet your own predicates, etc. As long as you aren't then saying: Yo, CL, read this <bunch or raw stuff>, it's not going to matter.
11:46:42
elderK
jackdaniel: I would rather useful pictograms be added. Not things like poop or cats.
11:46:46
jackdaniel
such "signs" are universally recognizable disregarding written language knowledge
11:47:06
elderK
White_Flame: Right. So, I'm saying if that isn't necessary and you aren't using string<= and stuff, then not having native Unicode support is not necessarily a killer.
11:47:35
White_Flame
elderK: depends on what you mean by "killer". It means you can't use any string, character, READ, etc utilities
11:48:01
jmercouris
jackdaniel: ok, that's a pretty convincing argument actually, however let's say I disagree about WHICH emojis are necessary or not
11:48:02
White_Flame
it would probably make sense to marshall such a thing into some custom escaped string representation
11:48:20
jackdaniel
this list http://unicode.org/emoji/charts/full-emoji-list.html doesn't seem half bad
11:48:29
jmercouris
jackdaniel: for example, I can imagine a pictogram indicating "restroom" would be very useful
11:49:15
Bike
elderK: i mean it's like implementing arithmetic yourself. you can do it but it's not great.
11:50:07
jackdaniel
you disagree with about, say, 256 characters in million, I'm sure someone found cowboy character useful if its there
11:50:27
elderK
Bike: Aye. A couple years back I implemented a whole set of such encoding / decoding libraries in C for a project I was doing.
11:50:31
White_Flame
aren't there a ton of combining forms for them as well as skin color modifiers and other meta stuff adding to implementation woes?
11:50:41
elderK
I learned a lot. And it was just pure code-point decoding, no normalization or... you know, the harder stuff