freenode/lisp - IRC Chatlog
Search
9:52:53
ogamita
elderK: to test reader macros easily, you can use read-from-string: (read-from-string "`(foo \"string\" ,x)") #| --> (list* 'foo (list* "string" (list x))) ; 18 |# be sure to escape double-quotes and backslashes!
9:55:29
ogamita
aeth: when you prefix a reader macro by a quote, this prevents what is read to be evaluated. So it should print what has been read. Unfortunately, the pretty printer, and even the printer, will often print some objects in a special way. For example: (prin1-to-string '(function foo)) #| --> "#'foo" |# instead of printing as a normal (function foo) list. You can use your own printing function to avoid this caveat, eg. (print-conses
9:55:29
ogamita
'(function foo)) #| (function . (foo . ())) --> #'foo |# ; notice how the result after --> is printed by cl:print.
10:23:22
ogamita
elderK: another trick when you are developping a reader macro is to use 'my-reader-macro instead of #'my-reader-macro in set-macro-character or set-dispatch-macro-character.
10:23:41
ogamita
elderK: with ' when you redefine the reader macro, it's taken into account immediately.
10:42:14
no-defun-allowed
The symbol adds redirection so the name is looked up during a funcall instead of being handed the old function object.
10:42:39
ogamita
it's used with apply or funcall, so a symbol denotes the global function of same name. (actually, symbol-function is used).
10:49:01
elderK
Is there any particular reason to say #'name rather than just 'name for say, apply or reduce or whatever?
10:50:20
jackdaniel
elderK: if you have (flet ((name () "foo")) …) then 'name will refer to a global function definition, while #'name will refer to the local one
10:51:28
jackdaniel
also #'foo gives you a function itself, so if you (let ((foo #'foo)) (loop (funcall foo))), it will always call the same function when looping (even if you redefine it in a different thread)
11:08:46
jackdaniel
you probably want to replace lambda and uiop piggyback with your own function working on strings
11:15:26
elderK
If you have trouble like, navigating the CLHS as it is online, Zeal could really help :)
11:16:51
jmercouris
I know about sort, did know how to write the predicate to compare two strings and see which one has alphabetical precedence
11:19:06
jackdaniel
jmercouris: lexicographic in uiop is there for lists, that's why I did coerce them
11:21:28
White_Flame
the inequality tests are based on char< etc, which do defer to implementation specifics
11:24:02
elderK
jmercouris: string<= doesn't seem to be implementation-specific. Although I guess like, if you like, want to compare unicode strings in an ASCII-only Lisp, I guess would hit trouble.
11:24:39
White_Flame
it relies on character codes, which used to not be very standardized, and thus implementation specific
11:28:01
elderK
White_Flame: Right, so it's "super portable" only if you stick to like, the "standard ASCII" stuff, right?
11:28:29
elderK
As soon as you say, have any kind of international characters - you need to either use an implementation that supports Unicode for its stuff, or implement your own predicates, right?
11:30:09
White_Flame
if they didn't, I dont' believe you'd be able to use character types to represent unicode characters. (but I wouldn't bet my life on it)
11:31:04
White_Flame
ah, there's CHAR-CODE-LIMIT, which things will refuse to work with if you go outside fo
11:31:49
elderK
White_Flame: Right. So, much like in other languages that don't natively support Unicode strings, you'd have to implement your own stuff.
11:32:09
elderK
That's no major issue though. At least, not if you just want codepoints. Unicode, while annoying, is easy enough to decode.
11:32:46
Bike
you could use flexi streams or the like to manipulate things to some extent, if there were no characters past ascii or anything. it would suck though.
11:34:07
elderK
I imagine as long as you can read binary, you can read Unicode. Just to varying degrees of "annoying."
11:34:23
Bike
well, right, it takes care of that. but you wouldn't be able to manipulate the result as lisp strings.
11:35:27
elderK
How would you add support if you so wanted? Would you have to go as far as creating like, your own types and predicates and everything?
11:35:56
Bike
the set of characters is determined by the implementation and can't be extended by users.
11:35:56
elderK
I guess you would. Maybe implement a code-point type, create predicates for that, then define "unicode strings" on that.
11:36:29
White_Flame
basically, dig into the implementation and send a pull request when you're done :-P
11:37:27
elderK
jmercouris: As far as I am concerned, at least compared to the languages I usually deal with, CL is pretty well built and is quite flexible.
11:38:13
jmercouris
maybe I'm just in an argumentative mood, but it doesn't seem like an oversight to me
11:38:44
elderK
Taken in context, it doesn't. I mean, shit, C's support for strings of any kind is kind of crap :P
11:39:13
elderK
And White_Flame is right: Back then, well, it was by the platform. Maybe you had code-pages or something, maybe not.
11:39:50
White_Flame
and as Bike listed, this decision on what the character encoding is is wound up in all sorts of ways in character & string handling
11:40:16
jmercouris
hindsight is always 20/20 and, as far as I understand, it is mostly up to the implementation to decide
11:40:21
jackdaniel
for instance extending valid character set would require implementations to allow specializing stream encoders and decoders
11:40:40
White_Flame
the ability to define new character code ranges would mess with the low level byte representation of internal characters/strings, which is beyond the recompilation sensibilities of the day
11:41:14
elderK
Still, I guess it only really matters if you're reading stuff that you intend to evaluate or something anyway, right?
11:41:36
elderK
Like, if it's just strictly read in, do stuff, write out. You can deal with Unicode yourself.
11:41:42
White_Flame
anyting you specifically want as a character or string would require the characters' codes to be within range
11:43:41
White_Flame
but in actuality that char code limit was established long before the emoji craze
11:43:52
elderK
But still, just to get an answer: It is really an issue if and only if you're actually say, relying on CL's native string stuff, right? If you aren't say, intending on having the user give you stuff to read-from-string, or if you aren't reading "text streams", it's not really an issue, right?
11:44:27
White_Flame
again, anything you specifically want as a character or string type in the runtime at some point would require the characters' codes to be within range
11:44:50
White_Flame
if you're reading from a byte stream, it will decode in whatever ways it supports
11:44:55
jackdaniel
while emoji does seems silly at first sight, adding pictograms to unicode doesn't anymore
11:46:03
White_Flame
emoji is not creating a record of existing glyphs, it's inventing new ones for the purpose
11:46:06
jackdaniel
jmercouris: you don't see any utility in simple pictograms being part of the charaset?
11:46:06
elderK
White_Flame: I mean like, if you're reading as binary, not as text, straight bytes. Uninterpreted or altered. And you, yourself, perform decoding and implemenet your own predicates, etc. As long as you aren't then saying: Yo, CL, read this <bunch or raw stuff>, it's not going to matter.
11:46:42
elderK
jackdaniel: I would rather useful pictograms be added. Not things like poop or cats.
11:46:46
jackdaniel
such "signs" are universally recognizable disregarding written language knowledge
11:47:06
elderK
White_Flame: Right. So, I'm saying if that isn't necessary and you aren't using string<= and stuff, then not having native Unicode support is not necessarily a killer.
11:47:35
White_Flame
elderK: depends on what you mean by "killer". It means you can't use any string, character, READ, etc utilities
11:48:01
jmercouris
jackdaniel: ok, that's a pretty convincing argument actually, however let's say I disagree about WHICH emojis are necessary or not
11:48:02
White_Flame
it would probably make sense to marshall such a thing into some custom escaped string representation
11:48:20
jackdaniel
this list http://unicode.org/emoji/charts/full-emoji-list.html doesn't seem half bad
11:48:29
jmercouris
jackdaniel: for example, I can imagine a pictogram indicating "restroom" would be very useful
11:49:15
Bike
elderK: i mean it's like implementing arithmetic yourself. you can do it but it's not great.
11:50:07
jackdaniel
you disagree with about, say, 256 characters in million, I'm sure someone found cowboy character useful if its there
11:50:27
elderK
Bike: Aye. A couple years back I implemented a whole set of such encoding / decoding libraries in C for a project I was doing.
11:50:31
White_Flame
aren't there a ton of combining forms for them as well as skin color modifiers and other meta stuff adding to implementation woes?
11:50:41
elderK
I learned a lot. And it was just pure code-point decoding, no normalization or... you know, the harder stuff
11:51:51
dim
should you use memcpy, strcpy, strncpy, strlcpy, snprintf, xsnprintf or something else?
11:51:54
jackdaniel
while it is not conforming, I think that treating C strings as byte arrays makes much more sense (sanity-wise)
11:52:12
White_Flame
and then there's C++ where there's a kabillion incompatible string implementations
11:53:49
dim
having a character type and then arrays/vectors of characters looks like a simple and effective way at handling strings, I wonder why so few programming languages are doing that
11:55:00
elderK
:) I hail from the land of C so I'm pretty aware of its issues. So, I'm pretty happy with CL :)
11:55:18
jackdaniel
dim: not so fast, C did that already and we all agree it is hell. you need a specific array implementation too
11:56:00
dim
jackdaniel: C doesn't have a type for characters, it has signed/unsigned 8-bits thing named char, that's very different
11:57:21
jackdaniel
I first looked here: https://en.wikipedia.org/wiki/C_data_types , but usually I refer to c99 standard when I look up things
11:58:39
White_Flame
given that string literals become char arrays/pointers, that strongly indicates that a char is in fact a character
11:59:18
dim
I like the character data type in CL where you are actually dealing with characters and then you may represent them in a different external-format if needed, or even read them from a different external-format
11:59:55
dim
White_Flame: but it's not really, it's just a number that fits in a byte. You can't reprensent multi-byte unicode characters in a single char in C! it's not a character…
12:00:35
elderK
dim: And who's to say some CL implementation's character type also supports multibyte characters?
12:01:15
White_Flame
what C defines as a character does not have to include anybody else's definition of a character, including unicode's
12:01:56
dim
nope, the CL standard makes it so that you may do whatever you want to support the implementation's character range, the C standard says that a char is the smallest adressable unit (usually a 8-bits byte) without respect for the character encodings / code points you might need to fit in there
12:01:59
jackdaniel
dim: i.e here (c99), there is a section about multibyte characters: http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf (section 5.2)
12:02:29
dim
CL is pretty good at UTF-8 and the like, using its standard API and data types from the 80s
12:02:45
White_Flame
C defines what you fit in its character type. It owns that type and its defintion, nothing else does
12:03:06
jackdaniel
dim: smallest addressable unit may be UTF-8 or whatever. afair lisp machine compilers for C had address length = 1 and mapped char into host lisp characters
12:03:18
White_Flame
it's not a _useful_ charcer type in the grand scheme of things, but it is C's character type
12:04:32
jackdaniel
I agree that C strings are disasters as they are handled, but character is implementation dependent (like in CL) and they are kept in arrays (and, yuck, null terminated)
12:05:06
elderK
jackdaniel: Well, it's all trade-offs. It's terminate them in some way or store their length in some other way.
12:05:33
dim
IME it's only been arrays of bytes, for any practical purposes you can't handle them as arrays of something that would know the current encoding…
12:05:33
elderK
And then you have to say: How many bytes will I use to store the length? Or will I encode the strings in an interesting way so that I can tell how many bytes are required for the length?
12:06:29
elderK
dim: Well, sure. But C is arguably a lower-level language than CL. So, it makes sense that if you need such support, you implement it or use a library.
12:06:51
dim
elderK: PostgreSQL has a good variable-length data-type for that with a length header that's either 1-byte or 4-bytes, limiting the size of varchar/text columns to 1GB - 4 byte, which is practical
12:07:32
elderK
dim: I'm just saying that such decisions need to be made. And well, sometimes it is not always so clear cut which to do.
12:07:49
jackdaniel
having a null canary certainly seemed to make sense at some point of time, but from the historical perspective it was a huge mistake
12:09:44
jackdaniel
dim: this is a fun read: https://begriffs.com/posts/2018-11-15-c-portability.html, they even mention symbolics c implementation
12:09:59
jackdaniel
https://archive.org/details/bitsavers_symbolicssGuidetoSymbolicsC_543959/page/n1 ← and Symbolics C manual
12:10:02
shka__
White_Flame: you are saying that essentially it was all because strings were doing double duty as streams?
12:11:21
White_Flame
it's my interpretation of history I wasn't a part of, so here's your grain of salt
12:11:40
jackdaniel
I think that some phrasing in the original ANSI C standard would be different if not they didn't take into account Lisp Machines
12:12:33
jackdaniel
elderK: this may be also interesting to you: https://github.com/vsedach/Vacietis
12:16:20
elderK
God, it would be a dream to understand that kind of thing. To know CL well enough to help out froggey.
12:17:03
jackdaniel
elderK: I believe there are tasks which may be worked out with a modest CL knowledge
12:17:18
jmercouris
so I have a bit of a problem, I have a struct, that has a field which is a list of strings
12:17:32
elderK
I've always wanted to get into them. The farthest I've got is learning about parsing, like, recursive descent and LL1 and stuff.
12:17:57
jmercouris
HOWEVER, when I try to generate an equivalent hashmap to retrieve the key, it is not retrieved
12:19:13
dim
default is 'eq and that means you're only trying to find the same struct instance (pointer equality), not another struct that shares common elements
12:19:45
jackdaniel
show the code, I'm sure you'll have some mistake there which we'll be able to spot (instead of guessing)
12:23:03
jmercouris
interesting, it is working in the repl, there must be something different going on in my code
12:23:56
jackdaniel
first trying, then preparing code and at last asking may be a better strategy then a reversed one ;-)
12:28:20
elderK
jackdaniel: Yes - but if you compare a slot that is not like, a string or something, say another structure.
12:29:25
ogamita
jackdaniel: yes, 1500 emojis is a big number. Instead, we could have 1444 combining glyphs of 1 pixel (38 high x 38 wide), and you could draw any emoji up to 38x38 using combinations!
12:29:31
elderK
What about if a slot is an array? And that array has references to other structure instances?
12:30:06
ogamita
jackdaniel: or once you've noted the siliness, just extend unicode with a graphic language to draw anything.
12:32:36
jackdaniel
namely this: https://common-lisp.net/project/cdr/document/8/cleqcmp.html (re equals predicate)
12:36:49
ogamita
So #S(NEXT::KEY-CHORD :KEY-CODE NIL :KEY-STRING "l" :MODIFIERS ("C")) and #S(NEXT::KEY-CHORD :KEY-CODE NIL :KEY-STRING "L" :MODIFIERS ("c")) will be equalp.
12:39:11
heisig
elderK: Use eq to check for object identity (or, forget about eq and always use eql), use eql for things that might also be numbers or characters, and use equal for s-expressions. Don't use equalp unless you know what you are doing.
12:47:06
ogamita
elderK: there are no incorrect use of equalp! or of any other operator. Just read the specification, and see if it's adapted to your need.
12:48:30
ogamita
(list (string= 'pascal "Pascal") (string-equal 'pascal "Pascal")) #| --> (nil t) |#
12:49:10
ogamita
Notice how string= compares string designators (including characters and symbols), while equal compares objects (same type).
12:52:34
ogamita
(= (mismatch "hello world!" "aworlding" :start1 6 :start2 1 :end2 7) (+ 6 (- 7 1) -1)) #| --> t |#
12:53:39
ogamita
can also work with sequences: (= (mismatch "hello world!" '(#\a #\w #\o #\r #\l #\d #\i #\n #\g) :start1 6 :start2 1 :end2 7) (+ 6 (- 7 1) -1)) #| --> t |#
16:10:30
jfrancis
Anyone here happen to be good with cl-json? I've got an issue that's making me insane. Been using it for years, first time I've run into this.
16:11:26
jfrancis
If I have this bit of JSON: {"sources":{"include":[[{"label":{"href":"/orgs/1/labels/41"}}]],"exclude":[{"ip_address":{"value":"10.1.0.27"}}]}}
16:11:39
jfrancis
when I do this: (princ (json:encode-json-alist-to-string (json:decode-json-from-string "{\"sources\":{\"include\":[[{\"label\":{\"href\":\"/orgs/1/labels/41\"}}]],\"exclude\":[{\"ip_address\":{\"value\":\"10.1.0.27\"}}]}}")))
16:11:50
jfrancis
I get this: {"sources":[["include",[{"label":{"href":"\/orgs\/1\/labels\/41"}}]],["exclude",{"ip_address":{"value":"10.1.0.27"}}]]}
16:12:28
jfrancis
Note that "sources" gets magically transformed into an array. Which breaks my REST call horribly. Obviously, I'm not just converting to a list, then back. I'm doing useful stuff in between.
16:13:48
jfrancis
But that's the simplest test case I can boil it down to. I *could* re-write everything with yason. But it's vastly different in the way it works, and will require re-writing three years of code (thankfully, 80% of it is in one file, but not 100%). Would prefer not to do that.
16:15:18
dlowe
the json decoding doesn't leave distinguishing markers on whether something is a list or an object
16:16:49
jfrancis
Yeah. And I could actually live with that if I had a way to force cl-json to do what I want. I already have to do that with things like an empty list (you have to do (make-array 0) to force it to encode "[]" instead of "false").
16:16:56
_death
personally I recommend com.gigamonkeys.json that does roundtripping correctly without tweaking configuration
16:24:00
jfrancis
That does work. But the amount of effort required to convert three years of code from the cl-json way of doing things to the gigamonkeys way makes me sad.
16:24:07
jfrancis
(princ (com.gigamonkeys.json:json (com.gigamonkeys.json:parse-json "{\"sources\":{\"include\":[[{\"label\":{\"href\":\"/orgs/1/labels/41\"}}]],\"exclude\":[{\"ip_address\":{\"value\":\"10.1.0.27\"}}]}}")))
16:24:14
jfrancis
{"sources":{"include":[[{"label":{"href":"/orgs/1/labels/41"}}]],"exclude":[{"ip_address":{"value":"10.1.0.27"}}]}}
16:25:48
jfrancis
I was hoping there was a way to hack/coerce/force cl-json to do what I want. I build most of the objects in my code, anyway. I could just build them the way cl-json wants them to give me what I need.
16:26:21
sjl_
This is why the #1 most important feature for any JSON lib I use is "encodes and decodes unambiguously"
16:27:02
dlowe
json:set-decoder-simple-clos-semantics *seems* like it could work, but it barfs on your example json
16:28:31
_death
iirc there is a way to make cl-json work, but it will of course change the representation
16:28:32
jfrancis
It's much more straightforward to build lists, then convert them, than to build hashes and/or hashes and vectors.
16:29:14
jackdaniel
jfrancis: jsown is unambigous and fastest (according to benchmarks we did some time ago), though it is a little pita to use it with all `(:obj …) stuff
16:29:54
jackdaniel
yason otoh is the most convenient library (but I use it only for repl, production code works on jsown)
16:30:11
jfrancis
Oddly enough, performance is about #1045 on my list of priorities. The elapsed time of the REST API call dwarfs any compute time in my code.
16:32:08
pfdietz
I think you want an encoder/decoder that converts json boolean into something other than T/NIL. :true and :false, maybe.
16:32:23
sjl_
You can make hash/vector use less awkward with some greasy generic functions, e.g. https://github.com/sjl/cacl/blob/master/src/base.lisp#L77-L90
16:32:55
sjl_
It's still a little clumsy, but at least things will be unambiguous and actually work correctly.
16:34:32
sjl_
And you can always go full Clojure and define a #{} reader macro to wrap alexandria:alish-hash-table
16:35:02
jfrancis
I can work around the :true/:false t/nil issue, given how rigorously the API is defined (ie, I can guess correctly 100% of the time). The issue I can't work around is that cl-json improperly converts my data into an array instead of a key/value.
17:06:08
fiddlerwoaroof
jfrancis: you could probably also use these special variables to make decoding unambiguous: https://common-lisp.net/project/cl-json/cl-json.html#DECODER-CUSTOMIZATION
17:13:11
jmercouris
I'm hesitant to iterate over the hashtable keys and do remhash because I have a feeling it won't like that
17:15:16
jackdaniel
(let ((to-remove nil)) (maphash #'(lambda (k v) (when (pred v) (push v to-remove))) ht) (mapc #'remhash* to-remove))
17:17:00
_death
jmercouris: the standard guarantees that you should be able to do that for the entry currently being passed
17:18:53
sjl_
yeah, (maphash (lambda (k v) (when (funcall predicate k v) (remhash foo k)) foo) should be portable
17:23:16
_death
yep.. that would save you from errors like pushing value instead of key, or rotating the arguments to remhash ;)
17:24:39
sjl_
ugh, yet another instance of *hash functions having the opposite arg order to what I expect
17:30:08
_death
same argument for MAPCAR taking lists after the function.. while it could make sense for (single) sequence functions to take the sequence first, I guess
17:32:05
sjl_
I'd still probably lean toward taking the predicate first, because (curry #'mapcar #'pred) to make map-pred seems more useful than (curry #'mapcar somelist), which would be... map-over-some-particular-list
17:46:28
ogamita
The nth element of the list (nth n list); Let's reference the array a with the indices i j k (aref a i j k).