freenode/#lisp - IRC Chatlog
Search
21:07:37
vms14
I have no idea how to convert a string to a list of symbols, I just take the first element
21:09:48
aeth
vms14: So the problem is that you're reading in a line into a string, and you're turning "Hello world" into |Hello world| instead of (HELLO WORLD)
21:11:17
vms14
I'm trying to parse input, I just want read every symbol from the input until the user press enter
21:13:37
aeth
vms14: One thing you could do, and it's quite a hack, is (read-from-string (read-line))
21:13:53
vms14
the thing is usually I won't know how many symbols will be, and the delimiter is the newline
21:14:04
aeth
vms14: the second value in read-from-string is where it left off so you can loop on that second value
21:14:25
pjb
vms14: or: (loop for element = (extract-one-item (read-line)) until (eof-element-p element) collect element)
21:15:03
pjb
vms14: using READ or READ-FROM-STRING, you allow input to do whatver it wants with your lisp image, by default.
21:15:11
aeth
vms14: (read-from-string (read-line)) for a line "hello world" will return (values HELLO 5)
21:16:30
aeth
vms14: you can then do (read-from-string (read-line) nil nil :start 5) to get (values WORLD 11)
21:16:53
pjb
So you would want to bind *read-eval* to NIL. but other reader macros can be problematic: (read-from-string "#8931289312839012*") for example, could DOS your system by trying to allocate all its RAM. (or just signal a condition, depending on the implementation).
21:17:19
pjb
another thing is that reading symbols will intern them, so if there's a loop, the input could fill your memory with useless symbols.
21:17:47
pjb
So you might want to intern the symbols in a throw away package that you can delete-package when you're done.
21:17:47
aeth
vms14: The "correct" (safe) way to do things is to parse the string, perhaps with cl-ppcre
21:18:15
aeth
By the time you add in the validation pjb is talking about, the parse solution probably becomes more concise than the elegant solution that pjb and I both said simultaneously
21:18:57
pjb
vms14: (split-sequence #\space (read-line) :remove-empty-subseqs t) is usually all you need.
21:19:36
aeth
Which to use is debatable. split-sequence is a smaller dependency, but if you're doing additional parsing, you might be using cl-ppcre anyway
21:19:40
pjb
(ql:quickload :split-sequence) (use-package :split-sequence) (with-input-from-string (*standard-input* "Hello world! How do you do?") (split-sequence #\space (read-line) :remove-empty-subseqs t)) #| --> ("Hello" "world!" "How" "do" "you" "do?") ; 27 |#
21:21:23
aeth
If you wanted "absolutely 0" overhead, you can get that. Well, not quite 0, you'd have to track start and end positions for each substring. String/sequence functions take in start and end so you can just work like that.
21:22:50
pjb
(com.informatimago.common-lisp.cesarum.array:positions #\space "Hello world! How do you do?") #| --> (5 12 16 19 23) |#
21:24:58
pjb
(let ((string "Hello world! How do you do?")) (loop :for start := 0 :then (1+ end) :for end :in (com.informatimago.common-lisp.cesarum.array:positions #\space string) :collect (cons start end) :into result :finally (return (nconc result (list (cons end (length string))))))) #| --> ((0 . 5) (6 . 12) (13 . 16) (17 . 19) (20 . 23) (23 . 27)) |#
21:25:07
aeth
You could store positions in an array with the :element-type alexandria:array-index, which will probably round up to fixnum or "unsigned fixnum" (it will show up as some strange looking unsigned-byte size like (unsigned-byte 62)) or (in 64-bit implementations) (unsigned-byte 64)
21:25:43
pjb
And then you can use (foo string :start (car pos) :end (cdr pos)) with most sequence functions to process the substrings. Or (subseq string (car pos) (cdr pos)) when you need to extract it.
21:28:18
aeth
You could also do that as two vectors or two lists, one for start position and one for end position. (I think to make the vector, the best solution would be to walk the string twice, first to get the length for the allocated vectors and then to set the elements)
21:28:21
pjb
vms14: Notice that displaced arrays just abstract those (car pos) (cdr pos) bounds. So instead of subseq, you can use (make-array (- (cdr pos) (car pos)) :element-type (array-element-type string) :displaced-to string :displacement-offset (car pos))
21:31:21
aeth
the alternative is to allocate a list or vector of positions, or, as I recently noticed, two sequences instead of one
21:34:57
aeth
splitting isn't the standard way to think about things, the standard way to think about things is with positions, which is why every built-in (and every well-behaved library) has start/end or start1/end1/start2/end2
21:36:55
aeth
the easiest no-library way to do it is probably read-line and do position tracking, but read-char will probably be the most efficient solution
21:38:38
aeth
Thinking about lists can be done with splitting without a library, but only in one direction, splitting the front parts off and keeping the tail.
21:39:27
pjb
Depending on the size of the string and the substrings, displaced arrays may spare a lot of RAM. However, in the substrings are short, then subseq will be more efficient both in time and space. (eg. on a 64-bit system, we can assumme that strings up to 8 or 16 bytes (2-4 unicode characters) are better created rather than (list* string start end) or displaced arrays.
21:42:07
aeth
vms14 might not need a subseq/displacement at all, if it's about determining what to do based on user commands.
21:43:26
pjb
But don't write the state machine by hand! Write a state machine compiler from a high level description!
21:44:07
vms14
yeah, I want to make a transpiler to c, starting with easy stuff like create a variable, output the value, etc
21:45:56
pjb
vms14: or you may have a look at: https://github.com/informatimago/lisp/tree/master/common-lisp/html-generator
21:46:09
pjb
Have a look at https://github.com/informatimago/lisp/blob/master/common-lisp/html-generator/html-generators-in-lisp.txt
21:46:23
aeth
I have a partially complete GLSL generator so I can already essentially transpile to C if I spent a few weeks on it. Very similar syntax.
21:46:42
aeth
Generally, people avoid the parsing problem altogether when generating another language and just work directly in s-expressions
21:48:42
aeth
vms14: the problem is that 90% of the cases where you'd need parsers in other languages, people just avoid them altogether in Lisps and start with s-expressions, so there's probably less work on parsers than you might expect
21:51:39
aeth
vms14: Almost every "transpiler" in Common Lisp starts with s-expressions. If you don't want to start with s-expressions, you should probably act like you're doing the exact same thing as the normal transpilers and use this as the intermediate format.
21:52:42
vms14
what I had is a function wrapping the input from read-line with parens using concatenate 'string xD
21:52:48
aeth
Lisp itself was written in this way. m-expressions were the next step. https://en.wikipedia.org/wiki/M-expression
21:56:56
aeth
This sort of thing in Lisp is always done in at least two stages, where the first stage parses to s-expressions and the last stage turns a direct (or near-direct) s-expression mapping into strings like (:+ 1 2 3) into "(1 + 2) + 3"
22:00:43
aeth
In fact, + is probably one of the harder ones. Mostly you just go (:foo 1 2 3) to "foo(1, 2, 3)" with the only real difficulty being the way to generate the names (e.g. does foo-bar become "fooBar"?)
22:51:25
grewal
vms14: Wouldn't (read-from-string (read-line)) do what you want (read-delimited-list #\Newline) to do?
23:04:05
pjb
vms14: you could make read-delimited-list #\newline work. For this, you need to copy the character syntax from #\) to #\newline.
23:06:55
pjb
theorically. It stil doesn't work :-( (let ((*readtable* (copy-readtable))) (set-syntax-from-char #\newline (character ")") (with-input-from-string (*standard-input* (format nil "hello world~%How do you do~%")) (values (read-delimited-list #\newline) (read-delimited-list #\newline)))) #| ERROR: Unexpected end of file on #<string-input-stream :closed #x3020025DED1D> |#
23:26:28
vms14
and there are more things I'm missing about format, I need to practice a bit with things like ~:* and so on
23:35:59
vms14
I shouldn't be coding yet, but I want to get used to lisp, and the best way is coding
23:36:51
vms14
grewal: I mean I should be reading and doing test stuff and wait a bit to make this program
23:38:03
vms14
also I still thinking the On lisp book should teach me nice things, but I need to understand lisp better before this book, or I'll miss some important stuff
23:49:30
pjb
vms14: loop is nice because it's versatile. Instead of having loops for, while, until, etc, loop does everything. (loop :while … :do …) (loop :do … :until …) (loop :for i :from 0 :to 10 :do …) and other variants: (loop :while … :do … :until …) (loop :do … :while … :do …) etc.
23:50:45
pjb
vms14: note that the :finally clause is jumped to as soon as one terminating clause is validated. So (loop … :until … :do … :finally …) doesn't evaluate :do when the :until condition is true.
23:54:09
aeth
What makes LOOP good for reading is its behavior for :for ... := ... is different than DO's behavior when you do not have an iteration step. With LOOP, it will do the thing initially and then repeat it, with DO it will only do it once so you wind up having to repeat yourself twice (once for the initial value and once for the step) unless you abstract over this with a custom macro.
23:55:39
aeth
So even if you're primarily using DO and/or DO* in your coding style, this is one of those good exceptions where you should use LOOP
23:56:40
aeth
(correction for the nitpickers, you repeat yourself once, which is writing the same code twice, you don't "repeat yourself twice")
23:58:57
pjb
And it's safe: (with-input-from-string (input " #.(delete-file \"~/.bashrc\")") (read-token-list input)) #| --> ("#.(delete-file" "\"~/.bashrc\")") |#
23:59:27
aeth
vms14: Imo, you shouldn't think in terms of "list of atoms being read from input" imo. That's eval()-style behavior (CL's EVAL is different, and eval("1 + 1") in other languages is closer to (eval (read-from-string "(+ 1 1)")) in CL)
23:59:45
aeth
vms14: You should be thinking in terms of what kind of syntax you want to support, and parsing that syntax.
0:00:16
aeth
None of this that we've been talking about is strictly necessary with a sufficiently restrictive syntax
0:01:53
aeth
e.g. you could require the user write things like "foo 42\nbar 43\n" (replace \n with newlines in your head; IRC is limited to one-line-per-message) in which case you don't technically need any intermediate strings.
0:07:19
pjb
and macros wouldn't be unsafe (and worse, unhygienic) if CL's macros weren't so powerful.
0:10:39
aeth
I guess my point is that for untrusted user input you don't want power, so you wind up having to write your own (or use a library) functionality. Shortcuts here are bad.
0:12:22
aeth
read-line vs. read-char is up to you (unless you *need* to not hang, then you have to use read-char-no-hang)
2:40:39
Arylla
Is there any way to set up SBCL so that it's possible to (attempt to) recover from heap exhaustion by aborting to the REPL?
2:41:12
Arylla
(or from just having a relatively low amount of heap space; I'm okay with having to abort while there's still some heap left)
3:15:53
remexre
coming from Scheme and Haskell (as my most lisplike languages), what does Common Lisp have over e.g. Haskell?
3:20:06
Arylla
Common Lisp is quite similar to Scheme in a lot of ways, so if you're familiar with the comparative advantages of Scheme and Haskell, you'll probably find the comparison between CL and Haskell similar
3:22:00
remexre
also, minor inflammatory statement, a large part of my attraction to scheme is that guile is installed on school machines and ghci isn't
3:23:24
remexre
I've worked on a python project from hell before, which pretty much scared me into types-for-everything-land
3:23:32
estest
remexre: Common Lisp has types and there are a number of multi million line projects created with it.
3:25:10
Arylla
I think that a large part of it is macros and the domain-specific-languageiness of Lisp
3:25:32
estest
There are a lot... one that might be up your ally is simply declaring your types. SBCL will give you warnings at compile time when it detects type issues.
3:26:01
Arylla
when you look at any particular function or something, which is written with macros, it's
3:26:35
Arylla
usually expressed in terms of what the person meant, rather than in terms of whatever constructs the language already supports
3:27:54
Arylla
so that way, it's a lot easier to keep track in one's head of what code is actually trying to do, so a strong, static type system is somewhat less necessary
3:30:00
remexre
but there, there are all sorts of problems getting them to work together (e.g. embedding code in another DSL within some place in yours)
3:30:32
remexre
since the inner DSL might be expecting certain invariants to be true that are broken by the outer one
3:31:54
remexre
wait nvm actually read past the first page and it doesn't describe the type system much :p
3:34:14
Arylla
I'm not completely sure; I haven't had a whole lot of issues with this in Lisp, but then again I've never done very much DSL-based programming in Haskell
3:34:33
Arylla
or really run into such problems before, so it's totally possible that I just don't have that experience
3:35:10
Arylla
also if you want to learn about the details of the type system, the Hyperspec is probably the best resource: http://clhs.lisp.se/Body/04_.htm
3:35:28
remexre
afaik the canonical example is that nondeterministic choice doesn't compose with pretty much any other control flow effect
3:37:21
estest
remexre: The second reference is better for understanding the type system, and the hyperspec will be the most standard-compliant reference... SBCL goes beyond the standard with its type inference (and use of type information to compile faster code). I believe it's based on Kaplan-Ullman inference, like this: https://web.archive.org/web/20181107011706/http://home.pipeline.com/~hbaker1/TInference.html
3:37:50
p_l
remexre: Common lisp is strongly-typed but with static analysis being, unfortunately, mostly an optimization method
3:38:17
p_l
CMUCL pioneered advanced static type derivation in Common Lisp, SBCL derives from CMUCL thus also having it
3:38:53
p_l
one of the problems with CL type system is that it's explicitly turing complete, which means you can't run exhaustive static analysis
3:40:03
remexre
e.g. rust and haskell's type systems are turing-complete, but it's pretty rare to infinite-loop the typechecker there
3:40:26
p_l
remexre: Depends on how you code. For a lot of applicationss, heavy use of classes makes things easier, as classes are also types
3:42:59
Arylla
remexre: hmm I haven't heard much about that before; would you mind linking a resource of some sort which describes the difficulties with nondeterministic choice and other control flow?
3:43:02
p_l
it's however easy to make them form assertions in your code, and debugging facilities even on poorest CL environments are pretty great in comparison
3:44:10
remexre
Arylla: https://wiki.haskell.org/ListT_done_right is basically what I'm talking about
3:49:49
remexre
arguably monads can replace macros as a DSL system most of the time; e.g. http://wall.org/~lewis/2013/10/15/asm-monad.html as a crazy example
3:50:20
remexre
and the free monad is super-nice for being able to write your logic in your DSL, then test it against multiple implementations of the DSL itself
3:50:52
p_l
remexre: I just find it funny that the biggest hack of Haskell became it's most recognizable feature
3:51:24
p_l
though I'll admit that I learnt of how hacky Monads are in uni, few years after I started any experience with Haskell
3:51:33
Arylla
hmm usually in Lisp different macros and other such constructs do tend to play nicely together, as a result of the main method of DSL creation involving substituting in forms provided by the user
3:52:18
p_l
Arylla: the moment you stop pushing for mathetically pure functions, Monad stops making sense
3:54:41
Arylla
and also, macros also allow you to do the part where you write your logic in your DSL and test it against multiple implementations
3:55:38
Arylla
for a big example, look at CLIM (Common Lisp Interface Manager); both McCLIM, an open-source implementation, and implementations derived from Allegro CLIM have the exact same semantics
3:58:12
Arylla
yeah they're basically code that generates code; think more "Template Haskell" than "C macros"
3:59:27
p_l
the main difference in contract of MACRO-FUNCTION from FUNCTION is that MACRO-FUNCTION gets arguments unevaluated
3:59:35
Arylla
except that they're easier to write than Template Haskell, owing to the simpler AST of Lisp
4:00:17
remexre
yeah, I only dipped my toe into TH before running for the hills because of the complexity of the AST :P
4:00:59
remexre
at this point I do things at runtime and spam INLINE and RULES annotations when performance matters
4:01:40
Arylla
yeah; Lisp macros don't suffer from those issues as much because of the characteristics of Lisp itself
4:02:20
Arylla
ACTION has also briefly gazed upon Template Haskell and realized that it was not for me
4:02:36
nisstyre
Arylla: the reason Template Haskell is harder to understand is because of the type system I think
4:03:38
Arylla
nisstyre: it would make sense that that's part of it, but I'm sure that the AST isn't helpful either
4:08:10
Arylla
that's true, although macros (unlike lazy evaluation) don't necessarily interact terribly with all impure functions
4:09:53
Arylla
(and while monads do exist and work, they're more complicated to understand and at times to use imo)
4:12:47
remexre
I'm tbh big on having algebraic effects (like in http://www.eff-lang.org/ ) instead of monads-as-a-hack-for-effects
4:13:20
remexre
but it looks like deriving a type system where effects are statically checked is somewhat difficult
4:16:34
Arylla
huh; reading through the "Print" example at least reminds me quite strongly of CL's condition system
4:16:35
remexre
I've played around with trying to fit a type system for these, and it looks like it ends pretty poorly
4:19:43
remexre
the big problem I've found is that you usually end up wanting some sort of subtyping for the "set" of effects you have
4:20:29
remexre
because if you have e.g. a State effect and an Exception effect, the order in which they're handled makes a big difference
4:23:10
aeth
remexre: If you're willing to be a heretic with non-idiomatic style you can have some degree of static typing in Common Lisp (in addition to the dynamic typing, making it gradual typing), at least in SBCL
4:24:57
aeth
remexre: Your toolbox is fairly limited for CL "static typing". You have type declarations (though macros can make the syntax much nicer), :type in defstruct, and :element-type in arrays. Portably, only :element-type must be respected, and only for characters and bits. element-type is also only for simple things like numbers and characters, e.g. (unsigned-byte 8) or octet (a common deftype rename for that) is pretty common
4:25:34
Arylla
I almost think that something interesting would be to have types be data and have the operations on the types (including checking their equality) just be expressed in the same language as the code itself
4:25:44
remexre
aeth: yeah, that got mentioned a bit above; I was wondering more about how people "cope" with not having it
4:25:57
aeth
remexre: The strongest static checking is limited to being within a function, but a lot of it also can take place within a file. sb-ext:*derive-function-types* tells the SBCL compiler to assume that a function type never changes so it can make even more assumptions (unfortunately, you can't do much working with function types)
4:26:21
Arylla
is it? I thought dependent types were more about having the type of a variable or whatever depend on the data that the variable contains
4:26:25
aeth
It also can make static type assumptions with stuff in the COMMON-LISP (CL) package, since those can't be redefined. so e.g. +
4:26:39
p_l
remexre: I'd say that protocols, which exist mostly by social convention though, are pretty common approach
4:27:12
remexre
Arylla: yes, but they way in which they depend ends up being value-level functions in all the dependent languages I've seen
4:27:26
aeth
remexre: Oh and you also have :type in defclass, btw, but that unfortunately is even less respected than :type in defstruct so I made my own metaclass to enforce checking of a :checked-type slot
4:27:32
remexre
though equality checking types isn't a common feature, because you lose parametricity (aka theorems for free)
4:29:19
aeth
remexre: You can do a lot of verification within macros, and the type system is fairly decent for CHECK-TYPE or declarations or typed slots... where supported. Not Haskell-level, but it supports stuff like (integer 4 37) and (or null integer)
4:29:49
remexre
p_l: hm, ok; is the approach of having laws alongside protocols (e.g. if a type is foldable, it should satisfy foldr f z t = appEndo (foldMap (Endo . f) t ) z) common, or does having side-effects pervasively reduce that
4:30:33
aeth
The real weakness is in typed collections. Arrays are stuck with simple element-types, hash-tables are stuck generic, and if you want typed conses (for e.g. linked lists that can only hold a certain type in the car) you have to define your own struct to do that with :type in its slots for about a 30% performance loss
4:31:08
aeth
Another big weakness is you have elaborate ftypes (in some implementations) but no real way to access those.
4:31:08
remexre
p_l: sorry, that law was a bit pathological, I wanted a non-mathy name for the interface/protocol :p
4:31:12
p_l
aeth: AFAIK only linked lists are truly limited in that, the rest can be done in compliant implementation
4:32:07
remexre
p_l: a better example is that if a type is a functor, it must respect fmap id == id and fmap (f . g) = fmap f . fmap g, where id is the identity function and . is function composition
4:32:07
aeth
remexre: You can declare it, and it will be used for checking (perhaps even at compile time, at least in SBCL with sb-ext:*derive-function-types* as T), potential optimizations, etc., but there's no real way to access it
4:34:13
Arylla
remxre: I feel like this type system I'm thinking up would work rather nicely in Lisp, which already doesn't really have parametric polymorphism per se
4:35:07
aeth
remexre: You can create a foo with a define-foo instead of directly, and as long as the values it uses are simple literal values like 42 or "hello" (probably not even constants) then you can check those things at compile time during macro expansion. Basically, those values need to be available at the time.
4:38:42
Arylla
oh uh I'm not sure if anyone saw but I asked an SBCL question earlier; is there a way to make SBCL abort back to the REPL before it runs out of heap?
4:38:53
aeth
You can also do a lot of compile time typechecks (implementation-specific, obviously) with type declarations as long as it can be encoded in the type system. Stuff like (defun foo () (the (unsigned-byte 8) 31487)) being caught by SBCL
4:39:08
Arylla
e.g. when it's running fairly low on heap but still has enough to invoke the ABORT restart
4:39:23
aeth
remexre: e.g. (with-compilation-unit () (defun foo () 31487) (defun bar (n) (declare ((unsigned-byte 8) n)) (* n 2)) (defun baz () (bar (foo)))) ; this is caught in SBCL at compile time as long as it's in the same compilation unit... that wrapper is so it works in the REPL and isn't needed if they're in the same file
4:40:29
aeth
It derived the return type of foo, accepted the declaration for the return type of bar, and then in baz notes that foo's type is incompatible with bar's type.
4:42:26
aeth
p_l: Afaik, conses can have types, but the way they're declared makes it look like it would be O(n) to verify, whereas a struct version would only be O(1) since you'd only have to verify the first cons, since the cdr slot's type would be the new cons struct itself
4:43:13
aeth
(this is assuming a linked list cons, a tree cons is a bit trickier since it would be (or null the-type-to-store the-struct-itself) for both the car and cdr
4:44:23
aeth
remexre: sometimes yes, sometimes no, e.g. it doesn't catch it when I replace 31487 with (random 31487) even though it has the same sort of issue there in terms of foo's return type. It normally catches it, though, and you can declaim ftype the rest of the time.
4:45:12
aeth
SBCL is normally really good at deriving the return type. You will have to declare the input types every time, though.
4:48:15
aeth
You can recompile functions one at a time but SBCL assumes that you will recompile a file at a time, so recompiling just one function can sometimes cause issues with heavily declared functions if it inlines too many assumptions. Especially e.g. something that returns a constant string "hello".
4:48:20
aeth
Change that to "Hello, world!" and you'll get a surprise runtime type error because it essentially wrapped all calls to foo in (the (simple-array character (5)) (foo))
4:49:09
aeth
It's also limited to functions that it assumes cannot change (mostly CL built-ins) and functions in the same file, unless you do the non-standard behavior with sb-ext:*derive-function-types* that makes it assume function types won't change.
4:49:59
aeth
This is also basically just SBCL. Other implementations have some degree of this, but nowhere near as much as SBCL
4:52:23
aeth
It's an amazing type system for a language that's dynamically typed, though. Compare this to TypeScript where you have "number" and "string" and "boolean"
5:02:59
theemacsshibe
but then there's no ratios, reduced precision floats (JS numbers are doubles?) or complex numbers still
5:03:47
theemacsshibe
if there were circular type specifiers, you could express typed conses among other things but i understand that's a bit much for implementors
5:07:31
aeth
You'd probably want separate types for conses that hold a certain type, immutable versions of various data structures, and arrays that compactly hold things too complex for :element-type (although you could still just add type checking to regular specialized arrays)
5:07:42
aeth
Probably a few other areas that are missing in the standard language that would be nice to have.
5:14:38
aeth
Something more important is probably standardizing various non-standard parts of the condition system, especially which things various standard macros/functions like destructuring-bind raise/throw/whatever-the-correct-term-is.