freenode/#lisp - IRC Chatlog
Search
15:18:09
edgar-rft
lionrouge: there's an XLISP-PLUS interpreter from 2016 that still runs on MS-DOS at the bottom of the page -> http://www.almy.us/xlisp.html
15:53:49
jackdaniel
do you have opinions about data-frame concept in general? is there some good CL library implementing the abstraction? if not, would people find it useful?
15:55:24
jackdaniel
(data frames are basically arrays with named columns and rows where you may select particular columns and/or rows depending on what you need, used i.e in statistics or in ggplot2)
16:55:57
dlowe
I've made ad-hoc data frames and associated functions, but I never put it all into a library
16:59:24
grewal
I've never used a data frame, but if I needed that sort of abstraction, I'd probably just use an array of structs
17:01:44
dlowe
datasets too large to mess with by hand but still fitting comfortably in a corner of memory
17:08:42
[6502]
Hello... in a toy lisp interpreter I'm playing with I "mark" environments that are captured in lambdas (and I follow up the environment parent chain until the global environment or a marked one is found)...
17:09:00
[6502]
...When executing a function call I check if after the body evaluation the environment has been made reachable by a lambda capture and when this is not the case I recycle it immediately instead of leaving the job to the gc...
17:09:21
[6502]
...Are there reason for which this trick shouldn't work? it saves me a LOT of consing.
17:09:46
dlowe
[6502]: this is a forum for Common Lisp. I think you want ##lisp which focuses on the lisp family
17:15:13
Bike
there's no reason for it not to work. most of the time you don't even need to allocate an environment.
17:34:02
jeosol
jackdaniel: are you planning on writing or working on dataframe? I think it will be useful. This was brought up here a while ago, and I recall someone had something in CL similar to python pandas.
17:35:06
jackdaniel
jeosol: I wrote a working prototype of polyclot (mcclim application to plot data) and now I'm rewriting it into something with arms and legs
17:35:34
jackdaniel
and data-frames are very convenient as a way for mapping dataset to the aesthetics
17:36:28
jackdaniel
(so I'm apparently writing a barebone implementation which will serve these needs)
17:47:30
jmercouris
I do (flexi-streams:make-flexi-stream ...) from a file I get, I do this twice, but you can only read a stream once
17:49:17
jeosol
sjl: it's possible it this one but if my memory serves me right, the one I was talking had functionalities to allow panda-like manipulations, some of which jackdaniel alluded above
17:50:36
sjl
jmercouris: once you read from a stream, that's it. that's how streams work. copy-stream isn't what you want (that's for piping one stream to another)
17:50:54
jmercouris
sjl: well, I have a CSV that I am trying to go through twice with two different functions
17:51:25
sjl
and you don't want to (with-open-file (...) (do-thing-1 ...)) (with-open-file (...) (do-thing-2 ...)) ?
17:51:43
jmercouris
sjl: I can't say I do, this is the upload to a server, and so I get a stream from the web framework
17:52:34
sjl
By default Lisp (probably) won't save the whole contents of a stream... if someone updates a 100gb file, the whole point of using a stream is to not have to hold it all in memory
17:53:40
sjl
Yeah, if it's small enough to keep in memory, read it from the initial stream into a buffer and then work with that.
17:54:02
aeth
jmercouris: What I'd consider doing is use something like a pipe e.g. https://gitlab.com/zombie-raptor/zombie-raptor/blob/17eb01a05e3f2b8ac6edab8c7592fe03df7e5b49/util/stream.lisp
17:54:22
sjl
then use flexi-streams' make-in-memory-input-stream twice to make two streams you can give to the CSV parser
17:54:29
aeth
jmercouris: But under the hood, which you can even see there, the pipe is just going to either (a) read into a buffer or (b) read into a /tmp file (I go with a, many go with b)
17:55:54
sjl
actually if you want to use the flexi-streams stuff on top, you definitely want the byte vector version
17:56:19
aeth
jmercouris: It's just a trivial use of trivial-gray-streams, probably subtly incorrect because I haven't used it enough to hit edge cases
17:57:14
sjl
the flexi-stream docs for make-in-memory-input-stream specifically says "the octets in the subsequence"
17:59:58
jmercouris
sjl: interesting errors: https://gist.github.com/jmercouris/21b9dcf5f18b984c00c0251ad14c9273
18:02:51
jackdaniel
(let ((string (a:read-stream-content-insto-string stream))) (w-i-f-s (s string) (deactivate-targets s)) (w-i-f-s (s string) (do-csv (row s) …)))
18:03:08
aeth
with-output-to-string combined with with-input-from-string seems like it might work if it's small enough and it doesn't have to be used interactively
18:08:04
jmercouris
re-uploaded: https://gist.github.com/jmercouris/de6fadd66cf5b77b628acc6fef36ac2b
18:09:27
aeth
Well, you're probably getting UTF-8 and you need to translate that into the internal format of the implementation, which is probably UTF-32. Unless the library is already translating UTF-8 to UTF-32, in which case you just need to CODE-CHAR it.
18:10:00
jmercouris
aeth: what if I do not get UTF-8 though? what if a windows user uses the application?
18:11:23
aeth
Especially all of the legacy ones that look like extended ASCII with special characters in the upper half.
18:14:26
aeth
jmercouris: the thing is, being naive about encodings will work fine for characters 0 through 127
18:17:39
jmercouris
I however want to be able to write to the echo stream, and then read from it... no?
18:18:37
aeth
Well, you can make your own stream with http://www.nhplace.com/kent/CL/Issues/stream-definition-by-user.html via https://github.com/trivial-gray-streams/trivial-gray-streams
18:20:57
aeth
You could probably add some logic that defaults to UTF-8 but also attempts to detect if it's using whatever Windows's default format is instead.
18:21:54
aeth
It's probably not to hard to do utf-8 on Windows since almost the entire web speaks utf-8 now.
18:33:49
aeth
jmercouris: So your issue is you need to pass a stream from caveman into a CSV library twice, right? The simplest answer is to put it into a string and then with-input-from-string twice if it's small enough, right?
18:35:44
aeth
well, if do-csv takes a stream and with-input-from-string makes a stream from a string, then there's no issue
18:37:22
aeth
with-output-to-string is the easiest solution, if it's a bidirectional character stream
21:43:15
Xach
kenu: systems provided by quicklisp will always be stored in the quicklisp directory in a certain structure.
23:54:49
manualcrank
a few days ago i complained about the performance of sbcl on the competitive programming site open.kattis.com, specifically the awful performance (compared to other languages there) of READ-LINE.
23:55:12
manualcrank
You absolutely need to be able to consume input as fast as possible on this site. Using ios_base::sync_with_stdio(false) in C++ or memory mapped input in C can chew through the input in literally 0 seconds vs. TLE (time limit exceeded) in Lisp.
23:56:04
manualcrank
Ditto one based on READ-SEQUENCE (in my tests READ-SEQUENCE --- on *STANDARD-INPUT* --- is _slower_ than READ-LINE).
23:56:26
manualcrank
So anyway i managed to accidentally cobble together a solution using DEFINE-ALIEN-ROUTINE, which I can't really explain how it works because I'm a complete newbie at foreign function stuff.
23:56:48
manualcrank
It calls C's gets(). Performance is dramatically better (finishes in < 0.6s, which is average).
23:56:58
manualcrank
But (1) it still creates a string object for every line and (2) what I really wanted to do was be able to call fgets_unlocked(), but couldn't figure out how to convert between stdin and *STANDARD-INPUT*
0:16:46
sjl
manualcrank: `cat`ing their sample input 36000 times gives me a 1008000 line sample file, which is larger than their limit. Running (time (with-open-file (s "foos") (loop :for l = (read-line s nil nil) :while l :sum 1))) on that in SBCL takes 0.11s on my machine.
0:24:57
sjl
Yeah, even a full solution is <1s on my laptop, but exceeds the limit on the site. So they must be running on something old/virtualized to all hell
1:20:34
Xach
but it will be less convenient and clear. but there are ways to reclaim clarity if not convenience.
1:27:45
aeth
manualcrank: do you have to process by the line? depending on what you're doing you could write a state machine that goes character-by-character (or byte-by-byte)
1:29:59
manualcrank
i did slurp all of stdin in one attempt and subsequence individual lines out of it
1:31:31
aeth
Well subseq isn't a good idea for performance. That's why nearly everything has a start and end optional or keyword parameter (or start1 end1 start2 end2, etc.)
1:31:56
manualcrank
largest possible input is 1,000,000 trees of 126 (+1 for newline) char-length each
1:33:11
aeth
if I needed performance and I didn't need to build up buffers of lines I would just work character-by-character (or, again, maybe just byte-by-byte)
1:34:29
aeth
the only hash table has everything as a key and a value, which isn't going to benchmark well against e.g. C++
1:35:45
aeth
I thought the hash table benchmark was one of the places where SBCL was weak in the Benchmarks Game
1:36:31
manualcrank
yeah if you have to read or write a lot of stuff, all the other languages are faster
1:38:16
manualcrank
i'm happy with my alien c-gets, but i'd be happier still if i could get fgets_unlocked to work and if return values weren't converted to new string objects
1:39:33
aeth
manualcrank: I wonder if working with bytes makes a difference over working with characters? A lot of languages use really weird strings but in SBCL afaik it's just utf32 so it reads utf8 into utf32 and there's a translation. Idk if that would make a difference, though
1:45:16
manualcrank
uiop has uiop:read-file-string but that takes a filename arg and anyway uiop isn't an option on the site
1:45:18
aeth
so it takes 123 124 and turns it into 0 0 0 123 0 0 0 124. You could see this yourself reading some FASLs with a binary reader, like hexl-mode in emacs. I think, though, at some point they might have optimized the size to get rid of that, though. It looks like that's just for symbols, though
1:45:48
aeth
literal strings saved in the fasl still have the 000000 between each character (assuming ASCII subset of unicode)
1:47:23
aeth
That still interferes with CL a lot. CL is a very multi-implementation language and each implementation implements extensions subtly differently and we rely heavily on portability libraries to actually use features
1:49:05
Bike
my dipshit solution can do 2.9 mil trees in 1.04s. let me try it with bytes or sorting as it goes or something
1:51:46
manualcrank
i don't mind not being able to use portability libs, it's just a puzzle site, point is to learn some algorithms
2:53:57
manualcrank
update to <http://pasted.co/d921528c>: reason it lops off the \n is because that's what gets() does, duh (fgets() doesn't)
2:54:36
manualcrank
also, returning a c-string according to docs implies a conversion to a Lisp string
2:55:14
manualcrank
i can though specify it returns (* char) and just return the buf i used to alien call gets
2:57:03
manualcrank
in that case there's no conversion, just an array of char (i changed the code to element type (unsigned-byte 8) though)
3:07:56
manualcrank
also, base-char is better than character because the former doesn't convert going from Lisp to C