freenode/#lisp - IRC Chatlog

15:18:09 edgar-rft lionrouge: there's an XLISP-PLUS interpreter from 2016 that still runs on MS-DOS at the bottom of the page -> http://www.almy.us/xlisp.html

15:21:34 lionrouge edgar-rft, thanks

15:37:52 Kabriel ebzzry: https://github.com/bharath1097/matlisp/network/members

15:53:49 jackdaniel do you have opinions about data-frame concept in general? is there some good CL library implementing the abstraction? if not, would people find it useful?

15:55:24 jackdaniel (data frames are basically arrays with named columns and rows where you may select particular columns and/or rows depending on what you need, used i.e in statistics or in ggplot2)

15:58:48 ck_ ACTION has to resist dark memories of trying to ''program'' in R

16:55:57 dlowe I've made ad-hoc data frames and associated functions, but I never put it all into a library

16:56:33 jackdaniel does it cons when subsets are selected?

16:56:41 jackdaniel (i.e copies data)

16:59:24 grewal I've never used a data frame, but if I needed that sort of abstraction, I'd probably just use an array of structs

17:01:18 dlowe jackdaniel: no, it didn't. I was only operating on medium sized datasets, though.

17:01:44 dlowe datasets too large to mess with by hand but still fitting comfortably in a corner of memory

17:01:56 jackdaniel dlowe: thank you. is the repository public?

17:02:04 dlowe "I never put it all into a library"

17:02:14 jackdaniel yes, but you might have put it as a file in some other project

17:02:17 jackdaniel I didn't miss that part

17:02:20 dlowe No, sorry :/

17:02:26 dlowe it was for work

17:02:31 jackdaniel understood, thank you

17:08:42 [6502] Hello... in a toy lisp interpreter I'm playing with I "mark" environments that are captured in lambdas (and I follow up the environment parent chain until the global environment or a marked one is found)...

17:09:00 [6502] ...When executing a function call I check if after the body evaluation the environment has been made reachable by a lambda capture and when this is not the case I recycle it immediately instead of leaving the job to the gc...

17:09:21 [6502] ...Are there reason for which this trick shouldn't work? it saves me a LOT of consing.

17:09:46 dlowe [6502]: this is a forum for Common Lisp. I think you want ##lisp which focuses on the lisp family

17:10:28 dlowe I mean, unless you're writing it in CL

17:11:11 [6502] no... the interpreter is in C++

17:15:13 Bike there's no reason for it not to work. most of the time you don't even need to allocate an environment.

17:34:02 jeosol jackdaniel: are you planning on writing or working on dataframe? I think it will be useful. This was brought up here a while ago, and I recall someone had something in CL similar to python pandas.

17:34:25 jeosol but not fully implemented or something like that

17:35:06 jackdaniel jeosol: I wrote a working prototype of polyclot (mcclim application to plot data) and now I'm rewriting it into something with arms and legs

17:35:34 jackdaniel and data-frames are very convenient as a way for mapping dataset to the aesthetics

17:36:28 jackdaniel (so I'm apparently writing a barebone implementation which will serve these needs)

17:37:04 sjl jeosol: are you thinking of https://github.com/numcl/numcl ?

17:45:55 jmercouris I'm having a bit of an issue with cl-csv it seems to be swallowing things up

17:46:02 jmercouris or I can't go through the same CSV twice

17:46:13 jmercouris I tried making a stream twice to get past this

17:47:30 jmercouris I do (flexi-streams:make-flexi-stream ...) from a file I get, I do this twice, but you can only read a stream once

17:47:43 jmercouris alexandria:copy-stream?

17:49:17 jeosol sjl: it's possible it this one but if my memory serves me right, the one I was talking had functionalities to allow panda-like manipulations, some of which jackdaniel alluded above

17:49:22 sjl ah

17:50:36 jmercouris how can I take a stream and produce two flexi-streams from it?

17:50:36 sjl jmercouris: once you read from a stream, that's it. that's how streams work. copy-stream isn't what you want (that's for piping one stream to another)

17:50:42 sjl what are you actually trying to do?

17:50:54 jmercouris sjl: well, I have a CSV that I am trying to go through twice with two different functions

17:51:04 jmercouris both of those functions use cl-csv to traverse the CSV

17:51:25 sjl and you don't want to (with-open-file (...) (do-thing-1 ...)) (with-open-file (...) (do-thing-2 ...)) ?

17:51:43 jmercouris sjl: I can't say I do, this is the upload to a server, and so I get a stream from the web framework

17:52:34 sjl By default Lisp (probably) won't save the whole contents of a stream... if someone updates a 100gb file, the whole point of using a stream is to not have to hold it all in memory

17:53:01 jmercouris yes

17:53:04 jackdaniel jmercouris: first read stream into some kind of sequence

17:53:07 jackdaniel and then use said sequence

17:53:15 jackdaniel if you are sure that it is not too much data

17:53:18 jmercouris I am sure

17:53:24 jmercouris it is usually around ~500kb

17:53:40 sjl Yeah, if it's small enough to keep in memory, read it from the initial stream into a buffer and then work with that.

17:54:02 aeth jmercouris: What I'd consider doing is use something like a pipe e.g. https://gitlab.com/zombie-raptor/zombie-raptor/blob/17eb01a05e3f2b8ac6edab8c7592fe03df7e5b49/util/stream.lisp

17:54:22 sjl then use flexi-streams' make-in-memory-input-stream twice to make two streams you can give to the CSV parser

17:54:28 jmercouris how do I copy a stream into a sequence?

17:54:29 aeth jmercouris: But under the hood, which you can even see there, the pipe is just going to either (a) read into a buffer or (b) read into a /tmp file (I go with a, many go with b)

17:54:42 dlowe clhs read-sequence

17:54:42 specbot http://www.lispworks.com/reference/HyperSpec/Body/f_rd_seq.htm

17:54:55 aeth well, read/write into

17:55:10 sjl alexandria:read-streamm-content-into-(byte-vector|string)

17:55:20 sjl if you don't want to handle the buffering stuff yourself

17:55:27 jmercouris I do not, I want to be as simpe as possible

17:55:38 jmercouris aeth: thanks for the reference I'll take a read later

17:55:40 jmercouris bookmark'd

17:55:54 sjl actually if you want to use the flexi-streams stuff on top, you definitely want the byte vector version

17:56:19 aeth jmercouris: It's just a trivial use of trivial-gray-streams, probably subtly incorrect because I haven't used it enough to hit edge cases

17:56:36 aeth using gray streams means there is some performance penalty, probably

17:56:45 jmercouris sjl: ok, I just put string in :D

17:56:50 jmercouris good thing you said that

17:57:14 sjl the flexi-stream docs for make-in-memory-input-stream specifically says "the octets in the subsequence"

17:59:58 jmercouris sjl: interesting errors: https://gist.github.com/jmercouris/21b9dcf5f18b984c00c0251ad14c9273

18:00:00 jmercouris not sure what they mean

18:00:47 jackdaniel cl-csv operates on character streams

18:00:53 jackdaniel and your stream is a byte vector

18:01:27 jackdaniel how about (with-input-from-string …) ;?

18:01:46 jackdaniel (after you have your sequence read to a string)

18:02:06 jmercouris this is turning way too complex :D

18:02:51 jackdaniel (let ((string (a:read-stream-content-insto-string stream))) (w-i-f-s (s string) (deactivate-targets s)) (w-i-f-s (s string) (do-csv (row s) …)))

18:03:03 jackdaniel >afk<

18:03:08 aeth with-output-to-string combined with with-input-from-string seems like it might work if it's small enough and it doesn't have to be used interactively

18:03:29 jmercouris I'll give that a shot...

18:05:47 jmercouris more and more problems

18:06:15 jmercouris https://gist.github.com/jmercouris/699457e2092a7e2ce5d119c07d3289a0

18:06:17 jmercouris I just don't get it

18:06:24 jmercouris why must it be so painful

18:07:00 dlowe because dealing with character encodings is awful

18:08:04 jmercouris re-uploaded: https://gist.github.com/jmercouris/de6fadd66cf5b77b628acc6fef36ac2b

18:08:26 jmercouris I'm getting really close to rewriting my code..

18:08:31 jmercouris to just somehow magically go through the same stream

18:09:19 jmercouris cl-csv:read-csv can take a string

18:09:27 aeth Well, you're probably getting UTF-8 and you need to translate that into the internal format of the implementation, which is probably UTF-32. Unless the library is already translating UTF-8 to UTF-32, in which case you just need to CODE-CHAR it.

18:09:32 jmercouris cl-csv:do-csv cannot, and must take a pathname or whatever

18:10:00 jmercouris aeth: what if I do not get UTF-8 though? what if a windows user uses the application?

18:10:54 aeth I don't think there's a convenient, universal way to detect all encodings.

18:11:23 aeth Especially all of the legacy ones that look like extended ASCII with special characters in the upper half.

18:11:51 aeth You might be able to determine which Unicode encoding it is.

18:12:31 jmercouris :\

18:12:40 jmercouris there must be a simpler way

18:12:47 jmercouris it was working taking the stream and just piping into cl-csv

18:12:55 jmercouris why can I not just duplicate a stream somehow?

18:13:02 jmercouris copy the contents of one stream into two streams

18:13:30 dlowe read-byte from one, write-byte into a echo-stream

18:13:46 dlowe read-byte from the echo streams

18:13:55 jmercouris dlowe: did you mean write-byte into echo streams?

18:14:00 dlowe yes

18:14:26 aeth jmercouris: the thing is, being naive about encodings will work fine for characters 0 through 127

18:14:35 aeth usually.

18:15:01 aeth try adding é to your input and see what happens

18:15:10 jmercouris no thanks :D

18:15:30 aeth it's possible some library or the implementation is already handling things

18:15:55 jmercouris how can I just make a stream?

18:17:27 jmercouris make-in-memory-output-stream?

18:17:39 jmercouris I however want to be able to write to the echo stream, and then read from it... no?

18:17:49 dlowe that's what echo streams do

18:18:02 jmercouris I'm confused here

18:18:10 dlowe clhs echo-stream

18:18:10 specbot http://www.lispworks.com/reference/HyperSpec/Body/t_echo_s.htm

18:18:28 jmercouris oh, I thought you had made up the term "echo stream" to describe this concept

18:18:32 jmercouris I wasn't looking up echo-stream

18:18:33 dlowe ...

18:18:37 aeth Well, you can make your own stream with http://www.nhplace.com/kent/CL/Issues/stream-definition-by-user.html via https://github.com/trivial-gray-streams/trivial-gray-streams

18:18:44 aeth And it looks like flexi-streams is a layer on top of that.

18:19:21 aeth https://edicl.github.io/flexi-streams/

18:19:30 aeth It looks like flexi-streams does its own encoding handling.

18:20:57 aeth You could probably add some logic that defaults to UTF-8 but also attempts to detect if it's using whatever Windows's default format is instead.

18:21:09 aeth That would cover 99.9% of all things, probably.

18:21:29 jmercouris I guess I could, but I don't want to go down that path

18:21:29 dlowe just assume utf-8 and if you find some other encoding, it's wrong :p

18:21:41 aeth yeah, or that

18:21:54 aeth It's probably not to hard to do utf-8 on Windows since almost the entire web speaks utf-8 now.

18:22:18 jmercouris make-in-memory-input-stream does not support an encoding..

18:25:36 aeth I guess because that works on octets? Are you working on (CL) characters or octets?

18:25:53 jmercouris I have no idea, I'm just working wit whatever caveman gives me

18:25:58 jmercouris s/wit/with

18:33:49 aeth jmercouris: So your issue is you need to pass a stream from caveman into a CSV library twice, right? The simplest answer is to put it into a string and then with-input-from-string twice if it's small enough, right?

18:34:04 aeth So the remaining issue then is getting it into a string

18:34:22 aeth If I'm understanding things correctly

18:34:49 jmercouris yeah that would be the simplest

18:35:00 jmercouris but all of my code is using cl-csv:do-csv

18:35:07 jmercouris and not read-csv which can take a string..

18:35:38 dlowe with-input-from-string will turn a string into a stream

18:35:44 aeth well, if do-csv takes a stream and with-input-from-string makes a stream from a string, then there's no issue

18:36:04 aeth The issue is reduced to turning your input into a string

18:37:11 jmercouris ok, I gave up

18:37:19 jmercouris I just go through the stream once and handle it in both palces

18:37:21 jmercouris plaes*

18:37:22 aeth with-output-to-string is the easiest solution, if it's a bidirectional character stream

18:37:24 jmercouris places*

18:38:51 jmercouris it is

18:38:58 jmercouris but it was having problems reading the encoding for some reason

18:39:02 jmercouris I dont know I might revisit this code

18:39:07 jmercouris but I'm tired of looking at it :\

18:39:09 jmercouris at least it works now

18:39:39 jmercouris thank you all for your help/patience :D

18:39:51 jmercouris at least this code is more efficient by just going through the stream once :D

18:40:09 jmercouris goodbye everyone

20:51:15 kenu hi

20:52:00 kenu Is there a way to have quicklisp and asdf work with each other flawlessly?

20:52:22 kenu I've got my stuff in common-lisp folder as per asdf manual

20:52:49 kenu but quickload dosn't store downloaded packages there

20:53:18 kenu manually moving the folders doesnt feel to be the right way...

20:56:32 pjb kenu: you can symlink your stuff into ~/quicklisp/local-projects/

21:43:15 Xach kenu: systems provided by quicklisp will always be stored in the quicklisp directory in a certain structure.

21:43:26 Xach kenu: other systems can go anywhere as long as asdf's find-system can find them.

21:50:47 Xach kenu: I'm curious - what prompts the desire to have downloaded stuff go there?

23:54:14 manualcrank hey, people

23:54:49 manualcrank a few days ago i complained about the performance of sbcl on the competitive programming site open.kattis.com, specifically the awful performance (compared to other languages there) of READ-LINE.

23:55:12 manualcrank You absolutely need to be able to consume input as fast as possible on this site. Using ios_base::sync_with_stdio(false) in C++ or memory mapped input in C can chew through the input in literally 0 seconds vs. TLE (time limit exceeded) in Lisp.

23:55:35 manualcrank Here's an example: <https://open.kattis.com/problems/hardwoodspecies>

23:55:52 manualcrank A sol'n based on READ-LINE doesn't finish in the allotted 1 sec :- (

23:56:03 no-defun-allowed https://shinmera.github.io/mmap/

23:56:04 manualcrank Ditto one based on READ-SEQUENCE (in my tests READ-SEQUENCE --- on *STANDARD-INPUT* --- is _slower_ than READ-LINE).

23:56:26 manualcrank So anyway i managed to accidentally cobble together a solution using DEFINE-ALIEN-ROUTINE, which I can't really explain how it works because I'm a complete newbie at foreign function stuff.

23:56:36 manualcrank See it here: <http://pasted.co/d921528c>

23:56:45 no-defun-allowed nice, i was going to suggest that

23:56:48 manualcrank It calls C's gets(). Performance is dramatically better (finishes in < 0.6s, which is average).

23:56:58 manualcrank But (1) it still creates a string object for every line and (2) what I really wanted to do was be able to call fgets_unlocked(), but couldn't figure out how to convert between stdin and *STANDARD-INPUT*

0:16:46 sjl manualcrank: `cat`ing their sample input 36000 times gives me a 1008000 line sample file, which is larger than their limit. Running (time (with-open-file (s "foos") (loop :for l = (read-line s nil nil) :while l :sum 1))) on that in SBCL takes 0.11s on my machine.

0:17:15 sjl So you still have 89% of the allotted time to actually do stuff with the input.

0:17:56 sjl So maybe whatever hardware they're running on is old or something.

0:21:03 hatchback176 .109s in python

0:24:57 sjl Yeah, even a full solution is <1s on my laptop, but exceeds the limit on the site. So they must be running on something old/virtualized to all hell

1:09:10 ebzzry Kabriel: thanks

1:14:14 manualcrank can't use any library that isn't standard btw, so the mmap library is out

1:17:43 manualcrank also, the allotted time includes the time to load and initialize sbcl

1:20:01 Xach working with unsigned-byte 8 streams will be a lot faster.

1:20:23 manualcrank i'll try that

1:20:34 Xach but it will be less convenient and clear. but there are ways to reclaim clarity if not convenience.

1:27:45 aeth manualcrank: do you have to process by the line? depending on what you're doing you could write a state machine that goes character-by-character (or byte-by-byte)

1:29:28 manualcrank no, it's just the most natural

1:29:59 manualcrank i did slurp all of stdin in one attempt and subsequence individual lines out of it

1:30:05 manualcrank but that timed out too

1:31:31 aeth Well subseq isn't a good idea for performance. That's why nearly everything has a start and end optional or keyword parameter (or start1 end1 start2 end2, etc.)

1:31:56 manualcrank largest possible input is 1,000,000 trees of 126 (+1 for newline) char-length each

1:32:06 aeth obviously a lot of the time, especially these days, using subseq doesn't really matter

1:32:20 manualcrank i used read-sequence to read that many characters in at once

1:32:44 manualcrank it was too slow

1:33:11 aeth if I needed performance and I didn't need to build up buffers of lines I would just work character-by-character (or, again, maybe just byte-by-byte)

1:33:11 manualcrank subseq is necessary no matter what 'cuz you're putting stuff in a hash table

1:33:44 aeth well, hash table performance isn't great in CL

1:34:01 Bike uh? i would think it would be fine

1:34:10 manualcrank here, too, you really want an ordered map

1:34:19 aeth Bike: this is a very high performance coding contest or something

1:34:29 manualcrank so you have to extract the keys into an array and sort them

1:34:29 aeth the only hash table has everything as a key and a value, which isn't going to benchmark well against e.g. C++

1:35:00 manualcrank honestly the performance of computation-bound stuff is perfectly fine

1:35:15 aeth strange.

1:35:45 aeth I thought the hash table benchmark was one of the places where SBCL was weak in the Benchmarks Game

1:35:52 manualcrank no problems with array and hash table performance, maybe 0.02s slower than C++

1:35:55 Bike even if it is, i/o is probably going to be worse

1:36:11 aeth yeah, I guess this is just overwhelmingly dominated by I/O

1:36:31 manualcrank yeah if you have to read or write a lot of stuff, all the other languages are faster

1:38:16 manualcrank i'm happy with my alien c-gets, but i'd be happier still if i could get fgets_unlocked to work and if return values weren't converted to new string objects

1:38:44 Bike new strings? that would be hard to avoid.

1:39:07 Bike maybe displaced arrays would help. but i kind of doubt that

1:39:17 manualcrank i get the sense it's possible reading sbcl foreign function docs

1:39:27 manualcrank but i'm too new to fully understand what i'm reading

1:39:33 aeth manualcrank: I wonder if working with bytes makes a difference over working with characters? A lot of languages use really weird strings but in SBCL afaik it's just utf32 so it reads utf8 into utf32 and there's a translation. Idk if that would make a difference, though

1:39:49 Bike oh, with alien you mean

1:39:59 Bike yeah, xach suggested that already.

1:40:11 aeth ah, right

1:41:04 manualcrank (setf sb-impl::*default-external-format* :utf-8) has no noticeable effect

1:41:36 aeth external format.

1:41:49 aeth internal format is utf-32

1:41:53 aeth iirc

1:41:57 Bike yes, probably.

1:45:16 manualcrank uiop has uiop:read-file-string but that takes a filename arg and anyway uiop isn't an option on the site

1:45:18 aeth so it takes 123 124 and turns it into 0 0 0 123 0 0 0 124. You could see this yourself reading some FASLs with a binary reader, like hexl-mode in emacs. I think, though, at some point they might have optimized the size to get rid of that, though. It looks like that's just for symbols, though

1:45:48 aeth literal strings saved in the fasl still have the 000000 between each character (assuming ASCII subset of unicode)

1:46:07 Bike you realize sb-alien is also not standard, i hope

1:46:48 aeth the rule is probably no third party libraries, I hope

1:46:54 manualcrank that's the rule

1:47:23 aeth That still interferes with CL a lot. CL is a very multi-implementation language and each implementation implements extensions subtly differently and we rely heavily on portability libraries to actually use features

1:49:05 Bike my dipshit solution can do 2.9 mil trees in 1.04s. let me try it with bytes or sorting as it goes or something

1:49:27 Bike oh, that is only input, though, i gess

1:49:56 Bike i suspect my solution is so dumb it's cpu bound, though

1:51:46 manualcrank i don't mind not being able to use portability libs, it's just a puzzle site, point is to learn some algorithms

2:13:01 Bike adjustable array crap worse, well that's nice

2:13:16 Bike if displaced arrays are actually useful to me at some point it'll worry me

2:30:31 moldybits how do you specialize on a 2 dimensional array?

2:33:13 moldybits '(array * 2), i guess

2:33:32 Bike specialize as in methods? can't

2:36:24 moldybits oh

2:36:51 moldybits just array?

2:43:25 Bike array, vector, bit-vector, string

2:53:57 manualcrank update to <http://pasted.co/d921528c>: reason it lops off the \n is because that's what gets() does, duh (fgets() doesn't)

2:54:36 manualcrank also, returning a c-string according to docs implies a conversion to a Lisp string

2:55:14 manualcrank i can though specify it returns (* char) and just return the buf i used to alien call gets

2:57:03 manualcrank in that case there's no conversion, just an array of char (i changed the code to element type (unsigned-byte 8) though)

2:57:41 manualcrank you have to find the 0 (#\Nul if array of character) yourself then

3:07:56 manualcrank also, base-char is better than character because the former doesn't convert going from Lisp to C

3:09:24 Bike that would be the utf-32