freenode/#lisp - IRC Chatlog
Search
11:34:29
beach
Speaking of data structures, I am collecting different versions of the standard binary search algorithm as published in various text books. So if you have a text book on algorithms that contains a version of binary search, I would like a copy of it (the algorithm, not the book) and the name of the book.
11:35:15
beach
The one is Aho, Hopcroft, and Ullman takes twice the time that it should, for instance.
11:41:08
dim
beach: would you be game to review what PostgreSQL is doing in that area? It's C code, but well commented and with references to original papers and algos when they've been implemented from papers
11:43:05
shka_
beach: binary search limited to the simply ordered vectors, or also pointerless trees?
11:52:52
flip214
beach: is that interesting to you too? https://www.pvk.ca/Blog/2012/07/30/binary-search-is-a-pathological-case-for-caches/ I guess you already read that, though.
12:58:56
dim
mmm, what are the subtelties of #+pgloader-image and such wrt to compiling code to a binary image?
12:59:51
dim
I've been using that to have interactive error handling in the REPL but backtraces when /usr/bin/pgloader is run, and now I realize it doesn't work like I want, or at all
13:07:14
beach
dim: Sure, I'll look at it if you point me to the source code. I have programmed in C so I should be able to understand it.
13:08:42
beach
dim: Otherwise, it is quite simple. If there are two tests per iterations, the first one testing for equality, then it is wrong.
13:09:19
dim
the file would be https://github.com/postgres/postgres/blob/master/src/backend/access/nbtree/nbtree.c
13:12:20
dim
I though that searching in binary tree would account for binary search, sorry, realizing the mistake now
13:13:27
dim
Postgres just uses bsearch() and has its own sorting facilities (with abreviated keys in cases), but I don't think it has its own binary search actually
13:15:09
dim
have a look at this one for the fun of it: https://github.com/postgres/postgres/blob/97c39498e5ca9208d3de5a443a2282923619bf91/src/include/nodes/pg_list.h
13:17:25
dim
the history of it makes it a little better, I think: PostgreSQL optimizer is known to have been written in Lisp in the 80s, but then they switched to C like the rest of the system and they did that pg_list.h implementation to help with the porting
13:17:37
beach
Oh, and they even included the stuff that Common Lisp had to include for historical purposes, i.e. that the CAR and the CDR of NIL is NIL.
13:19:22
dim
by then it was a university project by Stonebreaker and mainly used to host PhD thesis, up to 1995
13:21:18
dim
ok I'm not sure why yet but it seems that #+pgloader-image isn't defined in the image I'm using, for some reasons
13:22:25
dim
I'm doing (push :pgloader-image *features*) in src/hooks.lisp that I don't load with ASDF, but manually in the command line (using --load) when building the image, but maybe that feature is needed at compile time? read time?
14:31:45
Demosthenex
so i was tinkering with reading some machine generated text (https://bpaste.net/show/eb139faf9503) i wanted to throw into a table for searching. i could do this with regexp only, but given i have a variety of outputs to look at i was considering a lexer and parser. i found alexa, but cl-yacc seems to be all over. sound like the right tool? what's the right cl-yacc home?
14:36:47
beach
Demosthenex: There are plenty of parser libraries for Common Lisp. One is esrap that is maintained by scymtym. He might know more about the subject.
14:37:39
Demosthenex
beach: yeah, i was reading about a few on cliki, but all the links to google code are dead
14:41:36
shka_
Demosthenex: also i find monadic parser combinators to be interesting and easy to use
14:42:20
Demosthenex
i haven't used a lexer/parser since cs classes in school years ago. i just thought that given the regularity of my input data that regexp for each one was a waste of time if lex/parse could do it faster.
14:57:26
LdBeth
Demosthenex: actually for this kind of text you shown you’d better use awk like tools rather than parser gen
15:07:51
Demosthenex
LdBeth: that's an interesting assertion. i generally have used shell tools like awk to pull out that data, i was trying to make it a bit more formal/flexible to load into postgres, and since i was already using postmodern ;]
15:11:03
Demosthenex
i think the key difference is i've used awk for line operations instead of longer paragraph records and data extraction.
15:18:10
dim
Load the data in PostgreSQL then process it in SQL, that's very flexible and powerful, and you have advanced and fast support for data processing
15:19:20
Demosthenex
dim: i only ever insert finished data into sql... do you have a link to an example?
15:22:35
dim
use COPY to load the data quickly, then process it, I think I have many examples of that on my blog
15:23:35
Demosthenex
well, i understood loading the raw text in. i just hadn't considered parsing the text to extract useful information inside sql... i was leaning on lisp for that
15:29:01
Demosthenex
i hope in your book you say "using multiline regexps to try and match data is a bad idea" ;]
15:35:54
Demosthenex
hrm. i see the group by a replaced string, but that's not the actual record extraction. you're extracting from xml
15:39:31
dim
I've been doing that a lot, and I though I had the idea covered better on the blog, because I have lots of articles where I though I would explain that, turns out I did the job to prepare the data and that's usually not the topic of the blog post
15:39:57
dim
if you want to see something very involved, maybe you'll like https://tapoueh.org/blog/2017/09/on-json-and-sql/
15:48:45
dim
anyway, either filter on the fly with CL and use Postmodern support for the COPY streaming protocol (see cl-postgres:db-write-row and friends and an example at https://github.com/dimitri/pubnames/blob/master/pubnames.lisp); or just load a CSV-like input file and then process in SQL, that's the easiest way to do it, mainly because then the input data and their transformations are well defined
16:03:13
Lycurgus
looking for van wijngaarden grammar based parse in cl, will check log for any replies, ty in advance
16:57:12
Demosthenex
dim: the filter and such sounds applicable to single row records, i'm having to collate data from a multiline record and transpose that into a single row for insertion, i don't think i can do that during a COPY
16:58:23
dim
that's what I'm doing in the pubnames examples, where the input format is XML key/value and with a single k/v per input line, and I'm using several values to form a single row in the COPY stream
16:58:49
dim
if you process in CL then just call db-write-row when you have all the tuple information