freenode/#shirakumo - IRC Chatlog
Search
19:49:58
Shinmera
they don't, but it can be emulated and or it'll just write anyway and "succeed" every commit.
19:50:43
SAL9000
might be cleaner to emulate transactions as the depot library deferring everything until commit occurs?
19:50:59
SAL9000
so long as it's clear to the API user that it doesn't guarantee atomicity, of course
19:51:15
Shinmera
sure, my point is just that it's more general to do it like this than to present a raw streams interface.
19:51:48
SAL9000
point being that you're setting yourself up for feature-flags or real-transactions-p or SOMETHING
19:51:48
Shinmera
as for your other question, detecting whether an entry is denoting a certain type is a separate problem
19:53:49
SAL9000
maybe depot can have backends or hooks -- e.g. magic v.s. file extension -- which allows CHANGE-CLASSing the depot:entry into any of its subclasses?
19:54:48
Shinmera
that's what I'm asking. I can see three avenues: 1) a function that takes a "type" and returns a class to use. 2) a function that encapsulates the entry in another instance 3) a set of functions that can decide to "accept" the entry and change-class it
19:55:41
SAL9000
#1 seems quite limited. Does it matter for depot whether the caller does change-class or encapsulation or mixins...?
19:55:44
Shinmera
My issue with 1 and 3 is that you might need to multiplex the type depending on the "origin depot"
19:56:33
Shinmera
My issue with 2 is that you need a separate call to convert it to the depot that is denoted by the entry.
19:57:08
SAL9000
caller could theoretically do encapsulation + delegation so you don't need a separate call
19:58:09
SAL9000
(i.e. the wrapper object quacks like a depot:entry for all intents and purposes, without necessarily satisfying DEPOT:ENTRY-P)
19:58:18
Shinmera
Wait, no, 2) is bad because you could not transparently traverse through depots, since now you need to request depot conversion.
19:59:07
Shinmera
One idea of the system is that you can go like (entry* os-depot "home" "linus" "some.zip" "my-file.dat")
19:59:31
Shinmera
or, more transparently, "some" instead of "some.zip" where "some" could be a zip or a directory, or whatever.
20:02:38
Shinmera
though it would be more like (entry* http-depot '("org" "wikipedia" "en") "wiki" "common lisp')
20:05:44
Shinmera
another thing to consider is that, for instance if you open a depot that's a zip archive, from within another zip archive depot, it needs to know to "extract" that inner file first to be able to present it.
20:07:39
Shinmera
the issue is how to do it efficiently. The primitive way would be to read the entry into memory completely, then treat it as a zip archive.
20:08:17
Shinmera
but for zip depots that are backed by a standard fs that has streams and whatever, it's far more reasonable to use the underlying fs to access it so you /don't/ have to read it into memory.
20:09:04
Shinmera
another layer that should be broken is if you have an in-memory octet vector representation that the zip could be parsed from without having to copy the sequence.
20:11:50
Shinmera
what I'm gathering here is that we need a way to multiplex depending on the backing storage. so whatever converts an entry to another depot (or more specific other type, whatever) needs to be able to dispatch based on parent depot.
20:12:36
Shinmera
so the zip "plugin" would have a method for a directory depot and a method for a sequence-depot (that's sequence-backed) and a method for a generic depot that just reads the entry data in.
20:12:48
SAL9000
hold on, what if you model it as a set of "desirable" properties, which some depots cannot conserve?
20:14:48
Shinmera
Seems really annoying to work with. I'd much rather have some mixins that allow you to extend the protocol
20:16:06
SAL9000
right... but if the directory-depot is inside a zip file, you've lost zero-copy with respect to the backing store already...
20:16:32
Shinmera
you wouldn't have a directory-depot in a zip file, it'd have to be a zip-directory-depot.
20:18:07
Shinmera
I think what I'm seeing is that stuff that's not compressed can nest arbitrarily quite naturally, and stuff that is will "automatically" flatten the hierarchy back to an in-memory sequence anyway and continue from there.
20:19:20
SAL9000
I think there's some compressions where you can start at a keyframe-like thing and go from there
20:20:01
SAL9000
also some archive types (zip?) don't force you to decompress all files, just the files you want to read... right?
20:20:13
Shinmera
also I think it would be worthwhile to do the change-class/conversion as a separate method that's automatically invoked as soon as you try to open the entry but no earlier, to avoid stuff like listing entries opening tons of zip files.
20:20:55
Shinmera
When I say "decompress" here I don't mean when you open the zip depot, but when you access another zip in a zip.
20:21:34
SAL9000
right, which would (unless you go crazy with the layer-breaking) force you to decompress the whole 2nd-level zip file from the 1st-level one
20:23:14
SAL9000
yes, but it's still a zip-in-zip and "default" behaviour still "decompresses" it into a sequence
20:23:46
Shinmera
well the zip backend can make use of the compression attribute to figure out what to do.
20:24:15
Shinmera
or actually, requesting an entry from the depot will automatically figure out what to do
20:25:25
SAL9000
child tries to do zero-copy read, if parent is "store" it works, if parent is compressed it "fails" by making a new sequence...
20:26:43
SAL9000
parent constructs new sequence, stores in a cache slot, which is then used to serve future zero-copy calls.
20:26:58
Shinmera
zip1 itself can't be compressed since it has to be decompressed to see what it contains. zip2 in zip1 can be compressed and will only decompress if needed.
20:28:13
SAL9000
"fails" in that it ends up having to do more work -- decompress the whole zip -- than the child may immediately need
20:29:29
SAL9000
if zip2 (child) is not compressed, zip1 (parent) can read direct from backing store just those bytes that child wants
20:30:45
Shinmera
zip1 would not do anything. zip2 is an entry that contains a stream and an offset in that case.
20:31:43
SAL9000
in your draft protocol, WRITE-TO/READ-FROM start/end parameters are referring to the sequence, not the entry, right?
20:34:53
Shinmera
this entry/depot knows it's coming from a directory-depot and as such internally has a file-stream to do its stuff with
20:35:54
Shinmera
when you request a list of entries, it parses the zip file structure and generates zip-entry-depots, each of which know they're coming from a zip file, so they have the corresponding zip entry structure in them.
20:35:54
SAL9000
so... you end up with a zip-file-depot-from-directory-depot, zip-file-depot-from-zip-file-depot, etc.?
20:37:23
Shinmera
so that entry is a zip-file-depot with a file-stream that starts at a certain position
20:37:55
SAL9000
Maybe I'm looking at it wrong, but I'm seeing that file-stream as a (severe) layering violation
20:38:43
SAL9000
Why does the child zip-file-depot need to know that it comes from a real file-system?
20:41:30
Shinmera
in your story the depot now also needs to have a table to store where entries start/end or whatever rather than that info and whatever else just more naturally being part of the entry.
20:42:30
Shinmera
well now you're forcing every entry to have a start/end index even when there's no need to have one
20:43:31
Shinmera
my point is, why not just store all the info that's related to how to get the entry data in the entry itself
20:45:40
Shinmera
yes because in the above scenario we have depots that are connected to file-streams.
20:45:56
SAL9000
so you either need to duplicate the code -- file-stream version vs sequence version
20:46:32
Shinmera
you don't want to do the latter because it's slow and would just lead to more copying.
20:46:55
Shinmera
having two variants is not much more effort and is far more efficient because now you can create sub-entries that just address sub-sequences.
20:47:31
SAL9000
...would it make sense to provide something that quacks like a sequence but is backed by a file-stream?
20:48:28
Shinmera
in zippy I have an "io" structure that has its own API and has two implementations for vectors or for streams.
20:49:08
Colleen
github.com/Shinmera/zippy/b... Website (HTML), Title: zippy/io.lisp at master · Shinmera/zippy · GitHub
20:51:31
Shinmera
it's still not "very fast" because each individual call of these functions will dispatch, rather than dispatching once and then the rest knows.
20:52:07
Shinmera
the defmethod is because size is used elsewhere and I was too lazy to separate them.
20:53:16
SAL9000
...right, you still don't win unless you have two (identical) copies of the body of the caller, under an etypecase
20:54:19
Shinmera
having some magic that eliminates further dispatchers once it knows the type would be nice.
20:56:27
Shinmera
but usually emitting typecase is "good enough" and while slow at compiling, SBCL should eliminate the branching if it knows the variable type.