I like filesystems, because they're rugged and survivable and import/exportable between multiple OSes and storage formats.

But I guess I would really love a computer which had a "datasystem" instead of a filesystem. And where some of that data could be automatically consistency-constrained based on some definable criterion (which itself would have to be able to be stored as data). I say "criterion" because it *might* be a "type" but many of today's type systems don't cut it for databases.

And I say a "datasystem" rather than a singular "database" because the point would be that, like a filesystem, you could link any part of the data to any other part.

(subject to some kind of object capability system)

It wouldn't be a bunch of separated silos that you'd have to jump completely out of an "app" environment to make links between.

I *suppose* you could build such a system on top of today's indie darling, SQLite, but, I don't know that it would be especially table-oriented data.

I mean it would probably be one table and that table would be either pairs or triples. Cos it would be something like either RDF or JSON objects at the node level, and there'd be nothing but that one node type in it.

SQLite seems like it would be kinda overengineered for the purpose.

But if you absolutely had to, sure, you probably *could* use SQLite as the engine, if that was the only last resort persistent storage possibility left open to you by today's Programming Super Best Practices (tm).

@natecull what are you afraid of? it’s not like it’s SQLheavy

@zens

If I can make queries to it without making up a big old clumsy string of "lol quotation marks are escapes lol" then I'd be a little happier.

Maybe I can? I've not used it.

@dredmorbius @zens

Oh dear. Oh no. Putting files inside SQLite, which is itself a file in a perfectly good filesystem? That feels like the opposite of the right way to do anything.

I guess it's a clever hack, but ugh no thanks.

@dredmorbius @zens

The sort of thing I'm longing for is really the Pick OS, where instead of files you had this systemwide, OS-provided JSON-like storage layer. So you automatically just get recursive hash objects inside your "files", which are a thing everyone always wants instead of streams of bytes. That was nice and humane and I'm very sorry that Pick died and Unix won because this is a world where we don't deserve nice things

But, maybe with types and capabilities as well.

@natecull As I did with Alexandra earlier today: how familiar are you with IBM mainframes?

They had files with specific structured data formats.

You needed specific programs to read those files. As in, custom-written.

@zens

@dredmorbius @zens

Needing custom-written programs to read and write data files produced by another application sounds like the exact opposite of the thing I'm wanting. Is there a reason why you mentioned this?

@natecull Note that the access methods aren't customised.

But the parsing procedures are.

I mention it because it all but inevitably remains the case. If you're dealing with complex data and working with it, your processes for doing so tend to reflect that data and must be adapted to it.

I've seen five to eight decades of attempts to ignore this fact. They've gone poorly.

@zens

@dredmorbius @zens

and the types, obviously, have to be both tagged and described in terms of JSON-like objects. Obviously, because that's the only kind of object that "Pick 2022" can store. Obviously.

It turns out that that part isn't obvious at all to anyone working on typesystems today, because they're all focusing on writing compilers, where they can invent their own weird moon-syntax for the type definitions, use something else entirely for tagging, and then erase both after compilation.

@natecull @dredmorbius it is also very much a “core” part of how mac os is designed: storing the kind of data you are talking about in sqlite dbs is how all in house apple apps work en.m.wikipedia.org/wiki/Core_D

@zens @dredmorbius

That sounds quite nice, and very much like what I want!

But as long as it's limited to the Apple ecosystem, it's not cross-platform and therefore fails the "data must be robustly stored" requirement.

Are there cross-platform, multi-language, open-source implementations of Core Data?

@zens @dredmorbius

Also, the system I'm thinking about MUST include a way to join or link data from multiple files and from remote sources (and do so in a secure manner, however security is defined; that part is probably Hard). Otherwise it's still just per-app data restricted to app silos.

@natecull @dredmorbius i admit i was actually talking about several things. Core Data is stored in just normal sqlitedb files, that with its ecosystem (see in particular datasette) is getting pretty close to text files in terms of versatility and interchangeability. - like the list of affordances for text files i shared the other day, sqlite has quite a lot of them.

but i was also talking about

en.m.wikipedia.org/wiki/Extend and resource forks

@natecull @dredmorbius i will be the first to admit that SQL is bad, but it’s more than made up for by the other cool things SQLite can do

@zens @dredmorbius

Does it at least have a non-string query API these days?

@natecull @dredmorbius for instance… you can make a table column that’s just json, and there’s a library of functions for querying json using jsonpath

@zens @dredmorbius

Yeah, but that's JSON *literally stored as a UTF-8 string* isn't it?

Please tell me it isn't, because that would be just silly to the point of being an absurdist's nightmare.

@natecull @dredmorbius it is, and the justification for it is there’s no real performance or functionality benefit from doing something else

@zens @dredmorbius

Literally *the Javascript VM* tells me the opposite.

Constructing and searching strings every time I want to query a JSON property! No! That's silly! That's not being a database!

@zens @dredmorbius

If I wanted to construct and parse JSON strings as a means of persisting JSON *I'd bloody well use JSON.Stringify and write to a bloody text file*!!!!!

@natecull @dredmorbius unless you’re writing embedded C, in which case what JSON.stringify?
i have looked at all the json libraries for C and sqlite is seriously the best option

@zens @dredmorbius

I mean imagine if I have literally a terabyte of JSON storage, you want me to construct a terabyte-sized JSON string each time I change one integer halfway in the middle of it?

No.

@natecull @dredmorbius if you’ve got a terabyte json string you got worse problems than what we’re talking about

@zens @dredmorbius

I think you're not understanding me.

When I say "JSON" I don't mean "a string", I mean "the JSON *data model*".

That is, a data model consisting of

* string key/value objects
* strings
* arrays
* numbers
* true/false/undefined

There is no need for this to have any contact with string serialization except for the part which is literally strings.

And yes I could want to have my entire personal hard drive ( on the order of a terabyte) structured as this kind of storage.

@zens @dredmorbius

(though I'd probably want to extend it with at least one other type: "binary blob", so I can store "files" or chunks of files in it)

@zens @dredmorbius

In other words, I'm basically talking about the Pick operating system, but with JSON semantics.

en.wikipedia.org/wiki/Pick_ope

@natecull @dredmorbius the ways of doing this are
1. proprietary binary serialisation format. because- something’s gotta go on the disk!
2. one of the 3000 binary json “standards”
3. just store a json string and find a way to make that efficient. and well, it turns out there’s a bunch of really clever strategies for making that efficient

@zens @dredmorbius

"3. just store a json string and find a way to make that efficient"

No, because, I WANT A TERABYTE OF PERSISTENT JSON STORAGE.

I don't want that in a single JSON string thank you very much.

See again, as above:

THE PICK OS.

We've done this before. It was nice. It is possible to do, I promise you.

It is not done by serializing strings into a SQL database.

@zens @dredmorbius

This sort of gap is what NoSQL databases evolved to fill, but I think they've made things super-complicated. But if every Windows came with a single built-in MongoDB or similar that was fully supported like it was a filesystem, and you could just copy chunks of the object storage in and out like you can folders, I guess I wouldn't be too sad.

@natecull @dredmorbius i did see that the s3 model of object storage is starting to become a viral meme: hardware is available whose bus binary interface is object storage instead of block storage. i think designed for the data centre industry, but may get support in desktop operating systems

Show newer

@natecull @dredmorbius
what i am saying is:

open and cross platform
no sql (not nosql tho)
fast and efficient

pick 2

@natecull @dredmorbius @zens What if SQLite was just the front end, and behind that SQLite managed the raw block devices?

Ceph does this. There is the ceph frontend and behind it are services which manage the raw devices.

This would be a lot easier in a microkernel where different services communicate with the FS service over IPC.

@jollyrogue @natecull @dredmorbius this is also a super good option in embedded situations where there literally is no filesystem and you have the choice between SQLite and FAT

@zens @natecull @dredmorbius Does it actually do this right now because I want to try this.

@natecull @dredmorbius @zens Nice! I’ll have to check that out. I wonder if I could point it to a ZFS volume or if it has to be a disk. 🤔

@jollyrogue @natecull @dredmorbius it’s more an experiment than a real serious feature, or basically implementing the minimal set of fs functions that SQLIte expect to exist, as an plug in. see test_onefile.c

sqlite.org/vfs.html

@zens @natecull @dredmorbius This is awesome!

I was thinking about how to archive data across external devices of different sizes, and this would solve part of the problem. I was thinking about having a “.manifest” SQLite file storing parity, checksums, and file list on the device roots already, but if I can turn the devices into a giant db that would reduce the complexity on the drive side. Or create a crazy amount of complexity in the program logic. 🤷🏽‍♂️

@jollyrogue @dredmorbius @zens

"What if SQLite was just the front end, and behind that SQLite managed the raw block devices?"

If I were going to put any kind of database onto a bare metal blocksystem, I think I'd rather it be a native Lisp database.

I mean if I had any choice in the matter, and it wasn't forced on the world by edict that "thou must do SQLite because it's the New Cool Thing".

Not all the data in the world is going to be SQL-shaped. I'm pretty sure most of it in fact is not.

@jollyrogue @dredmorbius @zens

See, the use case I'm constantly thinking about here, is one which doesn't yet exist. But if it did exist, it wouldn't be an application, but rather a desktop-sized, user-driven platform. Key to the concept is that it has no a priori assumptions about the shape of the data, because, user-driven and extensible.

I think such a platform just isn't going to come with database schemas or things that resolve neatly to the SQL one-database-per-application model.

@natecull @dredmorbius @zens Probably not. It will probably be a hybrid.

They way I’ve conceptualized a data oriented DE, the VFS manages the indexes and metadata. Much like the work Tracker does today, except much more integrated. Data is imported, structured, and stored. When I say stored, I mean recording the inodes/blocks/URL of the file.

Everything we would see would be queries against the indexes and metadata.

There would be many db concepts, but it wouldn’t necessarily be a full db.

@natecull @dredmorbius @zens This is probably incredibly naive with lots of details missing, but it’s what I’ve come up with.

@natecull @dredmorbius @zens I’d rather it be Postgres, but Postgres doesn’t have an embedded option.

It would be nice if more LISPs had that option built in. PicoLISP is the only LISP I know of which has an embedded database in stdlib.

It’s not, which is why Postgres with BSON is so nice. 🙂 Of course, that’s only data that’s being forced into a database. More then likely, metadata about the data is what is needed, and would fit into a database.

@jollyrogue @natecull @dredmorbius Postgres doesn’t have an embedded option but it can operate in “raw” mode.

@jollyrogue @natecull @dredmorbius correction: no it can’t, i got confused cos other db systems could and in my mind postgres always has the superset of all db features

@zens @natecull @dredmorbius No problem. I didn’t think Postgres could be embedded via a library, but was going to double check. 🙂

Sign in to participate in the conversation
Toot.Cat

On the internet, everyone knows you're a cat — and that's totally okay.