Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify existing practice for Container data about contents #343

Closed
timbl opened this issue Nov 7, 2021 · 12 comments · Fixed by #352
Closed

Specify existing practice for Container data about contents #343

timbl opened this issue Nov 7, 2021 · 12 comments · Fixed by #352

Comments

@timbl
Copy link
Contributor

timbl commented Nov 7, 2021

The data about each contained resource which is given by the containing Container has been stable for at least 6 years, in NSS and so it is appropriate to take existing well-established practice as the current Solid spec, and as the default for any future versions.

This issue requires that the current situation be documented in the solid protocol before any additions or subtractions are considered .

@jeff-zucker
Copy link
Member

It seems to me that a blob of a .png stored in a database still has the media-type "image/png". And that if the user wants to download it, its size is knowable.

@timbl
Copy link
Contributor Author

timbl commented Nov 7, 2021

The data about contained resource ?x seems to currently

Subject Predicate Object Object type
?x rdf:type A class whose URI formed by concatenating 'http://www.w3.org/ns/iana/media-types/, the Internet Content Type of the resource, and #Resource rdfs:Class
?x stat:size An integer giving the size of the resource in bytes xsd:integer
?x dct:modified The time when the resource was lastmodified xsd:dateTime
?x stat:mtime The unix time when the resource was lastmodified xsd:integer

using the usual prefix conventions.

@timbl
Copy link
Contributor Author

timbl commented Nov 7, 2021

@RubenVerborgh I'd not be happy just making this a SHOULD, as basically that makes it something which a client could never depend on. Maybe something like "MUST unless that information does not exist is really expensive to find out"

The "wider solid ecosystem" will still need to support HEAD on each of those contained resources, and will have to respond to that HEAD with content-type and content-length and last-modified surely -- so even if there is a triple store back end, the PNG images must be strings in there.

If this is a SHOULD, then client code which browses Solid containers could be modified to do an explicit HEAd on each resource where it is missing that information. ((That would be a bit of a pain in rdflib.js as the cache currently knows about GET but not HEAD so we would have to introduce a new cache state "HEAD done but not GET".))

I am sympathetic to a triple store worried about the size... would number of triples be an alternative for those? Or it could just fake a byte size by multiplying the number of triples by 100 -- at least the resources would order by size ok)

@bourgeoa
Copy link
Member

bourgeoa commented Nov 7, 2021

Could we not retain just the original contentType for RDF resource. Is there any harm.
If the data has never been serialised (modified). The user can expect at least on certain server to have kept everything including comments when allowed.

@justinwb
Copy link
Member

justinwb commented Nov 8, 2021

Noting that Solid Editors organized a session last Friday (11/5) pertinent to this topic - see minutes, Mime-type and last-modified didn't receive pushback in-session.

@timbl
Copy link
Contributor Author

timbl commented Nov 8, 2021

"That is the issue indeed; size and content type are just not meaningful in those cases."

Then what happens when I do a HEAD to it?

@timbl
Copy link
Contributor Author

timbl commented Nov 8, 2021

So in principle you could to conneg in each case, using the HTTP request we have.

Of course in the case you have a file-backed or string-store-backed system, there is an actual resource representation. So the server should return that metadata. Shall we leave it that if the server has a store where there is "no such thing: as internet content type, or byte length, that we allow the server to leave that data out? But otherwise its a MUST?

In those case, the client will have to iterate over the contained resources doing HEAD requests, so that would be a bunch of work for rdflib, but we could do it.

Of course in a case the server being able to supply many different translated forms with conneg, we could also suggest that the container return a complete documentation of all the options with a graph of the alternatives.... pribably not for this release.

@timbl
Copy link
Contributor Author

timbl commented Nov 8, 2021

If we can get this through I'd be happy to make it a MUST as a compromise.

@justinwb
Copy link
Member

justinwb commented Nov 8, 2021

…specifically for non-RDF resources though:

Yes that's exactly right - should've been more clear about that. The focus on Friday was to start with the least controversial items.

Shall we leave it that if the server has a store where there is "no such thing: as internet content type, or byte length, that we allow the server to leave that data out? But otherwise its a MUST?

This seems to be another use case where it would be useful for a client to be able to understand what a server does and doesn't support in a programmatic way. Otherwise it is hard to know if a client (in the scenario above) is dealing with a server that is non-conformant, vs. one who cannot produce a valid size for a legitimate reason (for example).

@csarven
Copy link
Member

csarven commented Nov 8, 2021

If ldp:RDFSource is to be interpreted as Ted explains, I strongly suggest that the Solid Protocol does not adopt it. Instead, do
#342 (comment) :

If we need a class to distinguish RDF stuff from non-RDF stuff, we can use solid:RDFDocument or solid:RDFSource (based off rdf11-concepts)

To date, Solid servers/clients neither adopted ldp:RDFSource or interpreted it to be along the lines as mentioned. We only really cared about it being serialized as a concrete RDF syntax as per RDF 1.1.

@TallTed

This comment has been minimized.

@kjetilk
Copy link
Member

kjetilk commented Nov 9, 2021

My main issue with the current behavior is that if you have predicates such as dct:modified and st:mtime, then if you modify a resource, then that updates its container (as designed), but since that container now also changed, its parent container also has to be updated, and their parent again all the way up to root. NSS does not do that now, but that means these metadata do not reflect the actual changes in the container representation, which I think is bad. You could for example not use that for conditional requests.

My proposal has been to have a .stat auxiliary resource or something like that. That would solve this problem elegantly, and be analogous to POSIX, which does not have this problem since metadata isn't on the directory, but on an inode. Alternatively, we might not add the problematic metadata to descriptions of children that are containers themselves, but that is a departure from current NSS, AFAICS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

7 participants