You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Running a query is currently always done at a Server level. This makes sense for typical single-tenant deployments, but for server like atomicdata.dev, this can lead to suboptimal performance. Say you query all Folders. If a high percentage of these are private, the server will have to iterate over many items and check their rights. This is slow.
I think we should also index information about either the hierarchy or the drive. Let's explore some options.
Store the drive in index rows
Most (but not all) resources are scoped to a Drive. We could add a drive option to every QueryFilter. Most Atoms we index will have a drive. We'll need to derive the Drive either from the URL or by iterating over the parents. In every query filter, we store the drive, which causes it to be part of the index.
It would be nice if we could query the store using every single parent from a Resource. Let's say we have this nested structure:
a > b > c > d
We may want to find all Article resources inside b, which also includes resources that have their parent set to c. If we only find the direct children we'll not find d. We could find d like so:
Find all children using the existing indexes where parent: b. This will find c. Now, we also need to find all resources where parent: c. This will return d. Since this is a recursive process, we'll also try finding the children of d, which do not exist in this example. This entire process now took 3 lookups. Two with results, and one for the leafs. I think this should be pretty fast on its own, but it does more than n lookups, where n is the amount of leaves / results. If we need to do this for every single changed atom that is updated, things will be slow!
Now let's do an update. Let's say we're adding article d, and the Query looking for articles in b should be updated. How do we achieve this?
When we make the relevant change (e.g. change a parent from x to c), we create a new Atom. This Atom is turned into an IndexAtom. This new IndexAtom now has an array of parents. To do this, we need to know all the parents for this atom, the entire hierarchy. We find the direct parent, the parent above that, etc. With this list of parents, we can try all the QueryFilters, and check if the parents match.
Add hierarchy index
This is a new sled::Tree that stores hierarchical data. In here, each key is a Parent + (sub)Child. There is no value. Each row simply represents that that Child subject has the Parent as (in)direct parent.
Whenever we do add_atom_to_index (or remove), we check if the property is parent. If that's the case, we find all parents and then update these indexes to add the children to the list.
For every resource, we might have various index items. A child with 5 parents will have 5 index items. That's not necessarily a problem, as the items itself will be small.
I think we'll want two new methods on Store: get_parents and get_children. These should return an iterator, as the amount could be big.
The text was updated successfully, but these errors were encountered:
Running a query is currently always done at a Server level. This makes sense for typical single-tenant deployments, but for server like atomicdata.dev, this can lead to suboptimal performance. Say you query all Folders. If a high percentage of these are private, the server will have to iterate over many items and check their rights. This is slow.
I think we should also index information about either the hierarchy or the drive. Let's explore some options.
Store the drive in index rows
Most (but not all) resources are scoped to a Drive. We could add a
drive
option to everyQueryFilter
. MostAtom
s we index will have adrive
. We'll need to derive the Drive either from the URL or by iterating over the parents. In every query filter, we store the drive, which causes it to be part of the index.Similar to #288
Index parent
It would be nice if we could query the store using every single parent from a Resource. Let's say we have this nested structure:
a > b > c > d
We may want to find all
Article
resources insideb
, which also includes resources that have their parent set toc
. If we only find the direct children we'll not findd
. We could findd
like so:Find all children using the existing indexes where
parent: b
. This will findc
. Now, we also need to find all resources whereparent: c
. This will returnd
. Since this is a recursive process, we'll also try finding the children ofd
, which do not exist in this example. This entire process now took 3 lookups. Two with results, and one for the leafs. I think this should be pretty fast on its own, but it does more thann
lookups, wheren
is the amount of leaves / results. If we need to do this for every single changed atom that is updated, things will be slow!Now let's do an update. Let's say we're adding article
d
, and the Query looking for articles inb
should be updated. How do we achieve this?When we make the relevant change (e.g. change a parent from
x
toc
), we create a newAtom
. ThisAtom
is turned into anIndexAtom
. This newIndexAtom
now has an array ofparents
. To do this, we need to know all the parents for this atom, the entire hierarchy. We find the direct parent, the parent above that, etc. With this list of parents, we can try all theQueryFilters
, and check if the parents match.Add hierarchy index
This is a new
sled::Tree
that stores hierarchical data. In here, each key is aParent + (sub)Child
. There is no value. Each row simply represents that that Child subject has the Parent as (in)direct parent.Whenever we do
add_atom_to_index
(or remove), we check if theproperty
isparent
. If that's the case, we find all parents and then update these indexes to add the children to the list.For every resource, we might have various index items. A child with 5 parents will have 5 index items. That's not necessarily a problem, as the items itself will be small.
I think we'll want two new methods on
Store
:get_parents
andget_children
. These should return an iterator, as the amount could be big.The text was updated successfully, but these errors were encountered: