You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have some ideas to share with you, which may benefit ptrack. I'd like to get your response or comments before coding/PR, or is it in ptrack's development plans ?
Compress ptrack.map file.
ptrack.map.mmap is no longer used since changing to PG shared memory instead of mmap() system call, which saves the time to copy ptrack.map file when PG startup.
I have to set a bigger value for ptrack.map_size when $PGDATA is larger, e.g. 1024MB for 1T, 40960MB for 40T, in order to track the changed blocks and decrease hash collision. In this case, we need do a lot of IO for ptrack.map file when PG restarts or does checkpoint and will occupy lots of time. Furthermore, this may impact switchover/failover of PG cluster due to timeout.
How about to compress ptrack.map in shared memory before writing to physical file when doing checkpoint, and decompress ptrack.map into shared memory when loading physical file in starting up ?
Use multi-threads in ptrack_get_pagemapset() to scan files within $PGDATA concurrently.
If $PGDATA is bigger, looks like single process scanning files sequently is slow. I want to setup multi-threads when first call of ptrack_get_pagemapset(), worker of threads will do data file's scan/hash and build tuple into shared memory queue, which will be obtained by each call of ptrack_get_pagemapset() with proper mutex lock and condition variable.
Above are my thoughts, looking forward your comments.
Thanks,
vegebird
The text was updated successfully, but these errors were encountered:
Hello everyone,
I have some ideas to share with you, which may benefit ptrack. I'd like to get your response or comments before coding/PR, or is it in ptrack's development plans ?
Compress ptrack.map file.
ptrack.map.mmap is no longer used since changing to PG shared memory instead of mmap() system call, which saves the time to copy ptrack.map file when PG startup.
I have to set a bigger value for ptrack.map_size when $PGDATA is larger, e.g. 1024MB for 1T, 40960MB for 40T, in order to track the changed blocks and decrease hash collision. In this case, we need do a lot of IO for ptrack.map file when PG restarts or does checkpoint and will occupy lots of time. Furthermore, this may impact switchover/failover of PG cluster due to timeout.
How about to compress ptrack.map in shared memory before writing to physical file when doing checkpoint, and decompress ptrack.map into shared memory when loading physical file in starting up ?
Use multi-threads in ptrack_get_pagemapset() to scan files within $PGDATA concurrently.
If $PGDATA is bigger, looks like single process scanning files sequently is slow. I want to setup multi-threads when first call of ptrack_get_pagemapset(), worker of threads will do data file's scan/hash and build tuple into shared memory queue, which will be obtained by each call of ptrack_get_pagemapset() with proper mutex lock and condition variable.
Above are my thoughts, looking forward your comments.
Thanks,
vegebird
The text was updated successfully, but these errors were encountered: