@despens@campground.bonfire.cafe
iPRES 2023
SOFTWARE PRESERVATION AFTER THE INTERNET
Dragan Espenschied, Klaus Rechert
conference-paper
September 23, 2023
Software preservation must consider knowledge management as a key challenge. We suggest a concep-tualization of software preservation approaches that are available at different stages of the software lifecycle and can support memory institutions to assess the current state of software items in their collection, the capabilities of their infrastructure, and completeness and applicability of knowledge that is required to successfully steward the collec-tion.

For those who are interested, here is an explanation of why the campground had become so slow, and how we fixed that (browsing feeds is now up to 95% faster!):

When you're browsing your feed on Bonfire, each page on the app displays 20 activities, but there are tens of thousands of activities in total. Fetching all the activities at once, including related data like their authors and threads, and checking permissions for each activity can be slow and inefficient.

To address this challenge, we implemented deferred joins and boundary checking. Instead of fetching all the data and performing boundary checks for every activity before selecting the ones for a specific page, we defer these operations until after an initial pagination has been applied.

Here's how it works: First, the app creates an optimized query that loads only the necessary information needed for pagination. This includes filtering activities based on a specific feed, ensuring that only relevant activities are considered. This optimized query is designed to load less data from disk and thus is faster. It returns just the IDs of one or two pages, which could be up to 40 activities.

Next, the database takes the results of the optimized query and fetches the complete details of those activities. At this stage, it also computes the boundaries for each of those 40 activities, checking permissions to determine if the user has access to view them. The database filters out any activities that the user is not allowed to see. Finally, another round of pagination is applied to the filtered activities, ensuring that 1 to 20 activities are returned as the final result.

By deferring the retrieval of complete details and boundary computation until after pagination, the database minimizes the amount of data it needs to process. This significantly improves performance, even when the instance contains a large number of activities. It reduces the time and resources required to handle pagination, resulting in a much faster and more responsive user experience.

For those familiar with SQL, this looks something like (though the real query is much more complex):

SELECT * from activities
INNER JOIN (SELECT id from activities WHERE [...] LIMIT 40)
LEFT OUTER JOIN [...]
WHERE EXISTS (SELECT [...] WHERE [boundaries.id](http://boundaries.id) = [activities.id](http://activities.id))
LIMIT 20

@mayel I like that you write your own SQL queries. So many project use ORMs which in most cases create pretty bad queries.

Hey @ivan, @bonfire@indieweb.social I love the latest Bonfire release. Is there a migration pathway to convert existing mastodon instances? It might become a requirement for communities that want to take advantage of Bonfire features without having to start over from scratch?

@ivan @dajb That's an interesting feature but indeed is confusing when the user interface looks like something could be changed. I ran into this misunderstanding with "pages". For users that cannot change settings a more tabular display would be better, perhaps like Mediawiki lists installed extensions?

@despens The threading works ok, but the [-] and [+] widgets to fold and unfold thread branches is not that great because they're not used to add or remove something, just to show and hide nested items. I'd suggest using the classic caret instead like it is common for directory structures