Archive Statistics (June 2013)

The last time we bothered to generate statistics for the archive was last November. It has been quite some time since then and we have made a number of changes to the database as well.

We have added a few columns and started to store some integers regarding the post deletion time on the archive boards. This would have added a few bytes for every post stored in the tables. However, we recently converted all of our tables from InnoDB to TokuDB. Due to this change, all of our tables were compressed with TokuDB’s standard compression. This resulted in having a smaller database that would fit nicely on our SSD RAID-1.

We will be providing our old statistics for reference and a table containing our new statistics below.
Automatic Archivers

There are two open-source software that allows users to archive the entirety of 4chan automatically. This basically means that entire boards on 4chan can be archived locally without any manual interaction besides starting it up. Simply put, everything can and will be archived. This is very different compared to other services and software available which requires users to specific which threads they want archived/downloaded.

Anyway, we will be comparing both software against one another to give you a broader idea of what each software brings to the table.
The Return of /vg/

It is now official. We will be hosting the /vg/ archive at Foolz Archive again.

What does this actually mean? Nothing much. You will just need to update your 4chan extensions accordingly to redirect back to Foolz Archive. If you are using 4chan X v3 by Mayhem, all you need to do is press the “update now” button in the “Archives” tab and you should be good to go. If you are using something else, you will need to notify the developers about the change.
Major Hardware Upgrade

There were some obstacles that we needed to overcome because of the size of our databases. Since we stored a large amount of data over the past few years, our database grew to use about 120G. We only inherited approximately 55G of that data from both EasyModo and InstallGentoo. However, we archived approximately 65G of data over the past two years alone. There is a reason for this sudden increase. We started to archive both /v/ and /vg/, which are very fast moving boards on 4chan. For example, after one year of archiving each board, the database size of /v/ and /vg/ were 26G and 16G, respectively. We didn’t have any issue storing the data at all with our configuration, but to access all of that data with reasonable speeds on the current hardware? That is a challenge.
