This shows you the differences between two versions of the page.
| — |
blog:pushbx:2025:1109_repacked_ecm_files_in_download:old [2025-11-09 12:26:44 +0100 Nov Sun] (current) ecm created |
||
|---|---|---|---|
| Line 1: | Line 1: | ||
| + | ====== Repacked ecm files in download/ | ||
| + | |||
| + | Today I finished repacking all .zip files as yet | ||
| + | found within the https:// | ||
| + | and uploading the lzipped tarball instead. | ||
| + | This reduces the disk space use from 22 GB to about 2.3 GB. | ||
| + | |||
| + | ===== Downloading the ecm/ | ||
| + | |||
| + | '' | ||
| + | |||
| + | This ran from Friday, 17:26 until 18:06. It downloaded 2338 files, worth 23 GB. | ||
| + | |||
| + | ===== Unpacking the zipballs ===== | ||
| + | |||
| + | There are 2056 files in the old subdirectory, | ||
| + | |||
| + | '' | ||
| + | |||
| + | ===== Hardlinking old/msdos4/ files to old/ldos/ ===== | ||
| + | |||
| + | '' | ||
| + | |||
| + | ===== Hardlinking identical files ===== | ||
| + | |||
| + | '' | ||
| + | |||
| + | I left this command running starting at 20:38 on Friday, and it was done by at the latest 09:57 on Saturday. | ||
| + | |||
| + | ===== Creating the sorted list of files ===== | ||
| + | |||
| + | This was done using a scriptlet that sorts the 1.8 million files | ||
| + | according first to their basename (part after the last slash) | ||
| + | and second to their pathname (part before the last slash). | ||
| + | I attempted using perl's spaceship operator ''< | ||
| + | at first but found that this seems to treat its operands | ||
| + | as numbers, failing to sort as desired. | ||
| + | The correct operator for text operands is '' | ||
| + | |||
| + | '' | ||
| + | |||
| + | ==== Converting the sorted list to NUL separators ==== | ||
| + | |||
| + | '' | ||
| + | |||
| + | ===== Creating the tarball ===== | ||
| + | |||
| + | '' | ||
| + | |||
| + | Terminal output: | ||
| + | |||
| + | '' | ||
| + | |||
| + | This ran from 11:36 on Saturday until 12:42. The resulting file contains 89 GB. | ||
| + | |||
| + | ==== Creating the tarball without --hard-dereference switch ==== | ||
| + | |||
| + | '' | ||
| + | |||
| + | This ran from 10:34 on Saturday until 11:32. The resulting file contains 36 GB. | ||
| + | |||
| + | This is more efficient to depack in full as tar will recreate the hardlinks | ||
| + | between files with identical contents. | ||
| + | (Some files have as many as 9000 hardlinks pointing to the same file.) | ||
| + | However, it is more difficult to use this for depacking individual files. | ||
| + | Quoth the GNU tar manual: | ||
| + | |||
| + | < | ||
| + | |||
| + | < | ||
| + | tar: ./one: Cannot hard link to ' | ||
| + | tar: Error exit delayed from previous errors</ | ||
| + | |||
| + | The reason for this behavior is that tar cannot seek back in the archive to the previous member (in this case, ‘one’), to extract it(23). If you wish to avoid such problems at the cost of a bigger archive, use the following option: | ||
| + | |||
| + | < | ||
| + | |||
| + | |||
| + | ===== Compressing the repacked tarball ===== | ||
| + | |||
| + | Using pv (pipe view) for the progress display and lzip to compress. | ||
| + | |||
| + | '' | ||
| + | |||
| + | Final pv progress line: | ||
| + | |||
| + | '' | ||
| + | |||
| + | The file contains 2.1 GB. | ||
| + | |||
| + | ==== Compressing the repacked tarball without --hard-dereference switch ==== | ||
| + | |||
| + | '' | ||
| + | |||
| + | Final pv progress line: | ||
| + | |||
| + | '' | ||
| + | |||
| + | The file contains 2.0 GB. | ||
| + | |||
| + | Because of the difficulties extracting specified files, | ||
| + | and the compressed file size savings only coming out to about 5%, | ||
| + | we did not upload this file to the server. | ||
| + | |||
| + | |||
| + | ===== Depacking the lzipped tarball ===== | ||
| + | |||
| + | '' | ||
| + | |||
| + | Final pv progress line: | ||
| + | |||
| + | '' | ||
| + | |||
| + | ===== Uploading the repacked file ===== | ||
| + | |||
| + | '' | ||
| + | |||
| + | ===== Deleting the old files ===== | ||
| + | |||
| + | The following command was used to delete old zipballs from the old subdirectories. | ||
| + | It only processes files that have less than 2 (ie, exactly 1) hardlink | ||
| + | pointing to them. | ||
| + | Therefore, each most recent build (with a hardlink from the | ||
| + | https:// | ||
| + | |||
| + | '' | ||
| + | |||
| + | |||
| + | {{tag> | ||
| + | |||
| + | |||
| + | ~~DISCUSSION~~ | ||