
- #PKWARE SUCKS ZIP FILE#
- #PKWARE SUCKS ARCHIVE#
- #PKWARE SUCKS FULL#
- #PKWARE SUCKS CODE#
- #PKWARE SUCKS OFFLINE#
It's not weird because ripgrep has had more features than ag for a long time. ag doesn't really get gitignore support correct, although if you only have simple gitignores, its support might be good enough. In general, I also claim that ripgrep has far fewer bugs than ag. You can even put `-engine auto` in an alias or ripgreprc file and have ripgrep automatically select the regex engine based on whether you're using "fancy" features or not. ripgrep supports PCRE2 with the -P/-pcre2 flag. That's not true and hasn't been true for a long time. > But for features: there's no lookaround in the regex. For example, in my checkout of the chromium repository: It is very possible that there is no performance difference between ripgrep and ag for your use cases, but that does not mean there isn't a performance difference between ripgrep and ag. > Since both are instant on every codebase I care about, not so sure about that.
#PKWARE SUCKS CODE#
It's often the case that you're searching, for example, a repository of code repeatedly. They are only I/O bound when searching data that isn't in cache. Storing the index at the beginning, pointer and file sig at the end, and all the other format extensions does solve for all this.
#PKWARE SUCKS ARCHIVE#
Storing it at the beginning (without a end pointer) opens up the possibility you have a valid looking archive you’re touching that is truncated and missing a lot of data, and won’t know until you look past the end (or validate total bytes or whatever, which doesn’t work well when steaming). One big issue with zip storing the index at the end of course is a truncated file basically lost most of it’s context and is generally unrecoverable even in part, which this could also help with from a durability perspective. Seems convenient to allow optimization for high speed sequential reads and random read/writes at different parts of the life cycle, along with indexing, crcs, signatures, etc.
#PKWARE SUCKS OFFLINE#
Offline high speed data ingestion of multi-thousand file, multi hundred GB data sets, followed by rapid transfer to permanent online storage (and replication fan-out, etc).
#PKWARE SUCKS ZIP FILE#
The other main options I found was squashfs which recently grew zstd support and there is some preliminary zip file support for zstd but there are multiple standards which is not helpful! The nitty gritty details are summarised here: While it works quite well, especially with gzip and bzip2, sadly the zstd and xz (and some other compression formats) don't allow for decompressing only parts of a file by default, even though it's possible the default tools aren't doing it. I was surprised how hard a problem to solve it is to get a bundle file format that is indexable and compressed with a good and fast compression algorithm which mostly boils down to zstd at this point. This works pretty great with the only exception that I can't create scratch files inside the directory layout which in the past I'd wanted to do. Currently I untar these onto my zfs filesystem which works out OK because it has zstd compression enabled but I end up decompressing and recompressing which is quite expensive as often the files are GBs or more compressed.īut I've started using a tool called "ratarmount" ( ) which creates an index once (and something I could automate our upload system to generate in advance, but you can also just process it lcoally) and then lets you fuse mount the file. I've recently been looking into this same issue because I analyse a lot of data like sosreports or other tar/compressed data from customer systems.

And a plain uncompressed tar file described by “file” will say something like “POSIX tar archive”. tar.xz file (assuming of course that the actual data in the file is really this kind of data).

You can also see the description of compressed tar files with the “file” command and it will say for example “xz compressed data” for a. This will give you a plain tar file as a result, which is a file and a format of its own.
You can confirm this by decompressing the file without extracting the inner tar file using gunzip, bunzip2 or unxz respectively. In a way you could consider those files a format in their own right, but at the same time they really are simply plain tar files with compression applied on top.
#PKWARE SUCKS FULL#
Sometimes the compressed files are named in full like. When you produce a compressed tar file, the contents are written into a tar file, and this tar file as a whole is compressed. Tar the utility is a program that can produce tar files, but it is also able to then compress that file. It is important to distinguish between tar the format and tar the utility.
