2 minute read

undoing a wayword unzip

Say you’ve archived some results by zipping the folder they are in. This is a common part of my workflow when I’m getting ready to ship a set of results. While I’m working I pipe all my tables to a folder called /tabs/, which is sync’d to the overleaf document that holds the draft. Then before submitting I’ll zip the folder and rerun my code (I always do this twice),1 this way I don’t ship anything stale.

Now, if you’ve ever unzipped a file in a GUI you probably did it by double clicking on the zipped folder and waited happily while your OS unzipped the contents into a sensibly named folder. So, based on this habit, if you’re like me, once and awhile you accidentally:

unzip archive.zip

which obligingly sprays the contents of archive.zip all the files all over the current directory. This is probably never what you wanted.

Doing it right is easy enough, unzip just needs a destination (-d):

mkdir archive 
unzip archive.zip -d ./archive

Undoing this is a one line thing… but since I don’t make this mistake more than a couple of times per year I never remember exactly how to clean it up without having to manually delete all of the files (sometimes this can be a lot of files). The solution takes advantage of the fact that unzip knows all the names of the files in the .zip so you can use unzip -Z -1 archive.zip to get the names (-Z is actually right at the top of the man page and often aliased to zipinfo, and -1 restricts the output of unzip -Z to just the file names. This is great, because a list of filenames is just the sort of think that we can pipe via xargs to rm and have our nice tidy directory back. Since the output of unzip -Z -1 is one file name per line, we’ll use the -I option to execute command (rm) once for each line of what we are piping.

unzip -Z -1 archive.zip | xargs -I{} rm -v {}

The first set of {} is telling the pipe (|) where to put the list of files, and the second set is telling xargs where to put each filename. The -v option (--verbose) just tells you the names of the files as they are removed. And just like that you’re back where you started.

  1. My great, and likely only, contribution to the practice of applied statistics is that I reminded Andrew Gelman that he is indeed the one who said: “Computer programming is basically the last bastion of rigor in the modern world. I don’t trust anything until it runs. In fact, until it runs twice on the computer.”