25 years after being introduced to the Windows’ toolkit, CMD still has it. This post collects a couple of every day file manipulation scenarios that can be accomplished with the command-line interpreter.
Windows’ command prompt is a command-line interface for file and process management on Windows. A big deal in the 90’s, today the tool is not overwhelmingly popular among data scientists, or any Windows user for that matter. But this old school tool still proves useful for basic file manipulation. It might not have the capabilities of, say, Python, but in a situation when you cannot use a programming language or you are looking for a challenge, CMD will always be there for you. Recently, I helped a friend automate some tedious copy and paste operations and reduced his workload by days. Our collaboration is documented below, along with some code snippets.
*The title is a reference to the awesome Automate the Boring Stuff with Python by Al Sweigart.
My friend Richard, a food technologist, was burdened with the most mundane task. He needed to rename a bunch of files. He reserved a couple of days for the task: a prerequisite to performing his actual job. The files were supposedly photos, but they had been delivered without any file extension. Any relevant metadata was missing: the file names simply read “Attachement” with a corresponding number. My friend needed these pictures in the right format and under a standardised name to be able to proceed with his technical review.
To work around the problem, Richard designed a manual process relying on the tools he knew best: copy & paste, MS Paint, and MS Excel. He would first open each attachment in MS Paint and save it as a jpeg. He then generated a massive Excel spreadsheet with new file names following a simple convention of Photo-Number-Date. He would then carefully rename the original attachments by copying across the names from the spreadsheet.
Together we worked on automating this mindbogglingly boring and error-prone task. With the help of a standard Windows’ command line interface we reduced his work effort from a day to a mere minute. CMD is cool, because it comes with every Windows system: unlike Python and R that you need to install and configure first. It’s one click away and it’s not intimidating to get started with. It can be used for operations on files with little “programming” skills. The following list presents some of the scenarios we tackled with Richard – for anyone struggling with a similar problem to reuse.
To use Windows Command Prompt, open the Start menu, type cmd and hit Enter. CMD will come up. Navigate to the folder that contains your files. You can use the cd command, like below:
Importantly, before performing any operations on files, make sure you created a back up copy of the original set.
Let’s start by listing my current files:
>dir /b photo-1 photo-2 photo-3
We have 3 files with no extensions and meaningless names… Aargh! Let’s fix this.
>ren * *.jpg
The star symbol * stands for a wildcard, and tells the REN (rename) command to take all files into consideration. The command keeps the current file names and appends .jpg to them.
>ren *.png *.jpg
All files in the folder with a .png extension will be changed to .jpg.
>ren *.jpg ??????????%DATE%.jpg
%DATE% is a system variable that prints today’s date (as per date format convention set on your system). Here I assumed we know the file extension. Make sure there are enough question marks to cover the longest name in the set. More about the rename command and available methods can be found on the SS64 REN section.
These were some basic rename commands. Below I included some more advanced uses of REN. These examples require us to create a Windows batch file (.bat). In the same folder that contains your files, right click and select to create a new text file. Give the file a name and give it a .bat extension (like: program.bat). Open the file in Notepad or Notepad++: this is where we’ll put our code.
Start by creating a file that will store new names for your files (like names.txt). Here is my sample file:
>more names.txt meaningful_name1 meaningful_name2 meaningful_name3
Now copy the following code to your program and save it:
@echo off setlocal EnableDelayedExpansion rem read and parse the list of file names set i=0 for /F %%a in (names.txt) do ( set /A i+=1 set "name[!i!]=%%a" ) rem rename the files set i=0 for /F %%a in ('dir /b *.jpg') do ( set /A i+=1 for %%i in (!i!) do ren "%%a" "!name[%%i]!%%~Xa" )
First paragraph indexes the names in the text file. The second replaces the current file names with the previously indexed names. %%a is a single element, i stands for our counter, X refers to extension – here we’re leaving it in.
Run the program and confirm that the files have been renamed.
>program.bat >dir *.jpg /b meaningful_name1.jpg meaningful_name2.jpg meaningful_name3.jpg
@echo off setlocal EnableDelayedExpansion rem read and parse the list of file names set i=0 for /F %%a in (names.txt) do ( set /A i+=1 set "name[!i!]=%%a" ) rem rename the files set i=0 for /F %%a in ('dir /b *.jpg') do ( set /A i+=1 for %%i in (!i!) do ren "%%a" "%%~Na!name[%%i]!%%~Xa" )
This is a very similar example to the one above: the difference being we’re leaving the original name of the file in (hence the N). Extra bit goes in between the name and the file extension. All sorts of things can be added: an ordered collection of dates, file author’s name, or the company name. Running the program will return:
>program.bat >dir *.jpg /b photo-1-something1.jpg photo-2-something2.jpg photo-3-something3.jpg
First, we need to create a file with two columns (tab separated). The first will contain the original filenames, the second new names we want to use. My file:
>more names.txt photo-1 meaningful_name1.jpg photo-2 meaningful_name2.jpg photo-3 meaningful_name3.jpg
To rename the files, simply run REN in the command line:
>for /f "skip=1 tokens=1*" %A in (names.txt) do ren "%A" "%B"
This is especially useful if the files do not follow any logical order.
CMD comes with some seriously useful capabilities. For a detailed list of commands, refer to SS64. You will find there all you need to know about REN – rename, for loop FOR /F, EnableDelayedExpansion which is super important for loops, and other commands from the CMD universe.Follow @EveTheAnalyst