By Kevin Keenan
Despite my advocation of the rule; 'blame yourself first and the computer second' when something goes wrong, computers still baffle me sometimes. Why did my laptop crash first thing on Monday morning? I still haven't found out.
The story goes, I logged in at about 0730, checked my emails, read a couple of paper abstracts and installed some automatic updates. I noticed that my system needed a reboot for one of the updates to take effect, so I complied. Following the restart I was greeted with an empty desktop without the Ubuntu unity launcher icons or the system menu. I could still operate the system from the terminal, but the system seemed very unstable, besides, that's no way to operate a computer with so much graphics power to spare. I went about trawling through Google searches for 'missing unity launcher', 'unity launcher and menu bar don't appear' and the like. I tried lots of the proposed solutions but nothing worked. After a full day of trying to fix the problem, and knowing that I had a looming deadline for some stuff I need a fully operational system for, I came to the decision to do a clean re-install of Ubuntu. I had been meaning to this for a while because previously I was only using the wubu version, and I had some problems with activating the hibernate function as well as my fn key functionality.
The re-install went smoothly when I finally got some disc space back from my greedy Windows 7 installation. I restored my backups and installed my preferred packages. Within a couple of hours I was back to normal with everything working beautifully, or so I thought.
I realised that I had forgotten about Mendeley, my preferred ref manager. I quickly installed it and logged in, that was when things got messy. I had stupidly forgone the option to sync my library with a cloud source, instead choosing to host my library in single directory on my Windows 7 partition. This was no problem because on the wubi installation the Ubuntu file system was just a virtual extension of the windows system, so the directory path was pretty standard. Now however this was not the case and I was greeted with thousand of broken links in mendeley, which could no longer locate the relevant PDFs :-(
All I could do was note my error and move on. I needed to re-read all of my PDF into Mendeley and regenerate my library. The problem was that I stupidly (again) asked Mendeley to organise my PDFs in a really cumbersome way. All PDFs were located in folders divided by publication/author_name/paper_name. If you can't imagine that, it basically means that for every single file (2,384) I would have to navigate through three folder levels to copy it and paste it to my new library. F**k that!
I knew there would be a nice easy Linux solution to collapse the directory tree and copy all PDFs, but I don't like easy. I also like to see just what can't be done in R. I went about creating a function to basically take a file pattern argument (e.g. .pdf, .txt, .xlsx) and look for all files containing this pattern and copy them to a single directory of my choice. Sounds simple doesn't it. It is. The breadth of capabilities that R has never cease to amaze me. I always tell people that "it is much more that just a place to do your t-tests or regressions. It's a programming language too".
The code below will basically look in the current working directory (i.e. the top level directory) for any other directories recursively. Then, for each located directory it will check if there are any of the relevant file types (i.e. those containing the specified patterns) and copy them to a directory named 'fileSort-[out]' under the working directory. The code will not make any changes to the source folders so your data is safe.
Click this link to download a source code file.
edit: Following a useful suggestion, I have include a new argument which allows the files to be written to a user-defined location on the system.
# This function will look for all instances of files containing the pattern
# specified in the argument 'patterns', and copy them to a single directory
# named 'fileSort-[out]' under the working directory.
# Source files are unmodified
# Written by Kevin Keenan 2013
# Feel free to use, modify and redistribute as you wish
fileSort <- function(patterns = NULL, new.location = getwd()){
ptn <- patterns
# Define the file copier/mover function
cpmvFILE <- function(dirs = NULL, patterns = NULL){
if(!is.null(dirs)){
dirs = dirs
patterns = patterns
root <- getwd()
sapply(dirs, function(x){
names <- dir(path = x, pattern = patterns, ignore.case = TRUE)
if(length(names) != 0 ){
sapply(names, function(y){
file.copy(from = paste(x, "/", y, sep = ""),
to = paste(new.location, "/", y, sep = ""),
overwrite = FALSE, recursive = FALSE)
})
}
})
}
}
# list the first level of directories
dirsIn <- list.dirs(full.names = TRUE, recursive = TRUE)
# remove the first directory when path is given
# Create a directory into which all relevent files will be written
dir.create(new.location, sep = ""), showWarnings = FALSE)
# run cpmvFILE, assigning res to x to prevent printout
x <- cpmvFILE(dirs = dirsIn, patterns = ptn)
# remove x
rm(x)
}