normalization-insensitive
Earlier this year, I went on a trip to Texas. On my Android phone, I ended up with photos with names like 20160515_201343_East César E. Chávez Boulevard.jpg
. When I used the script I had written to transfer the photos to my computer, I discovered that it was re-copying some of the photos every time I ran the script, instead of recognizing that these photos had already been copied. Specifically, it was re-copying the photos that had accent marks.
This comes down to a question of Unicode normalization. Linux (and thus Android) uses NFC in its filesystem, and Mac OS X uses NFD in its filesystem, so when comparing the filenames on the phone and the filenames on the laptop, the filenames with accents never matched.
What I wanted was something like case-insensitive
, but for normalization, not case. So, I fulfilled this need by “writing” the normalization-insensitive
package. (I put “writing” in quotes, because all I did was make some straightforward changes after copying the case-insensitive
package wholesale.)
Under the hood, normalization-insensitive
uses the unicode-transforms
package. This avoids having to pull in any heavyweight dependencies like ICU.
Here is the new version of the android-photos
script:
#!/usr/bin/env runhaskell
import Data.List
import System.Directory
import System.IO
import System.Process
import Data.Unicode.NormalizationInsensitive ( NI )
import qualified Data.Unicode.NormalizationInsensitive as NI
remoteDir = "/storage/extSdCard/DCIM/Camera"
localDir = "/Users/ppelleti/Pictures/Android"
getRemoteFiles :: String -> IO [String]
getRemoteFiles remoteDir = do
files <- readProcess "adb" ["shell", "ls", remoteDir] ""
return $ lines $ filter (/= '\r') files
getLocalFiles :: String -> IO [String]
getLocalFiles localDir = do
files <- getDirectoryContents localDir
return $ filter shouldKeep files
where shouldKeep ('.':_ ) = False
shouldKeep _ = True
copyOneFile :: String -> String -> IO ()
copyOneFile remoteFile localFile =
cmd ["adb", "pull", "-a", remoteFile, localFile]
cmd :: [String] -> IO ()
cmd (exe:args) = do
putStrLn $ unwords $ exe : map show args
callProcess exe args
copyFiles :: String -> String -> [String] -> IO ()
copyFiles rDir lDir files =
mapM_ cfile files
where cfile file = copyOneFile (rDir ++ "/" ++ file) (lDir ++ "/" ++ file)
copyNewFiles :: String -> String -> IO ()
copyNewFiles rDir lDir = do
rFiles <- map NI.mk <$> getRemoteFiles rDir
lFiles <- map NI.mk <$> getLocalFiles lDir
let files = rFiles \\ lFiles
copyFiles rDir lDir $ map NI.original files
main = copyNewFiles remoteDir localDir