Archive for the ‘PHP’ Category

Removing Duplicate Files from iTunes

Saturday, September 29th, 2007

While iTunes has its issues, for the most part it makes for a fairly passable media organizer. As long as I let it do its job, I’m pretty happy with how it handles the files on the backend. When I moved my iTunes library to a new external drive, however, I came across a pretty egregious “bug” that left me with thousands of duplicate files and no good way to remove them.

By default, iTunes stores files in “/Users//Music/iTunes/iTunes Music”. When you drag and drop a file from your desktop, or wherever, it copies the file into this directory, tossing it into a subdirectory based on Artist and Album name and renaming it according to the track info you specify in iTunes. When you change the Album name or Track name, the file is is updated and moved automatically, so the location and name of the file always reflect what you see in iTunes. Personally, I like this behavior.

I didn’t run into problems until I attempted to move my files from the default directory to a directory on my new drive, a 500G firewire affair. At first, everything worked properly when I changed the settings; it took while, but it slowly copied all the the files to the new location, “/Data/iTunes”, and once it was done, everything was great. And everything remained great until I made the mistake of opening iTunes while my new drive was unmounted.

Little did I know that when iTunes couldn’t find “/Data/iTunes”, the new directory, it changed itself back to its default: “/Users//Music/iTunes/iTunes Music”. Upon finding all the files still there, it quickly settled in and changed all the entries in the database to point to these files. When I finally realized what had happened, I remounted the new drive and changed the iTunes directory again- and that’s when all my troubles began.

iTunes once again began to copy my files to new the drive- only this time, the files already existed. Rather than simply use the existing files, it created new copies, appended with ” 1″, and I was left with several thousand duplicate files and no easy way to remove them.

To deal with this issue, I wrote a simple PHP script to crawl through my iTunes directory and deal with files that end in ” 1″. In the case where both files exist (with and without ” 1″), the script compares the two files. If they are identical, it tries to rectify the situation by removing the old file and re-adding the remaining file to iTunes using osascript, OS X’s command line Applescript tool. Adding the file triggers iTunes to automatically rename the file properly, thus removing the ” 1″. By default, you’re prompted before any file is deleted. You can disable this by hitting ‘a’ at the prompt.

In the case where the two files are not equal, they are skipped. You’re going to have to deal with those manually. Sorry. It also has problems with files that contain special characters. You’ll have to deal with those as well… but come on- how many Blue Öyster Cult, Mötley Crüe or Björk songs do you have, anyway?

In the case where only one file exists, it will simply re-add the file to iTunes, on the theory that iTunes will fix it if the name is wrong, and ignore it otherwise.

fix_itunes_dupes.php

#!/usr/bin/php
<?php

/*
* Name: fix_itunes_dupes.php
* Author: patrick
* Usage: fix_itunes_dupes.php
*
* This script is provided AS-IS.  No warranty is either expressed or implied.
* Use at your own risk.
* Feel free to use, modify, duplicate, or take credit.
*/
// Set this variable to your iTunes directory.
$itunes_dir = "/Users/{$_ENV['USER']}/Music/iTunes/iTunes Music";

if (!file_exists($itunes_dir)) {
    exit("{$itunes_dir} does not exist.");
}

// Collect itunes files that end in " 1".
$find = "find \\"{$itunes_dir}\\" -type f -name \\"* 1.*\\"";
$files = explode("\n", `$find`);
$file_count = count($files) - 1;
// 'find' comes with a bonus carriage return.  Get rid of it or try to remove
// '.' (that's bad).
unset($files[$file_count]);
// Set some defaults.
$all = false;
$count = 0;

// Loop through the files.
foreach ($files as $file1) {
    $count++;
    print "Processing {$count}/{$file_count}:\n";
    // Skip Movies and TV Shows.
    if (preg_match("/\/Movies\//", $file1) || preg_match("/\/TV Shows\//", $file1)) {
        print "Skipping TV Show or Movie:\n";
        print "{$file1}\n";
    } else {
        preg_match("/^(.+) [0-9].([^.]+)$/", $file1, $matches);
        $file = "{$matches[1]}.{$matches[2]}";
        if (file_exists($file)) {
            $md5 = md5(file_get_contents($file));
            $md51 = md5(file_get_contents($file1));
            print "{$file} ({$md5})\n";
            print "{$file1} ({$md51})\n";
            if ($md5 == $md51) {
                // If the two files are identical, we can space one of them.
                print "{$file} = {$file1}\n";
                if (!$all) {
                    // Options are:
                    //  y - delete the file. default.
                    //  n - skip this file.
                    //  a - stop asking, just do them all.
                    //  q - quit.
                    print "Delete? [Y/n/a/q]: ";
                    $char = substr(fgets(STDIN), 0, 1);
                }
                if ($char == "q") {
                    exit;
                } elseif ($char != "n") {
                    if ($char == "a") {
                        $all     = true;
                    }
                    print "Deleting {$file}\n";
                    unlink($file);
                    add2itunes($file1);
                }
            } else {
                print "Files do not match.  Skipping.\n";
            }
        } else {
            // There is only one file; go ahead and add it to iTunes-
            // shouldn't hurt.
            print "{$file1} exists on its own.\n";
            add2itunes($file1);
        }
    }
    print "\n";
}

function add2itunes($file) {
    $file = preg_replace("/\//", ":", $file);
    // Please forgive the toothpicks- numerous files have 's.
    $command = "osascript -e \\"tell application \\\\"iTunes\\\\" to add file \\\\"{$file}\\\\" to playlist \\\\"Library\\\\" of source \\\\"Library\\\\"\\"\n";
    print $command;
    passthru($command);
    return;
}

?>

Sample Output

Processing 31/39:
/Data/iTunes/The Beastie Boys/Licensed To Ill/13 Time to Get Ill.m4a (42cbc19b9d2feb83899201793ee1e00e)
/Data/iTunes/The Beastie Boys/Licensed To Ill/13 Time to Get Ill 1.m4a (e2f49dfc12db5fb8f8d3954b9671870c)
Files do not match.  Skipping.
Processing 32/39:
/Data/iTunes/The Bloodhound Gang/One Fierce Beer Coaster/07 Asleep at the Wheel.mp3 (6d12ce65c4cba6748224e2262feba67a)
/Data/iTunes/The Bloodhound Gang/One Fierce Beer Coaster/07 Asleep at the Wheel 1.mp3 (6d12ce65c4cba6748224e2262feba67a)
/Data/iTunes/The Bloodhound Gang/One Fierce Beer Coaster/07 Asleep at the Wheel.mp3 = /Data/iTunes/The Bloodhound Gang/One Fierce Beer Coaster/07 Asleep at the Wheel 1.mp3
Delete? [Y/n/a/q]: y
Deleting /Data/iTunes/The Bloodhound Gang/One Fierce Beer Coaster/07 Asleep at the Wheel.mp3
osascript -e "tell application \"iTunes\" to add file \":Data:iTunes:The Bloodhound Gang:One Fierce Beer Coaster:07 Asleep at the Wheel 1.mp3\" to playlist \"Library\" of source \"Library\""
file track id 53184
Processing 33/39:
Skipping TV Show or Movie:
/Data/iTunes/TV Shows/Battlestar Galactica/3-03 Exodus_ Part 1.mp4

Fixing a PHP CLI segmentation fault

Monday, September 17th, 2007

Another Short Answer to a Specific Question:

Sometimes, after an upgrade, any PHP script run from the command line causes a segmentation fault, even though the script seems to run fine. For example:

[03:13:01 argon:~]$ php -v
PHP 5.2.4 with Suhosin-Patch 0.9.6.2 (cli) (built: Sep 17 2007 02:28:25)
Copyright (c) 1997-2007 The PHP Group
Zend Engine v2.2.0, Copyright (c) 1998-2007 Zend Technologies
Segmentation fault: 11 (core dumped)
[03:13:02 argon:~]$

Whenever this happens, I can usually resolve the problem by changing the order of the modules listed in extensions.ini. Three times out of four, moving “extension=session.so” to the top of the file fixes the problem.