PfpFileTree

While there are lots of other tools available for managing file trees, I have endeavoured to write something which provides optimal flexibility in some clean and well documented code.

The pfpFileTree API is implemented as a single class. Out of the box it allows you to filter and manipulate a set of files which exist within a directory tree.

The constructor takes a single parameter – the path to the directory which is the 'root' of the tree.

Most of the other methods take a parameter (the selector) which defines search criteria: an array of key value pairs. The key is the attribute to filter on, and the value is the value to match (optionally preceded by an operator.

The name attribute is the only one which accepts globbing patterns (and any parameter passed is treated as a globbing pattern). The others are a little different. If the value passed begins with any of the characters <,>,+,-,! then that is treated as a comparison operator (less than, greater than,les than, greater than and not). Note that the compairson operators only work on string and, integer and time values – not booleans. A list of the standard attributes is provided in Appendix A.

The current fileset is held in the $data member variable. This is an array with the path relative to the base directory (tree root) as keys. Each value is an array of attributes.

Because its very easy for a recursive search to run away from you, there is checking built in that halts the execution of the code, sets an error and returns false when the memory usage rises above a predfined threshold (90% by default).

readTree($attr)

Adds files to $data which match the array specified by $attr (pass an empty array to select all files) with the standard attributes populated. Returns false on failure, or a reference to the current $data on success. This method will also decorate $data with any additional attributes returned by a callback specified with addCallBack.


filter($attr)

Removes entries from $data which do not match the selector specified.

Returns false on failure, or a reference to the current $data on success.


subset($attr)

Like filter but operates (and returns) a new pfpFileTree with the filter applied. The current set in the current object is not modified.


applyCallback($callback, $arg)

Runs through the current list of files in $data and calls the function/method specified in $callback with 2 parameters, the first is the canonical path of the file, the second is a copy of $arg.

Execution stops when the function/method specified returns false (and this method returns false) or when all iterations of the method return true (and this method returns true).


Note that this callback is independent of and behaves differently to the callback used in the addCallback() decorator mathod.


compareTree($otherBaseDir)

Populates the cmp attribute in the current tree depending on the existence of a file with the same relative path:

0 if both files same

1 if the file in otherBaseDir is different and older

-1 if the file in otherBaseDir is different and newer

2 if the file does not exist in otherBaseDir

3 if the file is a regular file in one tree and a directory in the other


Returns false on failure, or a reference to the current $data on success. In the event of finding a type mismatch (file/directory) processing stops immediately and the method returns false.

canWriteTo($otherBaseDir)

Tests if all the files in the current data set can be written to the corresponding path in otherBaseDir. This populates the 'can_w' attribute for the files and returns true if all are writeable. Note that this method will iterate through the complete current list of files – even if some cannot be written.

Returns false if one or more files cannot be written.

writeTo($otherBaseDir)

If, and only if, canWriteTo($otherBaseDir) returns true, this nethod will attempt the actual file copying. Except for very ubnusual circumstances, all the files will copied and the method will retrn a reference to the current $data. If something unexpectedly fails, it will return false (and in this case, some of the files may have been copied).

canDelFiles()

Guess what? Yes, its the same as canWriteTo but tests of all the files in the current set can be deleted (inthe case of directories, it only tests the permissions related to the drectory – it does not check that the directory will be empty (and hence rmdir successful).

delFiles()

As with the writeTo operation, the correspondng 'can' method is checked first to see if the code is likely to succeed – if its not going to, then this method returns false without changing any files.


Where files have been deleted, the 'exists' attribute is updated to false. If there are directories which would normally be deletable but which still contain files, these will fail silently.


Returns true if the operation was completely successful – otherwise false.

clear()

Resets the current data set (but not the baseDir).

addCallback($callback)

Configures extra decorators which are invoked when readTree is called. Each callback (you can add multiple ones) is called with a single parameter – the canonical file path. It should return an array of key/value pairs. The keys should not be any of the standard keys listed in Appendix A.

e.g.


$instance->addCallback('my_fileperms');
function my_fileperms($file)
{
   return array('fileperms'=>fileperms($file);
}

sortFiles($attr)

It shouldn't come as much of a shock that this changes the order of the files in $data. If we're going to be deleting stuff, then it makes more sense to delete the contents of a directory before we try to delete the directory. If we're rotating log files, then we want to start with the oldest file in each set.

Again $attr is a set of key/value pairs – but the order is important! The values stored in the array can be '+' or '-' depending on whether that key should be sorted ascending or descending.

e.g.


    sortFiles(array('size'=>'+','mtime'=>'-'));

small files appear first (ascending), if 2 files have the same size, the newest appears first (descending).

ls()

Dumps the current $data to stdout.

Portability

I've written it to be as portable as possible, but I don't have a MS Windows based system to test on. While PHP provides the DIRECTORY_SEPERATOR constant, which is populated with '\' on such platforms and '/' every where else, PHP is quite happy to process patch on MS Windows systems using '/' as the directory seperator. So pfpFileTree uses '/' throughout.

The constructor set the 'caseSensitive' property to false for non-MSWindows systems, and to true for anything else – but this can be changed later. However the semantics of this can get very complicated very quickly – here be dragons!

Symbolic links

These are processed as if they were normal files. When wrteTo() is invoked for a link it creates a copy of the linked file. But when delFiles() is invoked it deletes the link leaving the original file in place.

Appendix A: Attributes

Attribute

Meaning

Example

exists

Boolean

Whether or not the file exists – a bit metaphysical – but there are reasons for having this. Populated by readTree(), updated by delFiles()

true

size

Integer

Size in bytes as returned by filesize(). Populated by readTree()

14219

mtime

Integer

Unix timestamp of last modified. Populated by readTree()

1296866170

type

String

File (f), Directory(d) or something else (o)

- something else includes symbolic links. Populated by readTree()

f

w

Boolean

Whether the file is writeable. Populated by readTree()

true

r

Boolean

Whether the file is readable. Populated by readTree()

false

name

String

This is not stored as an attribute – the path relative to the base directory is used as the key for an array holding the other attributes, however glob filtering can be performed by passing a template with a name member (see later).

*.jpeg

Conditionally populated

Attribute

Meaning

Example

cmp

Integer

The result of the last (tree) compare operation – 0 if same in both trees, 1 if different and the file in this tree is newer, -1 if different and te other file is newer, 2 if the file was not found in the other tree. Note that this does not detect the case where a file exists in the other tree but not in this one. Populated by compareTree()

-1

can_w

Boolean

Whether the file or directory can be written to another location.

Populated by canWriteTo

true

can_d

Boolean

Whether the file or directory can be deleted

Populated by canDelete()

false

md5

The value returned by md5_file() for a file, or in the case of a directory, the md5 hash of the string representing the path of the directory relative to the base dir – i.e. md5($effPath);



This is only populated by compareTree when files exist in both trees and are the same size

a5b4ea05ea755c5af59f6ecdfde9d38