Sixty-Five Software, Inc.
DownloadHome


 
You are here:
  Home
 
Quick Links:
 

 

SpaceMonger: About Treemaps


About Treemaps in General

In case you don't already know, SpaceMonger's rectangle-layout system is not unique; it uses a technique called treemaps that were originally developed at the University of Maryland's Human-Computer Interaction Laboratory by Dr. Ben Shneiderman, all the way back in 1989. Dr. Shneiderman is a well-known force in user-interface-design circles. Treemaps have found their way into a few different products, but for no readily apparent reason, they don't really seem to have made it very far outside academia.

We are pleased to say that we have received permission from Dr. Shneiderman to use his wonderful treemap concept in SpaceMonger. SpaceMonger doesn't use any of his code or anyone else's code, and it doesn't use his layout algorithms either, but it does use the concept. Thank you for your permission, Dr. Shneiderman!

You can learn more about treemaps on his very complete Treemaps Page.


How SpaceMonger Implements Treemaps*

So how does SpaceMonger's treemapping algorithm work? While we can't give you the whole details in a short space, we can give you an overview of it. The text below is somewhat technical in nature, but any second-year computer-science student should be able to follow it without too much difficulty.


The scanner and in-memory database

SpaceMonger maintains the disk's data as a tree of ScanFile and ScanFolder objects in RAM. A ScanFolder can be thought of as an array that stores multiple ScanFiles. Each ScanFile object contains information about a file, such as its name and size and type, and a pointer to a possible deeper ScanFolder.

ScanFolders, in addition to storing an array of ScanFiles, also store a sum of the sizes of all the data under them. During the scan, or when data changes, SpaceMonger updates higher ScanFolder objects when their descendants get added, changed, or removed to ensure that every ScanFolder in the tree is always valid for the known data.

SpaceMonger maintains a "watch" on the disk (using the ReadDirectoryChangesW API), updates the tree when the disk changes, and notifies anything displaying parts of the disk (the treemap, the pie chart, etc.) that they need to update their displays accordingly.


The treemapper

SpaceMonger's treemapper is broken into three parts: The layout engine, the renderer, and the widget. The layout engine is responsible for determining where files should be positioned in the treemap. The renderer is responsible for drawing files and folders in the treemap. The widget is repsonsible for accepting and processing user input, such as mouse clicks and key presses.

(This design is vaguely model-view-controller-esque, in that the model is the tree itself, the view is jointly managed by all three parts of the treemapper, and the widget is entirely responsible for the controller aspects of the design.)

These three components represent roughly 4,200 lines of code, so again, we'll only give a broad overview to its design, mainly focusing on the layout engine (the renderer and the widget have a lot of details, but are conceptually fairly straightforward).

The layout engine is responsible for apportioning the space in the display to files and folders. SpaceMonger does this using a pair of recursive algorithms. Our tests determined that a single good general-purpose algorithm beats special-purpose algorithms, so SpaceMonger uses the same algorithms to divide up space no matter how that space is structured.

The layout algorithm is embodied in a function named LayoutFolder, which is given a folder from the tree and the coordinates of the folder** in the display. If the folder is the root of the tree, then the coordinates are those of the entire display.

LayoutFolder starts by sorting the files within the folder by size (it uses a counting sort and radix sort to do this especially quickly, but any sorting algorithm would work). It then takes the sorted files and passes them and the folder's coordinates to a function named DivideDisplayArea. DivideDisplayArea does the majority of the hard layout work. If there's only one file or less, the file obviously consumes all of the folder's space. If there are two or more, though, its task is a bit more complicated.

DivideDisplayArea takes all of the files it's given, and, using a greedy algorithm, divides them into two lists, A and B, whose total sizes are nearly equal***. Then the available area is divided in half, with each sub-area proportional to the total size of each list that must fit in it. This split is done horizontally if the available area has a high aspect ratio, and vertically if it has a low aspect ratio. The remaining partial lists and partial areas are then fed back into DivideDisplayArea, which will eventually recurse until each area only has one file to place inside it.

When every file has been positioned, DivideDisplayArea returns back to LayoutFolder, which then goes through each of its file again and calls LayoutFolder on it if the file is actually a folder. This, then, causes every file to be fully positioned in the display.

Once a file is positioned for rendering, rendering it to the display is a relatively simple phase that can be performed after layout. Rendering is left as an exercise for the reader.


Pseudocode for a SpaceMonger-like treemapping engine

The algorithms described above can be approximated by the following (rather lengthy) pseudocode. This pseudocode is written in a vaguely C-ish style, but should be fairly comprehensible to any programmer familiar with C, C++, Perl, Java, Javascript, or PHP (and is likely to be at least somewhat readable to fans of Pascal and Ada too).


// Compute the complete layout for a folder and all of its
// descendants.
function LayoutFolder(folder_data, folder_rectangle) {

    // Sort the folder's children by their sizes, largest first.
    // We leave the result in a separate array along with the
    // sizes of the files.  If you wanted to lay out the data
    // by a different sizing technique (by logical size, physical
    // size, count, etc.), this would be the place to assign
    // "true" sizes to each entry.
    sorted_array = new array(folder_data.num_children);
    sort(folder_data.children, sorted_array);

    // Divide up the display area for the children.
    DivideDisplayArea(sorted_array, folder_rectangle,
        0, folder_data.num_children,
        folder_data.total_size_of_descendants);

    // Go within each child, and if it's a folder, lay it out too.
    foreach (child in folder_data.children) {
        if (child is a folder) {
            area_for_children = child.rectangle;

            // If you want to "steal" pixels from the children
            // for the folder (for example, to display the
            // folder's title), simply shrink area_for_children
            // here before calling LayoutChildren().

            LayoutChildren(child, area_for_children);
        }
    }
}

// DivideDisplayArea fully positions a group of children in
// the given rectangle.  It is very recursive.
function DivideDisplayArea(children, area_rectangle,
    first_child, num_children_to_layout, totalsize) {

    if (num_children_to_layout <= 0) {
        // Easy case:  No files to position.
        return;
    }

    if (num_children_to_layout == 1) {
        // Equally easy case:  Only one file to position.
        children[first_child].rectangle = area_rectangle;
    }

    // General case:  Position two or more files.  Start by
    // creating two imaginary empty lists.  We will eventually
    // move all the files from the original array to these lists
    // such that the two lists are of approximately equal size.
    // There are a variety of ways to do this; below, we'll
    // show the method that produces the 'squarified' layout
    // used in default installs of SpaceMonger.
    children_in_list_a = 0; size_a = 0;
    children_in_list_b = 0; size_b = 0;

    // Add the first file to the first list.  Because the incoming
    // list is sorted by size, we know this will always be the
    // largest child.
    size_a = children[first_child].size;
    children[first_child].list = LIST_A;
    children_in_list_a++;

    // Now the fun part.  We want to try to make the two lists
    // approximately equal in size.  We do this by linearly
    // searching for the midpoint of the sorted list such that
    // a few large files on one side are approximately equal
    // in size to many small files on the other side.

    // Walk through the files and find the "center of gravity"
    // of the data.  Each file before the "center of gravity"
    // will be put into list A.
    for (child = first_child + 1;
        child < first_child + num_children_to_layout
            && (size_a + children[child].size) * 2 < totalsize
            && children[child].size > 0;
        child++) {
		size_a += children[child].size;
        children[child].list = LIST_A;
        children_in_list_a++;
    }

    // Keep walking through the list and mark the rest of the
    // files as being for list B.
    for (; child < first_child + num_children_to_layout;
        child++) {
        size_b += children[child].size;
        children[child].list = LIST_B;
        children_in_list_b++;
    }

    // Now copy the files from the original array into
    // two temporary arrays.  These will both still be
    // sorted by size.
    temp_array_a = new array(children_in_list_a);
    temp_array_b = new array(children_in_list_b);
    dest_a = 0;
    dest_b = 0;
    for (src = first_child;
        src < first_child + num_children_to_layout; src++) {
        if (children[src].list == LIST_A)
            temp_array_a[dest_a++] = children[src];
        else temp_array_b[dest_b++] = children[src];
    }

    // The temporary arrays really aren't necessary,
    // so now that we've "unsorted" our data into their
    // separate lists, we can move it back to the
    // original array.  Then we can free the temporary
    // arrays (note that if you're clever, you can share
    // the temporary arrays across multiple invocations
    // of DivideDisplayArea, thus avoiding allocating
    // or freeing them at all.)
    dest = first_child;
    for (src = 0; src < dest_a; src++)
        children[dest++] = temp_array_a[src++];
    delete temp_array_a;
    for (src = 0; src < dest_b; src++)
        children[dest++] = temp_array_b[src++];
    delete temp_array_b;

    // We can now divide up the available area into two
    // parts according to the lists' sizes.
    x = area_rectangle.x;
    y = area_rectangle.y;
    width = area_rectangle.width;
    height = area_rectangle.height;

    if (size_a + size_b <= 0) {
        // Degenerate case:  All size-zero entries.
        midpoint = 0;
        orientation = HORIZONTAL;
    }
    else {
        // If the available area is wider than it is tall,
        // split it vertically.  Otherwise, split it
        // horizontally.  The position of the split is
        // proportional to the sizes of the two lists.
        if (width >= height) {
            midpoint = (size_a * width) / totalsize;
            orientation = HORIZONTAL;
        }
        else {
            midpoint = (size_a * height) / totalsize;
            orientation = VERTICAL;
        }
    }

    // Once we've split, we recurse to divide up the two
    // new areas further, and we keep recursing until
    // we're only trying to fit one entry into any
    // given area.  This way, even size-zero entries will
    // eventually be assigned a location somewhere in the
    // display.  The rectangles below are created in
    // (x, y, width, height) format.

    if (orientation == HORIZONTAL) {
        DivideDisplayArea(children,
            Rectangle(x, y, midpoint, height),
            first_child, children_in_list_a,
            size_a);
        DivideDisplayArea(children,
            Rectangle(x + midpoint, y, width - midpoint, height),
            first_child + children_in_list_a, children_in_list_b,
            size_b);
    }
    else {
        DivideDisplayArea(children,
            Rectangle(x, y, width, midpoint),
            first_child, children_in_list_a,
            size_a);
        DivideDisplayArea(children,
            Rectangle(x, y + midpoint, width, height - midpoint),
            first_child + children_in_list_a, children_in_list_b,
            size_b);
    }
}

Conclusion

SpaceMonger uses many, many optimizations on these techniques to ensure that the layout is performed very, very fast. Some of those optimizations are Sixty-Five trade secrets, so we can't release them. Others include avoiding copies in the loops above by using pointers; caching size lookups in the sort arrays so that sizes can be sampled from a variety of sources (logical, physical, count, etc.); and splitting off the free space and system space specially so that they are always separate from the rest of the display. But even without all these optimizations and tricks, the basic algorithm above will produce a treemap layout that's very similar to that of SpaceMonger's, even if it's not as fast or as flexible as SpaceMonger's.


Footnotes

* Why are we releasing information about how our treemap algorithm works? Well, we believe in the free exchange of ideas, for one thing; and for another, we have benefited from public programming documents ourselves. It would be hypocritical then to build on knowledge we learned from the public and not return at least some of our work to the public. You're welcome to use any of the information you see above any way you see fit, if it benefits you.

** LayoutFolder, in SpaceMonger, is not actually given the coordinates of the folder because it doesn't need them; instead, it's merely given the aspect ratio of the folder. But most typical implementations would probably want to include the coordinates of the folder instead.

*** It's impossible for DivideDisplayArea to do this splitting perfectly. First, it's very unlikely that the lists can be made to have equal sizes. And second, it's been shown that finding the "most equally-sized sublists" is a problem that's NP-complete. Thus SpaceMonger uses a greedy algorithm that, while not always perfect, usually achieves very, very good results.


 


© 2014 Sixty-Five Software, Inc.
All rights reserved.

Privacy Policy and Terms of Use