Last Accessed Date

Have an idea for a new feature or a way to make SM better? Propose it here!

Last Accessed Date

Postby wumpheno on Tue Feb 26, 2008 2:38 pm

I dont really know if this is already a part of the spacemonger product but it would really be great to see date last accessed of files and then be able to prune files based on both size and access date. Also if it could also be ported into the reports page that would be even better. It would make a really great buisness tool for sys admins. Again very new to the product but I love the features already in it.

Thanks
wumpheno
 
Posts: 1
Joined: Tue Feb 26, 2008 2:35 pm

Re: Last Accessed Date

Postby seanw on Wed Feb 27, 2008 3:07 am

wumpheno wrote:I dont really know if this is already a part of the spacemonger product but it would really be great to see date last accessed of files and then be able to prune files based on both size and access date. Also if it could also be ported into the reports page that would be even better.

Well, it is, but it isn't. Internally, SpaceMonger actually does collect all three of the creation time, access time, and modification time, but it only presents to the user the modification time. The reason for this is that while we do intend to include those two as visible attributes in the future, (as a general rule) the creation time and last-access time are rarely accurate. Let me give you some examples as to why:
  • The standard Windows thumbnail preview causes the last-access time to be updated for a file. So the mere act of viewing a directory full of photos in the Windows Explorer causes every single one of them to have their last-access time updated to today, regardless of the time the photo was actually last accessed.
  • Many simple operations, like right-clicking on a file and clicking "Properties..." can cause the last-access time to be updated, even if Windows isn't supposed to do that. A wide number of third-party property-page extensions open the file to examine its contents, and that's all it takes.
  • There are many other nearly-invisible ways the last-access time can be updated, and often is. Generally, the last-access time is not so much a viable measure of "what's been worked with recently" as it is a measure of "which files are such incredibly old detritus that no-one cares about them".
  • The creation time is often wrong, too. When you unzip a .zip file, or extract contents from a .tar file or .rar file, the file's modification time is usually updated to show the time stored in the archive --- but the creation time isn't, and is set to the time when you unpacked the archive. So, for example, it's not unusual to have a file with a modification time of July 2003, but with a creation time of March 2007 --- despite how nonsensical that may seem.

Given the general unreliability of these two parameters, it didn't seem worthwhile to spend the time to implement a more sophisticated set of user-interfaces to work with them. I realize that in some specialized circumstances they can be useful, but for anything smaller than a large corporate network server, they tend to be pretty useless.

But like I said, SpaceMonger does collect the data: It simply doesn't report it in the current version. If you can drum up enough demand, these can be included in a future release, but right now, the argument for adding them doesn't seem to me to be very strong.
seanw
Administrator
 
Posts: 773
Joined: Mon Oct 10, 2005 2:58 pm
Location: Pennsylvania, USA

Postby ppass on Thu Feb 28, 2008 9:33 pm

I am personnaly not very interested in "Last accessed date", sorry.


However, I wish that Spacemonger could show many more columns (see my separate post "Add "Author" field to the file selection list").

I use explorer2 lite as a file explorer, and you can display what ever column you like. Among interesting columns, I usually use:
- author (for Office documents)
- duration (for music files)
- date taken, camera type (for photograph)


Image

I must admit that displaying those columns does slow down the software a bit, but it remains reasonable speed for normal use.


I tend to view SpaceMonger as a visual file explorer for the whole drive, with advanced search features. So I wish that I could access the same fields with SpaceMonger as is possible with explorer2 lite.

Ideally, I would love to be able to make such a query inside SpaceMonger:

"Show me all photos taken between 2007/1/1 and 2007/3/1 with 'Canon' in the camera type".
or
"Show me all Office documents with author 'Spacemonger'"
ppass
 
Posts: 224
Joined: Sun Feb 05, 2006 3:05 am

Postby seanw on Mon Mar 03, 2008 4:20 pm

ppass wrote:I tend to view SpaceMonger as a visual file explorer for the whole drive, with advanced search features. So I wish that I could access the same fields with SpaceMonger as is possible with explorer2 lite.

Ideally, I would love to be able to make such a query inside SpaceMonger:

"Show me all photos taken between 2007/1/1 and 2007/3/1 with 'Canon' in the camera type".
or
"Show me all Office documents with author 'Spacemonger'"

Well, the question, then, is how slow is acceptably slow?

SM can only display information it knows, which makes sense. So, for example, during the initial scan of the disk, SM has to collect (at a minimum) file sizes, which are necessary to render the treemap. Certain other attributes that are useful (and that are easy to collect) are also collected at the same time: Modification date, for example.

However, there is a division in Windows itself between attributes that are "easy" to collect and attributes that are "difficult." "Easy" attributes are stored as part of the directory information itself: For example, it costs no extra time during the scan to collect the file's size and modification date, because those are provided by the OS along with the filename. However, the "difficult" attributes are stored externally: They are either stored in alternate streams, or as part of the file data itself.

This means that to simply extract "camera type" from every image during the scan would cut the speed of the scan to about 10% of what it currently is --- maybe slower. (Why? Every file that might contain "camera type" --- or any other nonstandard attribute --- has to be opened (slow), searched through (slow), and fed to each possible Explorer extension that might provide these kinds of columns based on the file data (very, very, VERY slow). In contrast, right now, SpaceMonger never opens the file or accesses its data until you request a file preview, and contains a number of optimizations to even avoid moving the disk head during the scan if necessary.)

Alternatively, SpaceMonger could choose to not collect this information during the initial scan, and could collect it only when you do a search or display the file in the selection bar. That would push the "slow" part to a later stage of the program, so the initial scan would be fast, but complex searches --- like "find by camera type" --- could be incredibly slow, and displaying this information in the selection bar might take some time as well (but you're used to it appearing after a delay in other programs and in the Windows Explorer, I suspect, so that's not that unusual).

So how slow is acceptable? I don't think it's worth cutting the speed of the scan to a tiny fraction just to collect data that most people aren't likely to use. While it's nice to be able to perform searches through the extended attributes, I don't consider massive cuts in performance to be a justifiable tradeoff for the ability.
seanw
Administrator
 
Posts: 773
Joined: Mon Oct 10, 2005 2:58 pm
Location: Pennsylvania, USA

Postby ppass on Wed Mar 05, 2008 12:19 am

This means that to simply extract "camera type" from every image during the scan would cut the speed of the scan to about 10% of what it currently is --- maybe slower.

Increasing the scan time by *only* 10% looks impressive. I thought that opening each file, reading non standard attributes and closing the file would multiply the scan time by a factor of at least 3 or 4.

Still, you are right, you should not penalize the user who is happy with the current attributes (modified date, etc) with a longer scan time. I totally agree with you, and I agree with your idea of postponing the scanning of those attributes to when the user launches a search on a specific attribute. This is fine.

How slow is acceptable? Well, I guess that it should take less than 30 seconds on a reasonably powered computer (1GB ram), with total file number roughly 100,000.

Side question: could you make it possible to define a user-slice related to those attributes? Example: files with author field = "ppass". In this case, obviously, you will have to scan the attributes during the 1st scan.

Note: when you have scanned those attributes, could you make them available as a column inside the file selection list?
ppass
 
Posts: 224
Joined: Sun Feb 05, 2006 3:05 am

Postby seanw on Thu Mar 06, 2008 8:28 am

ppass wrote:
This means that to simply extract "camera type" from every image during the scan would cut the speed of the scan to about 10% of what it currently is --- maybe slower.

Increasing the scan time by *only* 10% looks impressive. I thought that opening each file, reading non standard attributes and closing the file would multiply the scan time by a factor of at least 3 or 4.

I think you misunderstood what I wrote, probably because I didn't explain it well: What I meant was that the efficiency of scanning a single file could drop to 10% or less --- each file could 10 times longer to scan, or more. The additional head-seek time alone to read a single attribute would probably raise the scan time to 2 or 3 times what it currently is. So a scan that currently takes 3 minutes could take a half hour, or more. I don't consider that acceptable performance for the marginal benefit gained.

ppass wrote:Side question: could you make it possible to define a user-slice related to those attributes? Example: files with author field = "ppass". In this case, obviously, you will have to scan the attributes during the 1st scan.

Note: when you have scanned those attributes, could you make them available as a column inside the file selection list?

Again, the same performance and memory issues apply. Your "one minute" of scan time to scan 100,000 files is likely to be more like "five minutes" or "ten minutes" if you have that information enabled.

_______________________________


That said, there are ways to mitigate this problem for the common case of installing SpaceMonger on a single computer (and not on a network), and I'm considering those solutions for the distant design of SM 3.0, but given that SM 2.2 isn't even out yet, that's still a long way off.
seanw
Administrator
 
Posts: 773
Joined: Mon Oct 10, 2005 2:58 pm
Location: Pennsylvania, USA


Return to Feature Requests

Who is online

Users browsing this forum: No registered users and 1 guest

cron