Skip to content

A Search: Best Practices for Massive Amounts of Mysterious Images?

By: sr178@duke.edu

Here’s a scenario that I imagine is not entirely uncommon: Users have been putting photos onto various shared drives for years. Probably a few thousand images, many without clear labels, a few different folder systems, and no tags. So, how do you tackle this?

I’ve had discussions with a few people on how best to handle this, some of what was suggested:
1) Leave the archive as a wild no-man’s land. Focus on making an easy system for organizing/images images moving forward.
2) Cherry pick the most useful seeming folders, organize them, forget the rest (a friend’s advice)
3) Run all public images through Picassa, tag moving forward, and tag past images when convenient with a bit of assistance from the facial recognition (my current tentative plan)
4) Bite the bullet and manually go through all the images

Was just wondering if anyone at Duke had dealt with this successfully, and/or had suggestions for tools and best practices for working through this. (Note: This was previously posted to WebCom, though I feel some folks here might have suggestions as well). Thoughts?

Categories: DDMC Info

One comment

  1. I have well over 47,000 images in my home collection. Don’t ask. 🙂 It was like a brilliant light shining upon me when I switched to Google’s Picassa. It’s free, it can be run locally and it has amazing facial recognition software built in that made tagging my family super easy. That might be a solution for you to upload into that if identifying people in your images is important. It also has an experimental “search by color” which is interesting. I think the web based version also allows you to integrate with Google Image Search which will find visually similar photos to make tagging easier…

Leave a Reply

Your email address will not be published. Required fields are marked *