If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below. ** If you are logged in, most ads will not be displayed. **

need advice on cleaning out unused files in a web server...

i am on the verge of making some (drastic) changes to my company's webpage that was built before i got here... i have replaced everything i want to with new updated content... now i have gotten the go ahead to go through and destroy everything old, and uneeded. aside from some *.swf buttons, is there a way i could map out every link and path to every file currently in use,
">" it in a file, then "grep" out "ls" for all other files not in the map so that i can delete them??? i know dreamweaver (which i use) has a feature called a "link checker". would that produce a sufficient map of every file that my web page needs to function???

i guess first step is finding a reliable way to map out the "used" and "essential" files recursively from /. then i can figure out another way to filter out the leftovers.

Maybe wget can help. It's intended to copy pages and their links, so maybe if you wget your files out of the current directory (without dead links), you will be able to clean house and then wget them back in. If you try this, experiment first to make sure it works for you.

Wget can follow links in HTML and XHTML pages and create local versions of remote web sites, fully recreating the directory structure of the original site. This is sometimes referred to as ``recursive downloading.''

When you say you want to "destroy everything old, and uneeded" I understood that to mean all of the stuff that no longer is linked to anything you want to keep. So I was thinking that you could wget your home page, and if that's the top of everthing you want to keep, everything should be copied. I've only used wget for downloading from the internet, so maybe it can only move stuff from a web server. But after you've moved the good stuff from your web server, you can move it back (post housekeeping) by a normal copying method. If all of your links are relative (and wget can make them so), that should do it. If you have external links, you'll want to make sure that wget doesn't go outside of your domain.

There are plenty of ways to screw up downloads, as I've learned, but I think with a little effort, it might do what you need. As for flash buttons, if you mean linked files in your web directory, then yes, I think those will be copied same as if they were html pages. I've been wrong before, though.