Here's one of my most recent scripts using the Win32::Registry module to read values and data from the registry for stored information provided for the current status of your system including Windows install information.

I'm trying my best to understand what you're trying to do here but as I can read what you're saying now i'm interpreting that you want to save a list of all of the links to a file (indicated by the html element "href"; meaning you want href="THIS") and also the text to be saved to the file from inside of the title html element (title="THIS"). Or both title and the alt html tags??

Here's my latest version of the script anyway, right now the decode subroutine is just there, but doesn't serve a purpose, it's a placeholder in case I find a way to use it for something later.

What this does is it will add separators, and add links line by line to an output file. Next time the script runs it will append onto the existing data within the file from before (hence why we need separators). The separators also display for which link the internalHTML data was derived from to make things easier.

This one I decided to experiment with a different regex modified by myself. This one uses matches for http and instead of using http:// because there MAY be a link (or may not) for https:// and it won't catch that if I specify the full thing. so it looks for http, displays the data AFTER it, and when we write to the file I add "http" to the beginning again so it's there on display. That way if it finds an http secure (https) link, it will parse data as s://www.somelink.somedomain, and when we add http to the beginning it will give the full link as https://www.somelink.somedomain.

LWP::Simple is the module we're using here to get the source of the webpage with the built in get() function. If you want to display directly pipelined to the console, the string of the full source output, you can use the getprint() function for that.

Instead of having to modify the script every time for a different link I can probably modify this for you as well to use input ARGS as a variable on input at interpret "runtime" or when the script first "loads" for the $url input.

Edit: Not sure what the heck is wrong with my regex script though. Some displays go past the " and into another HTML element, stopping at all sorts of different places.

I personally didn't like using /href=\"(.*)\"/ at all because you can get a link like href="#", and that's completely useless.