I want to download a file (only download if the url ends with a certain string), which URL is in a MySQL database, extract the text and write the text to the database, by replacing the url.
And this have to be done for every row (approx. 25k rows).

Is there a way to do this with a bash script (or something else e.g Java)?

The MySQL server runs on my linux root, I have full access to it. My desktop OS is also linux (Linux Mint with many customizations).

What did you try so far and where exactly did you get stuck? Can you query the DB? Can you save the URLs to a file? Can you download the target? Can you insert data from a text file in the DB?
–
MarcoSep 28 '12 at 16:37

I'm confused as to what is it that you are trying to replace. Text in the URL or text in the file?
–
KarlsonSep 28 '12 at 16:39

I want to replace the url in the db with the content of the file.
–
LeandrosSep 28 '12 at 16:41

There are a number of problems with your script but the most important one is that you are blindly trusting data from a remote website. Even a simple unescaped '; in the txt could update ALL records in the database. and much worse could be done if the remote data was designed to be malicious rather than just accidental. I strongly recommend you look into using Perl and DBI or Python and SQLAlchemy or similar for database work rather than trying to DIY in bash. In particular, you want to use placeholders for all variables being passed to SQL select/insert/update/etc queries.
–
casSep 29 '12 at 9:18

as an added bonus, both perl and python have good web robot libraries to help automate downloading and extracting data from remote URLs.
–
casSep 29 '12 at 9:20

1 Answer
1

To be safe, you have to look at the generated, if code can be injected.

In this command mysql -B -ss -uuser -ppassword -e "USE table; SELECT column FROM table WHERE pid='$i';", you know that i is a number between 1 and 5 so it's ok.

But in that command mysql -B -ss -uuser -ppassword -e "USE table; UPDATE table SET column = '${line:6}' WHERE pid = '$i';", you cannot be sure of ${line:6}; you can imagine it contains a single quote, in this case you have to replace by two single quotes, you can also check the length is less than the length of the column in database (let's suppose here 255) or truncate the data, before replacing quotes.