Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!

Notices

Welcome to LinuxQuestions.org, a friendly and active Linux Community.

You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!

Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.

If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.

Having a problem logging in? Please visit this page to clear all LQ-related cookies.

Introduction to Linux - A Hands on Guide

This guide was created as an overview of the Linux Operating System, geared toward new users as an exploration tour and getting started guide, with exercises at the end of each chapter.
For more advanced trainees it can be a desktop reference, and a collection of the base knowledge needed to proceed with system and network administration. This book contains many real life examples derived from the author's experience as a Linux system and network administrator, trainer and consultant. They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own.

I want to create a list of ascii/text/xml files for one folder recursively. I know I can use the find command to list all the files and file command to check the file type. But I am not sure how to combine both and create a list of only text/ascii/xml files excluding the data, binary etc types.

I want to create a list of ascii/text/xml files for one folder recursively. I know I can use the find command to list all the files and file command to check the file type. But I am not sure how to combine both and create a list of only text/ascii/xml files excluding the data, binary etc types.

Please guide me through.

Regards,
Suhas

Simple search - all files in the current directory NO recursion.

Code:

file *|grep text|grep -v OpenDocument

lists all ascii/text/xml/html/shell script/ascii english text (like ".csv" files) on STDOUT.
The "-v" option to grep is saying "not these". So, for example - if you don't want the english text or the UTF-8 ascii files you could add some more pipes and exclusions.

You know that "find" will recurse from wherever you tell it. So, you could use find to create a temporary list (in /tmp for example) and then iterate through that list in a simple shell script -

Said shell script could then be given a name e.g. - homework.sh and then you could execute the script and redirect it's standard output to a file which would then contain your results.

Code:

./homework.sh >results

BTW - Some people view "Geek" as a derogatory term. Perhaps it would be better to address your requests "Hi helpful people". Or if you felt that too presumptuous - you could put "possibly" in parenthesis prior to the word "helpful".
dave

4) file is, in my experience, notoriously unreliable when it comes to detecting text files. It often mistakenly detects simple text as being, say, a lisp script. It has something to do with the way it parses the beginning of the file to determine it's "magic" type. At the very least try using the -i option and parse the mime-type info instead.

5) Instead of using grep, try generating a simple list of file names, then loop through the list, testing the output of file for each one directly. Perhaps something like this:

P.S. I don't think you'll find many people here who see "geek" as being derogatory. Indeed, "geek pride" is on the rise, as subjects like gaming and comics that were once nerdy are now mainstream, and it's becoming clearer to everyone that those who really know the tech are the ones who hold the reins.