The professional, friendly Java community. 21,500 members and growing!

The Java Programming Forums are a community of Java programmers from all around the World. Our members have a wide range of skills and they all have one thing in common: A passion to learn and code Java. We invite beginner Java programmers right through to Java professionals to post here and share your knowledge. Become a part of the community, help others, expand your knowledge of Java and enjoy talking with like minded people. Registration is quick and best of all free. We look forward to meeting you.

Stop word removal and stemming

I would like to know how to code a program which will remove stop words and perform stemming on the given input file eg: minutes of a meeting.I'm new to Java so not getting any ideas on how to program in java.

Re: Stop word removal and stemming

I'm doing a project in opinion mining. The basic idea is to identify human behavior based on the interactions in a meeting eg proposal of an idea, comment, acknowledge. I plan to give few minutes of meetings as input and in the first module have to preprocess the data using stop word removal, stemming and POS tagging. say for example "This is a short sentence "
This DT
is VBZ
short JJ
sentence NN.
I want to use pos tagging mainly to identify the names of people in the meetings. Hope you get an idea on what I'm trying to explain?

Re: Stop word removal and stemming

See the Scanner class for methods that can be used to read data from a single file.
If you want to read all the files in a folder, you need to use the File class to get a list of the files in the folder and then use the Scanner class to read each file in the list.

Re: Stop word removal and stemming

sir what i want is to read information from different word documents which are stored in a folder individually. eg: the folder XYZ will have 4 word documents a.txt,b.txt, c.txt, d.txt. can you please help me to write a code so i can read from this folder

Re: Stop word removal and stemming

The File class has methods that return a list of the files in a folder. You can use that list to read the files in the folder one at a time.

The steps are:
get a list of the files in the folder
begin loop
get next file in the list
read data from that file
end loop

Ok sir. can you code in java.
I'm also getting an exception in this line BufferedReader br = new BufferedReader(new FileReader("stopwords.txt"))
it is :java.io.FileNotFoundException: stopwords.txt (The system cannot find the file specified). I'm using netbeans where do i save the txt file

Re: Stop word removal and stemming

The system cannot find the file

The program can not find the file in the location where you are looking. To find where the program is looking for the file, create a File object for the file you are trying to read and print that File object's absolute path value which will show you where the program is looking for the file.

I don't know what your IDE does with the location of files when a program tries to read a file.

Re: Stop word removal and stemming

i do not know how to code in java can some one help me to read data from folder instead of this sentence
"for(String word : Split.words("The rain in spain falls mainly on the plains, except when it's not exactly working that way! And I need+some= way. How~ will \"(this)\" \"work\"?"))
System.out.println(word);"