Remove duplicate words from String in Java example

Remove duplicate words from String in Java example shows how to remove duplicate words from String in Java. Example also shows various approaches to remove duplicate words from String in Java.

How to remove duplicate words from String in Java?

Consider below given string value.

1

Stringstr="The first second was alright but the second second was tough.";

Below given example shows how to remove duplicate words from String.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

packagecom.javacodeexamples.stringexamples;

importjava.util.Arrays;

importjava.util.LinkedHashSet;

publicclassRemoveDuplicateWordsStringExample{

publicstaticvoidmain(String[]args){

Stringstr=

"The first second was alright but the second second was tough.";

System.out.println("Original String: ");

System.out.println(str);

/*

* Since the words are separated by space,

* we will split the string by one or more space

*/

String[]strWords=str.split("\\s+");

//convert String array to LinkedHashSet to remove duplicates

LinkedHashSet<String>lhSetWords

=newLinkedHashSet<String>(Arrays.asList(strWords));

//join the words again by space

StringBuilder sbTemp=newStringBuilder();

intindex=0;

for(Strings:lhSetWords){

if(index>0)

sbTemp.append(" ");

sbTemp.append(s);

index++;

}

str=sbTemp.toString();

System.out.println("String after removing duplicate words: ");

System.out.println(str);

}

}

Output

1

2

3

4

Original String:

The first second was alright but the second second was tough.

String after removing duplicate words:

The first second was alright but the tough.

Since our string contained words separated by a space, we first split the string by one or more space. Once we had all the words in form of String array, we converted the String array to LinkedHashSet using asList method of Arrays class. Since Set does not allow duplicate elements, duplicate words were not added to the LinkedHashSet. Once we had all the unique words in LinkedHashSet, we joined them with space to create a string without duplicate words.

As we have seen the longer and descriptive version of the program above, it’s time for the shortcut. Above given complete program can be rewritten in just one line.

Just one line. We clubbed creation of LinkedHashSet from the List of words which is created from splitting the string by space. Instead of iterating the Set, we used toString method which returns the set elements in “[e1, e2, e3, ..]” format. Since “[“, “]” and “,” were not needed, we replaced them with empty string.

How to remove duplicate words from String using Java 8?

If you are using Java 8, you can do that in just one line of code as given below.

Final words: Why we used LinkedHashSet and not HashSet? Well, the simple reason is HashSet does not maintain the order of the elements. So if we used HashSet to store the words, we could not guarantee the order of the words in sentence once we joined them back using space. LinkedHashSet maintains the order so after removing duplicates, we got the words in same order.

About the author

rahimv

rahimv has over 15 years of experience in designing and developing Java applications. His areas of expertise are J2EE and eCommerce. If you like the website, follow him on Facebook, Twitter or Google Plus.