In order to compare various languages, I will be implementing a anagram search program in a variety of languages and testing and analyzing the resulting code. This entry describes the VB.NET entrant into that discussion. A naive anagram search program will typically have dismal performance, because the “obvious” brute force approach of looking at every word and comparing it to every other word is a dismal one. The more accepted algorithm is to use a Dictionary or Hashmap, and map lists of words to their sorted versions, so ‘mead’ and ‘made’ both sort to ‘adem’ and will be present in a list in the Hashmap with that key. A psuedocode representation:

Loop through all words. For each word:

create a new word by sorting the letters of the word.

use the sorted word as a key into a dictionary/hashmap structure, which indexes Lists. Add this word (the normal unsorted one) to that list.

When complete, anagrams are stored in the hashmap structure indexed by their sorted version. Each value is a list, and those that contain more than one element are word anagrams of one another.

The basic idea is to have a program that isn’t too simple, but also not something too complicated. I’ll also admit that I chose a problem that could easily leverage the functional constructs of many languages. Each entry in this series will cover an implementation in a different language and an analysis of such.

The Visual Basic.NET implementation I came up with looks like this:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

Imports System.IO

Imports System.Text

Imports System.Collections.Generic

Module anagrams

'Anagrams Application

'Written By:Michael Burgwin forhttp://bc-programming/blogs

'entry into Anagrams Performance/language comparisons.

Public Function SortChars(sumstring As String) As String

Dim sortme = sumstring.ToList()

Dim sorted = From f In sumstring Order By Char.ToLower(f)

Return New String(sorted.ToArray())

End Function

Sub Main(Args As String())

Dim Anagrams As New Dictionary(Of String, List(Of String))

Dim usefilename As String = "D:\dict.txt"

Dim StartTime = Now

Using fDictionary As New StreamReader(New FileStream(usefilename, FileMode.Open))

Do While Not fDictionary.EndOfStream

Dim LineRead = fDictionary.ReadLine().ToLower()

'get sorted version.

Dim SortedVersion=SortChars(LineRead)

IfNotAnagrams.ContainsKey(SortedVersion)Then

Anagrams.Add(SortedVersion,NewList(Of String))

EndIf

Anagrams(SortedVersion).Add(LineRead)

Loop

EndUsing

Dim Duration=Now-StartTime

Console.WriteLine("Calculation Time:"&Duration.ToString())

Console.ReadKey()

EndSub

EndModule

Performance? Compiling this with vbc /optimize /debug- resulted in an average run-time of around 1.3 seconds. That’s a bit over twice as slow as C#. So, some C# pundits might go, “Ha! Well that proves C# is faster than C#, doesn’t it, Case closed we can all go home”

Not Exactly.

The slowdown, I suspect, is from the use of Linq:

1

2

3

4

5

PublicFunctionSortChars(sumstring AsString)AsString

Dim sortme=sumstring.ToList()

Dim sorted=FromfInsumstring Order Byf

ReturnNewString(sorted.ToArray())

EndFunction

This is different from the C# implementation of the same method, which uses the Sort() method of List. Let’s ‘port’ that C# implementation to VB.NET and see how that affects the performance:

1

2

3

4

5

PublicFunctionSortChars(sumString AsString)AsString

Dim sortme=sumString.ToList()

sortme.Sort(Function(ca,cb)ca.CompareTo(cb))

ReturnNewString(sortme.ToArray())

EndFunction

It now runs in 1.1 seconds. But it turns out this is partly due to a bug in the C# implementation, since it didn’t account for mixed case comparisons. The VB.NET implementation I have lowercases each string input. the C# version uses an iterator method; even changing the source to take account of casing, it still runs in 0.7 seconds, so we have some ground to make up. Let’s start by re-implementing the iterator method from the C# version. Here is the new implementation. it doesn’t actually add an iterator Function because the function in C# just returned an IEnumerable anyway, so we just have a similar function and use that, but it gives the same effect:

The Execution Time of this version hovers around .62 seconds- .2 seconds faster than the C# implementation, which is fairly surprising. Obviously, this doesn’t mean that VB.NET is faster than C# either; just means that this particular implementation is faster. It also Highlights something I did not expect, which is that the language used really does affect speed; not because one language is innately faster, but because you write code differently in different languages and they compile to different IL in many cases. I was expecting nearly zero difference between C# and VB.NET, which is why I didn’t originally create a VB.NET implementation- and yet here we have it performing .2 seconds faster than the C# version with what appears to be equivalent code.