If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Syntax Analyzer

I'm wanting to create an interpreter for the language brain****(mask picks it up but the last for letters rhymes with duck) and I'm wanting to make this as efficient as I can. Right now I am on the syntax analysis phase, I skipped the lexical analysis because there really is no need for it considering the language is so small and the syntax will never change. This is what I currently have:

Code:

Private Function Parse(ByVal source As String) As XDocument
Dim parentNode As New XElement("brain****")
Dim commandNode As XElement = Nothing
Dim index As Integer = 0
If index < source.Length AndAlso source(index) = "]"c Then
'If the source code starts with a closed bracket, inform the user of the syntax error
Throw New SyntaxErrorException("The source code cannot start with a closed bracket.")
Else
'Loop while there is still source code to parse
Do While index < source.Length
'Parse the current source code
commandNode = ParseCommand(source, index)
If commandNode IsNot Nothing Then
If commandNode.Name.LocalName = "command" OrElse commandNode.Value.Length > 0 Then
'Only add commands or non-empty loops
parentNode.Add(commandNode)
End If
Else
'If the current lexeme is a ']' then inform the user of the syntax error
Throw New SyntaxErrorException("Bracket mismatch. There are more closed brackets then there are open brackets.")
End If
Loop
End If
Return New XDocument(New XDeclaration("1.0", "utf-8", "yes"), parentNode)
End Function
CONST nonLoopCommands As String = ",.<>-+"
Private Function ParseCommand(ByVal source As String, ByRef index As Integer) As XElement
Dim terminalCommand As XElement = Nothing
Dim lexeme As Char = source(index)
If lexeme = "["c Then
'If the current lexeme is a '[' then start to parse the <loop> terminal and consume the lexeme
index += 1
If index < source.Length Then
lexeme = source(index)
terminalCommand = New XElement("loop")
If lexeme = "]"c Then
'If the current lexeme is a ']' then we have a blank <loop> terminal, consume the lexeme
index += 1
Else
'If the current lexeme is not a ']' then continue parsing the <loop> terminal
Dim terminalCode As XElement = Nothing
'Loop while there is still source left to parse and while something was returned
Do
'Set the current value of the <command> terminal
terminalCode = ParseCommand(source, index)
'If the returning <command> terminal is not a blank loop then...
If terminalCode IsNot Nothing AndAlso terminalCode.Value.Length > 0 Then
'Add it to the <loop> terminal
terminalCommand.Add(terminalCode)
End If
Loop While index < source.Length AndAlso terminalCode IsNot Nothing
'If the last parsed terminalCode was succesfully parsed then inform the program of the syntax error
If terminalCode IsNot Nothing Then
Throw New SyntaxErrorException("Bracket mismatch. There are more open brackets then there are closed brackets.")
End If
End If
Else
'Inform the programmer of the syntax error
Throw New SyntaxErrorException("The source code cannot end with an open bracket.")
End If
ElseIf lexeme = "]"c Then
'If the current lexeme is a ']' then simply consume the lexeme
index += 1
ElseIf nonLoopCommands.IndexOf(lexeme) <> -1 Then
'If the current lexeme is any of the non-loop commands(,.<>-+) then create a new <command> terminal with the lexeme as the value and consume the lexeme
terminalCommand = New XElement("command", lexeme)
index += 1
Else
'If all else fails then inform the user of the syntax error
Throw New SyntaxErrorException(String.Format("'{0}' is an invalid command symbol.", lexeme))
End If
Return terminalCommand
End Function
Private Function FormatSource(ByVal source As String) As String
'Define the commands to keep
Dim commands As String = nonLoopCommands & "[]"
'Declare something to hold the source code to be returned
Dim builder As New Text.StringBuilder(source.Length - 1)
'Loop through each character in the source code
For index As Integer = 0 To source.Length - 1
Dim lexeme As Char = source(index)
'Check if the current character is one of the commands to keep
If commands.IndexOf(lexeme) <> -1 Then
'If so then append it to the builder
builder.Append(lexeme)
End If
Next
Return builder.ToString()
End Function

It is a recursive descent parser that returns an XDocument to represent the abstract syntax tree. Is there anything that y'all notice that I could make this more efficient?

My next phase is optimization where I intend on remove repeating nodes and adding an attribute on the repeating node indicating how many times it repeated. For example:

Code:

>++++++++[<+++++++++>-]

Would build a parse tree with 9 command nodes followed by a loop node with 12 command nodes. Where as the optimization tree would have 2 command nodes followed by a loop node with 4 command nodes, basically transforming it to:

Re: Syntax Analyzer

So, I was able to reduce the size of the code, although I'm unsure if this makes the code more efficient or not. Instead of a recursive descent parser, because of the simplicity of brainf**k's syntax, I was able to roll out a simple brute force parser:

Code:

Private Const commands As String = ",.<>-+"
Private Const openLoop As Char = "["c
Private Const closeLoop As Char = "]"c
Private Function Parse(ByVal source As String) As XElement
Dim parent As New XElement("brain****")
Dim currentParent As XElement = parent
For Each command As Char In source
If commands.IndexOf(command) <> -1 Then
currentParent.Add(New XElement("command", command))
ElseIf command = openLoop Then
Dim loopCommand As XElement = New XElement("loop")
currentParent.Add(loopCommand)
currentParent = loopCommand
ElseIf command = closeLoop AndAlso currentParent.Parent IsNot Nothing Then
currentParent = currentParent.Parent
ElseIf command = closeLoop Then
Throw New SyntaxErrorException("Bracket Mismatch. There are more close brackets than opened.")
Else
Throw New SyntaxErrorException("Invalid Command. '{0}' is an invalid brain**** command lexeme.")
End If
Next
If currentParent IsNot parent Then
Throw New SyntaxErrorException("Bracket Mismatch. There are more open brackets than closed.")
End If
Return parent
End Function