Keywords within your code

An introduction to our language...

The interpreter you will be writing will accept a rudimentary form of code. Statements consist of several tokens, separated by a space. The first token always identifies the command that is going to be used:

COMMAND parameter1 parameter2

Some examples:

MSG Hello!
ADD 1 2

The basic program skeleton

Start up Visual Basic, and on a new form create a large, multiline textbox called txtCode. This is going to be the area where code is typed in, ready to be executed. Below the text box, set up a list box, call it "out". This will serve as a console to which we can output execution progress and any debug info.

Now, lets add a few global variables:

' For execution
Dim LineNum As Integer
Dim Code() As String
Dim Tokens() As String

The first is a variable to keep track of which line of code we're current reading. The next two are arrays to store a) a list of every line of code and b) a list of the tokens in each line of code.

Now create a subroutine called Parse() and a Subroutine called MainPass(). The code for each is below:


'Parses a complete program
Sub Parse()

MainPass

End Sub
' Performs a pass through the code executing each statement
Sub MainPass()

' Reset counter
LineNum = 0

' Load code into array in 1-line chunks
Code = Split(txtCode.Text, vbCrLf)

' Display each line
While LineNum <= UBound(Code)
    CodeString = Code(LineNum)
    out.AddItem CodeString
    LineNum = LineNum + 1
    DoEvents
Wend

End Sub

The subroutine Parse simply calls MainPass, which splits the code from the text box into 1 line chunks. The while loop then spits these out into the list box.

Now create a button, give it the caption "Run" and modify its OnClick event to call Parse. Now type some code spread over multiple lines into the box and click the button - your code should have been duplicated in the list box.

Tokenising

The next step is to split each line of code into its component parts - a command, followed by any number of parameters. Each is separated by a space ' ' - and so using a similar method to the above we can split each line up into individual tokens. Note how I'm splitting this up into functions and subroutines, to make the code as tidy and readable as possible:

(add a new declaration: TokCount as integer - you've already declared the array Tokens() above)

' Splits a line of code into its component parts
Sub Tokenise(strCode As String)

Tokens = Split(strCode, " ")
TokCount = 0

End Sub
' Gets the next token
Function Token() As String

'Exit if there are no more tokens
If TokCount > UBound(Tokens) Then Exit Function

' Return next token
Token = Tokens(TokCount)

' Increment count
TokCount = TokCount + 1

End Function
Now we have the functions to tokenise a string and retrieve the next token, let's modify MainPass to be retrieve the first token (the keyword), and output it:
' Performs a pass through the code executing each statement
Sub MainPass()

Dim CodeString As String

' Reset counter
LineNum = 0

' Load code into array in 1-line chunks
Code = Split(txtCode.Text, vbCrLf)

' Display each line
While LineNum <= UBound(Code)
    CodeString = Code(LineNum)
    
    ' Tokenise
    Tokenise CodeString
    
    ' Get next token
    NextToken = Token()
    
    out.AddItem NextToken
    LineNum = LineNum + 1
    DoEvents
Wend

End Sub

With that in place, run the app and type a few lines, making sure to have a mix of lines with a few words and a some with just one word. The code should then output a list of all of the FIRST words in each line.

Acting on commands

It should be relatively straightforward now to work out how to react to certain command we give it. To keep it simple, I'll add two commands to our program - a messagebox command and a command to terminate the program. Modify MainPass as so:

' Performs a pass through the code executing each statement
Sub MainPass()

Dim CodeString As String

' Reset counter
LineNum = 0

' Load code into array in 1-line chunks
Code = Split(txtCode.Text, vbCrLf)

' Display each line
While LineNum <= UBound(Code)
    CodeString = Code(LineNum)
    
    ' Tokenise
    Tokenise CodeString
    
    ' Get next token
    Key = LCase(Token())
    
    ' Do something
    Select Case Key
    
        ' Act on the messagebox command
        Case "msg"
        MsgBox "Hello!"
        
        ' Act on the end command
        Case "end."
        LineNum = UBound(Code)
        
    End Select
    
    LineNum = LineNum + 1
    DoEvents
Wend

End Sub

Now run the program again, giving the program the following input:

msg

You should see it give you a message? Now type

msg
end.

The same should happen. Now type

msg
end.
msg

You should only get one message - as the end command tells the program to skip to the last line of the code - thus causing the interpreter to exit the While loop and stop execution.

A word about parameters

Hopefully it should be clear how to get the compiler to recognise and use parameters now. Let's modify the MSG command to recognise a word and say it:

        ' Act on the messagebox command
        Case "msg"
        MsgBox Token()

and that's it!

While I'm not worrying about intricacies now, if you want the MSG command to recognise the rest of the line and say it, here's the modification:

    ' Output
    Case "msg"
        FinalOut = ""
        Do
        Phr = Token()
        If Phr = "" Then Exit Do
        FinalOut = FinalOut + " " + Phr
        Loop
        MsgBox FinalOut

 That's the end of this tutorial - next time I will introduce memory addresses to store data, as well as variables.

Download project files

The project files can be downloaded here