Skip to content
This repository has been archived by the owner on Aug 17, 2023. It is now read-only.
/ very-basic-BASIC Public archive

shell, tokenizer, parser and interpreter for a custom programming language

Notifications You must be signed in to change notification settings

ficnawode/very-basic-BASIC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overview

Shell, tokenizer, parser and interpreter for a custom programming language, loosely inspired by BASIC. More of a proof of concept than a tool of any sort.

Setup

The makefile uses clang, but the code is compatible with gcc, simply switch the value of the CC variable in the Makefile to be equal to gcc instead of clang. To compile the project, simply manouver into the main directory on your computer and use make:

make

Then, to enter the shell (only supported operation mode at the moment, no file inputs yet) type in

./bin/exe

How it works

Since this is a relatively big project for me, I first built it to work on simple expressions and then expanded the ability in steps. The program consists of a lexer, parser and interpreter.

The Lexer takes streams of characters as input and outputs a stream of tokens. The tokens then go into the parser: a hand-built top-down recursive descent parser with error handling, which then in turn outputs an AST (Abstract Syntax Tree). Lastly, the nodes of the AST are visited by the interpreter and executed left-to-right accordingly.

The output of every program is either a number, being the return value of the last expression executed, or a null character if the last operation return type is null (a function declaration for instance).

Quick disclaimer: the hash symbol ('#') signifies end of file for the shell. It tells the program when to stop reading user input.

Arithmetic

I started by adding some arithmetic expressions and making sure the precedence was all right.

Screenshot 2022-02-06 at 18 16 32

Added some parentheses.

Screenshot 2022-02-06 at 18 18 03

Also works with the unary minus operator.

Screenshot 2022-02-06 at 18 22 18

I then added some logical operators

Screenshot 2022-02-06 at 18 19 28

note: true, false are built-in keywords, signifying 1 and 0 respectively

And with those logical operators came the comparison operators. Those return either 1 or 0 so they can be used in expressions, like so:

Screenshot 2022-02-06 at 18 24 59

Typing

Everything within the language is an integer or a float and the interpreter can switch (one-way) between them and can tell them apart when it needs to. For instance:

Screenshot 2022-02-06 at 18 31 18

However note that integer division is still possible (and default):

Screenshot 2022-02-06 at 18 32 07

And an error wil be thrown upon encountering a float in a mod equation:

Screenshot 2022-02-06 at 18 34 49

Variables

A variable is declared with the var keyword and accessed via the variable name. All variables are mutable and must be initialized at declaration.

Screenshot 2022-02-06 at 18 38 06

Variables are stored in variable tables at runtime which belong to a given context, meaning a variable within a function cannot be accessed outside of that function. Same goes for if-else statements and loops (more on that later).

If-else statements

An if-statement is started with the if keyword, followed by a condition, a 'then' keyword and then some statements in brackets (optional if the entire if-statement is inline). An elif keyword can be used next (also followed by the then keyword) and an else keyword.

Screenshot 2022-02-06 at 18 45 35

An inline if-statement can be used without brackets, permitting only one expression after each 'then' keyword.

Loops

Only one type of loop is permitted: the while loop. It must be supplied with a condition and a statement to execute if that condition is true.

Screenshot 2022-02-06 at 18 49 40

Functions

Functions are declared with a 'func' keyword, function name, arguments in parentheses, followed by an arrow and some statements in braces.

Screenshot 2022-02-06 at 18 52 23

Error management

The error management system is divided into two main parts: the warning section and the error section. Warnings show up without stopping the program, while errors result in relaunching the shell, effectively never finishing the task which led to the error. Each stage of interpreting has its own error and warning types. Below are some common examples.'

Lexer warning

Screenshot 2022-02-06 at 17 28 26

Lexer error

Screenshot 2022-02-06 at 17 46 17

Parser error

Screenshot 2022-02-06 at 17 48 45

Runtime error, traceback

In this example, we get a runtime division by zero error.

Screenshot 2022-02-06 at 17 52 25

The traceback feature gets more sophisticated than that though.

Screenshot 2022-02-06 at 18 07 38

And that's pretty much the whole thing. Feel free to play around with it, though I consider the project to be over since the code would require (and isn't worth) a rewrite for any substantial improvement. It's served its educational/entertainment purpose, now I'm on to do another one, hopefully better this time.

About

shell, tokenizer, parser and interpreter for a custom programming language

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published