florium
v1.0.5
Published
This is a library that lets you build your own programming language!
Maintainers
Readme
florium
This is a library that lets you make your own programming language that can be compiled to machine code!
Installing
Type this to install the package:
npm install floriumTokenizing
Tokenizing is basically splitting your text code into pieces that mean something; something like integers, words, strings, comments, symbols etc.
This is super easy to do because all these come out of the box!
Example:
import {Tokenizer, CharacterList} from "florium";
const myTokenizer = new Tokenizer();
myTokenizer.add(
// "//" is the text that will start the comment and it will end with "\n"
buildCommentTokenizer({
"//": "\n"
}),
// introduce the symbolic tokens
buildSymbolTokenizer({
"=": "set",
"(": "open-parenthesis",
")": "close-parenthesis",
"[": "open-square-bracket",
"]": "close-square-bracket",
"{": "open-curly-brace",
"}": "close-curly-brace",
";": "semicolon",
"\n": "line"
}),
buildRepeatingTokenizer({
type: "string",
// so our strings can start with either ' or "
start: ["'", '"'],
// this means our string ends with the thing it started with
end: [".start"],
// this allows you to do things like this: " I can do this: \" "
escape: ["\\"],
// this means it will throw an error if the string isn't closed and the file is over
fileEndThrow: true
}),
// a basic definition of an integer is (anything from 0 to 9) repeating
buildBasicRepeatingTokenizer("integer", CharacterList.Integer),
// a basic definition of a word is (anything from a to z and A to Z) repeating
buildBasicRepeatingTokenizer("word", CharacterList.Word),
// these characters will be ignored
buildIgnorantTokenizer([
" ", "\t", "\r"
])
);Using the tokenizer
You can use the tokenizer right away with the tokenize function:
import {Tokenizer} from "florium";
const myTokenizer = new Tokenizer();
// assuming you added your tokenizers
const code = "...";
console.log(myTokenizer.read(code));AST (Abstract Syntax Tree)
After tokenizing your code, you will want to convert a series of tokens into even more meaningful things like defining variables, if statements, for loops, while loops, function definitions, function calls etc.
In this case, the library gives you have a couple options!
Let's just make a syntax that adds variable definition.
First, let's realise what variable definitions actually are:

As you can see what actually is happening is, it's following these instructions:
- First find a word token
- Now assure that it has an equals sign after it
- Now assure that there is at least 1 thing after that, which isn't a semicolon
- Now continue collecting all tokens until it hits a semicolon
And that's it!
Now that we know what actually is going on, let's code it!
Syntax Language
The explained 4 instructions in the previous header can be done like this in the syntax language:
label:name,type:word = label:expression,type:*,min:1,max:infinity ;Okay, now let's see what this syntax means:
label:name,type:word
First, let's realise that terms are separated by a space so that's why this is the first term.
Second, let's realise that every term's instruction is separated by a comma.
Which means that the current term's label is name and the type we expect is
word! Great!
This means when the job is done, it will assign .name to this Token.
= and ;
I'm pretty sure this one is quite clear, it just expects that character.
label:expression,type:*,min:1,max:infinity
This part is probably the part where you went "Woah, wait a minute!".
Let's dive right into it, first as you can see it's labeled as the expression.
Second, as you can see the type is set to * which means anything can be here!
And lastly, min and max determines how many things can we have here, which has a minimum of 1 and maximum of, well, infinity.
The more you know!

You can negate the instruction term by putting a ! at the beginning of the
instruction term. Example:
type:wordmeans it has to be a word!type:wordmeans it can be anything except for a word
You can recursively call the AST:
- Why? Because let's say you have a variable definition:
a = 10 + (b = 10)- You would expect it to run the instructions on the expression that sets the value
of the a, which is
10 + (b = 10). - Well you can do this by using the
joborjobAll(they are different). - Usage:
:word
There are aliases to everything! Here's a list:
type:word=t:word=:word- Determines the type of the tokenvalue:anything!=v:anything!=anything!- Determines the value of the tokenend:=#:- This will pass if there are no tokens left in the current groupspace:=s:(confused face) - This puts a space( )colon:=d:(smiley face) - This puts a colon(:)comma:=c:(happy face) - This puts a comma(,)line:=n:(talking face) - This puts a line break(\n)min:10=>:10=>=:10- This determines what is the minimum number of the current token there has to be in order to validate the current instructionmax:10=<:10=<=:10- This determines what is the maximum number of the current token there has to be in order to validate the current instructionlabel:anything=l:anything- Determines where the current instruction term will be saved in the result AST
Using the Syntax Language with a file
This option of AST depends on a file. Let's first start by creating that
file! Name it anything you want, it can be something like language.syntax
Every line of this file will be considered as a syntax of your language.
You can import your syntax file like this:
import {AST} from "florium";
AST.fromFile("./language.syntax");Defining the AST definition
You can define your AST by using the @label key:
@label: MyASTAdding instructions
You can add instructions like this:
@label: MyAST
echo: echo label:text,type:string;Pushing instructions through ASTs
You can use the @push key to do this:
@label: MyEchoAST
echo: echo type:string;
# now it's switching to another AST
@label: MyNewAST
print: print ( type:string )
# this pushes all instructions of MyEchoAST to MyNewAST
@push: MyEchoASTUsing comments
If the line is empty or starts with the character #, it will be ignored.
Example:
# I have serious comments >:cUsing the Syntax Language in code
Same rules apply.
import {AST, ASTSyntax} from "florium";
const ast = new AST("MyAST");
ast.syntaxes.push(
ASTSyntax.fromText(
"print",
`print ( type:word )`
)
);Using the API
Same rules apply.
import {AST, ASTSyntax} from "florium";
const ast = new AST("MyAST");
ast.syntaxes.push(
new ASTSyntax("create_variable")
.type("word", o => o.labelAs("name"))
.type(["set", "set-add", "set-subtract", "set-multiply", "set-divide", "set-modulo"])
.any(o => o
.labelAs("expression")
.min(1).max(Infinity)
.assignJob("ExpressionAST")
)
.type("semicolon")
);Using the AST
Important: You have to first add your tokenizers to your Tokenizer class
then run the AST!
Now you can run the AST by using the .read(code: string) function.
Usages:
import {tokenize, groupTokens} from "florium";
const myAst = new AST;
// set up your ast and tokens
const code = "...";
console.log(myAst.read(code), myTokenizer);
console.log(myAst.read(code, myTokenizer.read(code)));
console.log(myAst.read(code, groupTokens(myTokenizer.read(code)), false));