lexer.js

v0.2.6

Published

4 years ago

A lexer and longest common sequence finder (between source code files)

0High
0Medium
0Low

abhisheksoni27

Lexer Lexical Analysis Longest common substring

lexer.js

A lexical analyzer and longest common shared sequence finder between a list of JS files.

What it does

Suppose, you have two files with the same function but different function calls:

test1.js

function add(a, b){
    return a + b;
}

const sum = add(11 + 11);

test2.js

function add(a, b){
    return a + b;
}

let sum = add(11 + 11);

The longest common shared sequence between these two files is the entire function definition.

LCSS

function add(a, b){
    return a + b;
}

Results

Installation

CLI

You can download the module via npm. (To install npm, which ships with node.js, you can download node from nodejs.org for your OS.) node.js > 6.x

$ npm install -g lexer.js

Or, if you prefer yarn

$ yarn global add lexer.js

That's it. 🎉

As module

In your application:

const lexerJS = require('lexer.js');

const result = lexerJS(files, options);

files: An array of files.

options: See options This parameter is (ironically) optional.

Running lexer.js

To run it on your own set of files, you can either provide the files in CSV/JSON, or as command line arguments like this:

lexer.js test1.js test2.js

JSON configuration

The JSON must have a key named files and it's value should be an array of the paths of files you want to test on.

{
    "files": [
        "./example/test1.js",
        "./example/test2.js",
    ]
}

CSV configuration

The CSV config file only has one header (or column) and is called filename. Each new line should contain the path of a source code file.

filename,
./example/test1.js,
./example/test2.js

Result

The result contains the longest common shared sequence found between the set of files. The default format is JSON, but can be configured. (See options below.)

It also asssigns a score to each subsequnce using the following formula:

score = log2(count) * log2(total)

count: Total number of occurences of the subsequence

total: Total number of tokens in the subsequence

Options

`-o` Output Mode `[default: "json"]`

lexer.js supports JSON as well as CSV output. JSON is the defualt output format if you do not specify any during invocation.

lexer.js test.json -o csv

Output would now be a CSV file. To know what that file would contain, check out result.

`-s` Save Tokens `[default: false]`

This is a boolean option, which when set, saves the tokens for each test file.

lexer.js test.json -s

It will generate a tokens folder, and save individual tokens for each file in that directory.

`-f` Output File Name `[default: "result"]`

lexer.js test.json -f YayTheResultsYay

Note: If you provide a file name with extension, such as art.json, then the output mode will be determined from the fileName and the output mode flag ( if passed) will be overriden.

Running Examples

The examples directory contains a minimal example set that you can run lexer.js on. To do so, clone the repo, fire a terminal, and run:

npm install

This will downdload the dependencies. Then, run:

lexer.js test.json

This assumes that you already have lexer.js installed. If you don't, you can directly invoke the node script as follows:

node index.js test.json

As always, you must have node installed.

Test GitHub project

The repo also contains a script to test lexer.js on any GitHub project. The script does the following:

Find all JS files in a project.
Select a file which has more than n commits. n is configurable.
Downloads the file at that point in time (when that commit was made).
Generates a configuration file for lexer.js.
Run lexer.js with that config file.

To run it, fire a terminal and run (assuming you are inside the project directory):

node runGitHubExamples.js --owner OWNERNAME --repo REPONAME -n 20

OWNERNAME: Owner of the repo [default: prettier]

REPONAME: Name of the repo [default: prettier]

n: Minimum commits the selected file must have [default: 10]

The results are saved in result.json. The command line options for lexer.js can also be passed.

Tests

To run tests, clone the repo (That green button above the repo contents) and run the following command:

npm install && npm run test

This will first download the dependencies, and then run the tests (using mocha) and output the result.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

lexer.js

Table of Contents

What it does

Installation

CLI

As module

Running lexer.js

JSON configuration

CSV configuration

Result

Options

-o Output Mode [default: "json"]

-s Save Tokens [default: false]

-f Output File Name [default: "result"]

Running Examples

Test GitHub project

Tests

`-o` Output Mode `[default: "json"]`

`-s` Save Tokens `[default: false]`

`-f` Output File Name `[default: "result"]`