npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

javascript-clone-detection

v0.6.0

Published

Academic study project on duplication of Javascript code using AST syntactic analysis

Downloads

9

Readme

JavaScript Clone Detection - (v0.6.0)

Academic study project on JavaScript code duplication using AST parsing with text similarity.

Usage

Run:

make init
clone-analisys <PATH> <SIMILARITY INDEX>
// clone-analisys src/api-server 0.85

Current Process

We select a piece of code to convert it into an Abstract Syntax Tree (AST) representation. Then, the cleaning and normalization phase is carried out, in which we remove unwanted attributes and apply a standardization between similar structures, such as the example of an arrow function for a regular function.

// the both code snippets are characterized as type 2 clone

const arrowFunction = (value) => {
  const { type } = value
  return type
}

function regularFunction(value) {
  // this is a regular function
  const { type } = value
  return type
};

To perform a representation of code snippets in AST, we have good libraries like:

| Library | Version | |----------------------------------------------------------------------------------|:-------------:| |espree | 7.3.1 | |@babel/parser | 7.14.7 | |abstract-syntax-tree | 2.19.1 |

In this project we are using abstract-syntax-tree because it is a library that offers greater facilities to manipulate an AST.

Similarity between ASTs

To perform the comparison between ASTs, even in this current version, we had two options, namely: i) Comparison between pure ASTs where we only have the return if they are identical or not, or; ii) Convert the ASTs to text (string) and use libraries that check the textual similarity between the code snippets.

| Library | Version | Type | |------------------------------------------------------------------------|:-------------:|:-----------------:| |ast-compare | 2.1.0 | Compare ASTs | |string-similarity | 4.0.4 | Compare strings | |string-comparison | 1.0.9 | Compare strings |

The decision to compare ASTs directly seems to be the most coherent decision, but so far lib ast-compare can only identify whether the pieces are identical or not. In this scenario, using the representation of Abstract Syntax Trees still gives us the advantage of being a uniform and easy-to-manipulate representation for pre-processing and normalizations, in addition to transforming it into text so that it can be compared as a textual element.

Results

Using the code snippets examples above, we have:

No pre-processing and normalization

ast-compare:  false
string-similarity (Dice):  0.925351071692535
string-comparison (Cosine):  0.9672041516493517
string-comparison (Levenshtein):  0.9072164948453608
string-comparison (Longest Common Subsequence):  0.9357933579335793
string-comparison (Metric Longest Common Subsequence):  0.9337260677466863

With pre-processing and normalization (v.0.3.1)

ast-compare:  true
string-similarity (Dice):  1
string-comparison (Cosine):  1
string-comparison (Levenshtein):  1
string-comparison (Longest Common Subsequence):  1
string-comparison (Metric Longest Common Subsequence):  1

To learn more about the issues addressed, read: ESTUDO EMPÍRICO SOBRE DUPLICAÇÃO DE CÓDIGO EM APLICAÇÕES REACT.JS.