npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

genehood-cli

v0.2.9-1

Published

Command line interface to generate GeneHood dataset.

Downloads

29

Readme

genehood-cli

npm License: CC0-1.0 pipeline status coverage report

Command-line interface to generate GeneHood datasets.

Dependencies

Genehood needs nodeJS version 10+ and ncbi-tools+ version 2.6+ to run.

Install

npm install -g genehood-cli

Usage

GeneHood uses MiST3 API to collect the necessary information needed for the analysis. Thus, the only inputs required from the user are:

  • A list of reference genes,
  • how many upstream and downstream genes should be in the analysis.
  • Phylogenetic analysis in Newick format (optional)

Reference Genes

GeneHood reads a list of reference genes from the user and searches for the upstream and downstream information from those genes on MiST3.

For this reason, GeneHood uses the MiST3 standard for gene identifiers called stable id.

It is a composite of the NCBI genome version and the locus number of the gene.

Here are some examples:

| MiST3 stable id| description | |-|-| |GCF_000005845.2-b4355| Chemoreceptor tsr(b4355) from Escherichia coli str. K-12 substr. MG1655 | |GCF_000006765.1-PA5040| Secretin pilQ(PA5040) from the Pseudomonas aeruginosa PAO1 | |GCF_000006905.1-CC_2066 | part of L-ring flgH(CC_2066) from Caulobacter crescentus CB15 |

Performing GeneHood analysis

Once Genehood-cli is installed globally (-g option), NPM generates an executable called: genehood.

genehood takes one argument as the name of the project (in this example myNewProject) and a mandatory --action flag with four possible values:

| value | description | |-|-| |init| Initializes the configuration and data file for the project | |run | Starts a new run from an existing configuration file | |keepGoing | It restarts a run from the last successful step of the analysis pipeline | |cleanUp| Delete the temporary files generated by GeneHood|

Step 1: Initialize the project

To start a new analysis, we must initialize a new project.

genehood myProject --action init

This command will generate two files:

  • myProject.geneHood.config.json
  • myProject.geneHood.data.josn.gz

Now, we must edit the config file to tell GeneHood to which genes it should collect gene neighborhood information.

Step 2: Edit the config file to set initial parameters

genehood-cli version 0.2.8 has flags to facilitate this process, see below.

There are several parts in the GeneHood config file, but what matters is under the section user. There we will find three sub-sections:

| section | description | |-|-| |settings| This is where all the input data goes | |newickTree| This is where we should add a Newick tree (optional) | |startingStep | For advanced users if they want to start from a different step other than the default | |stopStep | For advanced users that don't want to run the entire pipeline |

Let's focus on the settings section first. It has three sub-sections that need user input:

| section | description | |-|-| |stableIds| This is where we will add reference genes using MiST3 stable identifier | |upstream| Integer of how many genes should be collected upstream from the reference gene | |downstream | Integer of how many genes should be collected downstream from the reference gene | |geneHoodPrefix | This is pre-filled with the name of the GeneHood project. |

For example, let us add as reference genes the _cheA_s from the three chemosensory systems in the Vibrio cholerae:

|system |stable Ids| |:-:|-| |F6|GCF_000006745.1-VC2063| |F7|GCF_000006745.1-VCA1095| |F9|GCF_000006745.1-VC1397|

and also, let us include 15 genes upstream and 15 downstream from the reference genes.

To do that, we can edit the config file using any text editor.

The user section of the config file will be something like this:

"user": {
 "newickTree": "",
 "settings": {
  "downstream": 15,
  "geneHoodPrefix": "vibrio",
  "stableIds": [
   "GCF_000006745.1-VC1397",
   "GCF_000006745.1-VC2063",
   "GCF_000006745.1-VCA1095"
  ],
  "upstream": 15
 },
 "startingStep": "fetchData",
 "stopStep": ""
}

Save the file and proceed to the next step.

Step 2 (alternative): Set parameters using flags.

We can set the genes downstream and upstream using --addRange

We can add the identifiers to a text file (one identifier per line) and pass to genehood using the flag --addStableIds.

If we put the identifiers into a file named vibrioIds.txt, we can accomplish the same setup as before by typing:

genehood myProject --addRange 10 10 --addStableIds vibrioIds.txt

Step 3: Running GeneHood

Make sure we have an Internet connection and that blastp and makeblastdb are executables in our systems.

then run:

genehood myProject --action run

That is it. GeneHood should do all the rest.

Step 4: Clean up

If everything goes as expected, we should have a file called myProject.geneHood.pack.json.gz in our directory. It probably should have a bunch of other files that GeneHood used temporarily.

We can safely remove these temp files using the action cleanUp from genehood:

genehood myProject --action cleanUp

GeneHood cleans all the files but 2: the config file and the pack file. It is a little redundant since GeneHood's pack also contains the config file. We made it this way to facilitate for the user to see how they ran the analysis or to re-run the analysis with few changes in the config file, if needed.

Now we just need to visualize the data.

Optional step 4.5: Add Phylogeny

We can add a phylogeny (in Newick format) to the config file at any moment, and the genehood-cli API has a helper option: --addPhylogeny. If we add the phylogeny after the pack has been built, genehood-cli will repack the file for us.

Adding phylogeny will let the viewer to order the gene clusters following the order of the phylogenetic tree. The tree can be built in any way: single gene, multiple concatenated genes and etc. However, in order for the viewer to work the names of the leafs need to be exactly the same as the identifiers of the reference genes.

To add a new phylogeny:

genehood myProject --addPhylogeny myPhylogeny.nwk

Step 5: Load the data on genehood.io

Open the GeneHood on a web browser and load the myProject.geneHood.pack.json.gz.

Now just explore the data.

To learn more about the GeneHood viewer, go to genehood.io and click in Demo.

Developers Documentation

Developer's Documentation

... to be continued.

Written with ❤ in Typescript.