NeatSeq-Flow: A Lightweight Software for Efficient Execution of High-Throughput Sequencing Workflows.

NeatSeq-Flow logo
https://readthedocs.org/projects/neatseq-flow/badge/?version=latest https://img.shields.io/badge/License-GPLv3-blue.svg https://img.shields.io/github/last-commit/sklarz-bgu/neatseq-flow.svghttps://anaconda.org/levinl/neatseq-flow/badges/downloads.svg

Important links

What is NeatSeq-Flow?

NeatSeq-Flow is a platform for modular design and execution of bioinformatics workflows on a local computer or, preferably, computer cluster. The platform has a command-line interface as well as a fully functional graphical user interface (GUI), both used locally without the need to connect to remote servers. Analysis programs comprising a workflow can be anything executable from the Linux command-line, either publicly available or in-house programs. Ready-to-use workflows are available for common Bioinformatics analyses such as assembly & annotation, RNA-Seq, ChIP-Seq, variant calling, metagenomics and genomic epidemiology. Creation and sharing of new workflows is easy and intuitive, without need for programming knowledge. NeatSeq-Flow is general-purpose and may easily be adjusted to work on different types of analyses other than high-throughput sequencing.

NeatSeq-Flow is fully accessible to non-programmers, without compromising power, flexibility and efficiency. The user only has to specify the location of input files and the workflow design, and need not bother with the location of intermediate and final files, nor with transferring files between workflow steps. Workflow execution is fully parallelized on the cluster, and progress can be inspected through NeatSeq-Flow “terminal monitor”. All workflow steps, parameters and order of execution are stored in one file, which together with the shell scripts produced by NeatSeq-Flow comprise a complete documentation of the workflow and enable future execution of the exact same workflow or modifications thereof.

Read more about NeatSeq-Flow.

Available Modules and Workflows

NeatSeq-Flow comes with a basic set of modules, marked here with an asterisk (*).
The complete set of currently available modules and workflows is downloadable from GitHub.
Installation and usage instructions, along with full documentation of the modules and workflows, are available at NeatSeq-Flow’s Module and Workflow Repository.

Quick Start:

Installing Using Conda will install NeatSeq-Flow with all its dependencies in one go:
  • First if you don’t have Conda, install it!

  • Then in the terminal:

    1. Install mamba and Create the NeatSeq_Flow conda environment:

    bash
    conda install conda-forge::mamba
    mamba create -n NeatSeq_Flow -c bioconda -c conda-forge levinl::neatseq-flow
    
    1. Activate the NeatSeq_Flow conda environment:

    bash
    source activate NeatSeq_Flow
    
    1. Run NeatSeq_Flow_GUI:

    NeatSeq_Flow_GUI.py --Server
    
    1. Use the information in the terminal:

      https://github.com/bioinfo-core-BGU/NeatSeq-Flow-GUI/raw/master/doc/NeatSeq-Flow_Server.jpg
      • Copy the IP address to a web-browser - (red line)

      • A login window should appear

      • Copy the “User Name” (blue line) from the terminal to the “User Name” form in the login window

      • Copy the “Password” (yellow line) from the terminal to the “Password” form in the login window

      • Click on the “Login” button.

    2. Managing Users:
      • It is possible to mange users using SSH, NeatSeq-Flow will try to login by ssh to a host using the provided “User Name” and “Password”.

      • The ssh host can be local or remote.

      • Note: If using a remote host, NeatSeq-Flow needs to be installed on the remote host and the analysis will be run on the remote host by the user that logged-in

    NeatSeq_Flow_GUI.py --Server --SSH_HOST 127.0.0.1
    
    1. For more option:

    NeatSeq_Flow_GUI.py -h
    

Install on Windows with WSL:

On Windows 10 version 2004 and higher (Build 19041 and higher) or Windows 11 it is possible to use both Windows and Linux at the same time on a Windows machine. NeatSeq-Flow can be install on the Windows Subsystem for Linux (WSL):

  • First install Linux on Windows:

  • Open the Windows PowerShell Terminal as administrator:

    wsl --install
    
  • Open the Ubuntu app/terminal: [it is possible to use the Microsoft Store to install Ubuntu]

    1. Install Conda, Make sure to type YES for “Do you wish the installer to initialize Miniconda3 by running conda init?”:

    wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
    sh Miniconda3-latest-Linux-x86_64.sh
    
    1. Install mamba and Create the NeatSeq_Flow conda environment:

    bash
    conda install conda-forge::mamba
    mamba create -n NeatSeq_Flow -c bioconda -c conda-forge levinl::neatseq-flow
    
    1. Activate the NeatSeq_Flow conda environment:

    bash
    source activate NeatSeq_Flow
    
    1. Activate SSH service:

    sudo apt install openssh-server
    sudo ssh-keygen -A
    sudo service ssh start
    
    1. Run NeatSeq_Flow_GUI:

    export WSL_IP=$(ip addr show eth0 | grep inet | awk '{ print $2; }' | sed 's/\/.*$//' | head -n 1)
    echo $WSL_IP
    NeatSeq_Flow_GUI.py --Server --HOST $WSL_IP
    
  • Open the Windows PowerShell Terminal as administrator:

    1. Run this command in the terminal while WSL_IP needs to be replaced with the Linux IP identified in previous step [the “echo $WSL_IP” result]:

    netsh interface portproxy add v4tov4 listenaddress=127.0.0.1 listenport=49190 connectaddress=WSL_IP connectport=49190
    
    1. Open a web-browser and in the address bar type localhost:49190
      • A login window should appear

    2. Use the information in the Linux terminal:

      https://github.com/bioinfo-core-BGU/NeatSeq-Flow-GUI/raw/master/doc/NeatSeq-Flow_Server.jpg
      • Copy the “User Name” (blue line) from the terminal to the “User Name” form in the login window

      • Copy the “Password” (yellow line) from the terminal to the “Password” form in the login window

      • Click on the “Login” button.

    3. Managing Users:
      • It is possible to mange users using SSH, NeatSeq-Flow will try to login by ssh to a host using the provided “User Name” and “Password”.

      • The ssh host can be local or remote.

      • Note: If using a remote host, NeatSeq-Flow needs to be installed on the remote host and the analysis will be run on the remote host by the user that logged-in

Authors

  • Menachem Sklarz

  • Liron Levin

  • Michal Gordon

  • Vered Chalifa-Caspi

Bioinformatics Core Facility, llse Katz Institute for Nanoscale Science and Technology, Ben-Gurion University of the Negev, Beer-Sheva, Israel

Cite NeatSeq-Flow

NeatSeq-Flow article on BioRXiv

Contact Us

Liron Levin

Web Site Contents:

Next Step: