dp_tools

Description

dp_tools is a collection of data processing tools designed to assist in GeneLab data processing operations

What This Library Includes

This module currently provides the following major features:

  1. General Validation and Verification (V&V) Framework
  2. General Data Model Framework (base data model, data model loading functions)
  3. Yaml Configuration Interface
  4. GLDS API Wrapper Functions

Additionally, the following Assay specific functionality is packaged:

  1. bulkRNASeq

Installation

Using Containers (e.g. Singularity)

This library is available for usage as prebuilt images located at quay.io

singularity shell quay.io/repository/j_81/dp_tools

Using pip

pip install git+https://github.com/J-81/dp_tools.git@1.3.4

CLI Commands

Note: Most library functionality is only available through using python import.

The following command line scripts are also available once installed and are defined here:

  • dpt-get-isa-archive
usage: dpt-get-isa-archive [-h] --accession GLDS-001

Script for downloading latest ISA from GLDS repository

options:
  -h, --help            show this help message and exit
  --accession GLDS-001  GLDS accession number
  • dpt-isa-to-runsheet
usage: dpt-isa-to-runsheet [-h] --accession GLDS-001 --config-type CONFIG_TYPE [--config-version CONFIG_VERSION] --isa-archive ISA_ARCHIVE

Script for downloading latest ISA from GLDS repository

options:
  -h, --help            show this help message and exit
  --accession GLDS-001  GLDS accession number
  --config-type CONFIG_TYPE
                        Packaged config type to use. Currently supports: ['microarray', 'bulkRNASeq']
  --config-version CONFIG_VERSION
                        Packaged config version to use
  --isa-archive ISA_ARCHIVE
                        Local location of ISA archive file. Can be downloaded from the GLDS repository with 'dpt-get-isa-archive'

Examples

Two step process to create runsheet from GLDS ISA Archive (using Singularity)

Note: At this time, No stdout messages are printed for these scripts

Download ISA Archive

# First two lines tell Singularity to run the dp_tools container in the current working directory
singularity exec --bind $(pwd):$(pwd) \
  docker://quay.io/j_81/dp_tools:1.3.4 \
  dpt-get-isa-archive --accession GLDS-168 # command we want to run

Convert ISA Archive into bulkRNASeq Runsheet

# First two lines tell Singularity to run the dp_tools container in the current working directory
singularity exec --bind $(pwd):$(pwd) \
  docker://quay.io/j_81/dp_tools:1.3.4 \
  dpt-isa-to-runsheet --accession GLDS-168 \
                      --config-type bulkRNASeq \
                      --config-version Latest \
                      --isa-archive GLDS-168_metadata_GLDS-168-ISA.zip # command we want to run
 1""" 
 2## Description
 3
 4dp_tools is a collection of data processing tools designed to assist in GeneLab data processing operations
 5
 6## What This Library Includes
 7
 8This module currently provides the following major features:
 9
101. [General Validation and Verification (V&V) Framework](dp_tools/core/check_model.html)
112. General Data Model Framework ([base data model](dp_tools/core/entity_model.html), [data model loading functions](dp_tools/core/loaders.html))
123. [Yaml Configuration Interface](dp_tools/config/interface.html)
134. [GLDS API Wrapper Functions](dp_tools/glds_api/commons.html) 
14
15Additionally, the following Assay specific functionality is packaged:
16
171. bulkRNASeq
18  - [configuration files](https://github.com/J-81/dp_tools/tree/development/dp_tools/config)
19  - [check functions](dp_tools/bulkRNASeq/checks.html)
20  - [validation procotol](dp_tools/bulkRNASeq/vv_protocols.html)
21
22## Installation
23
24#### Using Containers (e.g. Singularity)
25
26This library is available for usage as prebuilt images located at [quay.io](https://quay.io/repository/j_81/dp_tools?tab=tags)
27> singularity shell quay.io/repository/j_81/dp_tools
28
29#### Using pip
30
31> pip install git+https://github.com/J-81/dp_tools.git@1.3.4
32
33## CLI Commands
34
35**Note: Most library functionality is only available through using python import.**
36
37The following command line scripts are also available once installed and are defined here:
38
39- dpt-get-isa-archive
40``` bash
41usage: dpt-get-isa-archive [-h] --accession GLDS-001
42
43Script for downloading latest ISA from GLDS repository
44
45options:
46  -h, --help            show this help message and exit
47  --accession GLDS-001  GLDS accession number
48```
49
50- dpt-isa-to-runsheet
51``` bash
52usage: dpt-isa-to-runsheet [-h] --accession GLDS-001 --config-type CONFIG_TYPE [--config-version CONFIG_VERSION] --isa-archive ISA_ARCHIVE
53
54Script for downloading latest ISA from GLDS repository
55
56options:
57  -h, --help            show this help message and exit
58  --accession GLDS-001  GLDS accession number
59  --config-type CONFIG_TYPE
60                        Packaged config type to use. Currently supports: ['microarray', 'bulkRNASeq']
61  --config-version CONFIG_VERSION
62                        Packaged config version to use
63  --isa-archive ISA_ARCHIVE
64                        Local location of ISA archive file. Can be downloaded from the GLDS repository with 'dpt-get-isa-archive'
65```
66
67## Examples
68
69### Two step process to create runsheet from GLDS ISA Archive (using Singularity)
70
71> **Note: At this time, No stdout messages are printed for these scripts**
72
73#### Download ISA Archive
74``` bash
75# First two lines tell Singularity to run the dp_tools container in the current working directory
76singularity exec --bind $(pwd):$(pwd) \\
77  docker://quay.io/j_81/dp_tools:1.3.4 \\
78  dpt-get-isa-archive --accession GLDS-168 # command we want to run
79```
80
81#### Convert ISA Archive into bulkRNASeq Runsheet
82``` bash
83# First two lines tell Singularity to run the dp_tools container in the current working directory
84singularity exec --bind $(pwd):$(pwd) \\
85  docker://quay.io/j_81/dp_tools:1.3.4 \\
86  dpt-isa-to-runsheet --accession GLDS-168 \\
87                      --config-type bulkRNASeq \\
88                      --config-version Latest \\
89                      --isa-archive GLDS-168_metadata_GLDS-168-ISA.zip # command we want to run
90```
91
92"""
93__version__ = "1.3.4"