Get Started
get_started.Rmd
Vignette in development
In this vignette, we will walk you through the basic usage using the readthis R package for fast reading of output files from various programs, including Mutect2, Strelka, ASCAT, and FACETS. We will use the Strelka VCF files that are installed with the readthis package.
library(readthis)
strelka_vcf <- system.file("extdata", "Strelka", "S1.somatic.snvs.vcf.gz", package = "readthis")
VCF files with Strelka somatic SNVs can be read with
read_strelka_somatic_snvs()
function. The main argument
taken by function is the path, which can be either a path to a single
VCF file, vector of paths to many files, or a single path to a directory
containing many files.
Reading a single file
In the simpliest case path
points a single VCF file:
read_strelka_somatic_snvs(strelka_vcf, verbose = FALSE)
#> # A tibble: 9 × 9
#> sample_id chrom pos ref alt ref_reads alt_reads VAF DP
#> <chr> <chr> <int> <chr> <chr> <int> <int> <dbl> <int>
#> 1 TUMOR chr1 1 T G 40 2 0.0476 1000
#> 2 TUMOR chr2 3 A T 27 3 0.1 554
#> 3 TUMOR chr3 5 C A 23 3 0.115 412
#> 4 TUMOR chr4 7 T C 39 10 0.204 932
#> 5 TUMOR chr5 8 A C 59 3 0.0484 945
#> 6 TUMOR chr6 9 C A 25 2 0.0741 500
#> 7 TUMOR chr7 10 A C 35 2 0.0541 870
#> 8 TUMOR chrX 11 A T 32 1 0.0303 893
#> 9 TUMOR chrY 12 T A 62 3 0.0462 740
Reading list of files
path
can be a vector of paths to many VCF files
strelka_vcf2 <- system.file("extdata", "Strelka", "S2.somatic.snvs.vcf.gz", package = "readthis")
files <- c(S1 = strelka_vcf, S2 = strelka_vcf2)
read_strelka_somatic_snvs(files, verbose = FALSE)
#> # A tibble: 18 × 10
#> patient_id sample_id chrom pos ref alt ref_reads alt_reads VAF DP
#> <chr> <chr> <chr> <int> <chr> <chr> <int> <int> <dbl> <int>
#> 1 S1 TUMOR chr1 1 T G 40 2 0.0476 1000
#> 2 S1 TUMOR chr2 3 A T 27 3 0.1 554
#> 3 S1 TUMOR chr3 5 C A 23 3 0.115 412
#> 4 S1 TUMOR chr4 7 T C 39 10 0.204 932
#> 5 S1 TUMOR chr5 8 A C 59 3 0.0484 945
#> 6 S1 TUMOR chr6 9 C A 25 2 0.0741 500
#> 7 S1 TUMOR chr7 10 A C 35 2 0.0541 870
#> 8 S1 TUMOR chrX 11 A T 32 1 0.0303 893
#> 9 S1 TUMOR chrY 12 T A 62 3 0.0462 740
#> 10 S2 TUMOR chr1 1 T G 40 2 0.0476 1000
#> 11 S2 TUMOR chr2 3 A T 27 3 0.1 554
#> 12 S2 TUMOR chr3 5 C A 23 3 0.115 412
#> 13 S2 TUMOR chr4 7 T C 39 10 0.204 932
#> 14 S2 TUMOR chr5 8 A C 59 3 0.0484 945
#> 15 S2 TUMOR chr6 9 C A 25 2 0.0741 500
#> 16 S2 TUMOR chr7 10 A C 35 2 0.0541 870
#> 17 S2 TUMOR chrX 11 A T 32 1 0.0303 893
#> 18 S2 TUMOR chrY 12 T A 62 3 0.0462 740
Reading all files from the directory
strelka_dir <- system.file("extdata", "Strelka", package = "readthis")
read_strelka_somatic_snvs(strelka_dir, verbose = FALSE)
#> # A tibble: 18 × 10
#> patient_id sample_id chrom pos ref alt ref_reads alt_reads VAF DP
#> <chr> <chr> <chr> <int> <chr> <chr> <int> <int> <dbl> <int>
#> 1 S1 TUMOR chr1 1 T G 40 2 0.0476 1000
#> 2 S1 TUMOR chr2 3 A T 27 3 0.1 554
#> 3 S1 TUMOR chr3 5 C A 23 3 0.115 412
#> 4 S1 TUMOR chr4 7 T C 39 10 0.204 932
#> 5 S1 TUMOR chr5 8 A C 59 3 0.0484 945
#> 6 S1 TUMOR chr6 9 C A 25 2 0.0741 500
#> 7 S1 TUMOR chr7 10 A C 35 2 0.0541 870
#> 8 S1 TUMOR chrX 11 A T 32 1 0.0303 893
#> 9 S1 TUMOR chrY 12 T A 62 3 0.0462 740
#> 10 S2 TUMOR chr1 1 T G 40 2 0.0476 1000
#> 11 S2 TUMOR chr2 3 A T 27 3 0.1 554
#> 12 S2 TUMOR chr3 5 C A 23 3 0.115 412
#> 13 S2 TUMOR chr4 7 T C 39 10 0.204 932
#> 14 S2 TUMOR chr5 8 A C 59 3 0.0484 945
#> 15 S2 TUMOR chr6 9 C A 25 2 0.0741 500
#> 16 S2 TUMOR chr7 10 A C 35 2 0.0541 870
#> 17 S2 TUMOR chrX 11 A T 32 1 0.0303 893
#> 18 S2 TUMOR chrY 12 T A 62 3 0.0462 740
readthis
contains methods for bulk reading of output
files from some other programs. To see the list of functions implemented
in the package go to Reference
page.