Title: | Tools to Create Gene Sets |
---|---|
Description: | A set of functions to create SQL tables of gene and SNP information and compose them into a SNP Set, for example to export to a PLINK set. |
Authors: | Chanhee Yi, Alexander Sibley, and Kouros Owzar |
Maintainer: | Alexander Sibley <[email protected]> |
License: | GPL-3 |
Version: | 0.18.2 |
Built: | 2024-11-12 04:56:47 UTC |
Source: | https://github.com/cran/snplist |
A set of functions to create SQL tables of gene and SNP information and compose them into a SNP Set, for example for use with the RSNPset
package, or to export to a PLINK set.
Package: | snplist |
Type: | Package |
Version: | 0.18.1 |
Date: | 2017-12-11 |
License: | GPL-3 |
Please see the example function calls below, or refer to the individual function documentation or the included vignette for more information.
Authors: Chanhee Yi, Alexander Sibley, and Kouros Owzar Maintainer: Alexander Sibley <[email protected]>
RSQLite
, Rcpp
chromosome <- c(1,5,22,"X","Y","MT") geneNum <- 5 snpNum <- 1200 annoDataNum <- 500 chrLength <- 1000 geneLength <- 100 gene <- paste("gene",1:geneNum,sep="") chr <- sample(chromosome,geneNum,replace=TRUE) start <- sample(chrLength,geneNum,replace=TRUE) d <- sample(geneLength,geneNum,replace=TRUE) end <- start+d geneInfo <- data.frame(gene,chr,start,end) rsid <- paste("rs",1:snpNum,sep="") chr <- sample(chromosome,snpNum,replace=TRUE) pos <- sample(chrLength+geneLength,snpNum,replace=TRUE) snpInfo <- data.frame(rsid,chr,pos) annoInfo <- data.frame("rsid"=sample(rsid,annoDataNum)) dim(geneInfo) dim(snpInfo) dim(annoInfo) ## Not run: setGeneTable(geneInfo) setSNPTable(snpInfo) geneset <- makeGeneSet(annoInfo) exportPLINKSet(geneset,"geneSet.set") file.show("geneSet.set") ## End(Not run)
chromosome <- c(1,5,22,"X","Y","MT") geneNum <- 5 snpNum <- 1200 annoDataNum <- 500 chrLength <- 1000 geneLength <- 100 gene <- paste("gene",1:geneNum,sep="") chr <- sample(chromosome,geneNum,replace=TRUE) start <- sample(chrLength,geneNum,replace=TRUE) d <- sample(geneLength,geneNum,replace=TRUE) end <- start+d geneInfo <- data.frame(gene,chr,start,end) rsid <- paste("rs",1:snpNum,sep="") chr <- sample(chromosome,snpNum,replace=TRUE) pos <- sample(chrLength+geneLength,snpNum,replace=TRUE) snpInfo <- data.frame(rsid,chr,pos) annoInfo <- data.frame("rsid"=sample(rsid,annoDataNum)) dim(geneInfo) dim(snpInfo) dim(annoInfo) ## Not run: setGeneTable(geneInfo) setSNPTable(snpInfo) geneset <- makeGeneSet(annoInfo) exportPLINKSet(geneset,"geneSet.set") file.show("geneSet.set") ## End(Not run)
Simple function using Rcpp to write the gene set to a file in the PLINK set format.
exportPLINKSet(geneSets, fname)
exportPLINKSet(geneSets, fname)
geneSets |
An object created by the |
fname |
The name of the PLINK file to be created. |
A Boolean indicating if the file was successfully written.
# Please see the vignette or the package description # for an example of using this function.
# Please see the vignette or the package description # for an example of using this function.
A function leveraging the biomaRt
package to retrieve gene chromosome and start and end positions from Ensembl.
getBioMartData(genes,verbose=FALSE,...)
getBioMartData(genes,verbose=FALSE,...)
genes |
A vector of gene names matching |
verbose |
A Boolean indicating whether to output the funcitons progress in terms of the dimensions of the |
... |
Additional arguments passed on to the internal call to |
A data.frame
object with columns 'gene','chr','start', and 'end', suitable for input to the setGeneTable
function.
At the time of package release, the BioMart community portal is temporarily unavailable. See www.biomart.org for updated status or more information. To access alternative hosts, pass additional arguments to the internal call to biomaRt::useMart(...)
, as in the second example below.
Durinck S., Spellman P.T., Birney E. and Huber W. (2009) Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt, Nature Protocols, 4, 1184–1191.
## Not run: getBioMartData(c("BRCA1","BRCA2")) getBioMartData(c("BRCA1","BRCA2"), host="www.ensembl.org", biomart="ENSEMBL_MART_ENSEMBL", dataset="hsapiens_gene_ensembl") ## End(Not run)
## Not run: getBioMartData(c("BRCA1","BRCA2")) getBioMartData(c("BRCA1","BRCA2"), host="www.ensembl.org", biomart="ENSEMBL_MART_ENSEMBL", dataset="hsapiens_gene_ensembl") ## End(Not run)
This function uses existing SQLite tables (from setGeneTable
and setSNPTable
) to make SNP sets. The SNP Set for each gene is the collection of SNPs located either between the start and end locations of the gene, or within a specified neighborhood around the gene. The SNP Sets are stored in the SQLite database, and returned as a list object.
makeGeneSet(annoInfo=NULL,margin=0,annoTable='anno',geneTable='gene', allTable='allchrpos',db='snplistdb',dbCleanUp=FALSE)
makeGeneSet(annoInfo=NULL,margin=0,annoTable='anno',geneTable='gene', allTable='allchrpos',db='snplistdb',dbCleanUp=FALSE)
annoInfo |
A |
margin |
A number, indicating the size of the neighborhood (in base pairs) surrounding a genes start and end positions in which a SNP will be included in that genes SNP set. Default is 0. |
annoTable |
A string indicating the name of the SQLite table for the rsIDs from |
geneTable |
Name of the SQLite table containg chromosome, start and end positions for each gene, as previously created by |
allTable |
Name of the SQLite table containg chromosome and position for each SNP, as previously created by |
db |
Name of the SQLite database in which to find the gene and SNP tables and create the SNP set table. Default is 'snplistdb'. |
dbCleanUp |
Boolean indicating if the tables and views created by the function should be dropped after the SNP set is returned. Default is FALSE. |
Note: This function relies on the prior execution of the setGeneTable
and setSNPTable
functions and the SQLite databes and tables they create. If the table
or db
argument in either of those functions is changed from the default value, it must also be changed here.
Returns a list
of SNP sets of the form:
<gene name> |
Vector of rsIDs of SNPs within <gene> (or the neighborhood around it) |
setGeneTable
, setSNPTable
, snplist-package
# Please see the vignette or the package description # for an example of using this function.
# Please see the vignette or the package description # for an example of using this function.
Takes a data.frame
object with columns 'gene','chr','start', and 'end', and creates an SQLite table of the information. Returns a count of the number of genes in the table.
setGeneTable(geneInfo,table='gene',db='snplistdb')
setGeneTable(geneInfo,table='gene',db='snplistdb')
geneInfo |
A |
table |
Name of the SQLite table to be created. Default is 'gene'. |
db |
Name of the SQLite database in which to create |
Count of genes included in table
.
geneInfo <- cbind(c('BRCA1','BRCA2'),c(17,13),c(41196312,32889611),c(41277500,32973805)) colnames(geneInfo) <- c('gene','chr','start','end') ## Not run: setGeneTable(as.data.frame(geneInfo)) ## End(Not run)
geneInfo <- cbind(c('BRCA1','BRCA2'),c(17,13),c(41196312,32889611),c(41277500,32973805)) colnames(geneInfo) <- c('gene','chr','start','end') ## Not run: setGeneTable(as.data.frame(geneInfo)) ## End(Not run)
Takes a file or data.frame
object with columns 'chr','pos', and 'rsid', and creates an SQLite table of the information. Returns a count of the number of SNPs in the table.
setSNPTable(snpInfo,table='allchrpos',db='snplistdb')
setSNPTable(snpInfo,table='allchrpos',db='snplistdb')
snpInfo |
A |
table |
Name of the SQLite table to be created. Default is 'allchrpos'. |
db |
Name of the SQLite database in which to create |
Count of genes included in table
.
snpInfo <- cbind(c(17,17,13,13), c(41211653, 41213996, 32890026,32890572), c("rs8176273","rs8176265","rs9562605","rs1799943") ) colnames(snpInfo) <- c('chr','pos','rsid') ## Not run: setSNPTable(as.data.frame(snpInfo)) ## End(Not run)
snpInfo <- cbind(c(17,17,13,13), c(41211653, 41213996, 32890026,32890572), c("rs8176273","rs8176265","rs9562605","rs1799943") ) colnames(snpInfo) <- c('chr','pos','rsid') ## Not run: setSNPTable(as.data.frame(snpInfo)) ## End(Not run)