GOCompare 다운로드 - GOCompare 소스 코드 다운로드

GOCompare

기타 소스코드

1.0.0

다운로드

GOCompare R 패키지 v1.0.2.1

설명:

설치

GOCompare는 다음과 같이 설치할 수 있습니다.

 # CRAN
install.packages( " GOCompare " )
# Alternative: GitHub
library( devtools )
remotes :: install_github( " ccsosa/GOCompare " )

패키지에 필요한 라이브러리의 전체 목록은 아래에 포함되어 있습니다.

종속성: R (>= 4.0.0)

가져오기: base, utils, methods, stats, grDevices, ape, vegan, ggplot2, ggrepel, igraph, parallel, stringr

제안: testthat

용법

이 R 패키지는 기능 강화 분석 결과를 비교하는 간단한 워크플로를 제공하는 6가지 기능을 제공합니다.

기능: mostFrequentGOs. graphGOspecies 한 종에 대한 분석을 제공하도록 설계되었습니다.
함수: compareGOspecies graph_two_GOspecies evaluateCAT_species evaluateGO_species 하면 사용자가 필요로 하는 카테고리에 속하는 두 종 GO 용어 목록을 비교할 수 있습니다.
마지막으로 테스트용 4개 데이터세트 세트가 패키지에 제공됩니다: A_thaliana, A_thaliana_compress, H_sapiens, H_sapiens_compress, comparison_example

함수 사용 스키마

데이터 입력

기능 강화 분석 결과

주요 입력으로 BinGO, AmiGO, ShinnyGO 또는 TopGO와 같이 선호하는 리소스의 기능 강화 분석 결과가 포함된 두 개의 데이터 프레임이 필요합니다. 각 파일은 다음 구조를 가져야 합니다.

분석할 GO 용어가 있는 열과 비교할 범주가 있는 열이 있는 기능 강화 분석 결과의 data.frame
기능에 따라 종 이름을 지정해야 합니다. 종1 = "H_sapiens" 및 종2 = "A_thaliana"
분석할 GO 용어가 존재하는 열 이름이 있는 필드를 제공해야 합니다(예: GOterm_field <- "Functional_Category")

기능_범주	특징
스트레스에 대한 반응	지원
국방 대응	지원
세포 크기의 조절	지원
국방 대응	목표
외부 생물 자극에 대한 반응	DCE

작업흐름

이 패키지를 사용하는 방법에 대한 예는 아래에 제공됩니다. (예를 들어 D. melanogaster에서 발암 과정과 관련된 10가지 특징에 대한 암 유전자와 가능한 상동성을 비교할 것입니다. 정확한 프로그램이나 플랫폼을 사용하여 상동성 유전자를 얻으십시오. 이 경우 gprofiler2 R 패키지의 gorth 기능을 사용하는 예제를 제공합니다.
중복된 GO 용어를 줄이는 대체 기능은 https://github.com/ccsosa/reducereduntGO/에서 제공됩니다. 여기서 ReducereduntGO.R과 함께 제공되는 예는 https://github.com/ccsosa/reducereduntGO/blob/master/Cancer_hallmark_reduce_terms.R 파일에 제공됩니다.
GOCompare::graph_two_GOspecies 카테고리 옵션에 대한 무방향 그래프에 대한 플롯 기능은 https://github.com/ccsosa/부가 정보/blob/main/CHAPTER1/PLOT_TWO_SP_GRAPH_CAT.R에서 사용할 수 있습니다.
GOCompare::graph_two_GOspecies GO 옵션에 대한 무향 그래프에 대한 플롯 기능은 https://github.com/ccsosa/ replacement-information/blob/main/CHAPTER1/PLOT_TWO_SP_GRAPH_GO.R에서 사용할 수 있습니다.
(GOCompare에서 디렉토리와 그래프 입력을 추가하세요. 이 기능은 각각 CAT_TWO.pdf 및 GO_TWO.pdf라는 이름의 PDF 파일을 저장합니다. 소스 기능과 URL을 사용하여 청구합니다.
소스("https://raw.githubusercontent.com/ccsosa/ replacement-information/refs/heads/main/CHAPTER3/PLOT_TWO_SP_GRAPH_CAT.R")
출처("https://raw.githubusercontent.com/ccsosa/ replacement-information/refs/heads/main/CHAPTER3/PLOT_TWO_SP_GRAPH_GO.R")

require( gprofiler2 );require( stringr );require( GOCompare )

url_file = " https://raw.githubusercontent.com/ccsosa/R_Examples/master/Hallmarks_of_Cancer_AT.csv "
x <- read.csv( url_file )
x [, 1 ] <- NULL
CH <- c( " AID " , " AIM " , " DCE " , " ERI " , " EGS " , " GIM " , " IA " , " RCD " , " SPS " , " TPI " )


x_Hsap <- lapply(seq_len(length( CH )), function ( i ){
 x_unique <- unique(na.omit( x [, i ]))
 x_unique <- x_unique [which( x_unique != " " )]
 x_unique <- as.list( x_unique )
 return ( x_unique )
})

names( x_Hsap ) <- CH

# Using as background the unique genes for the ten CH.
GOterm_field <- " term_name "
x_s <-  gprofiler2 :: gost( query = x_Hsap ,
                        organism = " hsapiens " , ordered_query = FALSE ,
                        multi_query = FALSE , significant = TRUE , exclude_iea = FALSE ,
                        measure_underrepresentation = FALSE , evcodes = FALSE ,
                        user_threshold = 0.05 , correction_method = " g_SCS " ,
                        domain_scope = " annotated " , custom_bg = unique(unlist( x_Hsap )),
                        numeric_ns = " " , sources = " GO:BP " , as_short_link = FALSE )

colnames( x_s $ result )[ 1 ] <- " feature "

# Check number of enriched terms per category
tapply( x_s $ result $ feature , x_s $ result $ feature , length )

# Running function to get graph of a list of features and GO terms

x <- graphGOspecies( df = x_s $ result ,
                   GOterm_field = GOterm_field ,
                   option = " Categories " ,
                   numCores = 1 ,
                   saveGraph = FALSE ,
                   outdir = NULL ,
                   filename = NULL )

# visualize nodes 
View( x $ nodes )

# Get nodes with values greater than 95%
perc <- x $ nodes [which( x $ nodes $ WEIGHT > quantile( x $ nodes $ WEIGHT , probs = 0.95 )),]
# visualize nodes filtered
View( perc )



# ########

# Running function to get graph of a list of GO terms  and categories

x_GO <- graphGOspecies( df = x_s $ result ,
                      GOterm_field = GOterm_field ,
                      option = " GO " ,
                      numCores = 1 ,
                      saveGraph = FALSE ,
                      outdir = NULL ,
                      filename = NULL )

# visualize nodes 
View( x_GO $ nodes )

# Get GO terms nodes with values greater than 95%
perc_GO <- x_GO $ nodes [which( x_GO $ nodes $ GO_WEIGHT > quantile( x_GO $ nodes $ GO_WEIGHT , probs = 0.95 )),]

# visualize GO terms nodes filtered
View( perc_GO )


# #######################################################################################################
# two species comparison assuming they are the same genes in Drosophila melanogaster


orth_genes <- gprofiler2 :: gorth( query = unique(unlist( x_Hsap )), source_organism = " hsapiens " , target_organism = " dmelanogaster " )

# assigning genes

x_Dmap <- list ()
for ( i in 1 : length( x_Hsap )){
 
 D_list <- list ()
 for ( j in 1 : length( x_Hsap [[ i ]])){
   x_orth <- orth_genes [ orth_genes $ input == x_Hsap [[ i ]][ j ],]
   if (nrow( x_orth ) > 0 ){
     D_list [[ j ]] <- data.frame ( orth = x_orth $ ortholog_name )
   } else {
     D_list [[ j ]] <- NULL
   }
   rm( x_orth )
 };rm( j )

 D_list <- unique(do.call( rbind , D_list ))
 D_list <- D_list [which( ! is.null( D_list ))]
x_Dmap [[ i ]] <- D_list
rm( D_list )
};rm( i )

names( x_Dmap ) <- CH


GOterm_field <- " term_name "
x_s2 <-  gprofiler2 :: gost( query = x_Dmap ,
                         organism = " dmelanogaster " , ordered_query = FALSE ,
                         multi_query = FALSE , significant = TRUE , exclude_iea = FALSE ,
                         measure_underrepresentation = FALSE , evcodes = FALSE ,
                         user_threshold = 0.05 , correction_method = " g_SCS " ,
                         domain_scope = " annotated " , custom_bg = unique(unlist( x_Dmap )),
                         numeric_ns = " " , sources = " GO:BP " , as_short_link = FALSE )

colnames( x_s2 $ result )[ 1 ] <- " feature "

# preparing input for compare two species
x_input <- GOCompare :: compareGOspecies( x_s $ result , x_s2 $ result , GOterm_field , species1 = " H. sapiens " , species2 = " D. melanogaster " , paired_lists = T )

# try to test similarities using clustering

plot(hclust( x_input $ distance , method = " ward.D " ))

# Comparing species results

comp_species_graph <- GOCompare :: graph_two_GOspecies( x_input , species1  = " H. sapiens " , species2 = " D. melanogaster " , option = " Categories " )

# View nodes order by combined weight (SPS and GIM categories have more frequent GO terms co-occurring)
View( comp_species_graph $ nodes [order( comp_species_graph $ nodes $ COMBINED_WEIGHT , decreasing = T ),])

comp_species_graph_GO <- GOCompare :: graph_two_GOspecies( x_input , species1  = " H. sapiens " , species2 = " D. melanogaster " , option = " GO " )
# Get GO terms nodes with values greater than 95%
perc_GO_two <- comp_species_graph_GO $ nodes [which( comp_species_graph_GO $ nodes $ GO_WEIGHT > quantile( comp_species_graph_GO $ nodes $ GO_WEIGHT , probs = 0.95 )),]

# visualize GO terms nodes filtered and ordered (more frequent GO terms in both species and categories)

View( perc_GO_two [order( perc_GO_two $ GO_WEIGHT , decreasing = T ),])


# evaluating if there are different in proportions of GO terms for each category 
x_CAT <- GOCompare :: evaluateCAT_species( x_s $ result , x_s2 $ result , species1  = " H. sapiens " , species2 = " D. melanogaster " , GOterm_field = " term_name " , test = " prop " )
x_CAT <- x_CAT [which( x_CAT $ FDR < = 0.05 ),]
# View Categories with FDR <0.05 (RCD,SPS,GIM, AIM,ERI,DCE)

View( x_CAT )

# evaluating if there are different in proportions of categories for GO terms
x_GO <- GOCompare :: evaluateGO_species( x_s $ result , x_s2 $ result , species1  = " H. sapiens " , species2 = " D. melanogaster " , GOterm_field = " term_name " , test = " prop " )
x_GO <- x_GO [which( x_GO $ FDR < = 0.05 ),]
# View Categories with FDR <0.05 (No significant results in proportions)
View( x_GO )


# #Optional plots (omit # symbol and run)
# source("https://raw.githubusercontent.com/ccsosa/Supplementary-information/refs/heads/main/CHAPTER3/PLOT_TWO_SP_GRAPH_CAT.R")
# source("https://raw.githubusercontent.com/ccsosa/Supplementary-information/refs/heads/main/CHAPTER3/PLOT_TWO_SP_GRAPH_GO.R")
# plot_twosp_CAT("D:/",comp_species_graph)
# plot_twosp_GO("D:/",comp_species_graph_GO)

저자

메인:Christian C. Sosa, Diana Carolina Clavijo-Buriticá, Mauricio Quimbaya, Victor Hugo García-Merchán

기타 기여자: Nicolas Lopéz-Rozo, Camila Riccio Rengifo, David Arango Londoño, Maria Victoria Diaz

참고자료

소사, 크리스티안 C., 다이아나 카롤리나 클라비조-부리티카, 빅토르 휴고 가르시아-메르칸, 니콜라스 로페즈-로조, 카밀라 리치오-렌기포, 마리아 빅토리아 디아즈, 다비드 아랑고 론도뇨, 마우리시오 알베르토 킴바야. «GOCompare: 두 종 간의 기능 강화 분석을 비교하는 R 패키지». Genomics 115, n.º 1(2023년 1월): 110528. https://doi.org/10.1016/j.ygeno.2022.110528.