readGEOAnn package:annotate R Documentation _F_u_n_c_t_i_o_n _t_o _e_x_t_r_a_c_t _d_a_t_a _f_r_o_m _t_h_e _G_E_O _w_e_b _s_i_t_e _D_e_s_c_r_i_p_t_i_o_n: Data files that are available at GEO web site are identified by GEO accession numbers. Given the url for the CGI script at GEO and a GEO accession number, the functions extract data from the web site and returns a matrix containing the data. _U_s_a_g_e: readGEOAnn(GEOAccNum, url = "http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?") readIDNAcc(GEOAccNum, url = "http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?") getGPLNames(url ="http://www.ncbi.nlm.nih.gov/geo/query/browse.cgi?") getSAGEFileInfo(url = "http://www.ncbi.nlm.nih.gov/geo/query/browse.cgi?view=platforms&prtype=SAGE&dtype=SAGE") getSAGEGPL(organism = "Homo sapiens", enzyme = c("NlaIII", "Sau3A")) readUrl(url) _A_r_g_u_m_e_n_t_s: url: 'url' the url for the CGI script at GEO GEOAccNum: 'GEOAccNum' a character string for the GEO accession number of a desired file (e. g. GPL97) organism: 'organism' a character string for the name of the organism of interests enzyme: 'enzyme' a character string that can be eighter NlaII or Sau3A for the enzyme used to create SAGE tags _D_e_t_a_i_l_s: 'url' is the CGI script that processes user's request. 'readGEOAnn' invokes the CGI by passing a GEO accession number and then processes the data file obtained. 'readIDNAcc' calls 'readGEOAnn' to read the data and the extracts the columns for probe ids and accession numbers. The 'GEOAccNum' has to be the id for an Affymetrix chip. 'getGPLNames' parses the html file that lists GEO accession numbers and descriptions of the array represented by the corresponding GEO accession numbers. _V_a_l_u_e: Both 'readGEOAnn' and 'readIDNAcc' return a matrix. 'getGPLNames' returns a named vector of the names of commercial arrays. The names of the vector are the corresponding GEO accession number. _N_o_t_e: This function is part of the Bioconductor project at Dana-Farber Cancer Institute to provide Bioinformatics functionalities through R _A_u_t_h_o_r(_s): Jianhua Zhang _R_e_f_e_r_e_n_c_e_s: _E_x_a_m_p_l_e_s: # Get array names and GEO accession numbers #geoAccNums <- getGPLNames() # Read the annotation data file for HG-U133A which is GPL96 based on # examining geoAccNums #temp <- readGEOAnn(GEOAccNum = "GPL96") #temp2 <- readIDNAcc(GEOAccNum = "GPL96")