\HeaderA{readGEOAnn}{Function to extract data from the GEO web site}{readGEOAnn}
\aliasA{getGPLNames}{readGEOAnn}{getGPLNames}
\aliasA{getSAGEFileInfo}{readGEOAnn}{getSAGEFileInfo}
\aliasA{getSAGEGPL}{readGEOAnn}{getSAGEGPL}
\aliasA{readIDNAcc}{readGEOAnn}{readIDNAcc}
\aliasA{readUrl}{readGEOAnn}{readUrl}
\keyword{manip}{readGEOAnn}
\begin{Description}\relax
Data files that are available at GEO web site are identified by GEO
accession numbers. Given the url for the CGI script at GEO and
a GEO accession number, the functions extract data from the web site
and returns a matrix containing the data.
\end{Description}
\begin{Usage}
\begin{verbatim}
readGEOAnn(GEOAccNum, url = "http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?")
readIDNAcc(GEOAccNum, url = "http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?")
getGPLNames(url ="http://www.ncbi.nlm.nih.gov/geo/query/browse.cgi?") 
getSAGEFileInfo(url =
                       "http://www.ncbi.nlm.nih.gov/geo/query/browse.cgi?view=platforms&prtype=SAGE&dtype=SAGE")
getSAGEGPL(organism = "Homo sapiens", enzyme = c("NlaIII", "Sau3A"))
readUrl(url)
\end{verbatim}
\end{Usage}
\begin{Arguments}
\begin{ldescription}
\item[\code{url}] \code{url} the url for the CGI script at GEO
\item[\code{GEOAccNum}] \code{GEOAccNum} a character string for the GEO
accession number of a desired file (e. g. GPL97)
\item[\code{organism}] \code{organism} a character string for the name of the
organism of interests
\item[\code{enzyme}] \code{enzyme} a character string that can be eighter
NlaII or Sau3A for the enzyme used to create SAGE tags
\end{ldescription}
\end{Arguments}
\begin{Details}\relax
\code{url} is the CGI script that processes user's
request. \code{\LinkA{readGEOAnn}{readGEOAnn}} invokes the CGI by passing a GEO 
accession number and then processes the data file obtained.

\code{\LinkA{readIDNAcc}{readIDNAcc}} calls \code{\LinkA{readGEOAnn}{readGEOAnn}} to read the
data and the extracts the columns for probe ids and accession numbers.
The \code{GEOAccNum} has to be the id for an Affymetrix chip.

\code{\LinkA{getGPLNames}{getGPLNames}} parses the html file that lists GEO
accession numbers and descriptions of the array represented by the
corresponding GEO accession numbers.
\end{Details}
\begin{Value}
Both \code{\LinkA{readGEOAnn}{readGEOAnn}} and \code{\LinkA{readIDNAcc}{readIDNAcc}} return a
matrix.

\code{\LinkA{getGPLNames}{getGPLNames}} returns a named vector of the names of
commercial arrays. The names of the vector are the corresponding GEO
accession number.
\end{Value}
\begin{Note}\relax
This function is part of the Bioconductor project at Dana-Farber
Cancer Institute to provide Bioinformatics functionalities through R
\end{Note}
\begin{Author}\relax
Jianhua Zhang
\end{Author}
\begin{References}\relax
\url{www.ncbi.nlm.nih.gov/geo}
\end{References}
\begin{Examples}
\begin{ExampleCode}
# Get array names and GEO accession numbers
#geoAccNums <- getGPLNames()
# Read the annotation data file for HG-U133A which is GPL96 based on
# examining geoAccNums 
#temp <- readGEOAnn(GEOAccNum = "GPL96")
#temp2 <- readIDNAcc(GEOAccNum = "GPL96")
\end{ExampleCode}
\end{Examples}


