Description
The European Bioinformatics Institute (EMBL-EBI) in Hinxton,
UK, grew out of the pioneering work by the European Molecular
Biology Laboratory (EMBL) to provide biological databases
to the research community. Today EBI is a centre for research
and services in bioinformatics, housing a wide variety of
databases of biological data, including genomic, proteomic,
biochemical, structural, interaction, and expression data,
amongst others. In addition, the EBI provides a toolbox
of bioinformatics software and search engines with which
to analyse and collate the wealth of biological data. The
EBI training workshop will provide a thorough grounding
in a selection of fundamental EBI genomic and proteomic
databases, including the appropriate analysis software.
EBI
- ArrayExpress
Description
The Microarray team at the EBI provides public resources
for storing, managing and analyzing microarray data, and
is a public, online repository for microarray data that
conforms to the Microarray Minimum Information for Microarray
Experiment (MIAME) standards. ArrayExpress stores both normalized
and well-annotated raw data from 40,000 hybridization studies
in more than 70 species. Data can be submitted on-line,
and password protection to pre-publication data is provided
for authors and reviewers. This course will examine how
to query data in ArrayExpress, deposit data, and give yourself
and other microarray community members access to your data
at the level of the experiment or the gene. ExpressionProfiler,
a tool written to conduct online microarray data analysis,
will also be introduced.
Audience
No prior experience is required.
Format
The course consists of a 1-hour lecture OR a 2-hour hands-on
workshop which duplicates the content of the lecture (see
below). Lecture attendees may bring their wireless-access
enabled computer laptops and follow the lecturer, as conditions
permit.
Lecture
Registration for the lecture is unrestricted and open
to all Stanford University faculty, staff, (including
visiting faculty and staff) and graduate students. An
open consultation period with EBI instructors for lecture
participants will be available.
Hands-on
Workshop
The workshop duplicates the content of the lecture, but
differs in that it provides hands-on practice of the lecture
demonstration examples. Individual consultation with the
EBI instructor will be available to workshop participants
immediately following the workshop. Workshop computer
space is limited to a single session due to instructor
availability! Register early. Registration will be "first
come - first serve." A waiting list will
be maintained if necessary.
Registration
Registration is required for either the ArrayExpress lecture
or workshop.
Description
The Macromolecular Structure Database (MSD) (http://www.ebi.ac.uk/msd/)
group is one of the three partners in the worldwide Protein
Data Bank (wwPDB, http://www.wwpdb.org/), the consortium
entrusted with the collation, maintenance and distribution
of the global repository of macromolecular structure data.
Since its inception in 1996, the MSD group has worked with
partners around the world to improve the quality of PDB
data, through a clean-up programme that addresses inconsistencies
and inaccuracies in the legacy archive. The improvements
in data quality in the legacy archive have been achieved
largely through the creation of a unified data archive,
in the form of a relational database that stores all of
the data in the wwPDB. The implementation of the MSD database,
together with the parallel development of improved tools
and methodologies for data harvesting, validation and archival,
has lead to significant improvements in the quality of data
that enters the archive. The MSD has developed different
query systems in order to allow users to interact with the
database and retrieve clean, consistent data for their own
use. In addition, the MSD has also developed new algorithms
and visualization methods to help analysis of structures
and improved the deposition pipeline with the latest release
of AutoDep4.0 (http://www.ebi.ac.uk/msd-srv/autodep4/),
an in-house archival and web-based structure deposition
tool. This workshop will introduce participants to various
services offered by the MSD that can help in soling research
problems.
Audience
No prior experience is required.
Format
The course consists of a 1-hour lecture OR a 2-hour hands-on
workshop which duplicates the content of the lecture (see
below). Lecture attendees may bring their wireless-access
enabled computer laptops and follow the lecturer, as conditions
permit.
Lecture
Registration for the lecture is unrestricted and open
to all Stanford University faculty, staff, (including
visiting faculty and staff) and graduate students. An
open consultation period with EBI instructors for lecture
participants will be available.
Hands-on
Workshop
The workshop duplicates the content of the lecture, but
differs in that it provides hands-on practice of the lecture
demonstration examples. Individual consultation with the
EBI instructor will be available to workshop participants
immediately following the workshop. Workshop computer
space is limited to a single session due to instructor
availability! Register early. Registration will be "first
come - first serve." A waiting list will
be maintained if necessary.
Registration
Registration is required for either the ArrayExpress lecture
or workshop.
Description
Reactome is a collaborative project between the EBI and
the Cold Spring Harbor Laboratory, USA, consisting of a
curated knowledgebase of biological processes described
in terms of reactions, pathways, macromolecules, small molecules,
complexes and catalyst activities. These processes include
both well-defined metabolic pathways, such as glycolysis,
as well as a diverse range of other pathways, such as signalling
pathways and cell cycle regulation. The Reactome web interface
provides a textbook-like view of cellular processes, plus
features for keyword searching, integrating expression data
and generating protein-protein interactions. This course
describes the underlying concepts behind Reactome, and explores
the main features of the web interface using hands-on exercises.
In addition, participants will be guided through the process
of setting up their own Reactome site, and shown how to
use the curation tools available.
Audience
No prior experience is required. Anyone with an interest
in using and/or discussing in an open forum how to enhance
and/or improve bioinformatic services at Stanford is encouraged
to attend.
Format
The course consists of a 1 hour lecture followed by a 30
minute open consultation period.
Description:
EBI Web Services, such as SOAP (Simple Object Access Protocol),
is an integration technology that provides information retrieval
and data analysis tools for use with multiple databases,
without the need for manual searching through a web browser
or for installing and maintaining databases in-house. Moreover,
information retrieval and analysis are linked, enabling
the combination and manipulation of search results. Using
these technologies, programmers can build complex applications
in their language of choice (such as Java, Perl, VB, C#,
C++, Python, Ruby, R and PHP), with data access directly
from and to your code. As a result, you can submit large
and complex jobs straight from your data pipeline with all
of the results being returned directly to your software
for processing at the next stage. The number of tools and
databases at the EBI that have Web Services access is now
in double figures, making the adoption of this technology
very advantageous for any laboratory involved in data-intensive
research. You can retrieve data from multiple EBI databases
using the WSDbfetch web service; and you can make use of
sequence similarity tools such as Blast, Fasta and MPsrch
and perform multiple sequence alignments using ClustalW,
T-Coffee or MUSCLE. You can also explore the InterPro protein
families and domains database using InterProScan and perform
powerful queries over 40 different biological ontologies
via the ontology lookup service. In this lecture, the EBI
Web Services will be demonstrated, providing an overview
of what the individual Web Services can offer.
Audience:
Format: The course consists of a 1-hour
lecture/demonstration. The workshop is limited to 20 participants
or less to maximize interactions between instructor and
student. Register early! A waiting list will be maintained
with one additional session added, if necessary, sometime
during Bioinformatics Week.
Description
Ensembl is a joint project between EBI and the Sanger Institute,
which provides automatic annotation of large genomes and
the relationships among them. Ensembl is rapidly growing,
and currently contains annotation for over 20 species (including
human, chimpanzee, Rhesus macaque, mouse, rat, dog, cow,
opossum, chicken, zebrafish, among others). Ensembl also
imports the manually curated data from other model organisms
to use as comparisons (i.e. Drosophila from FlyBase, C.
elegans from WormBase and yeast from SGC). Ensembl has a
unique gene analysis pipeline with each piece of genomic
information predicted and then supported by biological evidence.
Additionally, ESTs, mRNAs, SNPs, proteins and other biological
information are mapped to the genomes.
This course is designed to present different data types
available in the browser as a series of "Views",
how to compare and contrast various information from several
sources and how to get the most out of the Ensembl database
with the unique data-mining tool, BioMart: a "query
builder" interface which allows users to specify genomic
regions and refine results. BioMart is a simple, distributed
data integration system with powerful genomic-based query
capabilities. BioMart can generate a number of different
types of output, including sequence and tabulated list data.
This course will also introduce BioMart, focusing on how
it can be used by laboratory-based, non-programming, researchers.
Audience
No prior experience is required. Basic familiarity with
the steps and processes involved in BLAST sequence searching
is helpful, but not necessary.
Format
The course consists of a 2-hour lecture OR a 4-hour hands-on
workshop which duplicates the content of the lecture (see
below). Lecture attendees may bring their wireless-access
enabled computer laptops and follow the lecturer, as conditions
permit.
Lecture
Registration for the lecture is unrestricted and open
to all Stanford University faculty, staff, (including visiting
faculty and staff) and graduate students. An open consultation
period with EBI instructors for lecture participants will
be available Friday, May 20 immediately following the
lecture.
Hands-on
Workshop
The workshop duplicates the content of the lecture, but
differs in that it provides hands-on practice of the lecture
demonstration examples. Individual consultation with the
EBI instructor will be available to workshop participants
immediately following the workshop. Workshop computer
space is limited to a single session due to instructor
availability! Register early. Registration will be "first
come - first serve." A waiting list will
be maintained if necessary.
Registration
Registration is required for either the Ensembl lecture
or workshop.
A
Field Guide to NCBI GenBank and NCBI Molecular Biology Resources
Description
This course, a general overview of the NCBI resources, is
designed to provide instruction on the effective use of
NCBI's databases, search service, the BLAST similarity search
engine, genome data and related resources. Databases covered
will include, but are not limited to, GenBank, RefSeq, UniGene,
variation data (SNPs) and NCBI Structures. Topics will include
NCBI's genomic resources, the NCBI assembly and annotation
process, the updated map viewer genome displays, the new
curated conserved domains and the structural protein viewer
Cn3D 4.1. Developing effective search strategies will also
be covered, including database searching with Entrez and
similarity searching at NCBI with the various versions of
BLAST.
Audience
Aimed for both novice and experienced NCBI users, the Field
Guide is continuously updated to match the changing resources
at NCBI. The course has been substantially updated for 2005.
It is designed for anyone who works with biological sequence
data including principal investigators, professional staff,
postdoctoral fellows, and graduate students.
Format
The course consists of a 3-hour lecture. An accompanying
2-hour hands-on computer workshop, instructor-led, is also
available (see below). Workshop attendees will receive hands-on
experience with NCBI resources and will have the opportunity
to apply these resources to their specific interests and/or
research areas.
Lecture
The lecture is open to all principal investigators, professional
staff, postdoctoral fellows, and graduate students. You
may register only for the lecture OR for both the lecture
and one of the computer workshop sessions.
Hands-on
Workshop
You must register for the Field Guide lecture in order
to be eligible to register for one of the offered workshops.
All workshops offered are identical in content. Choose
the workshop that best fits your schedule. Workshop space
is limited due to the number of computer laboratories
available. Register early to insure a place. Every attempt
will be made by both Stanford Libraries and NCBI to enroll
all interested. A waiting list will be maintained.
Registration
Registration is required for both the Field Guide lecture
and workshop.
Description
The goal of the Gene Ontology (GO) project is to produce
a controlled vocabulary that can be applied to all organisms
even as knowledge of gene and protein roles in cells is
accumulating and changing. GO provides three structured
networks of defined terms to describe gene product attributes:
biological process, molecular function, and cellular component.
This GO demonstration lecture is designed to teach users
about what ontologies are and how they can be used to improve
data integration in biology.
Audience
Registration for the lecture is unrestricted. No prior experience
is required. The course is designed for novice and experienced
graduate students, postdoctoral fellows and faculty/professional
staff researchers, regardless of professional discipline.
Although annotation experience is not necessary, workshop
registrants should have some background knowledge of GO
in order to receive the maximum benefit from the workshop.
Format
The course consists of a 2-hour lecture OR a 4-hour hands-on
workshop which duplicates the content of the lecture (see
below). Lecture attendees may bring their wireless-access
enabled computer laptops and follow the lecturer, as conditions
permit.
Lecture
Registration for the lecture is unrestricted and open
to all Stanford University faculty, staff, (including visiting
faculty and staff) and graduate students. If you are interested
in learning about GO and what it can do, register for
the lecture. An open consultation period with EBI instructors
for lecture participants will be available beginning at
3:30 PM Friday, May 20 immediately following the last
EBI lecture for the week.
Hands-on
Workshop
If you want to practice annotation, making your own GO
subsets, register for the workshop. The workshop overlaps
the lecture content, but differs in that it provides hands-on
practice of GO annotation as well as providing the opportunity
to create your own GO subsets. Individual consultation
with the EBI instructor will be available to workshop
participants immediately following the workshop. Workshop
computer space is limited to a single session due to instructor
availability! Register early. Registration will be "first
come - first serve." A waiting list will
be maintained if necessary.
Registration
Registration is required for both the Gene Ontology lecture
and workshop.
Description
This workshop will introduce participants to three major
EBI proteomic databases - UniProt, InterPro, IntAct and
PRIDE – and explore the wealth of information they
contain that can enhance your research. UniProt is a central
repository of protein sequence and function created by combining
the information in Swiss-Prot, TrEMBL and PIR, while InterPro
and IntAct provide free open source data and resources aimed
at analysing proteomes and interactomes. InterPro provides
information on the function, annotation and classification
of proteins by combining the major signature databases from
Gene3D, Pfam, PIRSF, Prints, ProDom, Prosite, Smart, SuperFamily,
Tigrfams, and PANTHER into a unified protein resource. InterPro
then adds structural annotation by mapping individual proteins
to PDB, MSD, CATH, SCOP, ModBase and SwissModel, as well
as functional annotation through GO mapping, literature
summary and references, and links to several external databases
(Pandit, CAZy, Merops, IUPHAR, COMe, Cluster, Blocks, and
Prosite doc). IntAct provides a comprehensive, annotated
data resource for protein interactions derived from literature
curation and through direct submissions. The interactions
can be viewed individually, or as part of an interaction
network, using GO mappings to help define the proteins involved
in such networks.
Audience
No prior experience is required. Basic familiarity with
the steps and processes involved in BLAST sequence searching
is helpful, but not necessary.
Format
The course consists of a 2-hour hands-on workshop. Lecture
attendees may bring their Stanford registered wireless-access
enabled computer laptops and follow the lecturer, as conditions
permit.
Hands-on
Workshop
The workshop considerably duplicates the content of the
lecture, but differs in that it provides hands-on practice
of the lecture demonstration examples. Both workshops are
identical. Chose the one that best fits your schedule. Individual
consultation with the EBI instructor will be available to
workshop participants immediately following the workshop.
Workshop computer space is limited to a single session
due to instructor availability! Register early. Registration
will be "first come - first serve." A
waiting list will be maintained if necessary.
Registration
Registration is required for either the Ensembl lecture
or workshop.
Description
PRIDE provides free open source data and resources aimed
at analysing proteomes. PRIDE is a repository of protein
identification data from large-scale proteomic projects,
such the Human Proteome Project (HUPO) and the Plasma Proteome
Project. This tutorial is suitable for both biologists and
bioinformaticians, and will focus on the type and organisation
of annotated data, the different query methods possible,
understanding the different visualisations of the data,
as well as exploring the multiple links and cross-references
available between the different databases covered and to
external databases. In addition, the tutorial will provide
information on the submission processes for IntAct and PRIDE.
Audience
No prior experience is required. Basic familiarity with
the steps and processes involved in BLAST sequence searching
is helpful, but not necessary.
Format
The course consists of a 1-hour lecture OR a 1-hour hands-on
workshop which duplicates the content of the lecture (see
below). Lecture attendees wotj Stamfprd registered computers
may bring their wireless-access enabled computer laptops
and follow the lecturer, as conditions permit.
Hands-on
Workshop
The workshop considerably duplicates the content of the
lecture, but differs in that it provides hands-on practice
of the lecture demonstration examples. Individual consultation
with the EBI instructor will be available to workshop participants
immediately following the workshop. Workshop computer
space is limited to a single session due to instructor availability!
Register early. Registration will be "first come -
first serve." A waiting list will be maintained
if necessary.
Registration
Registration is required for either the PRIDE lecture or
workshop.
NCBI
Mini-Course - Identification of Disease Genes to Phenotypes
Description
This mini-course deals with the identification of a disease
gene using
NCBI's human genome assembly. The reference genome assembly,
along with
integrated maps, literature, and expression information
comprises a
powerful discovery system for exploring candidate human
disease genes.
We will start with EST sequences that might have been obtained
from a
patient, identify the gene(s) expressing them, download
their sequences,
determine the exon-intron structure and identify known single
nucleotide
polymorphisms (SNPs) in the ESTs, if any, that may contribute
to the
disease phenotype.
The course will emphasize the integration of NCBI tools
such as BLAST,
Map Viewer, Single Nucleotide Polymorphism Database (dbSNP),
and Online
Mendelian Inheritance in Man (OMIM).
Audience
No prior experience with any NCBI resources, including LocusLink
or Gene, is required. The course is designed for novice
and experienced graduate students, postdoctoral fellows
and faculty/professional staff researchers, regardless of
professional discipline, who have a need for a more focused
instruction on locating resources related to a given gene
than provided in the Field Guide course.
Format
The course consists of a 90 minute overview demonstration-style
lecture. An accompanying 60 minute, hands-on computer workshop
is also available on a limited basis. An individual consultation
period will be provided by the NCBI instructors for workshop
attendees.
Lecture
Open to all Stanford University faculty, staff, (including
visiting faculty and staff) and graduate students. You
may register only for the lecture OR for both the lecture
and one of the computer workshops sessions.
Hands-on
Workshop
The workshop is highly recommended for the mini-courses.
You must register for the companion NCBI Mini Courselecture
in order to be eligible to register for the workshop.
Workshop space is limited. Register early to insure a
place. A waiting list will be maintained if necessary.
Registration
Registration is required for all lectures and workshops.
NCBI
Mini-Course - Correlation of Disease Genes to Phenotypes
Description
This mini-course focuses on determining what is known about
a disease
and the gene associated with it. It also elucidates the
biochemical and
3-D structural basis for the phenotype caused by a single
nucleotide
polymorphism that results in an altered protein.
The course will emphasize the integration of resources such
as Entrez
Gene, Single Nucleotide Polymorphism Database (dbSNP), GeneTests,
Online
Mendelian Inheritance in Man (OMIM), Conserved Domain Database
(CDD),
and Cn3D.
Audience
No prior experience with any NCBI resources, including LocusLink
or Gene, is required. The course is designed for novice
and experienced graduate students, postdoctoral fellows
and faculty/professional staff researchers, regardless of
professional discipline, who have a need for a more focused
instruction on locating resources related to a given gene
than provided in the Field Guide course.
Format
The course consists of a 90 minute overview demonstration-style
lecture. An accompanying 60 minute, hands-on computer workshop
is also available on a limited basis. An individual consultation
period will be provided by the NCBI instructors for workshop
attendees.
Lecture
Open to all Stanford University faculty, staff, (including
visiting faculty and staff) and graduate students. You
may register only for the lecture OR for both the lecture
and one of the computer workshops sessions.
Hands-on
Workshop
The workshop is highly recommended for the mini-courses.
You must register for the companion NCBI Mini Courselecture
in order to be eligible to register for the workshop.
Workshop space is limited. Register early to insure a
place. A waiting list will be maintained if necessary.
Registration
Registration is required for all lectures and workshops.
NCBI
Mini-Course - Making Sense of DNA and Protein Sequences
Description
This mini-course will provide an introduction to protein sequence
analysis, beginning with the prediction of a gene within a
DNA sequence
and ending with a model 3-D protein structure.
Topics to be covered include:
A. Exon prediction and the generation of a protein sequence
B. Protein function prediction
C. Identification of the characteristic domains and motifs
in the
protein
D. Mapping of the protein sequence onto the structure of a
protein with
similar sequence
The course will also address how the tools can be applied
to the
analysis of a microbial DNA sequence.
Audience
No prior experience with any NCBI resources, including LocusLink
or Gene, is required. The course is designed for novice
and experienced graduate students, postdoctoral fellows
and faculty/professional staff researchers, regardless of
professional discipline, who have a need for a more focused
instruction on locating resources related to a given gene
than provided in the Field Guide course.
Format
The course consists of a 90 minute overview demonstration-style
lecture. An accompanying 60 minute, hands-on computer workshop
is also available on a limited basis. An individual consultation
period will be provided by the NCBI instructors for workshop
attendees.
Lecture
Open to all Stanford University faculty, staff, (including
visiting faculty and staff) and graduate students. You
may register only for the lecture OR for both the lecture
and one of the computer workshops sessions.
Hands-on
Workshop
The workshop is highly recommended for the mini-courses.
You must register for the companion NCBI Mini-Course lecture
in order to be eligible to register for the workshop.
Workshop space is limited. Register early to insure a
place. A waiting list will be maintained if necessary.
Registration
Registration is required for all lectures and workshops.
Bob Kuhn
Overview of the UCSC Genome Browser
This presentation will focus on the Genome Browser main
display, its
controls, navigation and configuration, as well as the Known
Genes track and
its companion modules, the Gene Sorter and Proteome Browser.
Finally, the
Custom Track utility for displaying user-generated data
on the Browser will
be demonstrated along with the Table Browser, which is used
for data
filtering and intersections between tables and custom tracks.
Heather Trumbower
Under the Hood: Using the code behind the browser
The focus of this talk will be using the UCSC source code.
This includes how
to obtain the source tree and build it. Important utilities
and libraries
will be highlighted, and the sequence formats (nib and 2bit)
will be
reviewed. The talk will also cover the basics of setting
up a mirror site
and a description of how we document our data processing
(makeDocs). If
time allows, the cluster management tool 'parasol' and the
Genbank update
process will be reviewed.
Daryl Thomas
Comparative Genomics at UCSC
This presentation will focus on comparative genomics --
the methods and
applications. The motivation supporting comparative genomics
and the power
of comparing many species will be reviewed, and simple examples
of pairwise
alignments will be followed by chaining and netting. Multiple
species
alignments will be briefly described, and examples of their
use in
conservation analysis (phastCons), exon prediction (exoniphy),
microRna
structure prediction (evoFold), and regulatory potential
(RP) will be shown.
Audience
Registration unrestricted.
Format
Two hour demonstration lecture only. Individual consultation
with the instructor will be available to workshop participants
immediately following the workshop. Workshop computer space
is limited to a single session due to instructor availability!
Register early. A waiting list will be maintained if necessary.
Registration
Registration is required for the lecture.
Lane Medical Library -
Summarizing Candidate Gene Data using SOAP and Excel
Description
Many bioinformatics databases support SOAP or Internet SQL
querying. Such services are a boon to researchers who would
otherwise be limited to querying these databases via the
Web, with the concomitant difficulty in extracting the relevant
data into a format more amenable to summarization, such
as that provided by spreadsheet programs. A use case will
be presented which involves a simple Perl application that
queries the MeSH and PubMed databases using SOAP. Its purpose
is to characterize what is known about a collection of candidate
genes in order to facilitate the optimal selection of genes
to be genotyped. The querying tool executes large number
of queries whose results are summarized automatically for
visual inspection using simple Excel functions. Because
this approach (direct querying and data manipulation within
Excel) is both simple and generic, it can be applied to
many other types of data beyond those provided by NCBI.
Audience
Registration unrestricted.
Format
Two hour demonstration lecture only. Individual consultation
with the instructor will be available to workshop participants
immediately following the workshop. Workshop computer space
is limited to a single session due to instructor availability!
Register early. A waiting list will be maintained if necessary.
Registration
Registration is required for the lecture.
Description
A key step in understanding genome data is being able to
view and use it in the context of the full metabolic and
regulatory network of an organism. The online BioCyc collection
presents pathway/genome databases for over two hundred organisms
along with tools for interacting with individual organisms
or comparing across many organisms. These databases can
also be worked with via the Pathway Tools software package,
which is both a navigation tool and can generate a database
de novo from any annotated genome. This lecture will introduce
the concept of the pathway/genome database via the BioCyc
collection, then demonstrate the additional capabilities
of the Pathway Tools software package, including discussion
of what is involved in making a database for your own organism
of interest.
Audience
Registration for the lecture is unrestricted. This course
is open to everyone from beginners to those who are looking
for more advanced tools to understand one or many organisms.
Format
The course consists of a 90-minute lecture, with opportunities
for Q&A throughout. The instructor will also be available
for additional Q&A following the lecture period.