Teaching plan for the course unit



Catalą English Close imatge de maquetació




General information


Course unit name: High-Throughput Analysis of Genomic Data

Course unit code: 568770

Academic year: 2021-2022

Coordinator: Enrique Blanco Garcia

Department: Department of Genetics, Microbiology and Statistics

Credits: 2,5

Single program: S



Estimated learning time

Total number of hours 62,5


Face-to-face and/or online activities



-  IT-based class




Supervised project


Independent learning




Competences to be gained during study


Basic competences
— Knowledge forming the basis of original thinking in the development and/or application of ideas, typically in a research context. 

— Capacity to apply acquired knowledge and solve problems in new or unfamiliar environments within broader or multidisciplinary contexts related to the area of study. 

— Capacity to integrate knowledge and tackle the complexity of formulating judgements based on incomplete or limited information, taking due consideration of the social and ethical responsibilities involved in applying knowledge and making judgements. 

— Capacity to communicate knowledge and conclusions and the grounds on which they have been reached to specialist and non-specialist audiences in a clear and unambiguous manner. 

— Skills to enable lifelong self-directed and independent learning. 


General competences

— Capacity to structure a reasoned discourse in a logical and rational manner, to discuss any scientific topic in front of larger audiences. 

— Capacity for critical, logical and creative thought. Capacity for analysis and synthesis. 

— Capacity for interaction and transfer activities with their environment. 

— Capacity to work in groups and to collaborate with other researchers. 

— Capacity to read and critically interpret scientific publications related to the subject, mainly in English, and to be able to design, write and defend a research project. 


Specific competences

— Capacity for the proposal of methodological designs for assessing genetic diversity based on knowledge of the evolutionary processes that generate this diversity. 

— Capacity to process and interpret genomic data resulting from gene expression analysis and massive sequencing of genomes. Skill in the use of scientific databases and bioinformatics tools used to access available genomic annotations. 

— Ability to apply knowledge on the origin and function of stem cells during cellular regeneration and reprogramming in order to generate iPS cells and devise applications for them in regenerative medicine.





Learning objectives


Referring to knowledge

Successful completion of the course implies a full command of the basic terminology relating to gene expression analysis and massive genome sequencing and of the basic resources needed for visualizing results in the reference genome. Sufficient experience must be acquired in order to design analysis protocols for this massive sequencing data, and the capacity to make the most appropriate biological interpretation in each case should be developed. In this regard, any results obtained using these technologies should directly reflect on the biological problems under consideration at all times. The achievement of these objectives ensures that candidates are prepared to continue their journey in this field, including at a professional level in the future if so desired.



Teaching blocks



*  1. UCSC Genome Browser 

— Browsers, tracks, display
— UCSC tracks: RefSeq, phastCons
— UCSC sessions and custom tracks
 — UCSC tools: BLAT, Table Browser
— Gene Ontology: DAVID, Enrichr

2. Galaxy working environment 

— Basic operations with files 
— The Galaxy interface
— Analysis of genomic data

II. ChIP-seq analysis

*  3. Basic Pipeline (I): Mapping 

— Raw data (single-end): FASTQ
— NCBI-GEO: Platform/Samples
— Mapping with Bowtie
— SAM format (single-end)
— UCSC tracks: BedGraph profiles

4. Basic Pipeline (II): Peak calling 

— Peak calling with MACS
— UCSC tracks: BED format
— Read distribution charts
— Read density heat maps
— ENCODE project: ChIP-seq

5. Characterization of PICs 

— Regulatory information catalogues
— Detection of regulatory sites
— Phylogenetic footprinting
— Searching sequence motifs

III. RNA-seq analysis

*  6. Basic Pipeline (I): Mapping
— Raw data (paired-end): FASTQ
— Raw data (strand-specific): FASTQ
— NCBI-GEO: Platform/Samples
— Mapping with TopHat
— SAM format (paired)
— UCSC tracks: BedGraph profiles

7. Basic Pipeline (II): Quantification
— RPKM quantification with Cufflinks
— Differential gene expression
— Gene expression heat maps
— ENCODE project: RNA-seq

IV. Microarrays

*  8. Basic analysis
— Classes of microarrays
— NetAffx/Agilent web portals 
— NCBI-GEO: Platform/Samples
— Babelomics platform



Teaching methods and general organization


Face-to-face learning activities 

All classes are practical in nature and are taught in computer rooms with Internet access. In each session, after a brief lecture given by the teacher, the balance of class time is devoted to working on a specific example in order to learn about and gain experience with the most commonly used resources in each case. 


Independent learning activities 

Independent learning involves reviewing the practical exercises completed in class as well as recommended and further reading. 


Tutorial sessions 

Teachers are available for consultation on any doubts or questions related to the subject, both in person during office hours or at any time via email.



Official assessment of learning outcomes


Assessment criteria and procedures 

Assessment is based on a final project involving a high-throughput analysis of data on specific regions of the genome published as part of the ENCODE (Encyclopedia of DNA Elements) and modENCODE projects, or on other publications related to gene expression (microarrays) specified by the teacher. Each student will receive a specific assignment corresponding to their individual project. The final report for the project must be submitted within 15 days from the date of publication of the final statement. Late submissions will not be evaluated. 

Class attendance is compulsory. An unjustified absence record of greater than 20% will incur point deductions from the final grade for the subject.