Teaching plan for the course unit

(Short version)

 

Catalā English Close imatge de maquetació

 

Print

 

General information

 

Course unit name: Multivariate Analysis

Course unit code: 361232

Academic year: 2021-2022

Coordinator: KARINA GIBERT OLIVERAS

Department: Faculty of Economics and Business

Credits: 6

Single program: S

 

 

Estimated learning time

Total number of hours 150

 

Face-to-face and/or online activities

60

 

-  Lecture with practical component

Face-to-face

 

30

 

-  IT-based class

Face-to-face

 

30

Supervised project

40

Independent learning

50

 

 

Learning objectives

 

Referring to knowledge

The aim of the course is to present statistical techniques for analysing large data sets in order to quickly extract the most relevant information from those data; the problems addressed are of various types: from the definition of dominant axes to the statistical characterization of subpopulations. This specific objective is tackled by presenting the point of view of three large families of multivariate statistical techniques:

1. Multivariate techniques for automated classification aimed at establishing typologies and characterizing them; different families of methods are presented, from the most classic to the most recent: partition methods, hierarchical methods, density-based methods; special emphasis is placed on class interpretation tools; the adequacy of the different methods to different cases is studied, depending on scalability, data type, etc.

2. Multivariate techniques focused on synthesizing and summarizing information, studying multidimensional relationships between variables and, eventually, defining latent indicators; as centred on three fundamental techniques: principal component analysis, single correspondence analysis, and multiple correspondence analysis; factor analysis is presented as a general formal framework from which the techniques mentioned are derived as specific cases; particular importance is given to the analysis of graphical results; some additional extensions are illustrated, including textual analysis.

3. Discriminant analysis techniques, these are multivariate techniques for obtaining allocation rules; the focus is on the relationship with the techniques introduced above.

4. Textual analysis techniques, working with free texts from documents, web pages or social networks in order to identify the underlying concepts and the relationships between them.

From a conceptual point of view, the aim of the subject is twofold . On the one hand, it seeks to provide students with a solid formal base for the multivariate techniques on the syllabus. On the other hand, students must develop the practical skills to apply these techniques to real data. Thus, the practical sessions, in which students work with real data, follow the syllabus from the perspective of the application of the techniques. To this end, a pre-processing data step has to be included to prepare the data for analysis.

Finally, and taking into account that the course cannot be exhaustive and that other aspects will be dealt with later, various multivariate techniques are presented in an introductory fashion focusing on them in a less algebraic sense from a more algorithmic point of view.

 

Referring to abilities, skills

In this subject, particular importance is given to training students in certain transversal skills for the professional development of the statistician, such as the ability to analyse, synthesize, communicate, integrate knowledge, write reports and, above all, work as part of a team, including mid-term planning skills, task sharing, and incident management in the work plan.

Practical sessions are structured in such a way as to train students in these skills with the necessary support of the teachers of the subject.

 

 

Teaching blocks

 

1. Introduction

*  Metrics, angles and projections. Multivariate nomenclature. Matrices of variances and covariances and matrices of correlations. Presentation of points of view, presentation of techniques, presentation of statistical computer systems. Simple examples of multivariate description, data characterization, classification, and discrimination

1.1. Data entry and pre-processing

The scope of multivariate analysis. Main relevant elements in data pre-processing

2. Automated classification

*  Conceptual presentation. Partitioning methods. Hierarchical methods. Density-based methods. Relationship with factor analysis. Interpretation of classes. Description of typologies. Application to real cases and practical implications

3. Factor analysis

*  General formalization, theoretical results

4. Principal components analysis

*  Formalization, theoretical results, interpretation, application to real cases, practical implications

5. Simple correspondence analysis

*  Formalization, theoretical results. Interpretation. Applications to real cases. Practical implications

6. Multiple correspondence analysis

*  Formalization. Theoretical results. Interpretation. Application to real cases and practical implications that show advantages for the treatment of survey data

7. Discriminant analysis

*  Formalization, theoretical results. Relationship with factor analysis. Interpretation in the case of two groups. Applications to real cases. Practical implications

8. Other multivariate methods

*  Textual analysis. Canonical correlations. Multidimensional scaling

9. Textual analysis

*  Textual data analysis techniques

 

 

 

 

Reading and study resources

Consulteu la disponibilitat a CERCABIB

Book

ALUJA, Tomàs, et al. Aprender de los datos: el análisis de componentes principales: una aproximación desde el Data Mining. Barcelona: EUB, 1999

  Basic bibliography.

Catāleg UB  Enllaç

ESCOFIER, Brigitte, et al. Análisis factoriales simples y múltiples: objetivos, mé todos e interpretación. Bilbao: Servicio Editorial. Universidad del País Vasco, 1992

  Basic bibliography.

Catāleg UB  Enllaç

GREENACRE, Michael J. Correspondence analysis in practice. Boca Raton (Fla.) [etc.]: Chapman & Hall/CRC, 2007

  Basic bibliography.

Catāleg UB  Enllaç
Edciciķ en castellā de 2008  Enllaç

HUSSON, François, et al. Exploratory multivariate analysis by example using. R. Boca Raton: CRC Press, 2011

  Basic bibliography.

Catāleg UB  Enllaç

JOHNSON, Richard Arnold, et al. Applied multivariate statistical analysis. 6th ed. Upper Saddle River, N.J.: Pearson Education, Prentice-Hall, 2007

  Basic bibliography.

Catāleg UB  Enllaç

BOUROCHE, Jean-M et al. L’analyse des données. Paris: Presses Universitaire de France, 1980

  Further reading.

Catāleg UB  Enllaç

JOBSON, J.D. Applied multivariate data analysis. Vol. I y Vol. II. New York; Barcelona [etc.]: Springer, 1992

  Further reading.

Catāleg UB  Enllaç

LEBART, Ludovic, et al. Tratamiento estadístico de datos: métodos y programas. Barcelona [etc.]: Marcombo, 1985

  Further reading.

Catāleg UB  Enllaç

SAPORTA, Gilbert. Probabilités, analyse des données et statistique. 3e éd. rév. Paris: Technip, 2011.

  Further reading.

Catāleg UB  Enllaç

VOLLE, Michel. Analyse des données. 4e éd. Paris: Economica, 1985

  Further reading.

Catāleg UB  Enllaç