(ref.second image)

Matrix and Tensor methods for Data Science
Part I. Matrix and Tensor tools

a.y. 2023-2024.
Course of the Curriculum Advanced Mathematics for Applications, Master Degree in Mathematics - Bologna


6 CFU
Lectures: I semester (Part I, 15h)
Lecturer: Prof. V. Simoncini

Time schedule (19/09/2023- )

Tuesday 16:00-18:00
Friday 14:00-16:00

Please check at the bottom of this page for time changes.

Extra-class meeting time with students:

arrange meeting time by sending email to lecturer.

Aim

At the end of the course, students have theoretical and computational knowledge on matrix and tensor techniques for analysing large amounts of data. In particular, students are able to examine large samples of discrete data and extract interpretable information of relevance in image and data processing, in medical and scientific applications, and in social and security sciences.
Part I. The course presents fundamental matrix and tensor techniques commonly employed in large Data analysis methods, typically arising in data science. These will serve as preparatory material for Part II (prof. M. Porcelli), on optimization strategies for the analysis of big data.

Details

* Vector and matrix norms (including sparsity promoting)
* Linear regression and Least squares
* Eigenvalues, SVD, pseudoinverse
* Reduction and low rank representation
- Sparse representation with l_0-norm
- CUR factorization
* Tensors
- Dealing with tensors and various representations
- HOSVD, Tensor OMP, Dictionary Learning with tensors

Details: Lectures lognotes , a.a.2023-2024, 15hours.


Part I of the course consists of 15 hours, alternating in presence lectures (with slides) and computer sessions. The computational environment will be Matlab.

Requirements:

Fundamental concepts of mathematical analysis.
Numerical Linear Algebra (first course)
Basic knowledge of Matlab.


References


* Course Slides (available as we move along).
* R. Horn e C. Johnson, Matrix Analysis , Cambridge Univ. Press, 1985.
* Lars Elden, Matrix Methods in Data Mining and Pattern Recognition , SIAM, 2^ ed., 2019.
*R. Johnson e D. Wichern, Applied Multivariate Statistical Analysis, Prentice-Hall, (V ed.) 2002. Dati Tables in the book.

* SVD and applications Z. Zhang, arXiv n.1510.08532v1 (2015).
* From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images Alfred M. Bruckstein, David L. Donoho, Michael Elad, SIAM Review, 51 (2009).

* article: "CUR matrix decompositions for improved data analysis", M. Mahoney and P. Drineas, PNAS (2009).
* article: "Improving CUR Matrix Decomposition and the Nystrom Approximation via Adaptive Sampling", Shusen Wang, Zhihua Zhang, Journal of Machine Learning Research 14 (2013).
* Tensor Decompositions and Applications T. Kolda and B. Bader, SIAM Review, 51 (3), 2009.
* Chapter on Tensors, Matrix Computations, G. Golub and Ch. Van Loan, 4 Ed, (2013) Johns Hopkins Univ.Press.
* Slides on Dictionary Learning with Tensors: DL.pdf

* Data from various sources.
* UCI machine learning repository
* TechTC (Technion Repository of Text Categorization Datasets)


Matlab functions for tensor computations: hosvd3.m , nmodeproduct.m , tensor toolbox current release, see: Toolbox homepage, and the "Functionality" section for documentations.

Highlights:

  • Arithmetic Formats for Machine Learning - Working group

    Computational exercises:


    Sept 26, 2023, Text.
    Oct 6-10, 2023, Text. Data: mnist_all.mat. Image plotting: ima2.m (consider also using Matlab function "imshow").
    Matlab codes for Tensor Dictionary Learning: tensor_example.(zipped tar folder)
    More data:
    coil20.tar.gz, orl_faces.tar.gz, yalefaces.tar.gz .
    read_allfaces96.m . faces96.zip.

    Exam dates:


    Final test:

    Possible Projects from Part I.