Difference between revisions of "MAT5153"
(→Course Content: Refined order lesson to ensure that a week of technical material is followed by a week of conceptual week. This increased time for completion of technical assignments,) |
|||
Line 15: | Line 15: | ||
! Week !! Source !! Topic !! Prerequisites !! SLOs | ! Week !! Source !! Topic !! Prerequisites !! SLOs | ||
|- | |- | ||
− | | 1 || || Description of the course project. || || Understanding of government databases | + | | 1 || || Description of the course project. || || Understanding of government databases. Conduct basic data exploration. Identity questions answerable with data available for a specific problem. |
|- | |- | ||
− | | 2 || || | + | | 2 || || Scripts vs. compiled code. || || Basic numeric operations in scripting vs. compiled code. Clarity about the differences between interpreted and compiled code, and how it impacts data analysis. Setting up environments. |
|- | |- | ||
− | | 3 || || | + | | 3 || || Ethics in data analysis. || || Identification of biases introduced during data collection, storage, analysis, and access. |
|- | |- | ||
| 4 - 5 || || Linear discriminants I || || Ability to minimize an equation involving matrices and vectors. Mastery of Principal Component Analysis (PCA), Fisher's linear discriminant, and multiple discriminant analysis. Mastery in multi-linear operations in scripting and compiled languages. Understanding of the balance between computational performance and development time. | | 4 - 5 || || Linear discriminants I || || Ability to minimize an equation involving matrices and vectors. Mastery of Principal Component Analysis (PCA), Fisher's linear discriminant, and multiple discriminant analysis. Mastery in multi-linear operations in scripting and compiled languages. Understanding of the balance between computational performance and development time. |
Revision as of 07:56, 25 April 2023
Data Analytics MDC4153/MAT5153
Catalog entry
Prerequisite: MAT2243 Applied Linear Algebra or (MAT2233 Linear Algebra and MAT2214/MAT2213 Calculus III).
Content: This immersive Data Analytics course equips students with the essential mathematical skills and knowledge required to analyze, visualize, and interpret complex datasets. Students will be exposed to the entire life cycle of data analysis. Throughout the course, participants will explore basic operations in scripting languages, delve into advanced visualization techniques, and investigate linear discriminants, generalized regressions, time series analysis, and non-linear discriminants, and clustering. Students will program essential algorithms, instead of using toolboxes, to explore the discrete Fourier transform, generalized regressions, clustering algorithms, and artificial neural networks.
Furthermore, the course will provide an understanding of relational databases and their integration with programming environments, as well as guidance on creating effective data analysis plans. Emphasis will be placed on solution architecture, reproducibility, configuration management, and generating standardized reports.
By the end of the course, students will have a strong foundation in data analytics, allowing them to transform raw data into valuable insights for decision-making.
Course Content
Week | Source | Topic | Prerequisites | SLOs |
---|---|---|---|---|
1 | Description of the course project. | Understanding of government databases. Conduct basic data exploration. Identity questions answerable with data available for a specific problem. | ||
2 | Scripts vs. compiled code. | Basic numeric operations in scripting vs. compiled code. Clarity about the differences between interpreted and compiled code, and how it impacts data analysis. Setting up environments. | ||
3 | Ethics in data analysis. | Identification of biases introduced during data collection, storage, analysis, and access. | ||
4 - 5 | Linear discriminants I | Ability to minimize an equation involving matrices and vectors. Mastery of Principal Component Analysis (PCA), Fisher's linear discriminant, and multiple discriminant analysis. Mastery in multi-linear operations in scripting and compiled languages. Understanding of the balance between computational performance and development time. | ||
6 | Visualization (basic and advanced). | Understanding of different families of visualization techniques. Ability to create Circos plots. | ||
7 | Generalized regressions | Understanding of mathematical approaches to produce an infinite family of regressions for the purpose of data smoothing. | ||
8 | Relational databases | Ability to create, access, and use relational databases from within programming environments. Understanding of when to use relational databases | ||
9 | Clustering | Ability to create basic clusters using multiple definitions of distance. | ||
10 | Solution architecture & reproducibility. | Capacity to design a complex data analysis solution that guarantees reproducibility, interoperability, and maintainability, | ||
11 | Non-linear discriminants (i.e. artificial neural networks). | Capability to program a fully-connected feed-forward artificial neural network from scratch. Understanding of the effect of multiple activation functions, | ||
12 | Management of the configuration. | Ability to program in collaborative multi-layered environments. Capacity to resolve conflicts in code, create code branches, and propagate effectively code changes across multiple environments such as development, test production, etc. | ||
13 | Data analysis plans & standardized reports. | Dexterity to break a data analysis problems into multiple interconnected components, and then produce automated reports targeting specific audiences. | ||
14 | Project presentations | Exposure to presentation of results in front an audience of experts. |