{{col1}}
{{col0}}
Sorry, JavaScript required!

Data Science: Statistical analysis of data with Python

Title Data Science: Statistical analysis of data with Python
Course code CM609-07-2023-C
Objective Python is a very popular programming language currently, it comes with a large number of powerful statistical libraries, which make it has a significant advantage in complex data analysis. This course will introduce and use python to practice the concepts and methods of basic statistical analysis, teach students the rigorous process of data statistics, and help students master professional data analysis tools.
Content Python Data Loading and Data Visualization
  • Data Loading
  • Data cleaning
  • Plotting of Statistical Data
Distributions and Hypothesis Tests
  • Populations and Samples
  • Probability Distributions
  • Degrees of Freedom
  • Expected Value and Variance
Distributions of One Variable
  • Mean, Median, Mode
  • Quantifying Variability
  • Discrete Distributions
  • Normal Distribution
  • Skewness and Kurtosis
  • Percentiles
Probability
  • Probability Distribution
  • Empirical Distributions
  • Sampling and Central limit Theorem
Hypothesis Tests
  • Hypothesis Concept, Errors, p-Value, and Sample Size
  • Basic Tests
  • Proportion Testing
  • Sensitivity and Specificity
Tests of Means of Numerical Data
  • One Sample t-Test for a Mean Value
  • Analysis of Variance (ANOVA)
  • Multiple Comparisons
Non-parametric test
  • Basic Problems with Nonparametric Tests
  • One-Sample Nonparametric Tests
  • Testing for the median (mean)
  • Test of distribution
  • Sequence of checks
  • Two-Sample Nonparametric Tests
Correlation Analysis and Association Analysis
  • Correlation analysis
  • Functional relationship and correlation
  • Simple correlation analysis
  • Partial Correlation Analysis
  • Association Analysis
Regression analysis
  • Linear regression
  • Basic principles of regression analysis
  • Multiple Linear Regression
  • Nonlinear regression
Contingency Analysis and Correspondence Analysis
  • Contingency Analysis
  • Correspondence analysis
Clustering
  • Basic principles of clustering
  • Steps and Processes of Clustering
  • Systematic clustering
  • K-MEANS Clustering
  • DBSCAN Clustering
Assessment In class performance, exercises and test.
Target audience Python developer, data analyst
Prerequisite Proficient in computer operation, familiar with linux is better;
Have a foundation in Python programming or have completed course CM540;
Familiarity with statistical concepts is better;
Able to read materials in English;
Please check the self-assessment.
Class size 18
Instructor Microsoft certified Trainer(MCT), Oracle Certifited Professional (OCP), a result-driven IT professional experienced in hardware & network troubleshooting. Certified in MCITP,CCNA, Oracle OCP and ITIL v3 qualification. Has been teaching for CPTTM since 2004.
Handout All training material provided by CPTTM
Instruction language Cantonese (supplemented with English)
Handout language Handouts in Chinese (supplemented with English terminology)
Duration 24 hours in 8 sessions
Schedule 18:45-21:45, Jul 5, 2023 (Wednesday), and Jul 13, 2023 (Thursday), and from Jul 19, 2023 to Aug 3, 2023 every Wednesday, Thursday.
Fee MOP2,720
Venue Cyber-Lab (Rua Comandante Mata Oliveira, Ed. Associacao Industrial, 3-andar Macau)
Certificate Certificate of Completion issued by CPTTM (with at least 80% attendance and passed the assessment)
PDAC code ---
Remark This course is pending review and approval by the "Continuing Education Development Program" of the Macao SAR Government.
{{col0}}
{{col1}}
Sorry, JavaScript required!