When is the final exam?

  • Our in class final exam is scheduled for Tuesday, May 5 from 3:00 – 5:30 PM in Forsyth 214.
  • The final exam will be distributed to the class during the final exam window.

What should I bring to the final exam?

  • A pencil with an eraser.
  • A cheat-sheet: 8.5″ × 11″ double-sided, handwritten sheet of notes.

What will be on the final exam?

  • The final exam will consist of 50 multiple-choice and True/False questions. Questions will be drawn from units 2 through 6.

What resources can I use on the final exam?

  • An 8.5″ × 11″ double-sided, handwritten sheet of notes.

How should I study for the final exam?

  • Complete and review the practice exam available on Canvas.
  • Review the study guide provided below.

Study Guide:

  • Unit 2: NumPy

    • What advantages do NumPy arrays offer over native Python lists when working with large numerical datasets?
    • How can you find the maximum value of a NumPy array along a given axis?
    • Describe how multidimensional indexing works in NumPy (e.g. accessing rows, columns, and subarrays).
    • How can boolean and logical indexing be used to filter data in a NumPy array?
  • Unit 3: Pandas

    • Compare and contrast a pandas Series and a DataFrame in terms of structure and typical use cases.
    • Outline the steps you would take to load data from a CSV, clean missing or inconsistent entries, and prepare it for analysis.
    • What are the key attributes and methods you use to inspect a DataFrame’s shape, data types, and summary statistics?
    • Explain how you would rename columns, change a column’s data type, and drop unnecessary columns in a DataFrame.
    • Describe the “split‑apply‑combine” pattern. How does it relate to groupby operations in pandas?
  • Unit 4: Visualizing Data with Python

    • What graphing library is considered the foundation for most Python plotting libraries?
    • When would you choose to use Seaborn instead of Matplotlib directly, and what conveniences does it provide?
    • Explain the difference between a static plot (Matplotlib/Seaborn) and an interactive visualization (Bokeh/Plotly).
    • What are glyphs in Bokeh, and how do they enable interactivity in a plot?
    • How can you add and format labels to a Poltly Express bar chart?
  • Unit 5: Statistical Analysis and Visualization

    • What was the key finding of the 2018 World Happiness Report?
    • How do correlation matrices help in exploratory data analysis?
    • Describe the benefits of correlation and simple linear regression in understanding relationships between variables.
    • What is produced by the corr() method when called on a DataFrame?
  • Unit 6: Dash and Machine Learning with Scikit-Learn

    • What are the differences between Dash Core Components and HTML components when building a Dash app?
    • Describe the differences between a Dash callback decorator and a callback function.
    • What is the difference between supervised and unsupervised learning?
    • Explain the purpose of splitting data into training and test sets before fitting a model.
    • Why is feature scaling (e.g., with StandardScaler) important for many machine learning algorithms?