Exploratory data analysis

Interactive SAS tutorials supporting the OpenIntro Introduction to Modern Statistics textbook.

Exploratory data analysis

When your data sits in a SAS dataset or other database, it’s difficult to observe much about it beyond its size and the types of variables it contains. In this tutorial, we’ll learn about summarizing and visualizing your data. Sometimes, we refer to the techniques covered in this tutorial as exploratory data analysis or EDA.

EDA is all about learning the structure of a dataset through a series of numerical and graphical techniques. When you do EDA, you’ll generate summaries of your data that help you understand the distributions of variables, uncover relationship, and find outliers. EDA can also generate questions that will help inform subsequent analysis.

Learning objectives

Lessons

Exploring single variables

Create graphical and numerical summaries to understand the marginal distributions of single variables

Exploring relationships between two variables

Create graphics that display and calculate numeric measures that summarize the relationship between multiple variables

Measures of association between two variables

Calculate numeric measures that summarize the relationship between multiple variables

< Back to main page