0% found this document useful (0 votes)
37 views2 pages

SAS Visual Analytics

The document outlines essential steps for exploring and preparing data in SAS Viya before building a machine learning model. Key steps include data import and inspection, cleaning, transformation, exploratory data analysis, and preparation for modeling. Following these guidelines ensures the dataset is well-prepared and relevant for effective model training.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views2 pages

SAS Visual Analytics

The document outlines essential steps for exploring and preparing data in SAS Viya before building a machine learning model. Key steps include data import and inspection, cleaning, transformation, exploratory data analysis, and preparation for modeling. Following these guidelines ensures the dataset is well-prepared and relevant for effective model training.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

SAS Visual Analytics –

Explore Data
Exploring and preparing our data in SAS Viya is a crucial
step before building a machine learning model.
Here’s a comprehensive list of steps we can follow to help
us explore and understand our dataset’s capabilities and
limitations:

1. Data Import and Initial Inspection


 Import Data: Load your dataset into SAS Viya.
 Inspect Data: Use procedures like PROC CONTENTS to understand the
structure, types, and summary statistics of your data.
2. Data Cleaning
 Handle Missing Values: Identify and address missing values using
techniques like imputation or removal.
 Remove Duplicates: Ensure there are no duplicate records in your dataset.
 Correct Errors: Look for and correct any data entry errors or inconsistencies.
3. Data Transformation
 Normalization/Standardization: Scale your data to ensure all features
contribute equally to the model.
 Encoding Categorical Variables: Convert categorical variables into
numerical formats using one-hot encoding or label encoding.
 Feature Engineering: Create new features that may be more predictive for
your model.
4. Exploratory Data Analysis (EDA)
 Summary Statistics: Use PROC MEANS or PROC FREQ to get descriptive
statistics.
 Data Visualization: Create visualizations like histograms, box plots, scatter
plots, and correlation matrices to understand relationships and distributions.
 Correlation Analysis: Identify relationships between variables using
correlation coefficients.
5. Data Reduction
 Feature Selection: Use techniques like correlation analysis, mutual
information, or feature importance from preliminary models to select relevant
features.
 Dimensionality Reduction: Apply methods like PCA (Principal Component
Analysis) to reduce the number of features while retaining most of the
variance.
6. Data Splitting
 Train-Test Split: Divide your data into training and testing sets to evaluate
your model’s performance.
 Cross-Validation: Use cross-validation techniques to ensure your model
generalizes well to unseen data.
7. Data Sampling
 Resampling Techniques: Apply techniques like bootstrapping or stratified
sampling to ensure your training data is representative of the overall dataset.
8. Data Exploration for Model Relevance
 Feature Importance: Use preliminary models to identify which features are
most important for predicting the target variable.
 Target Variable Analysis: Analyze the distribution and characteristics of
the target variable to understand its behavior.
9. Data Preparation for Modeling
 Create Pipelines: Set up data preprocessing pipelines to automate the
transformation and cleaning steps.
 Save Processed Data: Save the cleaned and transformed data for use in
model training.
10. Documentation and Reporting
 Document Steps: Keep detailed records of all data exploration and
preparation steps.
 Generate Reports: Create reports summarizing your findings and the steps
taken to prepare the data.

By following these steps, you’ll ensure that your data is well-


prepared and relevant for building a your first machine learning
model in SAS Viya

You might also like