Best Practices For Deploying Sas Server
Best Practices For Deploying Sas Server
Copyright © 2024 Amazon Web Services, Inc. and/or its affiliates. All rights reserved.
Best Practices for Deploying SAS Server on AWS AWS Whitepaper
Amazon's trademarks and trade dress may not be used in connection with any product or service
that is not Amazon's, in any manner that is likely to cause confusion among customers, or in any
manner that disparages or discredits Amazon. All other trademarks not owned by Amazon are
the property of their respective owners, who may or may not be affiliated with, connected to, or
sponsored by Amazon.
Best Practices for Deploying SAS Server on AWS AWS Whitepaper
Table of Contents
Abstract and introduction ................................................................................................................ i
Abstract ........................................................................................................................................................... 1
Are you Well-Architected? .......................................................................................................................... 1
Introduction ................................................................................................................................................... 1
SAS Architecture .............................................................................................................................. 3
SAS 9.4 Intelligence Platform .................................................................................................................... 3
SAS 9.4 Grid ............................................................................................................................................. 4
SAS Viya ......................................................................................................................................................... 5
Prerequisites for migrating SAS to AWS ........................................................................................ 7
Prerequisites for modernizing SAS 9 to SAS Viya ......................................................................... 8
Instance types .................................................................................................................................. 9
Physical core requirements ......................................................................................................................... 9
Additional configurations ............................................................................................................................ 9
Multi-tenancy ........................................................................................................................................... 9
Placement groups ......................................................................................................................................... 9
SAS 9 systems ............................................................................................................................................... 9
SAS compute tiers + SAS grid node ................................................................................................... 9
Shared file system storage required for SAS grid .......................................................................... 10
SAS mid-tier and metadata servers .................................................................................................. 10
SAS Viya Servers ................................................................................................................................... 11
Baseline resource recommendations ................................................................................................ 11
Storage types ................................................................................................................................. 12
Permanent SAS data storage .................................................................................................................. 12
Temporary SAS data storage ............................................................................................................. 13
Shared file system to use SAS Grid Manager or SAS Viya .......................................................... 14
Placement groups in SAS .............................................................................................................. 15
Clients ........................................................................................................................................................... 15
SAS infrastructure ...................................................................................................................................... 15
Source data files ......................................................................................................................................... 16
Authentication tools .................................................................................................................................. 16
Options for deploying on SAS on AWS ........................................................................................ 17
Authentication ............................................................................................................................... 19
High availability ............................................................................................................................ 20
Network and security .................................................................................................................... 21
iii
Best Practices for Deploying SAS Server on AWS AWS Whitepaper
Conclusion ...................................................................................................................................... 23
Contributors ................................................................................................................................... 24
References & further reading ....................................................................................................... 25
Document revisions ....................................................................................................................... 26
AWS Glossary ................................................................................................................................. 27
iv
Best Practices for Deploying SAS Server on AWS AWS Whitepaper
Abstract
Many SAS customers are moving their SAS applications from on-premises data centers to the AWS
Cloud. In order to migrate, customers must be aware of all the layers of their SAS infrastructure.
Customers should understand how their SAS applications run, and how to optimize their Amazon
Web Services (AWS) architecture.
This whitepaper addresses performance considerations and best practices for SAS®9 (SAS®
Foundation and SAS Grid Manager) and SAS® Viya® when hosted on AWS. The content is written
for IT professionals familiar with SAS and AWS.
For more expert guidance and best practices for your cloud architecture—reference architecture
deployments, diagrams, and whitepapers—refer to the AWS Architecture Center.
Introduction
SAS is an analytics software that provides organizations a suite of capabilities that enable users
to draw insights from data and make intelligent decisions. The SAS platform includes software
platforms that underpin SAS product offerings in analytics, data management, and visualization.
SAS 9.4 provides simplified architecture and deployment options for running SAS on a cloud
infrastructure. SAS Viya is a cloud- enabled, in-memory analytics engine that provides quick,
accurate and reliable analytical insights.
Abstract 1
Best Practices for Deploying SAS Server on AWS AWS Whitepaper
1. Data Management
2. Visual Analytics
3. Governance and Security
4. Forecasting and Text Mining
5. Statistical Analysis
6. Environment Management
SAS is also a 4GL programming language used by data scientists for more than 80,000 customers
globally. SAS 9 does not leverage the benefits of the cloud in terms of managed hosting, elasticity,
and scalability. SAS Viya, on the other hand, is a cloud-enabled, in-memory analytics engine with
features such as elasticity, scalability, and fault tolerance. In this whitepaper, SAS customers can
learn about the best practices for running their SAS 9 workloads on AWS and evaluate how to
modernize their architecture of SAS Viya.
Introduction 2
Best Practices for Deploying SAS Server on AWS AWS Whitepaper
SAS Architecture
• Middle tier that enables users to access intelligence data and functionality through a web
browser, and provides shared services used by the platform’s applications
• Client tier that provides desktop access to intelligence data and functionality through easy-to-
use interfaces
The SAS grid computing environment uses SAS Grid Manager to distribute SAS computing
tasks among multiple computers on a network. Workload distribution enables the following
functionality:
• Accelerated processing allows users to distribute subtasks of individual SAS jobs to a shared pool
of resources.
• Scheduling jobs allows users to schedule automatically routed tasks to the shared resource pool.
SAS Viya
SAS Viya is a cloud-enabled, in-memory analytics engine that provides quick, accurate and reliable
analytical insights. Elastic, scalable, and fault-tolerant processing addresses complex analytical
challenges while effortlessly scaling for future use cases. SAS Viya has the following benefits:
• SAS Viya provides distributed analytical in-memory calculations that are optimized for
unconstrained environments and automatically adjust in constrained environments.
• SAS Viya supports a standardized code base that enables programming in SAS and other
languages like Python, R, Java, and Lua.
• SAS Viya is highly available with distributed processing crafted to handle multiple users
distributing operations across the cores of a single server, or nodes of massive compute clusters.
SAS Viya 5
Best Practices for Deploying SAS Server on AWS AWS Whitepaper
SAS Viya 6
Best Practices for Deploying SAS Server on AWS AWS Whitepaper
• Are there any SAS jobs that must run within a certain timeframe? Do you expect your SAS jobs to
execute in the same amount of time (or faster) than they are currently executing in your existing
data center? If you do expect a similar execution, you should determine the AWS I/O throughput.
• What is the location of the source data for the SAS job? If the data is not in AWS, you must
consider the time connection requirements for migration. Added time will impact the SLA for
jobs that consume data outside of AWS.
• Is additional security required for the data and/or SAS code?
SAS 9 workloads require instances that supports heavy analytical processing and large sequential I/
O; SAS Viya processes data in memory with spill to disk, if required. These
behaviors should be taken into consideration when selecting the correct Amazon EC2 instance
types and storage requirements for AWS migration.
Running SAS workloads on the cheapest AWS EC2 instances does not necessarily provide the best
performance. For example, customers may require storage and server instances with more physical
cores than required for computing needs and more storage capacity than the initial sizes required
to acquire the maximum I/O bandwidth for their SAS application(s).
Evaluate the following areas to understand the main considerations for optimal AWS performance:
7
Best Practices for Deploying SAS Server on AWS AWS Whitepaper
Knowledge of the following is needed to deploy, update, and manage the SAS Viya software:
• Kubectl commands to perform operations, such as kubectl apply, kubectl taint, kubectl
label, and kubectl logs
• Experience with Amazon Elastic Kubernetes Service (Amazon EKS)
• Depending on the deployment, experience with Kubernetes operator, Docker, or Ansible
Optionally, SAS also provides a content assessment tool that helps you migrate your environment
to SAS Viya. The content assessment tool provides the inventory of what is in your SAS 9
environment and additionally provides you with details specifically around if your current code will
transition to SAS Viya.
8
Best Practices for Deploying SAS Server on AWS AWS Whitepaper
Instance types
Additional configurations
SAS servers require high I/O and sufficient bandwidth is required. You can obtain bandwidth
through a dedicated network interface card (NIC) — additional CPUs and RAM may also be
required.
Multi-tenancy
With a dedicated NIC, sharing the same with multi-tenant applications residing the same physical
server can result in inferior performance for virtualized EC2 instances.
Placement groups
Ensure that all the instances and components of SAS are placed on the target infrastructure within
the same Availability Zone (AZ) in an EC2 placement group. This is particularly useful for SAS
workloads that require low-latency performance for node-to- node communication.
SAS 9 systems
SAS compute tiers + SAS grid node
SAS compute and SAS grid nodes require a minimum of 8GM of physical RAM per core and robust
I/O throughput.
SAS WORK is a temporary library that is automatically defined by SAS at the beginning of each
SAS session or job. The WORK library stores temporary SAS files that are created by users and is
internally used by SAS. SAS UTILLOC is a temporary location for operations such as sorting, stats,
multi-threaded processes which could have the same location as the WORK folder, but may be
different.
• I3 family – EC2 I3 instances are the next generation of storage optimized instances for high
transaction, low latency workloads. These instances include Non-Volatile Memory Express
(NVMe) SSD-based instances storage optimized for high random I/O performance, high
sequential read throughput, and high IOPS. Because of the high internal I/O bandwidth
from striped NVMe SSD drives for SAS WORK and SAS UTILLOC, users should configure their
environment to explicitly use the NVMe SSD local drives (not EBS volumes).
• I3en family – This family provides Non-Volatile Memory Express (NVMe) SSD instance storage
optimized on Amazon EC2 with enhanced networking via ENA to achieve up to 100 Gbps of
network bandwidth.
• M5n family – The M5 family provides a balance of compute, memory and networking. M5n
instance variation are ideal for applications requiring improved network throughput and packet
rate performance.
• M5 or M5dn family — With M5dn instances that support 8 GB of RAM per physical core, local
NVMe-based SSDs are physically connected to the host server and provide block-level storage for
lifetime of the instance.
Both instances types are suitable for workloads requiring a balance of compute, memory and
networking resources.
• R5 or R5d family – R5/R5d instances are suitable for memory intensive applications such as in-
memory caches, mid-size in-memory databases and real- time big data analytics.
SAS Viya node groups are identified by the work the associated pods in the application perform. To
manage the workload, node taints and labels are assigned to control scheduling.
In previous SAS Viya deployments (for example, 3.5 prior) SAS CAS nodes, SPRE, and microservices
nodes were required and pricing was based on the number of processing cores. With the SAS Viya
Kubernetes option, customers have the flexibility of cloud-native models, where they have to
decide their solutions based on the number of users, types of users, and total revenue.
For details on base line recommendations, refer to the Costs and Licenses section in the Migrating
SAS Viya to the AWS Cloud guide.
Storage types
AWS has many storage types for temporary and permanent requirements. In this section, we
address the options for field experiences and lab testing with the SAS on AWS storage options.
Viya CAS is not only an analytic and transformation engine, it is also a data server. It loads data
into a CAS table in order to analyze and process. The format of these tables
can vary including SASHDAT, CSV, Oracle, SQL Server, or Hadoop, and it is backed into a permanent
storage with content stored in-memory.
The following permanent storage options are suggested to support the SAS data files and SAS Viya
CAS tables:
• Elastic Block Storage (EBS) – Stripe together a minimum of 4 EBS volumes for I/O bandwidth
aggregation.
• EBS ST1 (throughput optimized HDD) Storage designed for large block sequential I/O. A 12.5
TB volume can sustain 500 MB/second. If the volume size is less than 500 MB per second of
total bandwidth, it can be observed during the burst window.
• For high-throughput read-heavy workloads (like in SAS), update the read-ahead setting on
EBS ST1/IO1 from default 256 KB to 8 MB. EBS IO1 (provisioned IOS SSD) storage can also be
used.
• Other EBS storage types like GP2 (general purpose storage) and SC1 (cold storage) are not
suitable for permanent SAS 9 or SAS CAS data files.
• RAID 0 configuration is preferential because fault tolerance is not a determining criterion for
these workloads.
• Customers can also choose to have EBS IO1 volumes (provisioned storage). However, costs
would increase as IO1 volumes are charged by storage and by provisioned IOPS. For ex – 32K
IOPS can yield as much as 500 MB/sec but customers would pay an additional amount for the
desired provisioned IOPS.
• • Using SAS/Access to Redshift, SAS Datasets can be loaded into Amazon S3/Amazon Redshift
using Amazon S3 capabilities of multi-part upload, transfer acceleration, and COPY/UNLOAD
to Amazon Redshift for relational storage.
• I3 instances feature low latency NVMe SSDs striped together with RAID0. Use NVMe devices to
support high bandwidth, low latency, and sequential I/O .
• If additional storage is required, default to permanent SAS storage.
• Amazon FSx for Lustre — Provides a high-performance file system optimized for fast processing
of workloads such as machine learning and high performance computing. These workloads
commonly require data to be presented through a fast and scalable file system interface and
typically have data sets stored on long-term data stores like S3.
AWS sets up the Lustre file system with mount options noatime and flock. SAS prefers flock
parameter and the mount options must be properly implemented for FSx for Lustre.
FSx does not allow dynamic expansion for the size of the Lustre file system. If a larger size is
required then a new system must to be setup, and data must be copied to the new file system.
• Amazon Elastic File System (Amazon EFS) — Amazon EFS supports network file system version 4
protocol and allows multiple Amazon EC2 instances to interact with EFS. However, the maximum
throughput I/O is 500 MB per second per instance. For more information on limits, refer to the
EFS documentation.
Multiple EFS file systems per instance are required to overcome this I/O throughput limitation in
addition to a single NIC per AWS EC2 instance. These file systems cannot be striped together. These
file systems have a total of 512 hard locks for any particular file across all users and instances
connected to this system.
14
Best Practices for Deploying SAS Server on AWS AWS Whitepaper
In a partition placement group, each group is split into logical segments called partitions
containing their own set of racks with separate power supplies and networks. This creates a
hardware resiliency in case of failure. A partition placement group can have partitions in multiple
Availability Zones in the same Region.
Clients
The most common placement of SAS clients like SAS Enterprise Guide, SAS Data Integration Studio
and SAS Studio should be within the same Region where the other SAS infrastructure is located.
These clients can be on a Windows server or a Windows Virtual Desktop, and it is required to
determine a place for these Windows systems.
Depending on the volume of data being transferred back to the SAS client, having the clients and
backend server in the same Availability Zone, Placement Group or Region yields the fastest results.
SAS clients can be placed on the same instance (Windows
Server) as the backend server as well, allowing SAS users to access both the clients and the
backend server.
SAS infrastructure
SAS has several tools that allow sharing of SAS data files on-premises with SAS applications that
run on AWS and vice-versa. Network bandwidth over the internet can be limited, sometimes as low
as 500 KB/second, which can further constraint the I/O required by SAS applications
If higher I/Os are desired, you can use AWS Direct Connect, which is a private network connection
between AWS and your datacenter or corporate network that provides a high bandwidth,
Clients 15
Best Practices for Deploying SAS Server on AWS AWS Whitepaper
consistent network experience. It is best practice to have SAS applications and the frequently
accessed data in the same Region and Availability Zone.
Authentication tools
If the interaction is frequent, it is recommended to co-locate the authentication tool in the same
Region/Availability Zone/placement group. However, in many cases once access is obtained
interaction is infrequent with authentication tool and the tool can be located either on-prem or in
another availability zone.
17
Best Practices for Deploying SAS Server on AWS AWS Whitepaper
18
Best Practices for Deploying SAS Server on AWS AWS Whitepaper
Authentication
For Windows server, AWS provides AWS Directory Service or AD connector for connecting an
existing on-premises Microsoft Active Directory. For added security MFA can be enabled through
Enable MFA on AD Connector.
19
Best Practices for Deploying SAS Server on AWS AWS Whitepaper
High availability
Depending on the need for high availability (HA), a process should be placed in a new instance in
case that the existing instance fails. This failover practice is a benefit of running SAS application on
AWS.
For SAS Grid Manager customers, a shared file system, such as Intel Lustre or IBM Spectrum Scale,
will remain operational if one of the nodes associated with the shared file system terminates
unexpectedly. However, any data associated with the node will not be available until the node is
restored. Only one copy of the data is stored by default and it is possible to enable replication of
the data to two or three copied across the
shared file system from the nodes in the system, especially with FSx Lustre, which is backed up into
Amazon S3.
If customers would like to implement redundancy mechanisms, they must decide the downtime
SLAs that they are willing to accept, and based on those choices, implement a pilot light, cold start,
warm-standby, or active-active setup. With any of the above HA options, customers must mirror
SAS deployment, SAS Files, and data store to the appropriate Region/Availability Zone for HA. At
minimum, it is expected that customer builds for cold starts with data stores and deployment files
are backed up to S3 in separate production accounts. For more information, refer to the option for
single host SAS 9.4 on AWS with backups to Amazon S3 Glacier.
20
Best Practices for Deploying SAS Server on AWS AWS Whitepaper
SAS 9.4 can be deployed within a customer’s VPC within a private subnet containing the required
EC2 instances and permanent storages devices.
A public subnet can contain a NAT Gateway that allows instances in a private subnet to connect to
the internet or other AWS services, while also preventing outside internet connection to the SAS
server.
A bastion host can be placed within the public subnet, with security group rules, to allow transfix
between the public bastion host and the SAS servers placed in the private subnet.
Internet Gateway can be used for connectivity between the internet and SAS Servers in a VPC for
hosting public websites
21
Best Practices for Deploying SAS Server on AWS AWS Whitepaper
22
Best Practices for Deploying SAS Server on AWS AWS Whitepaper
Conclusion
This whitepaper helps SAS customers and migration consultants understand the different AWS
capabilities that best meet the needs of their SAS workloads. They key takeaway is that I/O
throughput is crucial and can be a limiting factor in successful SAS deployments on AWS if not
implemented correctly.
At the same time, it is important for AWS migration consultants and solutions architects to broadly
understand SAS infrastructure and its components, so that they can match the base performance
requirements with the corresponding infrastructure and technical improvements being released by
AWS.
This whitepaper raises awareness of SAS workload requirements within its core and ancillary
components, and how to best meet those requirements in AWS. It is crucial to understand that the
choice of compute, storage, and application architecture placement are key for achieving the best
performance that AWS can offer for SAS deployments.
The information in this paper is based on customer experience and expertise from SAS and AWS
working together at the time of writing of this paper. AWS offerings are constantly improving, and
therefore it is in the best interest of the reader to understand the rationale used in the selection
process and to consider what was done as a point-in- time design.
23
Best Practices for Deploying SAS Server on AWS AWS Whitepaper
Contributors
Contributors to this document include:
24
Best Practices for Deploying SAS Server on AWS AWS Whitepaper
25
Best Practices for Deploying SAS Server on AWS AWS Whitepaper
Document revisions
To be notified about updates to this whitepaper, subscribe to the RSS feed.
26
Best Practices for Deploying SAS Server on AWS AWS Whitepaper
AWS Glossary
For the latest AWS terminology, see the AWS glossary in the AWS Glossary Reference.
27