optimizing mentor graphics calibre on netapp all flash fas and … · 2018-08-31 · 3 storage best...
TRANSCRIPT
Technical Report
Optimizing Mentor Graphics Calibre on NetApp All Flash FAS and Clustered Data ONTAP 8.3.2 Storage Best Practices Guide Bikash Roy Choudhury, NetApp Hien T. Vu, Mentor Graphics Marty Gartner, Mentor Graphics Min Tsao, Mentor Graphics
March 2016 | TR-4499
Abstract
Mentor Graphics Calibre® is one of the most commonly used tools in the silicon on chip (SoC) manufacturing
process. This process handles different parts of the workflow from Calma Graphic Data System (GDSII) to
mask flow, providing high wafer yield and reducing the cost of operation. The input files provided from the
chip design houses include physical details of SoCs in a GDSII or Open Artwork System Interchange
Standard (OASIS) format. While Calibre as an application is getting more optimized to reduce the time to
mask, the underlying storage infrastructure also plays a significant role on turnaround time (TAT).
The infrastructure consists of network file share storage, the network layer, and the compute farm. NetApp®
storage is primarily used to store the GDSII/OASIS files and the intellectual property files in a shared file
system accessed by Calibre from the compute farm nodes over Network File System (NFSv3). This paper
illustrates how All Flash FAS (AFF) and storage best practices improve the TAT for optical proximity
correction (OPC), pattern matching (PM), and mask decomposition (MDP).
2 Storage Best Practice Guide – Optimizing Mentor Graphics Calibre on NetApp All Flash FAS and Clustered Data ONTAP 8.3.2
© 2016 NetApp, Inc. All Rights Reserved.
TABLE OF CONTENTS
1 Introduction ............................................................................................................................................ 3
2 Target Audience .................................................................................................................................... 3
3 Calibre Functions .................................................................................................................................. 4
4 Clustered Data ONTAP for Calibre Workloads ................................................................................... 4
4.1 Performance ................................................................................................................................................... 5
4.2 High Availability and Reliability ....................................................................................................................... 5
4.3 Capacity ......................................................................................................................................................... 6
4.4 Storage Efficiency .......................................................................................................................................... 6
4.5 Agile Infrastructure ......................................................................................................................................... 7
4.6 Data Protection ............................................................................................................................................... 7
4.7 Manageability ................................................................................................................................................. 7
4.8 Cost ................................................................................................................................................................ 8
5 Calibre Validation with Clustered Data ONTAP 8.3.2......................................................................... 8
5.1 Performance Validation Objectives ................................................................................................................ 8
5.2 Test Environment ........................................................................................................................................... 8
5.3 Test Results ................................................................................................................................................... 9
6 Best Practice Recommendations for Calibre with Clustered Data ONTAP 8.3.2 ............................ 9
6.1 Storage Cluster Node Architecture ................................................................................................................. 9
6.2 File System Optimization .............................................................................................................................. 10
6.3 AFF Optimization .......................................................................................................................................... 12
6.4 Storage Network Optimization ...................................................................................................................... 13
6.5 NFSv3 Optimization ..................................................................................................................................... 13
6.6 Storage QoS ................................................................................................................................................. 14
6.7 NDO ............................................................................................................................................................. 16
7 Compute Farm Optimization .............................................................................................................. 16
8 Summary .............................................................................................................................................. 18
9 Conclusion ........................................................................................................................................... 18
LIST OF FIGURES
Figure 1) Calibre workflow. ............................................................................................................................................ 4
Figure 2) Workload balancing for verification workloads. ............................................................................................... 5
Figure 3) Test results. .................................................................................................................................................... 9
Figure 4) Active-active SAS loop configuration for SSD shelves. ................................................................................ 12
3 Storage Best Practice Guide – Optimizing Mentor Graphics Calibre on NetApp All Flash FAS and Clustered Data ONTAP 8.3.2
© 2016 NetApp, Inc. All Rights Reserved.
1 Introduction
The chip design lifecycle goes from conceptualizing a logical design to physically manufacturing it on a
silicon chip. Most of the semiconductor companies that design chips do not have their own foundries to
actually manufacture the chips except for some large semiconductor companies. Either way, the design
files are handed over from the design teams/organizations to the foundry to perform back-end physical
verification, fracturing, and masking of the designs on real silicon wafers before they end up in fabrication.
All design organizations perform final physical verification tasks before the design files are sent to the
foundries for manufacture. This process is called tapeout. After the foundries receive the GDSII/OASIS
files, the files go through verification, rule check, pattern matching (PM), optical proximity correction
(OPC), and mask data preparation (MDP). Then they are finally manufactured in the fabrication units.
Mentor Graphics Calibre is one of the most popular tools that is widely used by design organizations and
foundries.
Calibre has different modules that perform a variety of functions post-tapeout. The challenge of the post-
tapeout workflow is maintaining tight control for high wafer yield that would lead to a reduction in the time-
to-mask and operation costs. While some of the workloads from modules such as scatter bar and bias in
the pre-OPC stage are memory intensive, the workloads generated from PM, OPC, and MDP are latency
intensive. They rely on the performance of the storage, network, and compute infrastructure in order to
support and complement the speed and quality of Calibre.
NetApp has a strong storage footprint with almost all semiconductor and foundry users and has been
successful in meeting business requirements with its performance, high reliability, and storage efficiency.
The NetApp clustered file system in the NetApp Data ONTAP® 8.3.2 operating system provides scale-up
and scale-out storage architectures to store large and complex chip designs. The system addresses the
growing storage needs of customers while efficiently handling the different workloads generated during
the entire chip-design cycle. NetApp clustered Data ONTAP 8.3.2 provides the following key drivers to
shorten the chip-design process with a faster time to market and improved ROI:
Performance
High availability and reliability
Capacity
Storage efficiency
Agile infrastructure
Data protection
Manageability
Cost
2 Target Audience
Calibre is a popular tapeout tool for complex digital designs. This technical report is for design engineers,
storage administrators, and architects. The information in this report covers:
Best practices and sizing required with clustered Data ONTAP 8.3.2 to support the performance, capacity, availability, and manageability requirements of some Calibre workloads
How using the NetApp scale-out clustered file system solution on All Flash FAS (AFF) for Calibre improves performance for simulation through storage optimizations
The best practices and optimizations that are dynamic to the NetApp storage, network, and compute nodes and that do not require changing the Calibre application or any existing workflows
4 Storage Best Practice Guide – Optimizing Mentor Graphics Calibre on NetApp All Flash FAS and Clustered Data ONTAP 8.3.2
© 2016 NetApp, Inc. All Rights Reserved.
3 Calibre Functions
Calibre performs various functions on GDS and OASIS files, such as taking multiple layers as input and producing multiple files each with a single layer of fractured patterns for variable shape beams (VSBs). OPC, PM, and MDP have been identified as workloads that generate a lot of I/O operations to the
underlying storage. NetApp AFF on FAS8080 provides better latency and more performance headroom to
accommodate scalable workloads (see Figure 1).
kflow.
Figure 1 illustrates the different phases of the Calibre workflow from the design to photo mask before
getting into fabrication. The Calibre tool suite has a solution for every step of the semimanufacturing
design-to-mask flow, which starts with design database signoff and finishes with the photomask pattern.
For the signoff, Calibre nmDRC, Calibre nmLVS, and Calibre xRC are physical and circuit verification
tools to check for design rule compliance, to compare the layout to the schematic, and to extract parasitic
parameters of the designs, respectively. The signoff databases are typically in the GDSII or OASIS
format.
Before the layout data is converted to the photomask patterns, it goes through a series of modifications to
enable the printing of subwavelength geometries. Calibre TDopc, Calibre OPCpro, and Calibre nmOPC
are the rule- and model-based optical proximity correction (OPC) tools that are used to compensate for
diffraction-related effects. Calibre OPCsbar and Calibre nmSRAF are resolution enhancement technology
(RET) tools that are used to insert scattering bars or subresolution assist features to improve the depth of
focus. Calibre Pattern Matching can be used to detect specific regions in the layout where unique
applications of OPC or RET can be applied.
The size of the output databases after these modifications is usually much larger than that of the input.
The number of geometries is increased with the insertion of the subresolution assist features. During the
OPC, the polygonal data is segmented to allow small movements of the segments. This segmentation
increases the number of vertices and hence also increases the output databases’ size.
Calibre FRACTURE is used for converting the GDSII or OASIS data into a photomask pattern. The
specific format of the pattern file depends on the mask writer used to write the photomask.
4 Clustered Data ONTAP for Calibre Workloads
NetApp clustered Data ONTAP provides advanced technologies for software-defined storage that
abstracts data from the underlying hardware by virtualizing the storage infrastructure with storage virtual
machines (SVMs). This process enables an efficient, scalable, nondisruptive environment. Some of these
Figure 1) Calibre workflow.
5 Storage Best Practice Guide – Optimizing Mentor Graphics Calibre on NetApp All Flash FAS and Clustered Data ONTAP 8.3.2
© 2016 NetApp, Inc. All Rights Reserved.
virtualization capabilities might be similar to past NetApp vFiler® unit functionality. Others go beyond
anything else available today.
Clustered Data ONTAP is built on the same trusted hardware that NetApp has been selling for years. We
bring together the different hardware platforms, connect them, and give them the intelligence to
communicate with each other in a clustered environment. The following sections detail the key benefits
that clustered Data ONTAP provides for Calibre workloads.
4.1 Performance
Tapeout environments mostly use Network File System (NFS) to mount volumes from storage onto
compute nodes. With NFS, scaling the number of nodes in the compute farm is easy. With clustered Data
ONTAP, however, storage can also scale seamlessly to provide the enhanced I/O operations per second
(IOPS), bandwidth, performance, and efficiency that are required by different chip-design tools.
The following are requirements in chip-design production scenarios to provide top-notch performance:
Larger memory footprint
Greater number of cores for concurrent processing
Higher capacity limits
Users might require 1,000,000 IOPS from multiple volumes on the storage for different phases of physical
verification, PM, and OPC from multiple projects run in parallel. In clustered Data ONTAP, symlinks are
replaced by cluster namespace junctions, which can have all volumes that are part of a project spread out
on different nodes in the cluster. Every node in the cluster contributes to the IOPS requirement for that
project.
Figure 2 illustrates how the IOPS requirement is spread across different controllers. Proj11 has six
volumes spread out on four FAS nodes in a production cluster. Each node is capable of doing more than
250,000 IOPS from the cache. The 1,000,000-IOPS requirement for Proj11 can be achieved by spreading
the flexible volumes in the cluster namespace within an SVM. These volumes can grow and shrink in
size. They also can be moved seamlessly, without disrupting the application, to any cluster node that is
capable of providing the desired performance.
AFF on FAS8080 provides a significant boost in the number of IOPS and reduction in latency to have an
overall improved turnaround time (TAT) for different Calibre workloads. AFF provides improved storage
space efficiency with inline compression and deduplication and very low latency for read and write
operations.
4.2 High Availability and Reliability
With scale-out architectures, it is very important to have volumes that are highly available and accessible
at all times by the verification applications. Clustered Data ONTAP 8.3.2 provides a high level of
availability at the following levels for all verification workloads:
Figure 2) Workload balancing for verification workloads.
6 Storage Best Practice Guide – Optimizing Mentor Graphics Calibre on NetApp All Flash FAS and Clustered Data ONTAP 8.3.2
© 2016 NetApp, Inc. All Rights Reserved.
Storage controller
Network
NFS protocol
The cluster storage can be set up to fail over to its surviving partner node in the High Availability (HA) pair
and to another network port in the cluster space. It can also be set to fail over to NFS access through a
different node in the cluster in case the NFS clients cannot reach the desired volumes. A chip verification
setup typically consists of a single large aggregate on each controller.
NetApp RAID DP® technology provides more data resiliency against single- or double-disk failures.
Nondisruptive upgrades (NDUs) for clustered Data ONTAP versions and disk shelf firmware provide
nondisruptive operations to the chip design application. This capability enables clustered Data ONTAP to
provide 99.999% reliability.
4.3 Capacity
Clustered Data ONTAP 8.3.1 supports large aggregates and flexible volumes for various hardware
platforms. The number of supported flexible volumes can be higher [NOTE: I don’t see a compared to
here. Higher than what?] on a single FAS controller for high-end platforms. For further details, refer to the
clustered Data ONTAP 8.3.2 release notes at
https://library.netapp.com/ecm/ecm_get_file/ECMLP2348067 and the system configuration guide at
http://mysupport.netapp.com/documentation/docweb/index.html?productID=62217.
Flexible volumes that host different chip designs can nondisruptively move to an aggregate on a different
controller for capacity load balancing. The flexible volume can move from an aggregate on midrange
platforms to aggregates on high-end platforms to provide a higher capacity limit. This capability provides
more autonomy for applications and services and dynamically responds to the shift in workloads.
4.4 Storage Efficiency
Clustered Data ONTAP 8.3.2 provides all the storage efficiencies—including NetApp Snapshot® copies,
thin provisioning, space-efficient cloning, deduplication, and data compression—that are required for most
electronic design automation (EDA) tier 1 applications.:
Thin provisioning. Thin provisioning makes a huge impact while provisioning storage space for volumes that are part of individual projects. Thick-provisioned volumes are guaranteed to use 100% of the space from the start of a project to the finish, even if the project files do not require the entire space. There is very little space left to provision newer projects; project X cannot borrow space from project Y. Therefore, thin provisioning is enabled by default for chip-design volumes mounted over NFS.
As the files generated from different projects continue to be created, updated, and deleted, the free space
is managed at the aggregate level. Statistics have proven that, at any point, user data fills about 30% to
60% of the aggregate space over the course of a project. Almost 33% of the unused aggregate space is
available to accommodate any new chip-design projects. The NetApp OnCommand® Workflow
Automation (WFA) and Operations Manager tools provide storage provisioning and alarms, respectively,
that are triggered when the aggregates are filled to the configurable limit (normally 80%). Administrators
can then move the volumes nondisruptively to an aggregate on a different controller that has more space
available.
Space efficiency. Clustered Data ONTAP 8.3.2 and the NetApp WAFL® (Write Anywhere File Layout) file system provide a 4,000-block size. Calibre application workloads have a combination of random reads and writes to the data files along with sequential write workloads for the log files generated during the workflow. There are both small and large file sizes. The small files are not mirrored, improving the space efficiency of the storage while storing the design files. Also, deduplication and compression are preserved at the destination when data is moved by NetApp SnapVault® technology from the primary storage.
7 Storage Best Practice Guide – Optimizing Mentor Graphics Calibre on NetApp All Flash FAS and Clustered Data ONTAP 8.3.2
© 2016 NetApp, Inc. All Rights Reserved.
4.5 Agile Infrastructure
With clustered Data ONTAP 8.3.2, volumes and IP addresses are no longer tied up with the physical
hardware. The SVM with the cluster namespace spans multiple controllers in the cluster. Storage can be
tiered with different types of disks, such as solid-state drives (SSDs), SAS, or SATA, depending on the
service-level offerings for different chip-design workloads. Other infrastructure features include:
Provisioning or scaling out. New SVMs that consist of chip-design volumes can be created on the existing hardware for different applications and tools. The existing SVMs can grow seamlessly by having new hardware added to the existing cluster. SVMs can be provisioned spontaneously for individual departments, companies, or applications.
Multitenancy. Many tenants can use the physical clustered nodes. SVMs provide a secure logical boundary between tenants. The bottom line is that data constituents such as volumes are decoupled from the hardware plane to provide more agility to the storage infrastructure.
Unified storage. Clustered Data ONTAP 8.3.2 offers unified storage that natively supports NFS, Common Internet Fiel System (CIFS), Fiber Channel Protocol (FCP), and Internet Small Computer Systems Interface (iSCSI). Because EDA workloads are mostly on NFS, different versions of NFS, such as NFSv3 and NFSv4.1/pNFS, can coexist and access the same file system that is exported from the storage.
Storage quality of service (QoS). Clustered Data ONTAP 8.3.2 also provides storage QoS, in which IOPS and bandwidth limits can be set on files, volumes, and SVMs to isolate test and development and rogue workloads from production. Storage QoS provides the following:
Enables consolidation of mixed workloads without affecting the performance of different Calibre volumes or files in a multitenant environment
Isolates and throttles resource-intensive workloads to deliver consistent performance
Simplifies workload management
Nondisruptive operations (NDO). Chip-design volumes and logical interface (LIF) movement within the SVM enable nondisruptive lifecycle operations that are transparent to the applications. NDO can be applicable in the following scenarios:
Unplanned events: infrastructure resiliency against hardware and software failures
Planned events:
Capacity and performance load balancing
Software upgrades and hardware technical refreshes
These features make the infrastructure more agile and enable IT-managed data centers to provide IT as
a service.
4.6 Data Protection
Clustered Data ONTAP provides a high level of data protection through file system–consistent Snapshot
copies, NetApp SnapMirror® technology, and SnapVault. Snapshot copies and SnapVault are the most
commonly used tools for data protection during design verification. In clustered Data ONTAP 8.3.2,
SnapVault performs a logical replication at the volume level that can be done within an SVM, across
SVMs, and across clusters. Because a common use case for SnapVault is for remote or off-site backups,
the remote sites can have single-node and two-node switchless clusters to help EDA customers scale
with minimal cost and complexity.
4.7 Manageability
Manageability becomes a lot easier with SVMs and cluster namespaces in the clustered scale-out
architecture, compared with managing different islands of storage, as was the case with Data ONTAP
operating in 7-Mode. Clustered Data ONTAP 8.3.2 offers a single virtualized pool of all storage. A single
logical pool can be provisioned across many arrays.
In traditional Data ONTAP systems operating in 7-Mode, SnapMirror moved volumes for more capacity,
for more compute power, or for archiving purposes. This approach is no longer the case with clustered
Data ONTAP. Volumes can be moved nondisruptively in the namespace with clustered Data ONTAP.
8 Storage Best Practice Guide – Optimizing Mentor Graphics Calibre on NetApp All Flash FAS and Clustered Data ONTAP 8.3.2
© 2016 NetApp, Inc. All Rights Reserved.
Using the OnCommand Unified Manager and WFA tools, clustered storage can be set up and configured
to provision storage and set policies for different types of workloads and nondisruptive operations.
4.8 Cost
Clustered Data ONTAP 8.3.2 can provide a virtual pool of storage across different FAS platforms. A six-
node cluster can have a combination of AFF with FAS8080 nodes for better latency and IOPS
requirements. The cluster can also have FAS8260s and FAS8240s with SAS/SATA disks to handle other
low-priority projects, backup, and so on in the same cluster. The projects can start in the low-end
FAS8260 nodes and can automatically be moved into AFF (FAS8080) nodes for high performance during
the mid to final stages. Finally, project volumes can be moved into FAS8240 nodes in the same cluster for
archiving or into a SnapMirror source, where data is mirrored to different destination targets.
During the entire life of the project, the Calibre volumes can be moved across different tiers of storage
that are set up according to price and performance. This capability is a unique aspect of SVM in clustered
Data ONTAP. The namespace spans different tiers of storage that are set up with respect to price and
service-level objective (SLO) for different phases of the verification workloads. With the declining prices
and increasing capacity for the SSDs, these drives can provide the performance at a very low cost.
5 Calibre Validation with Clustered Data ONTAP 8.3.2
There is always an ongoing requirement to improve application runtime through better storage
infrastructure to provide extra performance for various reasons. Those reasons include multiple layers for
a single design, faster time to market, and/or optimizing licensing cost. The metric used to measure
storage performance is directly related to the improvement in application wall clock time.
5.1 Performance Validation Objectives
The primary objectives for validating the Calibre tool on NetApp storage were
To enable our customers to optimize and size the storage infrastructure to provide the best job completion time for users
To highlight that the changes suggested in the best practices are dynamic and do not require changing the workflow or performing any application optimization
5.2 Test Environment
The tests were performed with the Calibre 2015.4 release with 128 cores in the Mentor Graphics location
at Wilsonville, Oregon. Different modules such as pattern matching (PM), optical proximity correction
(OPC), and mask data preparation (MDP) in the Calibre workflow generated a lot of I/O to the storage. All
the best practice recommendations documented in section 6 were followed in the NetApp cluster setup to
test verification workload. The verification test scenario consisted of:
FAS6290 with 900GB 10k RPM SAS disks compared with FAS8080 EX with 800GB eMLC SSDs.
The production FAS6290 controller had 18 disks of 8 RAID groups in a single aggregate. The FAS8080 EX AFF had 23 disks of 2 RAID groups in a single aggregate.
The FAS6290 was the production cluster nodes running on clustered Data ONTAP 8.2.1 [NOTE: this sentence doesn’t make sense to me.]. The tests run on these nodes were identified as the baseline. Further tests run on the AFF FAS8080 with SSD running on clustered Data ONTAP 8.3.2 were compared to the baseline from the FAS6290.
The same MDP, OPC, and PM test cases were used on the FAS6290/SAS and FAS8080 EX/SSD aggregates. Both the FAS6290 SAS and AFF FAS8080 EX used aggregated 40GbE connections to the network.
All the compute nodes had a 1GbE connection to the storage cluster.
The compute nodes were running on the CentOS release 6.6 kernel 2.6.32-504.30.3.el6.x86_64.
One master was multithreaded. There were 40 cores used for each of the MDP, OPC, and PM test cases to run in parallel, using a total of 120 cores. Each job was assigned a single core.
9 Storage Best Practice Guide – Optimizing Mentor Graphics Calibre on NetApp All Flash FAS and Clustered Data ONTAP 8.3.2
© 2016 NetApp, Inc. All Rights Reserved.
5.3 Test Results
Tests indicated that MDP barely showed any improvement, while OPC had about 5%, and PM had 15%
improvement in the wall clock time.
The MDP test case was generating too many I/Os to the NetApp controller. Evaluating the storage-level
statistics indicated that lack of an adequate number of disks in the aggregate on FAS8080 resulted in a
disk bottleneck. The test scenario had four 10GbE connections to the storage, while not enough disks to
commit the writes more quickly. Having at least 5 RAID groups is recommended, with each RAID group
having 22 disks in the SSD aggregate. Having the recommended number of disks can improve the
response time on the application further.
6 Best Practice Recommendations for Calibre with Clustered Data
ONTAP 8.3.2
Calibre is a popular tapeout tool used for back-end verification and many more functions. An increasing
number of customers deploy clustered Data ONTAP 8.3.2 for storage, supporting the tapeout phase.
Scale-out clustered FAS storage must be properly architected to handle the different Calibre module
workloads. The aggregates and volumes that store the GDSII/OASIS and IP files must be optimally laid
out in the cluster nodes.
The best practice recommendations in this section provide guidance to optimize clustered Data ONTAP,
the network layer, and the compute nodes for Calibre workloads. It is also necessary to validate some of
the key clustered Data ONTAP 8.3.2 features and functions to improve the overall efficiency of the
Calibre application.
6.1 Storage Cluster Node Architecture
NetApp highly recommends implementing the right storage platform in a clustered Data ONTAP setup. It
also recommends adequate storage sizing and configuration to accommodate logical and functional
verification during chip-design processes that have different SLOs. If the workload is performance driven
and has the highest SLO, NetApp recommends storage controllers with multiple cores and a large
memory footprint. Use SSDs integrated with a FAS8080 controller running on clustered Data ONTAP
8.3.2 that has additional functionality to handle verification workloads for designs that require faster
response time.
Figure 3) Test results.
10 Storage Best Practice Guide – Optimizing Mentor Graphics Calibre on NetApp All Flash FAS and Clustered Data ONTAP 8.3.2
© 2016 NetApp, Inc. All Rights Reserved.
Choosing the Right Hardware for Calibre Workloads in a Clustered Scale-Out Architecture
The Calibre cluster setup can provide different SLOs for OPC, PM, and MDP functions in the workflow.
Front-end verification, place and route, and design rule check (DRC) can coexist in the same or
different SVMs. The choice of hardware can be different based on the price-to-capacity (GB) and price-
to-performance ratios for various SLOs:
If the Calibre workload from different modules requires performance at the highest level, NetApp
strongly recommends FAS8080 controllers with a minimum of 800GB SSDs and a two-path active-
active multipath of 12GB (6GB + 6GB) backplane. The AFF system provides very low predictable
latency (1–2ms) and high IOPS. This configuration is the one most recommended for Calibre
workloads.
For the next level of SLO performance, you may use a FAS8060 with hybrid aggregates (a
combination of SSDs and spinning disks such as SAS/SATA), otherwise known as Flash Pool™,
for Calibre workloads.
FAS8060 with PCIe-based 1TB Flash Cache™ and 900GB 10k RPM SAS drives may also be
used based on the price-to-performance requirements.
Flash Pool and Flash Cache can both be used or interchangeably used depending on the cost-to-
performance requirements for different design workloads.
If the cluster setup is designed to accommodate verification files for SnapMirror targets, backup, or
archiving, NetApp recommends a minimum of FAS8040/FAS8020 controllers with SATA disks.
A four- or eight-node or larger cluster with different types of disks (AFF, SAS, and SATA) can be
configured based on the SLOs for different workloads.
NetApp highly recommends having a minimum of five RAID groups for spinning disk aggregates.
Each of the RAID groups should be configured in 22D + 2P format for SAS and 16D + 2P format
for SATA.
For hybrid aggregates, NetApp recommends a minimum of five RAID groups. One RAID group
would use SSDs, and the remaining four would use SAS disks. Each of the RAID groups would
have 22D + 2P disks.
With AFF, NetApp recommends having two or three RAID groups, depending on the capacity
requirement. Each of the RAID groups would be configured in 20D + 2P format.
NetApp highly recommends engaging with the appropriate NetApp sales account team to evaluate
business requirements before architecting the clustered scale-out setup for the environment.
6.2 File System Optimization
After the volumes have been created in the SVM, NetApp recommends certain best practice
configurations on the aggregate and volumes to address the following fragmentation issues in a Calibre
storage environment:
Constant writes and deletions to the file system during the assembly phase cause files to be fragmented across storage
Free space for writes to complete a full stripe becomes scarce
The file system can be kept healthy at all times with some maintenance and housekeeping activities on
the storage as it ages and grows in size. These activities include the following:
Defragmenting the file system. Reallocating is a low-priority process that constantly defragments
the file system, and it can run in the background. However, it requires sufficient free space to
11 Storage Best Practice Guide – Optimizing Mentor Graphics Calibre on NetApp All Flash FAS and Clustered Data ONTAP 8.3.2
© 2016 NetApp, Inc. All Rights Reserved.
succeed. NetApp recommends implementing measures to keep the aggregate use under 80%. If the
aggregate runs close to 90% capacity, the following considerations apply:
Some free space is required to temporarily move the data blocks to free space and then rewrite
those blocks in full and complete stripes in contiguous locations on the disk. This action optimizes
the reads that follow.
Insufficient space in the aggregate allows the reallocate process to run in the background, but
defragmentation of the file system never completes.
When there is insufficient free space to defragment the system, an NDO to move the production
chip design volume to another controller that is part of the cluster setup for capacity balancing
must occur.
New shelves must be added to the original controller to provide more space to the aggregate that
is running low on space. Perform reallocate start -vserver vs1_eda_cali -path /vol/VOL06 -force
true for all the volumes in that aggregate.
The reallocate start operation forces all the existing volumes to spread out on the new disk spindles that
were added to the aggregate. Otherwise, the new writes coming into that aggregate go only to the new
disks.
Defragmenting free space. Continuous segment cleaning, introduced in clustered Data ONTAP
8.1.1 and further optimized in clustered Data ONTAP 8.3, helps coalesce the deleted blocks in the
free pool to use for subsequent writes.
Thin provisioning. The volumes in the cluster namespace can be thin provisioned by disabling the
space guarantee and enabling autogrow. Doing so provides flexibility to provision space for chip
designs and allows different project volumes to autogrow in increments of 1GB.
NetApp recommends enabling the following storage options to optimize the entire life of the file system.
File-System Optimization Best Practices for FAS8xxx with SAS/SATA Aggregates
The following settings cannot be put into place from the cluster shell. They can be put into place
only in CLI mode:
bumblebee::*> vol modify -vserver vs1_eda_Cali -volume VOL06 -atime-
update false
Volume modify successful on volume VOL06 of Vserver vs1_eda_vcs.
NetApp recommends always setting up an alarm that triggers as soon as the aggregate reaches
80% capacity. The critical chip-design volumes that need more space can automatically use WFA
or manually be moved to another aggregate on a different controller.
NetApp recommends thin provisioning the volumes. You can do thin provisioning when the
volumes are created, or you can modify them later. Thin provisioning can also be implemented by
using OnCommand System Manager 3.0 from a GUI:
bumblebee::*> vol modify -vserver vs1_eda_Cali -volume VOL06 -space-
guarantee none
(volume modify)
Volume modify successful on volume: VOL06
Adequate sizing is required for the number of files in each directory and the path name lengths:
Longer path names lead to a higher number of NFS LOOKUP operations.
Default quotas cannot be implemented for users and groups:
Include an explicit quota entry for users and groups.
12 Storage Best Practice Guide – Optimizing Mentor Graphics Calibre on NetApp All Flash FAS and Clustered Data ONTAP 8.3.2
© 2016 NetApp, Inc. All Rights Reserved.
6.3 AFF Optimization
NetApp highly recommends AFF for very low, predictable latency and better IOPS rather than spinning
disk and hybrid aggregates. Clustered Data ONTAP 8.3.2 provides further read optimizations for AFF.
These optimizations apply to both random and sequential reads. Improvements to inline compression in
flash media also enable better storage efficiency and performance.
AFF Optimization Best Practices
Because of the sequential nature of the workload, it is very important to cable the back-end SAS
loop between the controller and the SSD shelves. The random reads and writes are not a
significant part of the design workload. The sequential workloads can saturate the back-end SAS
loop between FAS8080 controllers and the SSD subsystem. NetApp highly recommends an active-
active path between the controller pair and the SSDs. Half of the disks in the shelf is software
owned by one node; the other half is owned by its partner for each SSD shelf. The random read
and write workloads can be a subset of this setup. This setup does not require a separate
configuration.
Figure 4 illustrates the active-active connection between the controllers and the SSD shelves.
Make sure that software ownership of the disks is split equally between the controllers for each
SSD shelf. In clustered Data ONTAP 8.3.2, the disk ownership is automatically split into half
between the two controllers.
Configure a minimum of 5 RAID groups of 23 disks in each RAID group in a single aggregate for
Calibre workloads.
Disable inline compression on the volumes for Calibre workloads.
Figure 4) Active-active SAS loop configuration for SSD shelves.
13 Storage Best Practice Guide – Optimizing Mentor Graphics Calibre on NetApp All Flash FAS and Clustered Data ONTAP 8.3.2
© 2016 NetApp, Inc. All Rights Reserved.
6.4 Storage Network Optimization
After you create the aggregates and volumes based on the recommended sizes to support the Calibre
workload, you must configure the network. At that time, the cluster, management, and data ports are all
physically connected and configured on all the cluster nodes to the cluster switches. Configuring the
network includes the following:
Data port aggregation. Before you configure the LIFs and routing tables for each SVM, it is
important to aggregate at least two 10GbE data ports for handling the Calibre workloads. Depending
on the number of GDSII/OASIS volumes that each controller has, NetApp recommends aggregating a
larger number of data ports than are required to achieve the desired SLO. Using Link Aggregation
Control Protocol (LACP) on the port aggregations on the storage and as well as on the switch is
recommended.
LIF failover. In clustered Data ONTAP, LIF IP addresses are no longer tied to physical network ports.
The addresses are part of the SVM. When LIF IP addresses are created, NetApp recommends
configuring a failover path in case the home port goes offline. If a data port failure occurs, the LIF can
fail over nondisruptively to another controller. Doing so enables the application to continue accessing
the volume even though the LIF moved to a different controller in the SVM.
Storage Network Optimization Best Practices
Aggregate at least two 10GbE data ports on each cluster node that interface with the compute farm:
bumblebee::*> network port ifgrp create -node fas6280c-svl07 -ifgrp
e7e -distr-func ip -mode multimode
bumblebee::*> network port ifgrp add-port -node fas6280c-svl07 -ifgrp
e7e -port e0d
bumblebee::*> network port ifgrp add-port -node fas6280c-svl07 -ifgrp
e7e -port e0f
Use the following option to configure the LIF failover for any LIF configured in the SVM clusterwide:
bumblebee::*> net int modify -vserver vs1_eda_Cali -failover-group
clusterwide -lif vs1_eda_Cali_data3 -home-node fas6280c-svl09 -home-
port e9e -address 172.31.22.172 -netmask 255.255.255.0 -routing-group
d172.31.22.0/24
Always follow a ratio of one volume to one LIF. In that way, every volume has its own LIF. If the
volume moves to a different controller, the LIF should move along with it.
6.5 NFSv3 Optimization
Almost all cell characterization workloads access the file system from the back-end storage controllers
over the NFSv3 protocol:
NFSv3 is a stateless protocol and is geared primarily toward performance-driven workloads such as
the verification environment with asynchronous writes. Communication between the NFSv3 client and
storage takes place over remote procedure calls (RPCs).
Red Hat Enterprise Linux (RHEL) 5.x is the most common Linux vendor–supported version and is
used by most of the semiconductor companies in Calibre compute farm environments. However,
using RHEL 6.7 is recommended.
NFS runs in the kernel space of the network stack in the clustered Data ONTAP code. Minimal tuning
is required for NFS running on the network stack.
As one of the benefits of clustered Data ONTAP 8.3.1 and later, a fast path for the local data path is
available for NFSv3.
14 Storage Best Practice Guide – Optimizing Mentor Graphics Calibre on NetApp All Flash FAS and Clustered Data ONTAP 8.3.2
© 2016 NetApp, Inc. All Rights Reserved.
With a large number of compute nodes accessing files from a single controller, Transmission Control
Protocol (TCP) receives the window size. Otherwise, the receive buffer might quickly become exhausted.
Storage does not accept additional TCP windows over the wire until the receive buffer is freed up. NetApp
therefore recommends increasing the TCP receive buffer value.
NetApp also recommends enabling NFS failover groups to provide another layer of protection at the
protocol level.
NFSv3 Optimization Best Practices
• The command force-spinnp-readdir enables making effective readdir calls from the data
stack; increasing the TCP buffer also optimizes performance. The buffer size also must be
increased:
nfs modify -vserver vs1_eda_Cali -force-spinnp-readdir true -
tcp-max-xfer-size 65536
• Follow these steps to configure the NFS failover groups. The example shows how the LIFs
vs1_eda_Cali_data3 and vs1_eda_Cali_data4, which are assigned to an NFS failover group,
move the NFS traffic over port e7e on node fas6280-svl07:
bumblebee::*> network interface failover-groups create -
failover-group Cali_failover_group -node fas6280c-svl07 -port
e7e
bumblebee::*> network interface failover-groups show -failover-
group Cali_failover_group -instance
Failover Group Name: Cali_failover_group
Node: fas6280c-svl07
Port: e7e
1 entries were displayed.
bumblebee::*> network interface modify -vserver vs1_eda_Cali -
lif vs1_eda_Cali_data3,vs1_eda_Cali_data4 -failover-group
Cali_failover_group
2 entries were modified.
6.6 Storage QoS
Storage QoS provides another level of storage efficiency in which IOPS and bandwidth limits can be set
for workloads that are not critical or when setting up SLOs on different workloads. In EDA environments,
storage QoS plays an important role:
Rogue workloads can be isolated with proper IOPS and bandwidth limits. Set a different QoS policy group for users who generate these kinds of workloads in a production environment. This isolation can be done at an SVM, volume, or specific file level.
In an IT-managed cloud infrastructure, storage QoS helps to run multiple tenants with different service-level offerings. New tenants can be added to the existing one as long as the storage platform has the headroom to handle all the workload requirements. Different workloads such as OPC, PM, and MDP have different performance SLOs assigned to them.
15 Storage Best Practice Guide – Optimizing Mentor Graphics Calibre on NetApp All Flash FAS and Clustered Data ONTAP 8.3.2
© 2016 NetApp, Inc. All Rights Reserved.
Storage QoS Configuration
A QoS policy group must be created for different SVMs in the cluster. In the following example,
two QoS policy groups are created; business_critical and non_critical have different IOPS and
bandwidth settings:
bumblebee::*> qos policy-group create -policy-group business_critical
-vserver vs1_eda_Cali -max-throughput 1.2GB/sec
bumblebee::*> qos policy-group create -policy-group non_critical -
vserver vs1_eda_Cali -max-throughput 2000IOP
bumblebee::*> qos policy-group show
Name Vserver Class Wklds Throughput
---------------- ----------- ------------ ----- ------------
business_critical
vs1_eda_Cali user-defined - 0-1.20GB/S
non_critical vs1_eda_Cali user-defined - 0-2000IOPS
2 entries were displayed.
Volume vol06 is then set with the QoS policy group non_critical:
bumblebee::*> vol modify -vserver vs1_eda_Cali -volume CMSGE -qos-
policy-group non_critical
(volume modify)
Volume modify successful on volume: CMSGE
The file writerandom.2g.88.log has been set to a non_critical QoS policy group. You cannot set a
QoS policy group on a file when the volume that holds that file already has a QoS policy group set
on it. The QoS policy group on the volume must be removed before the policy can be set on a
particular file in that volume:
bumblebee::*> file modify -vserver vs1_eda_Cali -volume VOL06 -file
//OpenSPARCT1/Cloud_free_trial_demo/OpenSparc-
T1/model_dir/farm_cpu_test/writerandom.2g.88.log -qos-policy-group
non_critical
bumblebee::*> qos workload show
Workload Wid Policy Group Vserver Volume LUN Qtree File
-------------- ----- ------------ -------- -------- ------ ------ ----
---------
CMSGE-wid12296 12296 non_critical vs1_eda_Cali
CMSGE - - -
file-writerandom-wid11328
11328 non_critical vs1_eda_Cali
VOL06
- -
/OpenSPARCT1/Cloud_free_trial_demo/OpenSparc-
T1/model_dir/farm_cpu_test/writerandom.2g.88.log
2 entries were displayed.
16 Storage Best Practice Guide – Optimizing Mentor Graphics Calibre on NetApp All Flash FAS and Clustered Data ONTAP 8.3.2
© 2016 NetApp, Inc. All Rights Reserved.
6.7 NDO
NDO completely changes the way that clustered Data ONTAP keeps data functioning and available to the
application and the users who access the data. Disruptive scenarios were tested in the lab under
verification workloads to determine whether users were disrupted at the application layer:
When a data port was taken offline, the LIF IP address instantly failed over to another node in the
cluster. This failover did not cause any outage for the user accessing the data under the load.
The chip-design volume was moved to a different cluster node under the active verification load for
capacity- and workload-balancing reasons. The volume and the LIF were moved to the new location
in the cluster namespace without disrupting the user’s running jobs on the chip-design volume.
Nondisruptive Operations with Volume Move
In this example, the volume VOL06 is moved from an aggregate in FAS8080-svl02 to an
aggregate in FAS8080-svl01 while the verification workload is in progress. The application is
not disrupted when the volumes are moved on the storage.
bumblebee::*> vol move start -vserver vs1_eda_Cali -volume VOL06 -
destination-aggregate aggr1_fas8080c_svl01_1
(volume move start)
[Job 17268] Job is queued: Move "VOL06" in Vserver "vs1_eda_Cali" to
aggregate "aggr1_fas8080c_svl01_1". Use the "volume move show -vserver
vs1_eda_Cali -volume VOL06" command to view the status of this
operation.
job show <job_id> can be used to check the status of the “vol move.”
bumblebee::*> job show 17268
Owning
Job ID Name Vserver Node State
------ -------------------- ---------- -------------- ----------
17268 Volume Move bumblebee fas8080c-svl01 Success
Description: Move "VOL06" in Vserver "vs1_eda_Cali" to
aggregate "aggr1_fas8080c_svl01_1"
NDO can also be performed during hardware technical refreshes when all volumes on a node are
evacuated to another cluster node and moved back nondisruptively to the new controllers after the
refresh.
Nondisruptive upgrades (NDUs) can also be performed on clustered Data ONTAP versions and the
shelf and disk firmware without causing any outage to the application.
7 Compute Farm Optimization
The engineering compute farms in foundries consist of hundreds of cores in pods in a master and slave
setup, which translates to hundreds to thousands of physical compute nodes. Virtualization is not the
favorite form of implementation due to performance overhead for jobs being processed in batch mode.
Linux is the most commonly used operating system in compute farms. Linux clients in the compute farm
provide the cores that process the number of jobs submitted.
For better client-side performance with clustered Data ONTAP 8.3.2, the Calibre tool and the schedulers,
such as Sun Grid Engine or Load Sharing Facility, must run on RHEL 6.6 and later.
Considering the high volume of nodes in the compute farm, it is unrealistic to make significant changes
dynamically on each of the clients. Based on the Calibre workload evaluation, the following
17 Storage Best Practice Guide – Optimizing Mentor Graphics Calibre on NetApp All Flash FAS and Clustered Data ONTAP 8.3.2
© 2016 NetApp, Inc. All Rights Reserved.
recommendations for Linux clients contribute a great deal to improving the job completion times for
various chip-design activities.
Compute Node Optimization for NFSv3 Mounts
Turn off hyperthreading on the BIOS setting of each of the Linux nodes if the nodes are
multisocket. Turning off hyperthreading is not required if the compute nodes are single socket.
Use the recommended mount options while mounting over NFSv3 on the Linux compute
nodes:
vers=3,rw,bg,hard,rsize=65536,wsize=65536,proto=tcp,intr,timeo=600.
Set sunrpc.tcp_slot_table_entries = 128; this setting improves TCP window size.
This option is fine for pre-RHEL 6.6 kernels that mount over NFSv3. RHEL 6.6 and later,
however, include changes to the TCP slot table entries. Therefore, the following lines must be
included when mounting file systems on an RHEL 6.6 kernel over NFSv3. The following lines
are not required when mounting over NFSv4.1, however. NetApp storage might have its
network buffers depleted by a flood of RPC requests from Linux clients over NFSv3:
Create a new file: /etc/modprobe.d/sunrpc-local.conf
Add the following entry: options sunrpc tcp_max_slot_table_entries=128
If the compute nodes use 10GbE connections, then the following tuning options are required.
The following changes do not apply for clients that use 1GbE connections:
Disable irqbalance on the nodes:
[root@ibmx3650-svl51 ~]# service irqbalance stop
Stopping irqbalance: [ OK ]
[root@ibmx3650-svl51 ~]# chkconfig irqbalance off
Set net.core.netdev_max_backlog = 300000; avoid dropped packets on a 10GbE connection.
18 Storage Best Practice Guide – Optimizing Mentor Graphics Calibre on NetApp All Flash FAS and Clustered Data ONTAP 8.3.2
© 2016 NetApp, Inc. All Rights Reserved.
8 Summary
Clustered Data ONTAP 8.3.2 provides the agility, storage efficiency, data protection, capacity, and
compute power needed in a scale-out architecture for Calibre. The OPC, PM, and MDP volumes after the
project is completed are archived and can move seamlessly through the different tiers of storage on
clustered Data ONTAP 8.3.2. The compute nodes keep scaling to address the back-end verification and
mask preparation requirements. Most of the time the underlying storage that stores all mask preparation
workflows and processes goes unoptimized. Architecting and optimizing the underlying storage can
further improve job completion for the Calibre tool.
The primary goal of this effort is to help improve job completion time at the application layer. Foundries or
chip manufacturers are always looking for extra performance capability from storage to complete jobs
more quickly and lead to faster time to market. Providing guidance and optimizing the storage to perform
to its capacity have always been the objectives. The increasing complexities in the chip-design process
require more predictive and sustainable performance in a scale-out architecture. The storage
performance of a Calibre workload with clustered Data ONTAP 8.3.2 and AFF is an optimal choice for
providing the best performance.
9 Conclusion
Layout vs. schematic, DRC, parasitic extraction, OPC, and MDP are some of the most important parts of
the pre- and post-tapeout phases before getting into mask preparation and finally fabrication. Although
most of these modules in the design and manufacturing phases are compute and memory intensive,
some critical parts of the Calibre tool drive a lot of I/O to storage. Turnaround time is always a critical
requirement during the chip manufacturing process, and even more so when the number of silicon layers
is increasing with 14nm and FinFETs. Optimizing the storage, network, and compute complements the
overall efficiency of the Calibre tool.
As mentioned earlier, MDP and OPC are highly I/O driven, and having the recommended number of
RAID groups helps to boost the performance of these modules. With SSD prices on the decline and the
size of these disks becoming greater, using AFF for high performance and predictable low latency is
highly recommended. Improving the wall clock time enables accelerating the post-tapeout phases before
entering into fabrication. This configuration improves overall ROI and optimizes license costs.
19 Storage Best Practice Guide – Optimizing Mentor Graphics Calibre on NetApp All Flash FAS and Clustered Data ONTAP 8.3.2
© 2016 NetApp, Inc. All Rights Reserved.
Refer to the Interoperability Matrix Tool (IMT) on the NetApp Support site to validate that the exact product and feature versions described in this document are supported for your specific environment. The NetApp IMT defines the product components and versions that can be used to construct configurations that are supported by NetApp. Specific results depend on each customer's installation in accordance with published specifications.
Copyright Information
Copyright © 1994–2016 NetApp, Inc. All rights reserved. Printed in the U.S. No part of this document covered by copyright may be reproduced in any form or by any means—graphic, electronic, or mechanical, including photocopying, recording, taping, or storage in an electronic retrieval system—without prior written permission of the copyright owner.
Software derived from copyrighted NetApp material is subject to the following license and disclaimer:
THIS SOFTWARE IS PROVIDED BY NETAPP "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, WHICH ARE HEREBY DISCLAIMED. IN NO EVENT SHALL NETAPP BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
NetApp reserves the right to change any products described herein at any time, and without notice. NetApp assumes no responsibility or liability arising from the use of products described herein, except as expressly agreed to in writing by NetApp. The use or purchase of this product does not convey a license under any patent rights, trademark rights, or any other intellectual property rights of NetApp.
The product described in this manual may be protected by one or more U.S. patents, foreign patents, or pending applications.
RESTRICTED RIGHTS LEGEND: Use, duplication, or disclosure by the government is subject to restrictions as set forth in subparagraph (c)(1)(ii) of the Rights in Technical Data and Computer Software clause at DFARS 252.277-7103 (October 1988) and FAR 52-227-19 (June 1987).
Trademark Information
NetApp, the NetApp logo, Go Further, Faster, AltaVault, ASUP, AutoSupport, Campaign Express, Cloud
ONTAP, Clustered Data ONTAP, Customer Fitness, Data ONTAP, DataMotion, Flash Accel, Flash
Cache, Flash Pool, FlashRay, FlexArray, FlexCache, FlexClone, FlexPod, FlexScale, FlexShare, FlexVol,
FPolicy, GetSuccessful, LockVault, Manage ONTAP, Mars, MetroCluster, MultiStore, NetApp Fitness,
NetApp Insight, OnCommand, ONTAP, ONTAPI, RAID DP, RAID-TEC, SANshare, SANtricity,
SecureShare, Simplicity, Simulate ONTAP, SnapCenter, SnapCopy, Snap Creator, SnapDrive,
SnapIntegrator, SnapLock, SnapManager, SnapMirror, SnapMover, SnapProtect, SnapRestore,
Snapshot, SnapValidator, SnapVault, SolidFire, StorageGRID, Tech OnTap, Unbound Cloud, vFiler,
WAFL, and other names are trademarks or registered trademarks of NetApp Inc., in the United States
and/or other countries. All other brands or products are trademarks or registered trademarks of their
respective holders and should be treated as such. A current list of NetApp trademarks is available on the
web at http://www.netapp.com/us/legal/netapptmlist.aspx.
TR-4499-0316