reniac accelerates high-volume, transactional …...reniac accelerates high-volume, transactional...

6
rENIAC Accelerates High-Volume, Transactional Databases with Intel ® FPGAs Data Center Database Acceleration with Intel® FPGAs Transforming the Apache Cassandra * NoSQL database for AI and the data-driven economy Executive summary The Apache Cassandra* NoSQL database is used widely for the data-intensive use cases that are shaping the modern era—from IoT and fraud detection to personalization and financial services. While Cassandra meets many enterprise- class requirements, it reaches limitations when processing transactional and AI applications. rENIAC offers an innovative “intermediary” layer between Cassandra clients and database nodes. Comprised of a Data Engine* and the rENIAC software, the solution brings storage closer to the network through intelligent caching. Running on Intel® architecture and taking advantage of the acceleration possible with Intel® FPGAs, rENIAC delivers outstanding performance, high throughput, and low latency for demanding workloads and applications, simplifying AI adoption. Challenges Managing the exponential growth of data without compromising performance is a pressing issue for many businesses and enterprises. The popular Cassandra NoSQL database handles database traffic extremely well across multiple commodity servers, providing high write throughput and scaling linearly through the addition of nodes. But as workloads become more demanding with huge databases, AI model training and neural network inference, high-bandwidth data such as video and imaging, and millisecond transaction speeds, many infrastructures reach limitations impacting throughput, performance, and latency. As with other big data challenges, many data centers have responded by adding more and more DRAM memory to every Cassandra node. Since one can only add a fixed amount of memory per node, more nodes end up getting added just to expand memory at the cluster level. Clearly, this approach works to a certain extent, but only if one can afford to add the large amounts of DRAM on each node. “Data traffic performance is a critical element to meet response times required for great user experiences and scalability. By accelerating Cassandra* databases using rENIAC’s Distributed Data Engine* technology, we are speeding up critical NoSQL workloads for global solution providers and enterprises.” —Prasanna Sundararajan, CEO, rENIAC SOLUTION BRIEF

Upload: others

Post on 22-May-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: rENIAC Accelerates High-Volume, Transactional …...rENIAC Accelerates High-Volume, Transactional Databases with Intel® FPGAs Data Center Database Acceleration with Intel® FPGAs

rENIAC Accelerates High-Volume, Transactional Databases with Intel® FPGAs

Data CenterDatabase Acceleration with Intel® FPGAs

Transforming the Apache Cassandra* NoSQL database for AI and the data-driven economy

Executive summaryThe Apache Cassandra* NoSQL database is used widely for the data-intensive use cases that are shaping the modern era—from IoT and fraud detection to personalization and financial services. While Cassandra meets many enterprise-class requirements, it reaches limitations when processing transactional and AI applications. rENIAC offers an innovative “intermediary” layer between Cassandra clients and database nodes. Comprised of a Data Engine* and the rENIAC software, the solution brings storage closer to the network through intelligent caching. Running on Intel® architecture and taking advantage of the acceleration possible with Intel® FPGAs, rENIAC delivers outstanding performance, high throughput, and low latency for demanding workloads and applications, simplifying AI adoption.

ChallengesManaging the exponential growth of data without compromising performance is a pressing issue for many businesses and enterprises. The popular Cassandra NoSQL database handles database traffic extremely well across multiple commodity servers, providing high write throughput and scaling linearly through the addition of nodes. But as workloads become more demanding with huge databases, AI model training and neural network inference, high-bandwidth data such as video and imaging, and millisecond transaction speeds, many infrastructures reach limitations impacting throughput, performance, and latency. As with other big data challenges, many data centers have responded by adding more and more DRAM memory to every Cassandra node. Since one can only add a fixed amount of memory per node, more nodes end up getting added just to expand memory at the cluster level. Clearly, this approach works to a certain extent, but only if one can afford to add the large amounts of DRAM on each node.

“Data traffic performance is a critical element to meet response times required for great user experiences and scalability. By accelerating Cassandra* databases using rENIAC’s Distributed Data Engine* technology, we are speeding up critical NoSQL workloads for global solution providers and enterprises.”

—Prasanna Sundararajan, CEO, rENIAC

Solution brief

Page 2: rENIAC Accelerates High-Volume, Transactional …...rENIAC Accelerates High-Volume, Transactional Databases with Intel® FPGAs Data Center Database Acceleration with Intel® FPGAs

Solution Brief | rENIAC Accelerates High-Volume, Transactional Databases with Intel® FPGAs

SolutionrENIAC maximizes the capacity of the Cassandra database with a cost-effective, plug-and-play software solution accelerated by Intel® FPGAs that delivers predictable low latency in production environments with high throughput and simplifies management of vision processing units (VPUs)—all without adding more nodes. Running on Intel® CPUs, the solution is designed to enable a wide range of data-driven use cases and simplify adoption of AI on legacy infrastructure.

The solution speeds big data workloads using distributed technology, achieving significantly lower latency and higher performance on Apache Cassandra with no software changes. rENIAC’s innovative approach deploys data store acceleration as a service to enable a new class of databases and improve efficiency.

rENIAC’s software acts as a data engine, using Intel FPGAs, Intel CPUs, and memory to speed up transactions. The rENIAC Data Engine uses an Intel FPGA-based Intel® Programmable Acceleration Card (Intel® PAC) optimized for fast data access. It provides consistently low latency and high throughput; can be deployed on premise or in the public cloud; and follows Cassandra security protocols. For Cassandra NoSQL, this provides a transparent layer that is placed between the Cassandra clients and Cassandra database nodes. It caches data in local storage and responds to queries by serving data either from its local storage or fetching it from the back-end database when the data is not available in local storage. The application layer is built on top of rENIAC’s Data Engine, a technology platform for building acceleration solutions for data-centric workloads.

inCreASeD ACCelerAtion better tCoWith Intel® FPGAs, the rENIAC solution achieves speedup of

Significantly lower TCO—reduce the number of servers by

uP to 50%1

>6X1

loWer lAtenCY

predictable lower latency at the 99th percentile1

>25X

Optimizing Apache Cassandra NoSQL performanceIntel FPGAs help optimize network–based transactions. Because rENIAC’s fundamental architecture is distributed, Intel FPGAs are used to speed distributed transactions—with compute instances, such as the Alibaba* Cloud F1 instance, changed into networking instances through software and firmware—allowing for considerable network acceleration. This enables a new class of databases with the capacity and speed to support intensive ecommerce and transactional environments and on-demand business models. Industry-standard Intel® architecture—including the Intel FPGAs, Intel PAC, and Intel CPU—enables high levels of performance, scalability, and reliability for the rENIAC solutions.

In a typical software stack, everything happens in the CPU. When the packet comes through, the application running on the CPU looks at the packet and executes it on the CPU. rENIAC uses the Intel FPGA as the equivalent of the CPU. Because this bypasses the kernel and CPU, data stays close to the network, and the Intel FPGA can answer requests right from hardware—boosting performance more than 6x.

Since industry-standard Intel architecture is widely used, rENIAC customers can use their legacy architecture with the Data Engine software and automatically gain the benefits, including:

• Ease of deployment Very easy to install and use with no software changes required

• Enterprise ready Clustering functionality provides high availability and built-in redundancy

• Lower operational costs Reduce the number of servers and decrease TCO

• Future-proof technology Scalable infrastructure for AI, big data, and IoT with Intel CPUs, FPGAs, and SSDs

“rENIAC has taken a unique approach to accelerating data workloads for server infrastructure and database clusters. rENIAC’s software can provide a throughput increase of up to 6x vs. alternative implementations.1 In addition, the company’s solution requires no code changes, allowing a seamless transition for customers to easily take advantage of Intel® FPGAs and advanced memory technologies.”

—John Sakamoto, VP, Programmable Solutions Group and general manager,

Data Center and Communications Division, Intel

2

Page 3: rENIAC Accelerates High-Volume, Transactional …...rENIAC Accelerates High-Volume, Transactional Databases with Intel® FPGAs Data Center Database Acceleration with Intel® FPGAs

Solution Brief | rENIAC Accelerates High-Volume, Transactional Databases with Intel® FPGAs

Sample use caseDisplaying relevant ads on digital sites requires automatically identifying browsing patterns and correlating user profile data such as demographics, region, etc. Cassandra is frequently used to track browsing patterns and store user profiles. Digital media companies and ad providers typically have KPIs requiring relevant content delivery within 10 milliseconds. This means Cassandra has about five milliseconds to pull the right content before engagement with a customer is lost. Using Cassandra with rENIAC is allowing digital media companies to pull information from Cassandra at the lowest possible latency and maximize queries with the fewest nodes.

Companies streaming large data and rich media on demand, ecommerce, fraud detection, and financial services, represent just a few examples where traditional Cassandra usage models can benefit from rENIAC acceleration.

Apache Cassandra* Apache Cassandra* is a free and open-source, distributed, wide column store, NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure.

cassandra.apache.org

Rethinking data center architectureThe solution from rENIAC and Intel is also creating opportunities for data center architects and data scientists. Says Matt Pfeil, cofounder of DataStax: “As FPGAs continue to enter the mainstream, Cassandra users look forward to increased performance for demanding workloads and look to solutions to upgrade their existing systems. rENIAC provides promising innovations for the community.”

Clients Proxy cluster Database clusters

rENIACproxy

rENIACproxy

rENIACproxy

Accelerating databases using rENIAC Data Engine*

Field programmable gate arrays (FPGAs) have been recognized for their increasing role in solving performance and analytical problems in data centers. FPGAs are an essential technology to meet demanding high-performance requirements, but software is needed to take advantage of their benefits without changing existing applications.

rENIAC leverages the power and flexibility of Intel FPGA platforms by providing software solutions that override (i.e., bypass) traditional CPU data flow to accelerate data and traffic without requiring changes to the application software. The result is predictable workload performance for private and public clouds running critical NoSQL workloads—without requiring knowledge on designing with or programming FPGAs.

3

rENIAC ANd INtEl® FPGAs

Page 4: rENIAC Accelerates High-Volume, Transactional …...rENIAC Accelerates High-Volume, Transactional Databases with Intel® FPGAs Data Center Database Acceleration with Intel® FPGAs

Solution Brief | rENIAC Accelerates High-Volume, Transactional Databases with Intel® FPGAs

Case study: Content personalization for digital mediaA large, online digital media company that relies heavily on personalized content was storing user personalization data in Cassandra on NoSQL. They were only able to meet their performance targets at the 75th percentile service level agreements (SLAs) for database transactions, resulting in unrealized revenue. In addition, they faced a 30 percent year-over-year increase in the number of database servers needed to support current levels of business growth.

Current solutions forced them to scale their Cassandra nodes both for transactions and data storage. This significantly increased their infrastructure footprint and cost structure, without improving SLAs.

Deploying rENIAC’s distributed Data Engine as a network service enabled disaggregation of transaction and data storage scaling. The result was an 8x increase in transaction throughput and 10x reduction in latency (5 to 8 milliseconds SLA at the 99th percentile), improving speed and revenue while lowering costs.1

CURRENT INFRASTRUCTURE WITH RENIAC RESULTS

Database servers # 150 18 8x reduction

Database queries per node # 3,000 25,000 50% cost reduction1

Latency per query (i.e., SLA) P=percentile

75th percentile: 7–8 ms 98th percentile: 60 ms

99th percentile: 5–8 ms10x improved latency Increased revenue

Software API Cassandra* 2.x Cassandra 2.x No SW changes

rENIAC technology deployed as Database-as-a-Network-Service

Current Architecture

Transaction and data storage scaling in database tier

Transaction scaling in distributed data engine

Disaggregated Architecture

1 2 3 n 1 2 3 n

How it works in briefThe Cassandra NoSQL database was designed to efficiently handle write operations. Consequently, there is always a trade-off between achievable read latency vs. the amount of data that one can store per node. As the data per node increases, the write pipeline exerts more pressure on the system due to increased computational and I/O demands, resulting in degradation of read latency. In addition, a combination of other factors impact overall system performance. For example, the network latency and throughput have a direct impact on efficient scaling of a Cassandra cluster. Similarly, the throughput of a Cassandra cluster is impacted by efficiency of access to flash memory such as SATA* SSDs or NVMe* drives. Compression of engine performance also impacts read latency.

Data is cached at the Cassandra cluster; rENIAC does not make a copy of data and the solution follows the same security standards used for Cassandra.

rENIAC’s Data Engine offers a technology platform to build out acceleration solutions for data-centric workloads, including NoSQL databases and searches using Intel FPGAs with commodity servers or cloud instances (e.g., Alibaba Cloud F1 instance). The rENIAC distributed Data Engine enables a decoupling of the data and application layers, simultaneously

acting as an I/O accelerator to resolve any bottlenecks. Large volumes of transactions are processed by the engine’s unique proprietary architecture that attaches storage-class memory directly to a low-latency network stack.

rENIAC NoSQL Data Engine for distributed databases has been designed to work with minimal configuration and without requiring any changes to the client code or the database. The rENIAC NoSQL Data Engine nodes listen for incoming queries on the configured port. For read queries, the engine parses the query and looks for the data in the local storage. If found, it returns the result to the client. If not found, it obtains the data from the database cluster, stores a copy in the local storage, and returns the result to the client.

For insert, update, and delete operations, the Data Engine forwards the query to the database cluster. When the database has successfully processed the query, the engine forwards the response to the client and updates its copy of the data.

The Data Engine generates metrics and log messages that are useful in understanding its performance. The engine sends metrics and logs to a metrics service, which displays them in a console window and browser and stores the data on disk.

4

Page 5: rENIAC Accelerates High-Volume, Transactional …...rENIAC Accelerates High-Volume, Transactional Databases with Intel® FPGAs Data Center Database Acceleration with Intel® FPGAs

Solution Brief | rENIAC Accelerates High-Volume, Transactional Databases with Intel® FPGAs

INTEL® FPGA AND INTEL® PAC BENEFITS FOR OEMS AND SOLUTION PROVIDERS

Flexibility • FPGA functionality can change each time a device is powered up. When a design engineer wants

to make a change or run a trial, a new configuration file can simply be downloaded into the device. • Changes can be made frequently to the FPGA without making costly PC board changes.

Acceleration

• FPGA devices are available off the shelf vs. ASICs (which require manufacturing cycles taking many months).

• Because of the flexibility of FPGAs, OEMs can ship systems as soon as the design is working and tested and get solutions to market faster.

• FPGAs provide off-load and acceleration functions to CPUs, effectively speeding up the entire system performance.

Integration

• Simplified white box integration. • More functions within the FPGA mean fewer devices on the circuit board, increasing reliability by

reducing the number of device failures. • Today’s FPGAs include on-die processors, transceiver I/O at 28 Gbps (or faster), RAM blocks,

DSP engines, and more.

Total cost of ownership (TCO)

• FPGAs reduce risk, allowing prototype systems to ship to customers for field trials while still providing the ability to make changes quickly before ramping to volume production.

• Support long life cycles (15 years or more). • Eliminate the cost of redesigning and requalifying OEM production equipment if one of the

electronic devices on-board is end of life (EOL). • While ASICs may cost less per unit than an equivalent FPGA, building them requires a nonrecurring

expense (NRE), expensive software tools, specialized design teams, and long manufacturing cycles.

Intel® Arria® 10 FPGAsIntel Arria 10 FPGAs are powerful, versatile accelerators for deep learning in IoT applications. Intel Arria 10 FPGAs deliver fast core

performance and up to a 20 percent fmax advantage, using publicly available OpenCores designs.2 Intel Arria 10 FPGAs and SoCs are also up to 40 percent lower power than previous generation FPGAs and SoCs and feature the industry’s only hard floating point digital signal processing (DSP) blocks with speeds up to 1.5 tera floating-point operations per second (TFLOPS).2 In addition to fast processing, high-performance HD video analytics, and deterministic low latency, Intel FPGAs offer the flexibility to interface with many types of image sensors. Furthermore, a product that employs an FPGA can be updated after deployment, ensuring that infrastructure can always take advantage of the latest developments that AI affords.

The rENIAC Data Engine has three primary components:

1. A data engine that implements a complete, transparent application layer for data workloads.

2. A storage engine that implements support for dominant data models such as a columnar, document, and graph, along with interfaces to contemporary flash memory technologies such as Intel® 3D XPoint™, NVMe SSDs, and SATA* SSDs.

3. A network engine that supports most networking protocols (e.g., TCP/IP, DHCP, ARP, and ICMP) for interoperability with existing data workloads on Linux* in a data center environment prone to failures.

The main features of rENIAC’s Data Engine from an application perspective are:

• CQL compatible, requiring no changes to applications• Write-through cache• Automatic data consistency• Decoupling of the read and write pipeline, providing read

SLAs in microseconds• Scaling to hundreds of data engine nodes (using the gossip

protocol)

5

Page 6: rENIAC Accelerates High-Volume, Transactional …...rENIAC Accelerates High-Volume, Transactional Databases with Intel® FPGAs Data Center Database Acceleration with Intel® FPGAs

Solution Brief | rENIAC Accelerates High-Volume, Transactional Databases with Intel® FPGAs

1.

Cassandra workload setup: Apache Cassandra v3.11; partitions: 5 million; size per partition 4KB: 5 columns, 780 bytes per column. Cassandra Stress command: ./cassandra-stress read n=1000000 no-warmup -node 11.22.101.106 -port native=8001 -schemakeyspace=4kapache310 -col n=FIXED\(5\) size=FIXED\(780\) -pop dist=uniform\(1..5000000\) -rate threads=1 Benchmark tool used: Cassandra Stress Test (industry standardized test): http://cassandra.apache.org/doc/4.0/tools/cassandra_stress.html

2. intel.com/content/www/us/en/products/programmable/fpga/arria-10.html.

Performance results are based on testing as of December 2018 and may not reflect all publicly available security updates. See configuration disclosure for details. No product can be absolutely secure.

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to intel.com/benchmarks.

Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at intel.com/iot.

Cost reduction scenarios described are intended as examples of how a given Intel-based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction.

Intel does not control or audit third-party data. You should review this content, consult other sources, and confirm whether referenced data are accurate.

Intel, the Intel logo, 3D XPoint, Arria, and Xeon are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries.

*Other names and brands may be claimed as the property of others.

© Intel Corporation

0219/MG/CMD/PDF 338542-001US

Conclusion rENIAC’s distributed Data Engine addresses key inefficiencies when executing data applications, providing significantly higher performance at lower TCO on scale-out architecture. No software changes to the legacy software stack are required, yet rENIAC supports core operational needs from business continuity to performance monitoring. Running on high-performance Intel architecture and taking advantage of the significant acceleration provided by Intel FPGAs, rENIAC delivers speedup for Cassandra NoSQL databases to power the data-driven, transactional economy.

With rENIAC and Intel, businesses can manage explosive data growth without compromising performance.

Learn moreDiscover Intel FPGAs at intel.com/fpga.

Find out more about Intel innovation for AI at intel.com/ai.

For more information about rENIAC, please visit reniac.com or contact us at [email protected].

Find out more about Apache Cassandra at cassandra.apache.org.

About rENIAC, Inc.rENIAC (Mountain View, California, US) reduces latency and accelerates throughput for critical workloads in public cloud, hybrid, and on-premise data centers without software changes to existing applications. The company’s distributed Data Engine is architected to benefit database, networking, and storage solutions while freeing more CPU resources to create business value.

reniac.com

Hardware rENIAC Proxy Server Cassandra Client Cassandra Server

Processor 8 cores (Intel® Xeon® Silver 4109T processor, 2.0 GHz)

40 cores (2 x Intel® Xeon® Gold 6148 processor @ 2.40 GHz)

32 cores (2 x Intel Xeon Gold 6130 processor @ 2.10 GHz)

Memory 64 GB DDR3 128 GB RDIMM, 2666MT/s, dual rank 192 GB RDIMM, 2666MT/s, dual rank

NVMe*/storage 187 GB Intel® Optane™ Memory 750 GB NVMe* 224 GB NVMe

Operating system CentOS* 7.6 CentOS 7.6 CentOS 7.6

Software rENIAC FPGA Data Engine* Apache Cassandra v3.11 Apache Cassandra v3.11

6