Designing Big Data Clusters with Cisco UCS & Nexus
Hadoop and big data seem to be everywhere these days; few technologies in IT today have as much promise and garner as much hype and interest as big data. This session will help data centre operations teams move beyond the hype to understand Hadoop, HBase, and similar big data technologies, how they differ from traditional enterprise applications, and practical advice for building scalable, supportable, highly-performing clusters using UCS and Nexus technologies. The session will address "Frequently Asked Questions" that arise with customers building big data clusters. After an introduction into the origins of technologies like Hadoop and the business problems they can solve, the session will detail cluster design principles and best practices from a network, compute, and application perspective, including compute node hardware selection (disk memory, CPU, etc.) and configuration; network characteristics such as buffer, latency, bandwidth and oversubscription, etc. and their affects on the application; and a view into how Hadoop in particular works with the hardware to provide resiliency and scalability. The session will provide practical configuration advice for running Hadoop and HBase on UCS, covering topics such as disk configuration (JBOD vs. RAID, temporary storage for "shuffle" data), Hadoop "rack awareness", the pros and cons of virtualisation, network redundancy, and cluster sizing and scaling. The session will focus on Cisco-based big data architectures including UCS C-series servers and Nexus data centre switches. The goal of the session is to demystify big data and associated technologies and to provide practical advice to help customers transition them from the realm of science project to mainstream data centre operations.