Show the Business the Money: Utilize Hadoop to Drive Operational Efficiency

Last time you heard from me, I was headed to Strata, excited to be announcing the coming of the Dell EMC Ready Bundle for Hortonworks Hadoop. This week at DATAWORKS SUMMIT, I’m proud to announce it has arrived.  The Dell EMC Ready Bundle for Hortonworks Hadoop was released for general availability last week – it is now available! This solution is uniquely delivering real value to customers, focusing on three key pain points:

  • Faster time to value to get to a fully-implemented solution: Dell EMC delivers an end-to-end solution guide to simplify the architecture, use case and configuration for customers
  • Reduce the risk: The Dell EMC Ready Bundle for Hortonworks Hadoop enables increased productivity with the delivery of a certified architecture and infrastructure guide
  • Control costs: Realize greater return on investment by reducing the total cost of ownership while seamlessly integrating with existing investments

While this is valuable, customers that want to leverage Hadoop often do not know the critical pieces needed to make it real for the business. The business needs to see the value of investing in new technologies like Hadoop. When I say “value,” what I really mean is do more with less. Bill Schmarzo, Dell EMC’s Dean of Big Data, has a great saying when talking to the business folks about Hadoop, “Don’t make it about the 3-Vs, for the business it has to be about-Make Me More Money!”

If you’re ready to “show the business the money”, then please come have a conversation with the Dell EMC and Hortonworks folks at the DATAWORKS SUMMIT.

Let’s start with a business problem that is an issue across many vertical markets –- data management. Gartner research found that 70% of all Enterprise Data Warehouses (EDW) are performance and capacity constrained.  Software processes that clean and transform data before it can be used are eating up way too many resources in the EDW.  Gartner says that up to 80% of the EDW capacity is being driven by data integration and transformation jobs.[1] This results in longer data ingestion and preparation times, inability to meet SLAs for business reporting and excessively long ad hoc query response times leading to fewer business insights. This is a pain of which both Hortonworks and Dell EMC are keenly aware.

Many people only think about Hadoop only in terms of data storage and analytics, however Hadoop Distributed File System and MapReduce together with technology from Syncsort form a high performance data cleaning and transformation alternative to the “best practices” for traditional EDW ETL approaches.

“Enterprise Data Warehouse has become an organization’s central data repository built to support business decisions. Yet, the complexity and volume of data poses significant challenges to the efficiency of the existing EDW solution, causing a huge impact to the business. Hortonworks is excited to partner with Dell EMC to help solve this problem with the Dell EMC Ready Bundle for Hortonworks Hadoop in the ETL Offload use case configuration.”   – Nadeem Asghar, Field CTO and Global Head of Technical Alliances/Partner Engineering at Hortonworks

We will be highlighting the Dell EMC Ready Bundle for Hortonworks Hadoop in the ETL Offload use case configuration with Syncsort at the DATAWORKS SUMMIT.  Let us show you how it is uniquely suited to solve this business problem with lower cost and more performance than traditional ETL approaches.

It’s been 7 years since the initial release of Hadoop Version 1 by the Apache Software foundation but there is still a shortage of people with experience in all aspects of Hadoop including design, implementation and operation.  Since 2011, Dell EMC has helped organizations solve this Hadoop skills gap by providing expert guidance and knowhow to streamline the architecture, design, planning, and configuration of Hadoop ETL environments.  Dell EMC and Hortonworks help customers by-

  • Removing Barriers-Avoid code generation, making it easier to deploy and maintain with no performance impact
  • Fast Tracking Projects – Allows customers faster time to value by reducing the need to develop expertise on Pig, Hive, and Sqoop, instead using SILQ for creating ETL jobs in MapReduce
  • Closing The Skills Gap – One of the biggest barriers to offloading from the data warehouse into Hadoop is legacy SQL scripts built and extended over time. SILQ takes an SQL script as an input and then provides a MapReduce output without any coding

Syncsort DMX-h was designed from the ground up to make big data integration simple – combining a long history of innovation with significant Syncsort contributions to the Apache Hadoop ecosystem. With Syncsort’s DMX-h, users can begin developing Hadoop ETL jobs within hours, and the system can become fully productive within days by using a drag-and-drop interface rather than learning additional complex technologies. Adding to this convenience, the SILQ offload utility helps to obtain drilled-down, detailed information about each step within the data flow, including tables and data transformations. This can reduce expert analysis from 20-plus hours to less than 30 minutes.

All the Dell EMC Ready Bundles for Hadoop enable companies to reduce Hadoop deployment times from unpacking the equipment to full productivity within days.  The new Ready Bundle for ETL offload expands the impact of our offering to include reducing development time of ETL jobs to hours instead of days or weeks.

At the Dataworks Summit, we encourage you to stop by to discuss your Hadoop implementation and learn how we can help you build a solution that will “Show the Business the Money!”

[1] Gartner. “The State of Data Warehousing in 2014.”     June 19, 2014.

About the Author: Armando Acosta

Armando Acosta has been involved in the IT Industry over the last 15 years with experience in architecting IT solutions and product-marketing, management, planning, and strategy. Armando’s latest role has been focused on Big Data|Hadoop solutions, addressing solutions that build new capabilities for emerging customer needs, and assists with the roadmap for new products and features. Armando is a graduate of University of Texas at Austin and resides in Austin, TX.