Altiscale Pushes Apache Spark Into The Cloud

Altiscale announced its support for Apache Spark on the Altiscale Data Cloud. The company's CEO says this move was in response to the growing demand for access to Spark.

Mike Vizard, Contributing Editor

February 18, 2015

3 Min Read
Altiscale CEO Raymie Stata
Altiscale CEO Raymie Stata.

With the number of application scenarios for Hadoop rapidly expanding to include in-memory applications based on the Apache Spark framework, adoption of Hadoop in the cloud is rapidly expanding. To address demand for access to Spark, Altiscale today at the Strata + Hadoop World 2015 conference announced that it now supports Apache Spark on the Altiscale Data Cloud.

Built from the ground up to support big data applications in the cloud, Altiscale CEO Raymie Stata said the Altiscale Data Cloud is a vertically integrated cloud that is optimized for both deploying and managing Hadoop clusters on an ongoing basis. What many IT organizations don’t always appreciate about Hadoop is how challenging it can be to upgrade a Hadoop cluster. Not only do all the servers in a Hadoop cluster have to be identical, upgrading those clusters can be a challenge in terms of maintaining a consistent set of configurations.

For those reasons Stata said that more IT organizations have been turning to the cloud to deploy Hadoop applications in production. The issue that many of those organizations will encounter, however, is that general purpose cloud computing environments are not optimized for Hadoop, said Stata. In fact, it was that very issue that led Datameer to unfurl a Big Data analytics application running in the cloud on top of the Altiscale infrastructure-as-a-service (IaaS) platform.

To help secure those applications Altiscale also announced today that it is adding support for Kerberos authentication to improve the overall security of the Altiscale Data Cloud environment.

Click here for Talkin’ Cloud’s Top 100 CSP list

In general, the Apache Spark framework makes use of YARN (Yet Another Resource Negotiator) to allow applications running on Hadoop to run in memory. Rather than simply treating Hadoop as a massive data repository, YARN makes it more feasible to deploy real-time applications on Hadoop, as opposed to only batch-oriented applications.

A recent study that Altiscale commissioned Forester Consulting to conduct found that reliance on the Altiscale Data Cloud reduced Hadoop job failure rates by over 60 percent. The study also found that there was a 46 percent reduction in the amount of time a Hadoop job took to complete and a 30 percent reductions in wasted compute cycles. The end result, said Stata, is not only enhanced productivity for data scientists, but also reduced IT labor costs.

With more applications heading towards both Hadoop and the cloud solution providers across the channel need to become savvier about where and when Hadoop can be deployed. There’s no doubt that Hadoop is transforming the cost of storing massive amounts of data. But it’s also a platform for developing and deploying a new generation of Big Data applications that just like any other application in the cloud still needs to be managed by organizations with particular sets of expertise.

Follow CJ Arlotta on Twitter @cjarlotta and Google+ for further updates on the story above — or if you just want to say hello.

Read more about:

AgentsMSPsVARs/SIs

About the Author

Mike Vizard

Contributing Editor, Penton Technology Group, Channel

Michael Vizard is a seasoned IT journalist, with nearly 30 years of experience writing and editing about enterprise IT issues. He is a contributor to publications including Programmableweb, IT Business Edge, CIOinsight and UBM Tech. He formerly was editorial director for Ziff-Davis Enterprise, where he launched the company’s custom content division, and has also served as editor in chief for CRN and InfoWorld. He also has held editorial positions at PC Week, Computerworld and Digital Review.

Free Newsletters for the Channel
Register for Your Free Newsletter Now

You May Also Like