Deep Learning Support Added to Apache Spark Platform
Posted by Databricks on 01 Nov 2016
Company Delivers Comprehensive Deep Learning Toolkit for Big Data with GPUs Alongside CPUs
Databricks, the company founded by the creators of the Apache Spark project, has announced the addition of deep learning support to its cloud-based Apache Spark platform. This enhancement adds GPU support and integrates popular deep learning libraries to the Databricks’ big data platform, extending its capabilities to enable the rapid development of deep learning models. Data scientists looking to combine deep learning with big data—whether it’s recognizing handwriting, translating speech between languages, or distinguishing between malignant and benign tumors—can now utilize Databricks for every stage of their workflow, from data wrangling to model tuning. Databricks is the first to integrate these diverse workloads in a fast, secure, and easy-to-use Apache Spark platform in the cloud.
Apache Spark and Deep Learning
The 2016 Spark Survey found that machine learning usage in production saw a 38 percent increase since 2015, making it one of Spark’s key growth areas. Many leaders in machine learning, such as Yahoo, are choosing Spark for deep learning to achieve groundbreaking results with big data.
In March 2016, Databricks created and open sourced TensorFrames, a software library that enables the popular deep learning framework, TensorFlow to run on Spark. The enhancements announced today simplify deep learning on Spark by adding out-of-the-box support for using TensorFrames with GPUs—specialized hardware that can perform an impressive amount of deep learning-specific computations in parallel. With Databricks, data teams can easily conduct deep learning on highly optimized hardware with a few clicks or API calls.
"We are proud to enable organizations to achieve better results in their mission-critical applications and are always looking ahead at the latest technologies—such as deep learning—to provide the Spark community with the most flexible, approachable big data toolset," said Ali Ghodsi, CEO and Cofounder at Databricks.
End-to-End Deep Learning with Databricks
Databricks allows organizations to perform data wrangling, interactive exploration, stream data processing, and other advanced analytics techniques alongside deep learning in a comprehensive platform. By seamlessly combining these techniques on Databricks, organizations can avoid unwanted system complexities and simplify the development of deep learning applications such as:
- More timely and accurate cancer detection for healthcare providers: To read and interpret pathology images with higher accuracy than humans;
- Faster drug discovery for pharma: To predict therapeutic uses of drugs at earlier stages to speed up the development and sales pipelines;
- More capable artificial intelligence, such as language translation: To translate spoken speech with computers at an accuracy that rivals human performance.
"Today’s dynamic data teams are applying a broad range of analytic tools to more data, but requiring insights and faster ROI," said Tony Baer, Principal Analyst at Ovum. "With the Databricks’ platform, they can easily utilize the latest innovations, whether it’s Spark Streaming or deep learning, enabling them to build and deploy sophisticated business applications, in a simpler and faster way."
Read the blog to learn more: http://dbricks.co/db-deep-learning Contact Databricks to get started: http://go.databricks.com/contact-databricks
Databricks’ vision is to empower anyone to easily build and deploy advanced analytics solutions. The company was founded by the team who created Apache® Spark™, a powerful open source data processing engine built for sophisticated analytics, ease of use, and speed. Databricks is the largest contributor to the open source Apache Spark project providing 10x more code than any other company. The company has also trained over 20,000 users on Apache Spark, and has the largest number of customers deploying Spark to date. Databricks provides a just-in-time data platform, to simplify data integration, real-time experimentation, and robust deployment of production applications. Databricks is venture-backed by Andreessen Horowitz and NEA. For more information, contact firstname.lastname@example.org.
© Databricks 2016. All rights reserved. Apache, Apache Spark and Spark are trademarks of the Apache Software Foundation. TensorFlow, the TensorFlow logo and any related marks are trademarks of Google Inc.