Google Cloud Dataproc

from Wikipedia, the free encyclopedia
Google Cloud Dataproc

Google Cloud Dataproc logo
Basic data

Maintainer Google Cloud Platform
Publishing year 2016
Current  version 1.2.31
(April 13, 2018)
operating system
cloud.google.com/dataproc

Google Cloud Dataproc (Cloud Dataproc) is a Platform as a Service (PaaS) offered on the Google Cloud Platform . Cloud Dataproc leverages many of the Google Cloud Platform technologies, such as Google Compute Engine and Google Cloud Storage , to offer fully managed clusters using popular computing frameworks such as Apache Hadoop and Apache Spark .

history

Cloud Dataproc was released as a publicly available beta service on September 23, 2015 and has been publicly available since February 22, 2016.

design

Cloud Dataproc is a Platform as a Service (PaaS) product that combines the Apache Spark and Apache Hadoop frameworks with many popular cloud computing patterns. Cloud Dataproc separates compute and storage, which is a relatively common design for many Cloud Hadoop offerings. Cloud Dataproc uses Google Compute Engine virtual machines for computing and Google Cloud Storage for storing files. Cloud Dataproc has a number of control and integration mechanisms that coordinate the life cycle, management and coordination of clusters. Cloud Dataproc is integrated into the YARN Application Manager to facilitate the management and use of clusters.

Cloud Dataproc contains many open source packages that are used for computing, including elements from the Spark and Hadoop ecosystems, as well as open source tools to connect these frameworks with other Google Cloud Platform products.

Individual evidence

  1. Derrick Harris: Survey shows huge popularity spike for Apache Spark. In: fortune.com. September 25, 2015, accessed July 9, 2019 .
  2. Vaibhav Nivargi: On The Growth Of Apache Spark. In: techcrunch.com. March 19, 2015, accessed July 9, 2019 .
  3. CLOUD DATAPROC. Cloud native Apache Hadoop and Apache Spark. In: Google . Retrieved July 9, 2019 .
  4. James Malone: Google Cloud Dataproc: Making Spark and Hadoop Easier, Faster, and Cheaper. In: Google Blog. September 23, 2015, accessed July 9, 2019 .
  5. James Malone: Google Cloud Dataproc managed Spark and Hadoop service now GA. In: Google . February 22, 2016, accessed July 9, 2019 .
  6. Cloud Dataproc FAQs. How does Cloud Dataproc work? In: Google . Retrieved July 9, 2019 .
  7. Cloud Dataproc Image version list. In: Google . Retrieved July 9, 2019 .