Raw data are often dirty (difficult to use for data scientists in their existing state) and need to be cleaned before they can be used. An example of this is the data that have been scraped from the web containing encodings or HTML tags.
In this tutorial, you will learn how to load data from Stackoverflow posts into a Spark cluster hosted on Dataproc, extract useful information and store the processed data as zipped CSV files in Google Cloud Storage.
When you work on Google Cloud Platform (GCP), you may need to enable or manage some APIs and services. For example to be able to use Apache Spark on Google Cloud, you need to enable Compute Engine, Dataproc and BigQuery Storage APIs. In this article you learn how to do it either from terminal (Google Shell ) or from Google Cloud web GUI.
Before starting, you need a Google Account, then go to Google Cloud platform , and login with you account.
You also need to create a project:
Have you ever thought about painting on Photoshop, but you don’t know where to start?
In this article, I explain step by step how to draw in Photoshop. After following all these steps, you will be able to create your own drawings.
Which version of Photoshop I used?
I used the latest version of Adobe Photoshop 2020. However, you can use any older version of Photoshop . After opening the Photoshop, you need to select a photo that you want to draw it. You can download the sample photo from here.
Step 1: Opening a new Photoshop document (canvas):
A Data scientist, with the passion of learning new things!