Hands-on tutorial for managing Google Drive files with Python

Jun 3, 2020 00:00 · 831 words · 4 minute read GoogleDrive

This is a tutorial of how to use Python to manage Google Drive files.

1. Introduction

Google Drive is awesome! Not only because it provides a easy way of uploading, managing and sharing files, but also because it’s free within certain storage limit. For some edu users, the storage is not only free but also unlimited. Have you wondered how to fully utilize this free cloud storage services from a data science perspective? Actually, it’s not that difficult. One simple question to start with is how we access and manage Google Drive files using Python (the most popular data science programming language).

2. Get Authentication for Google Service API

First, we need to get the authentication files for Google Service API, so our Python code can access to the Google Drive. To do that, we need to:

1) Create a new project in Google Developer Console by clicking “CREATE PROJECT” as following. google_console1

You can give your project a name or leave it as default. google_console2

2) Enable APIs and Services by clicking the “ENABLE APIS AND SERVICES” as indicated by the red circle in following picture. google_console3

That will bring you the the API library as below. google_console4

Search “Google Drive” in the API library (indicated by red circle in about picture). You’ll get the following snapshot. google_console5

Click the “Google Drive API” icon and it will bring you to next step as following. google_console6

Then click “ENABLE”, which will enable your Google Drive API service. You’ll get to the next step as following. google_console7

3) Create credentials by clicking the “CREATE CREDENTIALS” icon (indicated by red circle in above snapshot). Here’s what you’ll get.

google_console8

In above snapshot, we need to click “client ID” as that’s the Python program needs. Then click “CREATE” and download the JSON file as shown by the following snapshots.

google_console9 google_console10

The downloaded JSON file is the one we need for our Python code to access to Google Drive.

3. Use PyDrive

Once we have the JSON file to access Google Drive, we can install a Python library - PyDrive using pip install pydrive.

The following code will do authentication and list all files in your Google Drive. Note that every time you run the following program, the code will open a web browser to ask you to input your Google account and password.

from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive

# Rename the downloaded JSON file to client_secrets.json
# The client_secrets.json file needs to be in the same directory as the script.
gauth = GoogleAuth()
drive = GoogleDrive(gauth)

# List files in Google Drive
fileList = drive.ListFile({'q': "'root' in parents and trashed=false"}).GetList()
for file1 in file_list:
  print('title: %s, id: %s' % (file1['title'], file1['id']))

To avoid inputting password every time, we can create a settings.yaml file to save all the credientials. The details can be find from the PyDrive official document. The yaml file is like the following.

client_config_backend: settings
client_config:
  client_id: your_client_id
  client_secret: your_client_secret

save_credentials: True
save_credentials_backend: file
save_credentials_file: credentials.json

get_refresh_token: True

oauth_scope:
  - https://www.googleapis.com/auth/drive.file

The client_id and client_secret can be found by clicking the editing icon in following snapshot.

google_console11

Rerun the above Python code, the program will ask you the input your Google password again. Then it will create a credientials.json file. Next time, Python will just pick up that file to finish authentication automatically. Therefore, you don’t need to type your password again.

Now, we can upload local files to Google Drive folder, such as

# Upload files to your Google Drive
upload_file_list = ['google_console1.png', 'google_console2.png']
for upload_file in upload_file_list:
    gfile = drive.CreateFile({'parents': [{'id': '1pzschX3uMbxU0lB5WZ6IlEEeAUE8MZ-t'}]})
    # Read file and set it as a content of this instance.
    gfile.SetContentFile(upload_file)
    gfile.Upload() # Upload the file.

The above code upload my two local files google_console1.png and google_console2.png to my Google Drive folder test/. To do that, the pydrive library will create two files in Google Drive and then read and upload the two files to corresponding folder. Note that we need to provide the id of the corresponding Google Drive folder. In this example, the test folder’s ID is 1pzschX3uMbxU0lB5WZ6IlEEeAUE8MZ-t. You can get the Google Drive folder ID from browser. For example, when I open the test folder in my Google Drive, the browser shows the address as https://drive.google.com/drive/folders/1pzschX3uMbxU0lB5WZ6IlEEeAUE8MZ-t. Then the corresponding ID for the test folder is the part after the last \ symbol, which is 1pzschX3uMbxU0lB5WZ6IlEEeAUE8MZ-t.

Similarly, we can also write file directly to Google Drive using the following code:


file1 = drive.CreateFile({
    'parents': [{'id': '1pzschX3uMbxU0lB5WZ6IlEEeAUE8MZ-t'}],
    'title': 'Hello.txt'})  # Create GoogleDriveFile instance with title 'Hello.txt'.
file1.SetContentString('Hello World!') # Set content of the file from given string.
file1.Upload()

Of course, we can also read the file directly from Google Drive.

file2 = drive.CreateFile({'id': file1['id']})
file2.GetContentString('Hello.txt')

4. Summary

Today, we learned how to manage Google Drive files directly using PyDrive. Remember the major steps: * Set up Google Drive API and create credientials * Install PyDrive and set authentication * Manage Google Drive files using Python (e.g. upload and read) * More file management functionality can be found from the PyDrive official website.

tweet Share