gdoc_down package


gdoc_down.core module

Save the content of a Google document to a local file.

Author:Jonathan Karr <>
Copyright:2016, Karr Lab
class gdoc_down.core.GDocDown(credentials=None, service=None)[source]

Bases: object

Downloads Google documents to several formats

  • HTML (.html)
  • LaTeX (.tex)
  • Open Office document (.odt)
  • Plain text file (.txt)
  • Portable document format (.pdf)
  • Rich text document (.rtf)
  • Word document (.docx)

The class has several special features for handling LaTeX files:

  • The program ignores all images. This allows the user to place images inside the Google document for convenience and to use includegraphics to embed images in compile PDF files.
  • The program will convert all Google document comments to PDF comments.
  • The program ignores all page breaks.

The first time the program is called, the program will request access to the user’s Google account. This will create a client.json file.



Credentials object for OAuth 2.0.



A Resource object with methods for interacting with the service

APPLICATION_NAME = 'gdoc_down'
CLIENT_SECRET_PATH = '/home/docs/checkouts/'
CREDENTIAL_PATH = '/home/docs/.gdoc_down/auth.json'
SCOPES = ('', '', '', '')

Authenticate with Google server

Returns:A Resource object with methods for interacting with the service
Return type:apiclient.discovery.Resource
static convert_html_to_latex(html_content)[source]

Format Google document content downloaded in HTML format for LaTeX

  • Replace HTML characters with LaTeX commands
  • Remove images
  • Replace comments with PDF comments (using pdfcomment package)
Parameters:html_content (bytes) – HTML version of Google document
Returns:formatted LaTeX
Return type:bytes
download(gdoc_file, format='docx', out_path='.', extension=None)[source]
  • gdoc_file (str) – path to Google document
  • format (str, optional) – desired output format (docx, html, odt, pdf, rtf, tex, txt)
  • out_path (str, optional) – path to save document
  • extension (str, optional) – extension to document

objException: if format unknown or if ouput file path and extension cannot both be specified


Get and save user credentials from Google. If credentials haven’t already been stored, or if the stored credentials are invalid, obtain the new credentials.

oauth2client.client.OAuth2Credentials: Credentials object for OAuth 2.0.
static get_element_text(element)[source]

Get all of the text underneath an XML element

Parameters:el (xml.etree.ElementTree.Element) – XML element
Returns:element’s text
Return type:str
static get_gdoc_id(gdoc_file)[source]

Get Google document id

Parameters:gdoc_file (str) – path to Google document
Returns:id of Google document
Return type:str

Module contents