Async Polling Gmail Subject Lines with asyncio + Gmail API

andreasPython Code3 weeks ago46 Views

What you’ll build

  • A small script that authenticates with Gmail (OAuth), then polls the inbox every few seconds and prints the Subject of any new/unseen messages.
  • We’ll use the Gmail API’s users.messages.list to find message IDs and users.messages.get (with format=metadata) to fetch just the Subject header efficiently. Google for Developers+2Google for Developers+2

1) One-time setup: Google Cloud + OAuth

  1. Go to the Gmail API Python Quickstart and follow “Enable the Gmail API” to create a project and OAuth client ID (Desktop). Download credentials.json. Google for Developers
  2. Put credentials.json in your project folder. The first run will open a browser to grant access and store a token.json locally (so you won’t have to log in again). Google for Developers

We’ll request the minimal read-only scope: https://www.googleapis.com/auth/gmail.readonly. Google for Developers


2) Install dependencies

python -m venv venv
# Windows: venv\Scripts\activate
source venv/bin/activate
pip install google-api-python-client google-auth-httplib2 google-auth-oauthlib

These are the official client libraries used in Google’s quickstart. Google for Developers


3) The async idea (important!)

  • Google’s Python client is synchronous. We’ll call it inside asyncio using asyncio.to_thread(...) so the event loop stays responsive.
  • We’ll poll every N seconds using await asyncio.sleep(N).
  • We’ll keep a seen_ids set so we only print new messages.

4) Full example (copy-paste runnable)





# gmail_async_poll.py
import asyncio
import os
from typing import Set, List, Dict

from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import build

# ---- OAuth scope: read-only inbox access ----
SCOPES = ["https://www.googleapis.com/auth/gmail.readonly"]

def get_service():
    """
    Synchronous: builds an authenticated Gmail API service.
    Based on the official quickstart pattern.
    """
    creds = None
    if os.path.exists("token.json"):
        creds = Credentials.from_authorized_user_file("token.json", SCOPES)
    if not creds or not creds.valid:
        if creds and creds.expired and creds.refresh_token:
            # refresh silently
            from google.auth.transport.requests import Request
            creds.refresh(Request())
        else:
            # first-time browser OAuth flow
            flow = InstalledAppFlow.from_client_secrets_file("credentials.json", SCOPES)
            creds = flow.run_local_server(port=0)
        # save token for next runs
        with open("token.json", "w") as f:
            f.write(creds.to_json())

    return build("gmail", "v1", credentials=creds)

def list_message_ids_sync(service, q: str, label_ids: List[str], max_results: int = 20) -> List[str]:
    """
    Synchronous: returns a list of recent message IDs using users.messages.list.
    We use a Gmail search query 'q' and optional label filters.
    """
    res = service.users().messages().list(
        userId="me", q=q, labelIds=label_ids, maxResults=max_results
    ).execute()
    msgs = res.get("messages", [])
    return [m["id"] for m in msgs]

def get_subject_sync(service, msg_id: str) -> Dict[str, str]:
    """
    Synchronous: fetch only metadata headers for fast access,
    then extract Subject (and a few extras).
    """
    res = service.users().messages().get(
        userId="me",
        id=msg_id,
        format="metadata",
        metadataHeaders=["Subject", "From", "Date"],
    ).execute()
    headers = {h["name"]: h["value"] for h in res.get("payload", {}).get("headers", [])}
    return {"id": msg_id, "Subject": headers.get("Subject", "(no subject)"),
            "From": headers.get("From", ""), "Date": headers.get("Date", "")}

async def poll_subjects(interval_seconds: int = 10,
                        query: str = "is:unread",
                        labels: List[str] = ["INBOX"]):
    """
    Async polling loop:
     - every `interval_seconds`:
         - list recent messages matching query/labels
         - for any new IDs, fetch Subject via metadata and print it
    """
    print(f"Starting async poll every {interval_seconds}s for query={query!r}, labels={labels}")
    # Build the Gmail service once (sync), then reuse it.
    service = await asyncio.to_thread(get_service)

    seen: Set[str] = set()

    while True:
        try:
            # 1) list IDs (run sync call in a thread to avoid blocking)
            ids = await asyncio.to_thread(list_message_ids_sync, service, query, labels, 20)

            # 2) for any new IDs, fetch metadata concurrently
            new_ids = [i for i in ids if i not in seen]
            if new_ids:
                tasks = [asyncio.to_thread(get_subject_sync, service, mid) for mid in new_ids]
                results = await asyncio.gather(*tasks)
                for r in results:
                    print(f"[NEW] {r['Date']}  {r['From']}  ::  {r['Subject']}")
                    seen.add(r["id"])

            # 3) sleep and repeat
            await asyncio.sleep(interval_seconds)
        except KeyboardInterrupt:
            print("Stopping poller...")
            break

if __name__ == "__main__":
    # Change the interval or query if you like:
    # Examples for q:
    # - "is:unread"
    # - "newer_than:1d"
    # - "label:unread from:github"
    # (Note: some queries require scopes beyond gmail.metadata; we use gmail.readonly.) 
    asyncio.run(poll_subjects(interval_seconds=8, query="is:unread", labels=["INBOX"]))

What the code is doing (line-by-line highlights)

  • get_service(): Implements the OAuth flow from Google’s quickstart—loads saved tokens if present, runs a browser consent if not, then builds a Gmail API client. Google for Developers
  • list_message_ids_sync(): Calls users.messages.list with a Gmail search query (q) and optional label filters; only returns message IDs (that’s how the API works). Google for DevelopersStack Overflow
  • get_subject_sync(): Calls users.messages.get with format=metadata and metadataHeaders=Subject,From,Date to efficiently pull only headers (no full body). Google for Developers+1googleapis.github.io
  • poll_subjects(): An async loop:
    • Uses await asyncio.to_thread(...) to run each synchronous API call in a thread (so the event loop stays free).
    • Collects new IDs and fetches their metadata concurrently with asyncio.gather.
    • Waits interval_seconds with await asyncio.sleep(...) and repeats.

5) Run it

python gmail_async_poll.py
  • On first run, your browser opens → choose the Google account → allow the read-only permission → token.json is saved. Google for Developers
  • Leave the script running; when new mail arrives that matches your query (default is:unread in INBOX), you’ll see:
[NEW] Tue, 12 Aug 2025 18:03:22 +0000  Sender Name <sender@example.com>  ::  Welcome to Async Gmail!

Tweaks & Tips

  • Change the query (Gmail search syntax) to narrow results, e.g. from:github newer_than:1d label:unread. (Be mindful: certain queries require appropriate scopes; gmail.metadata has restrictions, we’re using gmail.readonly.) Google Hilfe
  • Avoid duplicates: we used a simple seen set. For reliability across restarts, persist it (e.g., a small SQLite DB or a file) or track Gmail history IDs (advanced). Google for Developers
  • Rate limits: Increase interval_seconds (e.g., 15–60s) for production.
  • Only headers: format=metadata is lighter/faster than fetching full messages. Google for Developers
  • Push vs Poll: Gmail also supports push notifications via watch/PubSub, which is more scalable—but polling is simpler for a quick tool. (See the Gmail API guides if you want to evolve this.) Google for Developers

Why we used asyncio.to_thread

The Google API client is synchronous; wrapping calls with to_thread keeps your app responsive and lets you concurrently fetch multiple subjects without blocking the main loop.


References

0 Votes: 0 Upvotes, 0 Downvotes (0 Points)

Leave a reply

Loading Next Post...
Loading

Signing-in 3 seconds...