Automate Your Research: 1

1 minute read

Published:

A Scientific Sink

Almost everyone I know reach research papers in their computers. Even though most papers are online, if you are like me, you ‘d probably love reading the pdf. It is neat, the figures are right around the text (I don’t like a paper where I have to turn a page to read legends :)). There are many software tools that

Move scientific publications to appropriate folder

Step 1. Write a script: off course

#!/usr/bin/env python3

import os
import re
import shutil
from PyPDF2 import PdfReader

# Setup
downloads = os.path.expanduser("~/Downloads")
target = os.path.join(downloads, "Journals_downloaded")
os.makedirs(target, exist_ok=True)

# DOI patterns
doi_patterns = [
    re.compile(r'\b10\.\d{4,9}/[-._;()/:A-Z0-9]+\b', re.IGNORECASE),
    re.compile(r'\bdoi:\s*10\.\d{4,9}/[-._;()/:A-Z0-9]+\b', re.IGNORECASE)
]

def contains_doi(text):
    return any(p.search(text or '') for p in doi_patterns)

# Process PDFs
for file in os.listdir(downloads):
    if file.lower().endswith(".pdf"):
        path = os.path.join(downloads, file)
        try:
            reader = PdfReader(path)
            texts = [p.extract_text() for p in reader.pages[:2]]
            if any(contains_doi(t) for t in texts):
                print(f"DOI found in: {file}")
                shutil.move(path, os.path.join(target, file))
            else:
                print(f"No DOI in: {file}")
        except Exception as e:
            print(f"Failed to process {file}: {e}")

Step 2. Run the script periodically

Save your script as e.g., move_pdfs_with_doi.py

Make the script executable

chmod +x /path/to/move_pdfs_with_doi.py

Open crontab. Crontab (short for “cron table”) is a time-based job scheduler in Unix-like operating systems, including macOS and Linux. It allows you to automatically run commands or scripts at specified times and intervals — daily, weekly, monthly, or even every minute.

crontab -e

Schedule the script to run every Friday at 3.00PM 0 15 * * 5 /usr/bin/python3 /path/to/move_pdfs_with_doi.py

to be continued…