Read PDF with Python

Install

pip install pdfplumber

Or let the script install it automatically (see below).

Script

import sys

try:
    import pdfplumber
except ImportError:
    import subprocess
    subprocess.check_call([sys.executable, "-m", "pip", "install", "pdfplumber", "-q"])
    import pdfplumber

pdf_path = r"<path/to/your/file.pdf>"
out_path = r"<path/to/output.txt>"

with pdfplumber.open(pdf_path) as pdf:
    with open(out_path, "w", encoding="utf-8") as f:
        f.write(f"Total pages: {len(pdf.pages)}\n")
        for i, page in enumerate(pdf.pages):
            f.write(f"\n{'='*60}\n")
            f.write(f"PAGE {i+1}\n")
            f.write("=" * 60 + "\n")
            text = page.extract_text()
            if text:
                f.write(text + "\n")
            else:
                f.write("[Empty page or image-only]\n")

print(f"Done! Output saved to: {out_path}")

Replace <path/to/your/file.pdf> and <path/to/output.txt> with your actual file paths.

Notes

Pages that contain only images or scans will not have extractable text
Output is saved as .txt with UTF-8 encoding
pdfplumber is more accurate than PyPDF2 for PDFs with tables