You have a 40-page research paper due tomorrow, a textbook chapter you keep meaning to read, and a 25-minute commute where your eyes are otherwise useless. The obvious move is to let the document read itself to you. The not-so-obvious part is that "convert this PDF to audio" hides a dozen small problems — scanned pages that contain no real text, two-column academic layouts that scramble the reading order, equations that turn into gibberish, and 200-line reference lists that no one wants narrated aloud.
This guide walks through how to actually turn PDFs, papers, textbooks and EPUBs into audio that's pleasant to listen to, what trips most tools up, and how to get clean playback whether you're at your desk or on a train.
First, figure out what kind of PDF you have
Not all PDFs are equal, and this single distinction decides everything about whether audio will work.
A text PDF is one where the words are real, selectable characters. If you can open the file, click into a paragraph, and drag-select a sentence, it's a text PDF. Almost anything exported from Word, LaTeX, Google Docs, or a journal's website is text-based. These are the easy case: a reader can pull the text out cleanly and start speaking it in seconds.
A scanned PDF is really just a stack of images — a photo of each page. It looks like text to your eyes, but there are no characters underneath. The giveaway: you try to select a sentence and the whole page highlights as one block, or nothing selects at all. Old books, library scans, lecture handouts photographed on someone's phone, and many "free PDF" downloads fall into this bucket.
To get audio from a scanned PDF, the text has to be recovered first through OCR (optical character recognition). Good tools run OCR automatically; if yours doesn't, you'll get silence or garbage. A quick test before you commit: open the file, press Ctrl/Cmd+F, and search for a word you can see on the page. If search finds it, you have text. If it finds nothing, you have images and need OCR.
How to turn a text PDF into audio (the easy path)
For a normal text-based PDF, the workflow is short:
- Open the document in a text-to-speech reader. With CastReader you can drop the PDF straight into the app, or use the PDF to audiobook flow on the web.
- Let it extract the text and pick a voice. Natural neural voices are worth the extra second of loading — robotic voices make long documents exhausting.
- Hit play. Adjust speed to taste — 1.0x to 1.25x is comfortable for dense material, 1.5x and up once you're warmed up to a topic.
Two habits make this dramatically better. First, skip the front matter. Title pages, copyright notices, and tables of contents are noise when listened to; scrub past them to the actual first paragraph. Second, watch the reading order on multi-column layouts. Two-column PDFs (most journals, many textbooks) can confuse extraction so it reads across both columns instead of down one. A good reader handles this, but if a sentence suddenly stops making sense, that's usually why — jump to the next paragraph and keep going.
If you're choosing a tool and weighing it against the usual paid options, it's worth knowing what each one charges per month before you commit. Our Speechify alternative and NaturalReader alternative breakdowns lay out the pricing and limits side by side.
Scanned PDFs and textbooks: getting clean OCR
Scanned material is where most people give up, but it's very doable if you set expectations.
The quality of the original scan is everything. A crisp 300-DPI scan OCRs almost perfectly. A crooked phone photo with shadows, coffee rings, and highlighter marks will produce errors — the OCR will read "rn" as "m," mangle accented names, and choke on handwritten margin notes. If you control the scan, scan straight, in good light, at high resolution.
A few realities to plan around with scanned textbooks specifically:
- Headers, footers and page numbers get read aloud. "Chapter 4. Thermodynamics. 87." sprinkled through your audio is normal for raw OCR. It's mildly annoying but not a dealbreaker; your brain learns to ignore it.
- Figures, tables and captions interrupt the flow. A table of numbers narrated linearly is meaningless. When you hit one, skip ahead — tables and charts are the one thing audio genuinely can't replace.
- Footnotes land mid-sentence. Academic scans often interleave footnotes with body text, so a citation can interrupt a sentence. Again, skip past and rejoin the main thread.
The honest summary: scanned textbooks are great for reading the prose — the explanatory paragraphs that make up most of a chapter — and poor for anything visual or tabular. Listen for the argument, and keep the book open for the diagrams.
Research papers and arXiv: the special cases
Academic papers are their own genre, and a few things about them matter for audio.
Equations don't narrate well. A line of dense math becomes a stream of "x subscript i equals sum..." that's nearly impossible to follow by ear. The practical approach is to listen to the prose around the equations — the intuition, the setup, the interpretation — and pause to actually look at the math when the narration reaches it. Listening to the introduction, related work, and discussion sections is where audio shines; the proofs are where you'll want your eyes.
References are a wall of noise. A paper's last several pages are often nothing but citations. There's no value in hearing "Smith, J., Doe, A., 2021, Proceedings of..." forty times. Stop playback when you reach the bibliography.
Two-column PDFs are the norm, so the column-order caveat from earlier applies double to papers. If you read arXiv specifically, you usually have a choice: the PDF, or the increasingly common HTML version of the paper. The HTML version is a single clean column and tends to convert to audio far more reliably than the two-column PDF — if it exists, prefer it. Many people also read papers inside an AI assistant; if that's your workflow, you can listen to the summaries and explanations directly with Listen to Claude instead of wrestling with the source PDF at all.
A workflow a lot of researchers settle into: skim the abstract and figures with your eyes first, then put the prose sections on audio during a walk or commute. You get the high-level argument hands-free, and you've already seen the parts that need visual attention.
EPUBs and ebooks: the format built for this
If you have a choice of format, EPUB beats PDF for audio every single time. A PDF is a fixed picture of a page — fonts, columns, and margins are baked in. An EPUB is reflowable text, like a web page, with a clean chapter structure and no two-column traps or page-number litter. That makes it the ideal source for narration.
To listen to an ebook, load the EPUB into a reader and play it chapter by chapter. The EPUB to audio reader flow is built exactly for this and keeps the chapter breaks intact, so you can jump around naturally. Many non-fiction books and free public-domain titles (Project Gutenberg, for instance) are distributed as EPUB precisely because it's the friendly format.
A note on Kindle: Amazon's books are EPUB-like under the hood but locked to Amazon's apps. You can still listen to them — see Listen to Kindle for the realistic options and their limits. If you're buying or downloading something new and audio matters to you, choosing the plain EPUB will save you a lot of friction.
Listen anywhere: desk, phone, commute
The point of all this is to not be tethered to a screen, so where you listen matters.
CastReader works as a Chrome and Edge extension for reading at your desk, a Mac app, and iOS and Android apps for everywhere else. A common pattern: queue a paper or chapter on your laptop, then continue on your phone during the commute. The extension is in the Chrome Web Store, and the mobile apps are on the App Store and Google Play. For general web reading beyond documents — articles, docs, anything with text — the same engine powers free text-to-speech across the browser.
CastReader is free to use, so the main decision is simply which surface fits your day. CastReader Pro adds premium ultra-realistic voices, more listening hours, and AI document analysis if you want them.
Quick FAQ
Can I convert a scanned PDF to audio?
Yes, but it requires OCR to recover the text from the page images first. A clean, high-resolution scan works well; a blurry or crooked phone photo will introduce errors. Test by searching for a visible word — if search finds nothing, the file is images and needs OCR before any reader can speak it.
Why does my PDF read in the wrong order?
Almost always a two-column layout. Text extraction sometimes reads straight across both columns instead of down one, so sentences get interleaved. It's most common in academic papers and textbooks. If a passage stops making sense, skip to the next clean paragraph — or use a single-column source (like an HTML version of the paper) when one exists.
What's the best format for listening — PDF or EPUB?
EPUB, by a wide margin. It's reflowable text with real chapter structure, so there are no column traps, no baked-in page numbers, and clean navigation. PDF is a fixed image of a page and is inherently messier to narrate. If you can choose, choose EPUB.
How should I listen to research papers with equations?
Put the prose on audio — abstract, introduction, related work, discussion — and look at the equations and figures with your eyes when narration reaches them. Stop before the reference list; it's just a long string of citations with no listening value.
Is CastReader free?
Yes. CastReader is a free text-to-speech reader across the Chrome/Edge extension, Mac app, and iOS and Android apps — any text read aloud in a natural voice, no signup. There's also an optional CastReader Pro plan that adds premium ultra-realistic voices, more listening hours, and AI document analysis.
The short version: check whether your PDF has real text or is just images, prefer EPUB and single-column sources when you can, skip the equations and reference lists, and let the prose come to you through your headphones. Once it's set up, a stack of papers turns into a podcast you can clear on your next walk.
Questions or a document that won't cooperate? Email us at support@castreader.ai — we read every message.