Text to audio file on Windows

Post by Larry Wolff
What freeware do you use to convert a relatively large (10-page) PDF text
file to an audio sound lfile (MP3 is fine, as is WAV) for free on Windows?

If you first convert or extract the PDF text to a TXT file
then you can simply drop that onto the following VBScript.
Go into Control Panel\All Control Panel Items\Speech Recognition
and click "Text To Speech" to choose the voice and speaking
rate that you prefer. Then save the following text in Notepad
as a .VBS file, like "record.vbs". In the line that begins SpFile.open,
edit the path to whatever you like. Then simply drop the TXT file
onto the script.

I tested this script with a file of about 2 pages. It finished
almost instantly. (Watch out for wordwrap below.)

' begin VBS file code:

Dim Voice, SpFile
Dim FSO, s1, Arg, TS

Arg = WScript.Arguments(0)
Set FSO = CreateObject("Scripting.FileSystemObject")
If FSO.fileexists(Arg) = False Then
MsgBox "Drop a text file onto the script to be converted to sound
file."
WScript.quit
End If

Set TS = FSO.OpenTextFile(Arg, 1)
s1 = TS.ReadAll
TS.Close
Set TS = Nothing
Set FSO = Nothing

Set Voice = CreateObject("Sapi.spVoice")
Set SpFile = CreateObject("Sapi.spFileStream")

SpFile.open "C:\Windows\Desktop\Recording.wav", 3, False
Set Voice.AudioOutputStream = SpFile
Voice.Speak s1, 64
SpFile.close

Set SpFile = Nothing
Set Voice = Nothing
MsgBox "Done."

david

2024-07-24 17:58:30 UTC

Post by Larry Wolff
What freeware do you use to convert a relatively large (10-page) PDF text
file to an audio sound lfile (MP3 is fine, as is WAV) for free on Windows?

You can likely find online programs but that would violate your privacy.

I don't think it's possible to create an MP3 or WAV offline from PDF text.
If it was, there would be programs that you could download that do it.

Newyana2

2024-07-24 19:13:59 UTC

Post by Larry Wolff
What freeware do you use to convert a relatively large (10-page) PDF text
file to an audio sound lfile (MP3 is fine, as is WAV) for free on Windows?

You can likely find online programs but that would violate your privacy.
I don't think it's possible to create an MP3 or WAV offline from PDF text.
If it was, there would be programs that you could download that do it.

Even extracting the text can be tricky. Some PDFs are actually
just storing images, so OCR is necessary first. Even when text
is stored, it's not stored as character encoding but rather as
vector images. Which is why even the best PDF text extractors
will do things like converting u to ii or converting d to cl.

Converting text to WAV once extracted or selected and copied,
however, is very simple with the Windows SAPI5 (speech API) libraries.
SAPI5 has been available since XP.

Paul

2024-07-24 21:33:39 UTC

Post by Larry Wolff
What freeware do you use to convert a relatively large (10-page) PDF text
file to an audio sound lfile (MP3 is fine, as is WAV) for free on Windows?

You can likely find online programs but that would violate your privacy.
I don't think it's possible to create an MP3 or WAV offline from PDF text.
If it was, there would be programs that you could download that do it.

Even extracting the text can be tricky. Some PDFs are actually
just storing images, so OCR is necessary first. Even when text
is stored, it's not stored as character encoding but rather as
vector images. Which is why even the best PDF text extractors
will do things like converting u to ii or converting d to cl.
Converting text to WAV once extracted or selected and copied,
however, is very simple with the Windows SAPI5 (speech API) libraries.
SAPI5 has been available since XP.

I think the PDF to TXT, you need to stop at that point
and edit the TXT file, before the TXT to WAV step.

Too many things could have gone wrong at that point,
and 200 pages of TXT is bound to be unusable as such.
It's going to need edits and removals. At a minimum.

Paul

Newyana2

2024-07-24 21:47:08 UTC

Post by Paul

Post by Larry Wolff
What freeware do you use to convert a relatively large (10-page) PDF text
file to an audio sound lfile (MP3 is fine, as is WAV) for free on Windows?

You can likely find online programs but that would violate your privacy.
I don't think it's possible to create an MP3 or WAV offline from PDF text.
If it was, there would be programs that you could download that do it.

Even extracting the text can be tricky. Some PDFs are actually
just storing images, so OCR is necessary first. Even when text
is stored, it's not stored as character encoding but rather as
vector images. Which is why even the best PDF text extractors
will do things like converting u to ii or converting d to cl.
Converting text to WAV once extracted or selected and copied,
however, is very simple with the Windows SAPI5 (speech API) libraries.
SAPI5 has been available since XP.

I think the PDF to TXT, you need to stop at that point
and edit the TXT file, before the TXT to WAV step.
Too many things could have gone wrong at that point,
and 200 pages of TXT is bound to be unusable as such.
It's going to need edits and removals. At a minimum.

Yes, probably right. Though it seems to vary. I find that if the
PDF has text and I get it with select-all -> copy it seems to work
better than using extractor tools. I used FreeOCR the other day to
convert two versions of talks on the Zen Oxherding pictures. One
was just photos of pages. It worked almost perfectly except with
foreign words. I think there must have been dozens of versions
of "sutra", for example. So maybe there's spellcheck incorporated?

Newyana2

2024-07-24 21:58:17 UTC

That just gave me an idea. I tried FreeOCR on a PDF that seems to
be text that isn't images. It works beautifully.

Joe Beanfish

2024-07-25 14:29:33 UTC

Post by Larry Wolff
What freeware do you use to convert a relatively large (10-page) PDF text
file to an audio sound lfile (MP3 is fine, as is WAV) for free on Windows?

You can likely find online programs but that would violate your privacy.
I don't think it's possible to create an MP3 or WAV offline from PDF text.
If it was, there would be programs that you could download that do it.

Incorrect. Text is stored as characters(glyphs), not vector images. Though
sometimes it can be layed out in interesting orders. IDK about Windows, but
pdftotext and similar on Linux will extract the text. With that said,
the text may have defects as you describe if it was the result of OCR.
But if it is a PDF generated from a word processor or such, the text will
be perfect, just as it was in the word processor.

Larry Wolff

2024-07-25 20:59:30 UTC

Post by Joe Beanfish

Post by Larry Wolff
What freeware do you use to convert a relatively large (10-page) PDF text
file to an audio sound lfile (MP3 is fine, as is WAV) for free on Windows?

You can likely find online programs but that would violate your privacy.
I don't think it's possible to create an MP3 or WAV offline from PDF text.
If it was, there would be programs that you could download that do it.

The PDFs I have come across perfectly as text so that's not the problem.
The text can be spoken to English on Windows also, so that's not a problem.

The problem is converting that spoken words to an audio file.

I searched far and wide, and while I found lots of online clickbait, I
can't yet find a freeware Windows 10 program to convert it to audio.

That's why I had asked for help.
But nobody seems to know the answer any better than I do.

I guess that means freeware Windows 10 offline text to audio file
conversion tools probably do not exist since nobody can find any.

Newyana2

2024-07-25 21:40:46 UTC

Post by Larry Wolff
The problem is converting that spoken words to an audio file.
I searched far and wide, and while I found lots of online clickbait, I
can't yet find a freeware Windows 10 program to convert it to audio.
That's why I had asked for help.
But nobody seems to know the answer any better than I do.
I guess that means freeware Windows 10 offline text to audio file
conversion tools probably do not exist since nobody can find any.

I gave you the answer. Windows can do it through the
speech API, using the very simple script that I posted. All
you need to do is give it a TXT file by copying the text out
of the PDF.

Larry Wolff

2024-07-26 02:41:18 UTC

I gave you the answer. Windows can do it through the
speech API, using the very simple script that I posted. All
you need to do is give it a TXT file by copying the text out
of the PDF.

Thanks for that VBS script as it probably would work for nerdy people.
But I'm trying to help non-nerds who won't be running visual basic.

What I want is what everyone wants which is a program that anyone can
download and just run with an input file and an output file without having
to install Microsoft Visual Basic, Pascal, Docker, or other crutches.

What is the setup required to add visual basic to a PC to run your script?

Paul

2024-07-26 05:12:28 UTC

Post by Larry Wolff

I gave you the answer. Windows can do it through the
speech API, using the very simple script that I posted. All
you need to do is give it a TXT file by copying the text out
of the PDF.

Thanks for that VBS script as it probably would work for nerdy people.
But I'm trying to help non-nerds who won't be running visual basic.
What I want is what everyone wants which is a program that anyone can
download and just run with an input file and an output file without having
to install Microsoft Visual Basic, Pascal, Docker, or other crutches.
What is the setup required to add visual basic to a PC to run your script?

Standard protocol, is you show interest by copying the offered script
into a text file and running it. Such as following the instructions,
dragging and dropping something, or... whatever. Then, if you're having
trouble with your bog-standard PC, post back and ask for a little more help.

My Windows 11 daily driver, is Win11Home and is bog-standard (no special features,
Visual Studio is on another OS as H: ).

MsgBox "Drop a text file onto the script to be converted to sound file"

Just rejecting it out of hand, what are the odds he'll write one for
the next individual who comes along ?

A measure of OS support, is watching what happens to the icon, when
I change the name of the script to

newscript.vbs

I changed the output filename a bit, in the following. C:\users\username\Desktop\Recording.wav
The script worked fine, and it was fast.

************************************** Content of newscript.vbs ********************************
' begin VBS file code:

Dim Voice, SpFile
Dim FSO, s1, Arg, TS

Arg = WScript.Arguments(0)
Set FSO = CreateObject("Scripting.FileSystemObject")
If FSO.fileexists(Arg) = False Then
MsgBox "Drop a text file onto the script to be converted to sound file."
WScript.quit
End If

Set TS = FSO.OpenTextFile(Arg, 1)
s1 = TS.ReadAll
TS.Close
Set TS = Nothing
Set FSO = Nothing

Set Voice = CreateObject("Sapi.spVoice")
Set SpFile = CreateObject("Sapi.spFileStream")

Set oShell = CreateObject("WScript.Shell")
strHomeFolder = oShell.ExpandEnvironmentStrings("%USERPROFILE%")
strOutput = strHomeFolder & "\Desktop\Recording.wav"

' SpFile.open "C:\Windows\Desktop\Recording.wav", 3, False
SpFile.open strOutput, 3, False

Set Voice.AudioOutputStream = SpFile
Voice.Speak s1, 64
SpFile.close

Set SpFile = Nothing
Set Voice = Nothing
MsgBox "Done."
************************************** Content of newscript.vbs ********************************

This is the result. Recording.wav, uploaded to sndup.net .

Scroll down a bit, to see the player-box with the triangle-start button on the left of it.

http://sndup.net/d3g9z/

Paul

Newyana2

2024-07-26 12:23:24 UTC

Post by Larry Wolff

I gave you the answer. Windows can do it through the
speech API, using the very simple script that I posted. All
you need to do is give it a TXT file by copying the text out
of the PDF.

Thanks for that VBS script as it probably would work for nerdy people.
But I'm trying to help non-nerds who won't be running visual basic.
What I want is what everyone wants which is a program that anyone can
download and just run with an input file and an output file without having
to install Microsoft Visual Basic, Pascal, Docker, or other crutches.
What is the setup required to add visual basic to a PC to run your script?

No setup. Windows Script Host is on all Windows
systems since Win98. You just copy the script into
Notepad, save as a .vbs file, then drop a text file
onto it. Change the line that provides the path if
you like. Maybe C:\recording.wav would be good
rather than C:\Windows\Desktop. (I use that path
for my Desktop but most people don't.)

In some cases a VBS file might run into permissions
problems, but this one should be fine. It runs fine for me.

Aside from that, no one need look at the code.
If you're wary of running script you can look up the
"objects" used. The script just opens the dropped file
and feeds it to the speech API. The speech API, in
turn, has a built-in method to convert plain text to
a WAV file instead of speaking it.

If you want to test out a simpler script, you can try
saving these 3 lines to a .vbs file and then double-click it.
(Make sure you have working speakers.) -

Set Voice = CreateObject("Sapi.spVoice")
Voice.Speak "It's as easy as this to use SAPI.", 0
Set Voice = Nothing

This functionality is built-in, as I said. Microsoft provides
choice of voices, speed, etc in Control Panel, but they
haven't provided their own software to use it. Instead it's
mainly used by things like screenreaders for the blind.

Stan Brown

2024-07-26 05:00:33 UTC

Post by Larry Wolff
The PDFs I have come across perfectly as text so that's not the problem.
The text can be spoken to English on Windows also, so that's not a problem.
The problem is converting that spoken words to an audio file.
I searched far and wide, and while I found lots of online clickbait, I
can't yet find a freeware Windows 10 program to convert it to audio.

That's the sort of thing Audacity can do, if I'm not mistaken.

<https://www.audacityteam.org/>

--
Stan Brown, Tehachapi, California, USA https://BrownMath.com/
Shikata ga nai...

Paul

2024-07-26 15:38:22 UTC

Post by Stan Brown

That's the sort of thing Audacity can do, if I'm not mistaken.
<https://www.audacityteam.org/>

Audacity is mainly an audio editor. It takes an audio input
file, and outputs an audio file on output. For example,
you can "normalize" or "compand" audio, interact with the
amplitude. Or the frequency response.

Historically, Audacity didn't even have FFMPEG in it. It
did not have multimedia conversion capability, like open
an MP4 video and extract an M4A audio stream. The input
formats originally were quite limited. The LAME MP3 module,
was their "feature" (not bundled, had to be acquired in a
brown paper bag). Most of the other formats would be
closer to PCM (pulse code modulation). MP3 is now "legal",
so the brown paper bag is optional.

I don't think working with SAPI would be a high runner.
Mainly because SAPI is Microsoft, and Audacity is FOSS
and SAPI-equivalent on Linux (assuming there is one),
would mean writing the software twice. And you know
how FOSS only accepts cross-platform standards, so they
"only have to write the code once". They hate having to support
custom standards on each platform.

Which is why LibreOffice, Firefox, and the like, might
be using OpenGL for their graphics, rather than
DirectX3D. As it is, the Firefox graphics person was
griping about having to support X11 and Wayland at the
same time (during the Wayland change-over interval,
which could be a while). X11 runs GLXGears at 20000 FPS,
while Wayland manages 12000 FPS (slower, gee, thanks).

If you wrote Firefox browser code, and you supported
OpenGL on Linux and DirectX on Windows, the graphics
person would have extra work to do.

On VLC, it means VLC can directly use a TV tuner to
play a TV channel (in Linux). The code does not exist on Windows,
as far as I know you'd need to use Media Center software
interface (on the assumption the TV tuner card is
Media Center compatible and comes with a Media Center
driver). I own two tuner cards, one ancient, and the
ancient one is not Media Center, and the newer one is
Media Center. That means, if VLC ever gets that working,
by writing Windows-specific [Media Center] code in VLC
for it, only the newer card would work.

*******

https://en.wikipedia.org/wiki/Audacity_%28audio_editor%29

"Importing, exporting and conversions

Audacity natively imports and exports WAV, AIFF, MP3, Ogg Vorbis,
and all file formats supported by libsndfile library. Due to
patent licensing concerns, the FFmpeg library necessary to import
and export proprietary formats such as M4A (AAC) and WMA is
not bundled with Audacity but has to be downloaded separately.[23]

In conjunction with batch processing features, Audacity can be used
to convert files from one format to another, or to digitize records,
tapes or MiniDiscs.
"

"Customizability and extensibility

Audacity supports ... audio effect plugins ...

In January 2024, Intel introduced some AI-powered capabilities for
Audacity as part of its OpenVINO plugin suite.[29][30]
"

That last item is more likely to be a lever to expand Audacity, than
perhaps waiting for them to be adventurous enough to include a
"small" version of FFMPEG.

If they were to include calls to SAPI,
they'd probably have heart failure :-) Their lawyer would tell them
to stop :-) By not being adventurous, they can "just edit audio files".
And stay out of court. It's legal chill that makes people less adventurous.

Paul

Stan Brown

2024-07-26 23:39:15 UTC

Post by Paul

[quoted text muted]
That's the sort of thing Audacity can do, if I'm not mistaken.
<https://www.audacityteam.org/>

Your knowledge is greater than mine (I mean that honestly, not as
sarcasm), but there's that red "record" button on Audacity's console.
I used it to create short snippets in my voice that I then save as
AAC files (.m4A extension). I would think that the user changing
settings in Volume Mixer, would let Audacity record sounds generated
buy other software on the computer.

Incidentally, the problem I was trying to solve was to create "back
announcements" so that I wouldn't have to get up while listening to
something in my randomized playlist and go over to the iPod to see
what was playing. My files played fine in iTunes on my PC, but the
iPod skipped most of my test announcements and also skipped all or
most files that were immediately before or after one of my
announcements.

--
Stan Brown, Tehachapi, California, USA https://BrownMath.com/
Shikata ga nai...

Paul

2024-07-27 00:08:46 UTC

Post by Stan Brown

Post by Paul

[quoted text muted]
That's the sort of thing Audacity can do, if I'm not mistaken.
<https://www.audacityteam.org/>

Your knowledge is greater than mine (I mean that honestly, not as
sarcasm), but there's that red "record" button on Audacity's console.
I used it to create short snippets in my voice that I then save as
AAC files (.m4A extension). I would think that the user changing
settings in Volume Mixer, would let Audacity record sounds generated
buy other software on the computer.
Incidentally, the problem I was trying to solve was to create "back
announcements" so that I wouldn't have to get up while listening to
something in my randomized playlist and go over to the iPod to see
what was playing. My files played fine in iTunes on my PC, but the
iPod skipped most of my test announcements and also skipped all or
most files that were immediately before or after one of my
announcements.

Do you mean "Line In" or "Microphone In" plus <Record> ?

Yes, you can record from microphone.

But that's not SAPI, or Text To Speech Synthesis.

I do not think Audacity accepts a .txt file and outputs
SAPI generated .wav, like "Newyana2" script does it.
The company that owns Audacity now, I don't think their intent
is to turn it into a "VLC-like" product. There was concern
at first, in the news that a private company had acquired it.

But if you load up Notepad, use the TTS capability it has,
configure "What You Hear" (Stereo Mix) in Windows sound,
go into Audacity and select "What You Hear" as its input, punch Record,
what is coming out of the speaker will be recorded.

[Picture] Yes, this still works in Windows 11, where this picture is taken...
It should look like this in Windows 10 too.

Loading Image...

Paul

Newyana2

2024-07-26 13:24:23 UTC

Post by Joe Beanfish

Post by Newyana2
Even extracting the text can be tricky. Some PDFs are actually
just storing images, so OCR is necessary first. Even when text
is stored, it's not stored as character encoding but rather as
vector images. Which is why even the best PDF text extractors
will do things like converting u to ii or converting d to cl.

Incorrect. Text is stored as characters(glyphs), not vector images.

Glyph just means shape. The shape must be encoded somehow.
Isn't it encoded as a vector image? There are only two methods
I'm aware of. A raster image is a map of pixel values. A vector image
is a math formula. The latter can be losslessly enlarged because
they're shapes rather than point data. My understanding is that PDFs
are using vector encoding, which is why they can be enlarged
without losing definition.

If that's not true then perhaps you could point to a link. I'd be
curious to understand better how it works.

This makes a difference
because if it's a vector image shape then OCR software might be
the best way to extract the text. Stored text, on the other hand, is
not shapes but rather numbers. For example, in plain ASCII, ANSI,
or UTF-8 text, a byte value of 65 represents "A". Binary data that
directly represents characters would translate perfectly to text. But
I don't think PDFs are storing it that way. First, because fonts must
be stored in the file. Second because PDF converters often make
visual/shape errors, like seeing "u" as "ii".

Joe Beanfish

2024-07-29 16:10:28 UTC

Post by Joe Beanfish

Incorrect. Text is stored as characters(glyphs), not vector images.

Glyph just means shape. The shape must be encoded somehow.
Isn't it encoded as a vector image? There are only two methods
I'm aware of. A raster image is a map of pixel values. A vector image
is a math formula. The latter can be losslessly enlarged because
they're shapes rather than point data. My understanding is that PDFs
are using vector encoding, which is why they can be enlarged
without losing definition.
If that's not true then perhaps you could point to a link. I'd be
curious to understand better how it works.
This makes a difference
because if it's a vector image shape then OCR software might be
the best way to extract the text. Stored text, on the other hand, is
not shapes but rather numbers. For example, in plain ASCII, ANSI,
or UTF-8 text, a byte value of 65 represents "A". Binary data that
directly represents characters would translate perfectly to text. But
I don't think PDFs are storing it that way. First, because fonts must
be stored in the file. Second because PDF converters often make
visual/shape errors, like seeing "u" as "ii".

You're thinking of fonts which are descriptions of how to draw the
letters. But the PDF file contains the text similar to this usenet
message, albeit with a lot of other info such as size and placement,
where each byte (or multi-byte) means a particular letter,
e.g. ASCII says byte with value 65 means A. Then the renderer will
lookup 65 or the letter A in the font definition to display/print it.
Fonts are sometimes stored within a PDF file, but those are separate
from the textual encoding.

Not sure what you mean by "PDF converters". PDFs are typically generated
one of two ways, either directly from source text as a word processor's
"Export to PDF" would do. Or pages are scanned (photographed) and placed
into the PDF as images and, optionally, OCR software is run on those
images to extract text to also include in the PDF. The former(export)
can not make visual/shape errors, the latter(OCR) can. PDF text
extractors will read whatever text encoding has been placed. If a
scanned PDF was not OCRed as well, a text extractor will find nothing.

I've worked in the bowels of the XPDF text extraction code, so I know
this is how PDF works. Perhaps the wiki can word it better than I.
https://en.wikipedia.org/wiki/PDF
Jump down to the "Text" section.

2024-07-24 19:54:09 UTC

Post by Larry Wolff
What freeware do you use to convert a relatively large (10-page) PDF text
file to an audio sound lfile (MP3 is fine, as is WAV) for free on Windows?

You can likely find online programs but that would violate your privacy.
I don't think it's possible to create an MP3 or WAV offline from PDF text.
If it was, there would be programs that you could download that do it.

ocr screenshots of displayed pdf pages, proofread, correct, text to speech

Larry Wolff

2024-08-18 20:47:11 UTC