If you have to deal with those pesky PDF files, pdftohtml from the poppler package and w3m are all you need.

pdftohtml -i -s -stdout filename.pdf | w3m -T text/html

Make it into a convenient function by adding it to your .bashrc.

pdf() { if [ $# -ne 1 ]; then echo "Usage: pdf filename." else pdftohtml -i -s -stdout $1 | w3m -T text/html fi }

@storm I have found the conversions from these to be ugly at best, and often worse than useless. Converting pdf, even properly tagged pdf, is still a dicey business.

@dgoodmaniii most actual pdf software is not accessible with orca, and even the few that are can be a pain to use. This method at least gets the text in a format that is usable. I figured the formatting wouldn't be exactly the same as the original, but I didn't realize it would be terrible. In my case, however, the layout isn't usually that important.
Follow

@storm Fair enough; as long as the pdf has text in it (many scans don't), you will get something readable. I spend a *lot* of time on document conversion, and find the whole process very frustrating with pdf.

The mutools bundle is also worth a look.

@dgoodmaniii oh, no doubt about it, pdf is the root of all evil. It's like plain text came out and it was good, so companies immediately set about finding the worst thing possible lol. Now they all use it, no plain text instruction manuals to be found ever. I will look into mutools, thanks for the suggestion.
Sign in to participate in the conversation
RCsocial.net

RCsocial.net — a friendly social networking space for those with an interest in Catholicism.