RAA - doc_convert

doc_convert / 0.0.3a

Short description: Converts documents to text, html and pdf
Category: Application/Text processing
Status: alpha
Created: 2003-07-29 20:02:22 GMT
Last update: -
Owner: MarkWilson (Projects of this owner)
Homepage: none
Download: ftp://ftp.learningruby.com/document_convert-0.0.3a.tar.gz
License: Ruby's
Dependency:
None
Description:

What's new: Fixed a mistake that broke text to pdf conversion. Some refactoring.

What's was new: Accepts multiple files to convert on the command line. Some refactoring.

Description: Converts text, html, pdf, rtf, word and postscript files TO text, html or pdf file formats.

Usage: ruby document_convert.rb <file name.format> [<file name.format> ...] <text | html | rtf>, e.g.,

pdf to text - ruby document_convert.rb document.pdf text
text to html - ruby document_convert.rb document.txt html
word to pdf - ruby document_convert.rb document.doc pdf
postscript to html - ruby document_convert.rb document.ps html

Multiple files: ruby document_convert.rb document.ps document.pdf document.html text

Output: file_name.format.target_format, e.g., document.pdf.txt

Requirements:

Requires the following:

antiword available at http://www.winfield.demon.nl/index.html. Also available through fink.
enscript available at http://people.ssh.fi/mtr/genscript/. Also available through fink.
file Present in one form or another on every Unix-based OS. Also available at http://www.qnx.com/developer/docs/qnx_6.1_docs/neutrino/utilities/f/file.html. Also available through fink.
htmldoc available at http://www.easysw.com/htmldoc/htmldoc.html.
lynx available at http://lynx.isc.org/release/. Also available through fink.
pdftohtml available at http://pdftohtml.sourceforge.net/.
pdftotext available at http://www.foolabs.com/xpdf/download.html (part of Xpdf). Also available through fink.
ps2pdf available with Ghostscript (and therefore present on almost every Unix-based OS).
pstotext available at http://research.compaq.com/SRC/virtualpaper/pstotext.html.
txt2html available at http://txt2html.sourceforge.net/txt2html.html.
unrtf available at http://www.gnu.org/software/unrtf/unrtf.html. Also available through fink.
wvHtml available at http://wvware.sourceforge.net/.

Install: Copy document_convert.rb to executable directory (e.g., /usr/local/bin). Make executable and add #!path/to/ruby. Copy document_convert_model.rb to Ruby site library directory (e.g., /usr/local/lib/ruby/site_ruby/1.8).

Edit this project (for project owner)

back to RAA top