| Short description: |
Converts documents to text, html and pdf |
| Category: |
Application/Text processing |
| Status: |
alpha |
| Created: |
2003-07-29 20:02:22 GMT |
| Last update: |
- |
| Owner: |
MarkWilson
(Projects of this owner) |
| Homepage: |
none |
| Download: |
ftp://ftp.learningruby.com/document_convert-0.0.3a.tar.gz
|
| License: |
Ruby's |
| Dependency: |
|
| Description: |
What's new: Fixed a mistake that broke text to pdf conversion. Some refactoring.
What's was new: Accepts multiple files to convert on the command line. Some refactoring.
Description: Converts text, html, pdf, rtf, word and postscript files TO text, html or pdf file formats.
Usage: ruby document_convert.rb <file name.format> [<file name.format> ...] <text | html | rtf>, e.g.,
pdf to text - ruby document_convert.rb document.pdf text
text to html - ruby document_convert.rb document.txt html
word to pdf - ruby document_convert.rb document.doc pdf
postscript to html - ruby document_convert.rb document.ps html
Multiple files: ruby document_convert.rb document.ps document.pdf document.html text
Output: file_name.format.target_format, e.g., document.pdf.txt
Requirements:
Requires the following:
antiword available at http://www.winfield.demon.nl/index.html. Also available through fink.
enscript available at http://people.ssh.fi/mtr/genscript/. Also available through fink.
file Present in one form or another on every Unix-based OS. Also available at http://www.qnx.com/developer/docs/qnx_6.1_docs/neutrino/utilities/f/file.html. Also available through fink.
htmldoc available at http://www.easysw.com/htmldoc/htmldoc.html.
lynx available at http://lynx.isc.org/release/. Also available through fink.
pdftohtml available at http://pdftohtml.sourceforge.net/.
pdftotext available at http://www.foolabs.com/xpdf/download.html (part of Xpdf). Also available through fink.
ps2pdf available with Ghostscript (and therefore present on almost every Unix-based OS).
pstotext available at http://research.compaq.com/SRC/virtualpaper/pstotext.html.
txt2html available at http://txt2html.sourceforge.net/txt2html.html.
unrtf available at http://www.gnu.org/software/unrtf/unrtf.html. Also available through fink.
wvHtml available at http://wvware.sourceforge.net/.
Install: Copy document_convert.rb to executable directory (e.g., /usr/local/bin). Make executable and add #!path/to/ruby. Copy document_convert_model.rb to Ruby site library directory (e.g., /usr/local/lib/ruby/site_ruby/1.8).
|