[nmglug] How do I get the text content only from a website?
Andrew Farnsworth
farnsaw at stonedoor.com
Tue Jul 22 09:09:31 PDT 2008
VA wrote:
> Does WGET have the option to obtain only the text of the webpage only?
>
> The recursive downloading option allows me to create the website
> locally, but I want the content (the text) only, from all html's on
> the site in one text (ascii) file.
>
> Any ideas?
>
> Thanks,
> Virginia
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> nmglug mailing list
> nmglug at nmglug.org
> https://nmglug.org/mailman/listinfo/nmglug
>
Use WGET in conjunction with HTML2TXT or HTML2RTF depending on what you
are really trying to do and then just CAT them together. A bit of
scripting around these tools should get you a single command line that
will give you a complete
Andy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nmglug.org/pipermail/nmglug-nmglug.org/attachments/20080722/0b0e4482/attachment.htm>
More information about the nmglug
mailing list