<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 TRANSITIONAL//EN">

<HTML>

<HEAD>

  <META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=UTF-8">

  <META NAME="GENERATOR" CONTENT="GtkHTML/3.18.2">

</HEAD>

<BODY>

not sure what you want to do,<BR>

but for scrapping you can use<BR>

ruby hpricot and scrape libraries, they are very very good at gathering info from web pages,<BR>

<BR>

<BR>

    On Tue, 2008-07-22 at 09:01 -0700, VA wrote:<BR>

<BLOCKQUOTE TYPE=CITE>

    <TABLE CELLSPACING="0" CELLPADDING="0">

<TR>

<TD VALIGN="top">

Does WGET have the option to obtain only the text of the webpage only?<BR>

<BR>

<BR>

The recursive downloading option allows me to create the website locally, but I want the content (the text) only, from all html's on the site in one text (ascii) file.  <BR>

<BR>

Any ideas?<BR>

<BR>

Thanks,<BR>

Virginia<BR>

<BR>

<BR>

</TD>

</TR>

</TABLE>

    <BR>

<PRE>

_______________________________________________

nmglug mailing list

<A HREF="mailto:nmglug@nmglug.org">nmglug@nmglug.org</A>

<A HREF="https://nmglug.org/mailman/listinfo/nmglug">https://nmglug.org/mailman/listinfo/nmglug</A>

</PRE>

</BLOCKQUOTE>

<TABLE CELLSPACING="0" CELLPADDING="0" WIDTH="100%">

<TR>

<TD>

<PRE>

-- 

Andres Paglayan

CTO, StoneSoup LLC 

Ph: 505 629-4344

Mb: 505 690-2871

FWD: 65-5587

Testi. Codi. Vinci.

</PRE>

</TD>

</TR>

</TABLE>

</BODY>

</HTML>