Quick and dirty article filter



As I've been spending more time on Usenet, I've come to become rather
annoyed at the "features" of Google Groups. Yes, it sucks, it makes
things a mess, and so on. But! Instead of just whining about the
problem, I've put together a small, simple Perl program to filter Google
Groups (and other) articles into something resembling sanity. I wrote
this for slrn, and will include the macro I wrote for slrn as well.

A quick precaution about this code: yes, it uses Email::Simple. For the
most part, email messages and news articles are compatible--but that's
not the part that summons this precaution. The code uses Email::Simple
to stick its fingers in its ears and pretend that encodings other than
US ASCII do not exist. For Usenet, this is still mostly okay (at least
for the parts of it I read, i.e. the Big 8 and a few groups in alt.*).
It very likely will break on multibyte messages, but I tend not to read
or encounter those. You have been warned.

This requires a new-ish version of libslang (2.2.4 works) due to a
recently-fixed bug in process.sl that I stumbled over while developing
the slang macro. It was fixed in git at the time I found it, but not in
the released version of libslang.

Patches are, of course, welcome. If someone wants to do proper encoding
handling, that'd be pretty awesome. If you really care about licenses,
I'll release it as the same terms as perl 5.16.0 or any later version.
Provided as-is, no warranty, yadda yadda.

fix_article.pl:

=== cut ===
#!/usr/bin/env perl
use strict;
use warnings;

use IO::All;
use Email::Simple;
use Text::Autoformat;

my $raw = io('-')->all;
my $article = Email::Simple->new($raw);
my $body = $article->body;
$body =~ s/^(>+)(\w)/$1 $2/mg;
$body =~ s/^>( >)+/>>/mg;
$body = autoformat $body;
$article->body_set($body);
print $article->as_string;
=== cut ===

fix_article.sl:

=== cut ===
require("process");
define fix_article_stupidity ()
{
variable a = article_as_string();
variable p = new_process(["/path/to/fix_article.pl"]; write={1,2}, read=0);
fputslines(a, p.fp0);
fflush(p.fp0);
fclose(p.fp0);
variable r = fgetslines(p.fp1);
p.wait();
variable n = strjoin(r, "");
replace_article(n);
}
!if(register_hook("read_article_hook", "fix_article_stupidity"))
message("Warning: Could not register fix_article_stupidity" +
" for read_article_hook");
=== cut ===


--
Thanks and best regards,
Chris Nehren
.



Relevant Pages

  • Re: === The verbal and the cyber war ===
    ... As said I write articles that I publish via the uncensored internet ... for the author in Google Groups and not even possible in Usenet. ...   located? ...
    (soc.culture.burma)
  • === The verbal and the cyber war ===
    ... As said I write articles that I publish via the uncensored internet ... for the author in Google Groups and not even possible in Usenet. ... questions to the Google Groups Help Forum ...
    (soc.culture.burma)
  • Re: Request to Musatov, an open letter
    ... following usenet guidelines of having your own full ... Here I have a conspiracy theory about Musatov, and it has to do with ... of spams to unmoderated usenet groups via Google groups. ... sci.logic see much less of Google groups spammers, ...
    (sci.math)
  • Re: Disabling "Edit with Vim " context menu entry in Windows?
    ... to/with Usenet, because on balance I guess I'd say they're doing ... Google Groups is a large operation, and Usenet is only a part of it. ... Unix, and had ideas poached from Unix. ... window was hidden behind other windows. ...
    (comp.editors)
  • Re: googlegroups.com
    ... U.S. setup fee and no monthly fee. ... (free registration), news.datemas.de. ... Google groups seem to be a rich source of illegal (in ... "Your Usenet blinders are my best friend." ...
    (alt.2600)