Title: F5er user's manual Summary: F5er is a daemon, which allows user to receive notifications about changes in web pages, which do not offer a built-in RSS support. On a user's request, it downloads a predefined web page, extracts target information from the web page content and generates an RSS feed on the basis of the content. Copyright: (c) 2009-2011 Vitaly Minko Content is available under GNU Free Documentation License 1.3 and Creative Commons Attribution-Share Alike 3.0 Unported License Date: 15 Nov 2011 Web: http://vminko.org/f5er NAME ---- F5er - RSS notifier about updates on web pages SYNOPSIS -------- f5er [__--address__=_hostname_] [__--port__=_number_] [__--config__=`file`] f5er __--help__ DESCRIPTION ----------- __F5er__ is a daemon, which allows user to receive notifications about changes in web pages, which do not offer a built-in RSS support. On a user's request, it downloads a predefined web page, extracts target information from the web page content and generates an RSS feed on the basis of the content. The actual determination whether the web page was changed or not is performed by the user's RSS aggregator. So F5er is a universal intermediary between any web site and any RSS aggregator. In order to receive notifications, user needs to add a new feed to his RSS aggregator for each F5er channel defined in the configuration file. F5er's feeds have the following URL format: http://hostname:port/channel_id Where *channel_id* is a channel ID specified in the channel declaration. See the configuration section for details. OPTIONS ------- - __-a__ _hostname_, __--address__=_hostname_ Local host bind address. The default value is `localhost`. - __-p__ _number_, __--port__=_number_ Local host bind port. The default value is `8080`. - __-c__ `file`, __--config__=`file` Configuration file to use. The default value is `/etc/f5er.conf`. - __-h__, __--help__ Print out usage information. CONFIGURATION ------------- Each line of the config file is either a comment or a directive. Comment lines start with '#' and are ignored as well as empty lines. Directives are of two types: channel declarations and parameter assignments. A channel declaration has the following format: [channel_id] The only allowed characters in a channel ID are alphanumeric characters and underscore. All parameters following the channel declaration up to the next declaration or up to the end of file are related to the same channel_id. A parameter assignment contains a parameter name and a parameter value separated by '='. For example: parameter = value The supported parameters are listed below. __title__ Required. Defines the title of the channel. __description__ Required. Describes the channel. __link__ Required. Defines the web site URL of the channel. __post_form__ Optional. HTML form to send when fetching HTML page of the channel. If set, F5er will send a POST request instead of a GET, which is the default method. __selection_xpath__ Required. XPath to the selection to extract. See the [XPath Tutorial](#notes) for information on how to compose XPath. __paginator_xpath__ Optional. XPath to the paginator on the web page. If set, F5er will try to nagivate to the last page available. In order to be properly handled, paginator should contain numeric links to pages. __ttl__ Optional. How often to refresh the feed from the source (in minutes). For example the following configuration may be used for monitoring current Perl version: [perl] title = Perl description = Current Perl version link = http://www.perl.org/ selection_xpath = //div[@id='short_lists']/div[1] ttl = 60 DEPENDENCIES ------------ Besides the Perl interpreter and the core modules, F5er requires the following modules: [HTML::Entities](http://search.cpan.org/perldoc?HTML::Entities), [HTML::FormatText](http://search.cpan.org/perldoc?HTML::FormatText), [HTML::TreeBuilder](http://search.cpan.org/perldoc?HTML::TreeBuilder), [HTML::TreeBuilder::XPath](http://search.cpan.org/perldoc?HTML::TreeBuilder::XPath), [HTTP::Daemon](http://search.cpan.org/perldoc?HTTP::Daemon), [HTTP::Status](http://search.cpan.org/perldoc?HTTP::Status), [HTTP::Tidy](http://search.cpan.org/perldoc?HTTP::Tidy), [LWP::UserAgent](http://search.cpan.org/perldoc?LWP::UserAgent), [XML::RSS](http://search.cpan.org/perldoc?XML::RSS). All of them are available on CPAN. DOWNLOAD -------- The current stable version is 0.4. The source code is available [here](/storage/f5er/f5er-0.4). To fetch the latest source code from the git repository, run the following command: git clone git://vminko.org/f5er NOTES ----- 1. [Comparison of feed aggregators](http://en.wikipedia.org/wiki/Comparison_of_feed_aggregators) 2. [XPath Tutorial](http://www.w3schools.com/xpath/default.asp) 3. [RSS Reference](http://www.w3schools.com/rss/rss_reference.asp) HISTORY ------- __15 Nov 2011__ - Version 0.4 Fixed memory leak. __24 Oct 2011__ - Version 0.3 Added ability to compose channels based on POST requests (using the `post_form' parameter). __10 May 2011__ - Version 0.2 Added cleaning up of the HTML content in order to extract nodes from pages with invalid markup. __18 Apr 2011__ - Version 0.1 Initial public release. COPYRIGHT AND LICENSE --------------------- Copyright (C) 2011 Vitaly Minko This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the [GNU General Public License](http://www.gnu.org/licenses/) for more details.