ReallySimpleSnagger

Version 0.1 - June 4, 2006

Copyright 2005-2006 by Eric Y. Theriault.  All Rights Reserved.
All other brand and product names used herein or in the Software
are trademarks or registered trademarks of their respective owners.

  1. ReallySimpleSnagger
  2. Introduction
    1. Features
    2. License
    3. Acknowledgments
  3. Installation
  4. Properties Menu
  5. Setup
  6. Site Editor
  7. Importing and Exporting OPML
  8. Updating
    1. Automating the Updates
  9. Data Synchronization
    1. Manual Synchronization
    2. PocketPC and ActiveSync
    3. Palm and Plucker
  10. Using The Content
    1. Generated HTML Indexes
      1. HTML Pages
      2. RSS and Atom Feeds
      3. OPML Feeds
    2. Generated RSS Feeds
  11. Troubleshooting and FAQ
    1. “You have been banned”
    2. The Feed I read had articles on multiple pages or requires a login.
    3. Downloading HTML pages (non-feeds)
    4. How much space does it use?  How long does it take?
    5. Pictures are missing from a Blog
    6. What is the Sites.xml file?
    7. I have chosen the Generate RSS Feeds option in the Site Editor, but all the links come back as Not-Found; why?
    8. How do I customize the Look and Feel of the indexes?
    9. Some items in a feed were updated yesterday, but are now in both the recently updated items and the older items.
    10. Firewall Notes
    11. Why am I getting an exception during ActiveSync Synchronization?
    12. ReallySimpleSnagger ActiveSync Service says 'Only one usage of each socket address (protocol/network address/port) is normally permitted'
    13. Where can I get help?

Introduction

ReallySimpleSnagger is an Atom, OPML, and RSS feed aggregator optimized for reading feeds offline. Most feed aggregators only read the feed, and most feeds only contain a summary of referenced items and do not contain images, which makes them difficult and frustrating to read offline.

Normal aggregators do not download the entire document.

By contracts, ReallySimpleSnagger downloads the feeds and the referenced full items, creating an offline experience that is almost like being online.

ReallySimpleSnagger downloads the full feed and referenced documents, making a better offline experience.

Once ReallySimpleSnagger has downloaded your feeds, you can copy the content to your device and not require an Internet connection to read through the referenced content.

1). ReallySimpleSnagger downloads Content. 2). Content can be copied to a device. 3). Use the content on or off-line.

Features

License

ReallySimpleSnagger 0.1 is a pre-release, Public Beta. It is being made freely available so that we can gather feedback on its feature set and compatability.  The software is provided to you "As-Is" without any warranty.  Future releases of ReallySimpleSnagger may or may not be free.

Acknowledgments

Special thanks goes to log4net, IIOP.net, and OpenNETCF.org, whose components are included in this distribution, and made this product much easier to create.

Installation

If you have not downloaded the software yet, please download it here.

Note: If you have installed a previous version of ReallySimpleSnagger, please remove it from your system before proceeding with the installation.  You can do this by going to Start, Control Panel, and Add or Remove Programs. In the list, click on ReallySimpleSnagger and click on the Remove button.

When you are ready to install the program, simply double click on SETUP.EXE. If you have a current version installed, it will tell you this immediately, and follow the instructions above before attempting the installation again.  This application requires the .NET Framework 2.0; if you do not have this installed on your computer, the installer will prompt you to install it.

During the installation, it will ask you if you want to install the ReallySimpleSnagger ActiveSync Service; if you do not know about this service yet, it is recommended that you not install it until later (see ActiveSync Properties). It will ask you were to install it, and typically, the default value is fine. After this, it will proceed to install the software.

Once that the software is installed, you should go to Start, Programs, ReallySimpleSnagger, and click on ReallySimpleSnagger.  In a few moments, you will see the core interface, but before you get too busy, I would recommend that you go into the Properties menu.

Properties Menu

The Properties menu allows you to set global parameters. Select File from the main ReallySimpleSnagger screen, and select Properties. The properties contains the following tabs:
General

These settings are general, application-wide properties.

Working Directory

The Working Directory field allows you to indicate where ReallySimpleSnagger will store its data on your system. You can either type in the directory yourself, or select the ... for selecting a directory.

NOTE: The selected directory should not contain any sub-directories that ReallySimpleSnagger did not create. This could cause some synchronization issues otherwise.

Note: Changing directories will not copy any of the existing data over. It is recommended that you change the directory, exit, copy the data to the new directory, and delete the original directory.

Generated Feed Directory

The Generated Feed Directory is a bit advanced. Usually, this will be blank. If you are not regenerating RSS feeds (Generate RSS Feed in the Site Editor), you can ignore this option completely. If you are using this feature, then this should be the directory where your feeds will eventually be stored; normally this is your Handheld Directory (if configured) or your working directory. If you are copying the files manually, you may need to edit this setting.

ActiveSync

The ActiveSync settings are specific to Pocket PC ActiveSync. If you are not using ActiveSync, simply ignore this tab.

The Device Directory is the directory on the handheld that you will store ReallySimpleSnagger’s data. You can either type in the directory yourself, or select the ... if your device is connected. Please note that currently, the ... button takes a moment to scan through your device.

NOTE: It is extremely important that the directory that you select be empty and only used for data generated by ReallySimpleSnagger. Data that is not generated by ReallySimpleSnagger will be erased!.

There will either be an Install ActiveSync Service or a Remove ActiveSync Service button on this page, depending on the current state of the service. This button will either install or remove the ReallySimpleSnagger ActiveSync Service from your Startup menu. If you use this service, it is recommended that you install it so that it is always running. Otherwise, in order to run the service, you will have to go to Start, All Programs, ReallySimpleSnagger, and select the ReallySimpleSnagger ActiveSync Service from the menu.

Note: Installing or removing the service does not actually immediately start or stop the service. The service will only be started when you log into your machine. To start the service manually, go to Start, All Programs, ReallySimpleSnagger, and select the ReallySimpleSnagger ActiveSync Service from the menu.  To stop the service, right click on the ReallySimpleSnagger icon in taskbar and select Close.

Site Defaults

This tab sets the defaults for the Site Editor. Please see the Site Editor section for complete details.

The Update Sites button allows you to apply the currently defined defaults to all the current sites.

Site Passwords

Because of the nature of ReallySimpleSnagger, passwords are not associated with a single site. Instead, passwords are associated with URL's. The Site Passwords allows you to edit the password associated with URL’s.

The screen has a list of URL’s and usernames. Clicking on any of the users in the list will load them into the editor, or clicking the New button will create a new password.

The Settings group has the properties of the URL. First, the URL of the site. This is treated as a prefix; to be used, the host of the URL must match exactly and the referenced link must be prefixed by this path. For security reasons, you should use the longest prefix possible. The user name and passwords settings are straight-forward.

Once you are complete, select Add or Update depending on your context. For security purposes, you must re-enter the password each time you make an edit to a site.

Filters

To control some of the content that ReallySimpleSnagger will download, you can setup filters. The filters allow you to setup commands that when they match a given site, the some operations are done.

The matching is done via regular expressions. This allows you to match components of the URL to control what is downloaded. The flexibility of regular expressions also makes them more of an advanced topic and they are not covered here.

The screen has a list of regular expressions’s and actions. Clicking on any of the actions in the list will load them into the editor, or clicking the New button will create a new filter.

The Settings group has the properties of the Regular Expression, including the regular expression itself. On matching, you can do one of the following things:

Once you are complete, select Add or Update depending on your context.

Note: Filters do not apply to links that you have typed in to the Site Editor.

Advanced

The Advanced tab contains properties that are for advanced users. For most users, the defaults are fine.

HTTP Time Out

This allows you to change the time out period of an HTTP Connection. For most cases, the default is more than enough and this setting should be avoided.  The value is specified in milliseconds, and the default is 100,000 (~1.6 minutes).

Logging Level

This allows you to change the default logging level. For most users, the default of Info is sufficient. For slightly smaller log files, Warn could be used. For troubleshooting, Debug is recommended.

Setup

Clicking on OK will save any changes you have made. Clicking on Cancel will discard any changes made since the last save. If the item was never saved, it will be completely discarded.

See Site Editor for more information, or see Updating for your first download.

Site Editor

The other fields in this section are described below.

Site

This is the URL of the site to download. Typically you will enter in a site, and use the Test Link button as described in Setup above to ensure that ReallySimpleSnagger will do the right thing.

Title (optional)

This is a title that can be displayed in the generated index file. If you leave it blank, Test Link will populate it with the feed title, if detected, or otherwise, it will default to the site's URL.

Items To Keep

Feeds are composed of a number of items. This defines how many items that you want to keep at any given time. A large number will use more space on your device, but too small of a number could prevent you from seeing newly posted items.  Please note that for OPML feeds, this number will apply to referenced Atom and RSS feeds.

Note: If you set the maximum number of entries to a number less than the number of items that typically appear in the feed, the number will be updated to the number of current items.

Save Enclosures?

For feeds that are video blogs or podcasts, use this option to capture video and audio attachments. If you do not use this option, it will not be downloaded. Do note that since these can be large, you may want to set the Items to Keep to a low number and may only enable this feature on blogs you will actually use the content for.

New Item Descriptions and Old Item Descriptions

For RSS and Atom feeds, ReallySimpleSnagger will generate an HTML index. If you select Show Description, the description from the feed will be displayed. This can be a good idea for some blogs to get an idea of the content, but for others, it could make the index less useful.

You can also decide when you want the descriptions shown. You can choose to show the descriptions only for new items, only for older items, or for both. This can be useful to control content size or to differentiate between older items and newer items.

If your destination browser supports JavaScript, you can set this to Show via JavaScript, which will place a box [+] where the description will be. By clicking on this box, the description will be shown, and the box will change to a -. Clicking the dash will hide the description.

Generate RSS?

For feeds, ReallySimpleSnagger can create RSS files so that you can use an RSS Reader on your device to read and track the items. If you are not using an RSS reader on your device, you can save some space by leaving this unchecked.

If you are using this feature, see the General Properties discussion of the Generated Feed Directory.

Note: For OPML feeds, the Items To Keep, Save Enclosures, and Show Descriptions in Index parameters are used for the referenced feeds, and are otherwise ignored.

The buttons are as follows:

Importing and Exporting OPML

ReallySimpleSnagger offers you the ability to import and export OPML files as a method to quickly configure and share your feeds with existing applications.

To import an OPML file, go to File and select Import from OPML File. A file dialog will be shown, allowing you to select the file. When completed, the file will be imported and a dialog will indicate if the file was imported successfully or not.

When the import is successful, the dialog box will indicate how many sites were imported and how many entries were ignored. Ignored entries include text-only entries or entries without a URL (or more specifically, a url, httpUrl, or xmlUrl), or sites that already exist in your settings.

To export an OPML file, go to File and select Export to OPML File. A file dialog will be shown, allowing you to choose your filename. Once selected, the file will be created. If an error occurs, a dialog will appear indicating the error.

Note: Newly added feeds may be stored as a link instead of a feed until you have successfully updated them.

Updating

After adding a site or two, you can proceed to use the Update button (or use the F5 hotkey), which will download any new content from the last time it was updated.

Note: The URL specified in the sites will be downloaded each time. Referenced documents will only be downloaded if they were not seen before.

Once the process is complete, you are now ready for data synchronization.

Automating the Updates

You can automated the updating of ReallySimpleSnagger by creating a small batch file that would look as follows:

@echo off
C:\Program Files\ReallySimplySnagger\RSSUpdate.exe

Once this batch file is written, you can then add a Scheduled Task to execute this script via the Start, Control Panel, Scheduled Tasks, and selecting the Add Scheduled Task wizard.

Note: ReallySimpleSnagger includes update.bat to get you started.  Please note that when you upgrade ReallySimpleSnagger, this file can be overwritten.

Data Synchronization

Once that RSS is downloading feeds, the next step is to synchronize that data onto your device. There are several ways to accomplish this:

Because ReallySimpleSnagger uses open standards, these are of but a few of the potential schemes for synchronization.

To help automate any of these, Automating the Updates discusses how you can automate the process.

Once the content is copied, Using the Content gives you an overview of what is provided and how to use it.

Manual Synchronization

You can open the data folder directly via the Output Folder button. From here, you can select all the files, copy the files onto your device. All the directories and files must be copied over.

The step-by-step recipe is as follows:

  1. Click the Output Folder button within ReallySimpleSnagger.
  2. In File Explorer, select Edit, Select All to select all the files.
  3. Select Edit, Copy to copy the files.
  4. In File Explorer, change to the folder where you want to copy the data to.
  5. Select Edit, Paste, and wait for the process to complete.

Note: If a file already exists, you will be asked whether you want to over-write it or not. You will want to select Yes to All.

Note: To save space, you may wish to not copy the Sites.xml document over.

There are a couple caveats with using this mechanism to synchronize:

Once the files are on your device, you can simply view the index.html in the directory that you have synchronized to and view the feeds in a web browser. For example, if you are using a Pocket PC, you can open File Explorer to navigate to that folder, and by clicking on index.html, it will open the document in Pocket Internet Explorer.

PocketPC and ActiveSync

Included with ReallySimpleSnagger is a service that will synchronize your device automatically.  Each time you connect your device, the updated files on your desktop will be copied over to your handheld, and once your handheld, you can use a native web browser, such as Pocket Internet Explorer.

To start using this service, you must configure the Handheld Directory in the ActiveSync Properties.  This will be directory on your handheld device where the content will be stored.  If you do not have this configured, when you start the service, it will state "The handheld is not yet configured.  Please edit your properties in ReallySimpleSnagger."

Once the service is configured, start the service by going to Start, All Programs, ReallySimpleSnagger, and select the ReallySimpleSnagger ActiveSync Service from the menu.  The application will start, and will show you a ReallySimpleSnagger icon in your toolbar.

Your device will automatically be synchronized each time that you connect your device to your computer, and whenever your device is connected and ReallySimpleSnagger is updated.  Generally everything is done in the background, and it will tell you when it is starting and finished synchronizing via bubble windows.

If you right click on the ReallySimpleSnagger icon in your toolbar, you will see these options:

This service can be started automatically when you start your computer.  The ActiveSync Properties has a button to enable and disable this feature.

Once the files are on your device, you can simply view the index.html in the directory that you have synchronized to and view the feeds in a web browser. For example, if you are using a Pocket PC, you can open File Explorer to navigate to that folder, and by clicking on index.html, it will open the document in Pocket Internet Explorer or your preferred web browser.

Palm and Plucker

Plucker is a Web Spider that converts HTML documents into a more compressed format, and there are several readers available on various platforms, including Palm.

To get started with Plucker, download the Plucker Desktop from http://www.plkr.org/. Once installed, you will need to create a new channel based on a Local File, which will be the index.html in your Output Folder. Afterwards, you will want to:

To automate the updates, change the batch file described in Automating Updates with:

@echo off
C:\Program Files\ReallySimplySnagger\RSSUpdate.exe
C:\Program Files\Plucker\plucker_desktop\plucker-desktop.exe --update-due

Once the data is updated, Plucker will create a file that the Conduit will copy onto your device the next time you sync. After your next HotSync, you can open Plucker on your Palm or other device to view the pages.

Using The Content

Generated HTML Indexes

In the working directory, the directory where ReallySimpleSnagger has stored its files after an Update, you will find an index.html file. This is typically the file you would start with. The index is separated into two sections, Updated Feeds (if any) and Feeds.

Updated Feeds are the feeds that ReallySimpleSnagger noticed new content on the last time that it updated the feed. If ReallySimpleSnagger did not notice and updates, the item will be in the Feeds section. The new list is sorted in the order that ReallySimpleSnagger checked the sites; this is typically the order that you added the sites to ReallySimpleSnagger. The Feeds section, however, is sorted by the time that ReallySimpleSnagger last noticed a change to the feed.

The link name is the Title field in the Site Editor. By clicking a link, it will bring you to some content. The content depends on the actual type of file downloaded.

HTML Pages

If the URL is not a feed, the downloaded page will be displayed. The page should look exactly as it would if you browsed that web site in your current browser. (If there are any differences or a valid feed is not detected, please contact us)

RSS and Atom Feeds

If the page is an RSS or Atom feed, the page will be a summary page. The page will show you the name and description of the site from the feed (if available), the date that the feed was last updated, and then proceed to show you two lists, the New Items and the Other Items.

New Items are items that were new at the time that you updated the feed, and Other Items are the items that were not updated. The sites in the New Items are ordered by the order that they appear in the RSS file, and the Other Items are sorted from newest to oldest.

An item in these lists always appears in the same fashion. The following example demonstrates the fields:

The Item Title [Orig][Enclosure]: The description of the item, and on some blogs, the entire item.

Orig allows you to visit the originally referenced site, and this is great for sharing the link with friends. Depending on the browser you are using, you should be able to copy the link, and then be able to paste it into an e-mail.

Enclosure will only be present if you have selected the Save Enclosures in the Site Editor and the current item has an enclosure. If it does, your browser will attempt to handle that file type.  If the feed has mulitple enclosures, you will see this repeated.

The description of the item comes from the site. For many sites, this will be a summary of the link. For other sites, this will be the entire content in text format. For other sites, this will be the entire content in HTML. ReallySimpleSnagger simply presents whatever they present, and it should be noted that images and other data for things in these descriptions will not be downloaded.  The description can be circumvented in the Site Editor.

Clicking on the Item Title will bring you to the HTML downloaded page. See the HTML page section for details.

OPML Feeds

Because OPML feeds can show lists of lists of lists, ReallySimpleSnagger’s approach to OPML feeds is to simply present you a page of lists.

Similar to the RSS and Atom feeds, the title and last updated time will be shown at the top of the page. Following this will be the lists.

Items in the list will be described with the Title (if present), and then the Text or Description, depending on which is present. The link will be under the first of these fields. New items will be highlighted.

If the items have a link, they will be displayed. The next page you will see will either be an HTML page, an RSS or Atom Index, or maybe even another OPML index, depending on the type of page referenced. Please see the above sections for more information.

Generated RSS Feeds

If you have selected the Generated RSS Feed in the Site Editor, ReallySimpleSnagger will generate a two files. The first file is an OPML index that is found where your data is stored, and the other file is an RSS feed that is found in the folder that is named the domain name of the site.

The OPML file is an open standard file that many RSS Readers support. You will have to consult the documentation of your RSS reader to know if it supports OPML and how to import the index.opml file. If it does have this ability, then it should import all the other RSS files. If your reader supports this, this is the recommended configuration.  The OPML file will also contain sites that are not RSS feeds, and your reader may ignore these.

If your reader does not support OPML or you have specific needs, you will need to look at the folder that is named the same as the domain name of the sites that you are generating RSS feeds for. In these directories, you will find a file that is called SiteRSSnum.xml, where num is some number. This is the file that you will want to have your RSS readers use.

One other note is that you will need to make sure that your RSS reader supports the protocol file:///; unfortunately, most documentation does not refer to this, and so you may need to contact their technical support for more information.  Another consideration is that some readers require an Internet connection in order to update the new items, eventhough all the content is already downloaded, and this may effect the way that you use that reader.

Troubleshooting and FAQ

“You have been banned”

Certain feeds will prevent you from downloading the feed for various reasons. Usually, this is because you have read it so frequently that it thinks that it is being attacked.

Since ReallySimpleSnagger updates when you tell it to, it is possible that during other troubleshooting, you will notice this problem. Typically when it occurs, you simply need to contact the e-mail address mentioned in the error and tell them your IP address.

To find out your IP address, it depends on how you connect to the Internet. If your modem (cable, DSL, or regular) is connected directly to your machine, you can Start, Control Panel, Network Connections, and double click on your network connection. From here, select the Support tab, and it will say ‘IP Address’ followed by a number (Alternatively, you can do Start, Run, type in command, click OK, and at the prompt, type ipconfig /all).

If you use a router, you will have to read your router’s documentation for complete details.

The Feed I read had articles on multiple pages or requires a login.

At the moment, ReallySimpleSnagger only downloads the page referenced in the feed, and will not download subsequent pages of the articles, including those that require you to login (other than Basic Authentication). This is, however, planned for a future version.

Downloading HTML pages (non-feeds)

One issue with downloading non-feeds, such as HTML pages, is that they will never go into the Updated list. This is, however, planned for a future version.

How much space does it use?  How long does it take?

ReallySimpleSnagger space requirements and the time it takes to process depend on your feeds and the options relating to them.  To minimize the size, you can select to limit the number of entries to keep or disabling attachments in the Site Editor.

The first time that you update a lot of new feeds or a large OPML document, it will take some time to update. Subsequent updates should be significantly faster.

Pictures are missing from a Blog

You may want to look through the log file to see if there are any issues processing the page. If not, certain blogs do some JavaScript hacks on images, preventing ReallySimpleSnagger from downloading them. You can contact support for more information.

What is the Sites.xml file?

The Sites.xml file is a custom data format that has a bit of RSS, all of the settings that you access in the User Interface (except those stored in the registry), and it keeps track of the files stored for the purpose of cleaning up.

In general, this file should not be touched. Updates to the file are done regularly by RSS. If you change the content, bad things could happen. Just do not do it.

You may wish to maintain a backup of this file, or at the very minimum, an OPML version of your feeds.  See Importing and Exporting OPML.

I have chosen the Generate RSS Feeds option in the Site Editor, but all the links come back as Not-Found; why?

In order to be a standard compliant RSS feed, the link in the feed cannot be relative. Because of this, ReallySimpleSnagger needs to know the ultimate destination of your feed so that it can properly generate the feed’s content. To set this setting, select Edit, Properties, and in the General tab, set this directory as the Generated Feed Directory.

If this setting is blank, it will default to your Handheld directory (if configured) or your working directory.

How do I customize the Look and Feel of the indexes?

To customize the Look and Feel of the indexes, edit the style.css file included in the distribution using a CSS Editor.

Some items in a feed were updated yesterday, but are now in both the recently updated items and the older items.

Some blogs will change the RSS feed at times. When this happens, if the URL changes, ReallySimpleSnagger will treat the changed URL as a new item and can maintain a reference to the old item.

Firewall Notes

The ReallySimpleSnagger ActiveSync Service utilizes TCP Port 7849, and clients communicating to this service will utilize a TCP Port randomly chosen in the range 7000 to 7500. As no outside services need to access these ports, these should be blocked. These port numbers are currently not configurable.

Why am I getting an exception during ActiveSync Synchronization?

There are a few common reasons that you could be getting an exception during synchronization.

ReallySimpleSnagger ActiveSync Service says 'Only one usage of each socket address (protocol/network address/port) is normally permitted'

This typically means that you have two ReallySimpleSnagger ActiveSync Services running or you have a program that is using port 7849. If you have a program that is using the same port, please let us know.

Where can I get help?

The best way to get help is by contact us directly at support@eyt.ca.