Portscout - README

portscout is an automated system designed to search for new versions of software available in the FreeBSD ports tree. It is primarily designed for use by FreeBSD port maintainers, who can avoid trailing around dozens of websites looking for updates. However, I hope that others may find it useful too.

There is currently lots of room for optimization and general improvements. I am gradually working on updating things; any suggestions, comments, patches, etc. are more than welcome -- please let me know if the software is useful!

Those who find portscout doesn't fulfill their needs may be interested in an alternative set of scripts (sysutils/newportsversioncheck in the ports tree) by Edwin Groothuis, which do a similar job.

System Requirements

The following software is required to run portscout:

Plus we need a few Perl modules:

portscout will probably work with other SQL databases, although I haven't verified this. If the database system you want to use doesn't support the 'text' datatype, just edit portscout's SQL scripts and drop in the largest varchars() the database supports. If the DBI driver for your database driver doesn't return row count data on SELECTs, portscout may behave a little strangely -- I'll get around to fixing this soon.

Usage Instructions

Steps 1-4 should be carried out after a fresh install; the others need to be repeated for day-to-day use.

1. Set up PostgreSQL

# createuser -U pgsql -P portscout
# createdb -U pgsql -E UNICODE portscout

Execute the included pgsql_init.sql script via psql:

# psql portscout portscout < sql/pgsql_init.sql

This will create the database tables for you.

2. Configure Portscout

Look at portscout.conf, and check it suits your needs. The defaults should be reasonable for most people. You can reduce num_children and workqueue_size if you don't want portscout sucking up all your resources.

Please note that portscout's internal defaults differ from the defaults in portscout.conf - this is because without a config file, portscout tries to be "portable" and use its own directory for storing things under, whereas if a config file is found, it assumes it is installed and being used "system-wide".

Any of the options in portscout.conf can also be set on the command line. E.g.:

$ portscout.pl --precious_data --num_children=8

3. Update ports tree

# cvsup ports-supfile && cd /usr/ports && make index

4. Build Database

This only needs to be done after a fresh install.

$ portscout.pl build

This takes around 70 minutes for me.

5. Rebuild Database

This step needs to be carried out whenever you cvsup / make index. It is NOT required the first time (after you run 'portscout.pl build').

This is designed to save time by only updating the ports that have changed.

# cvsup -L1 ports-supfile > ./_supdata/cvsup-`date '+%Y%m%d'`.log
$ portscout.pl rebuild

portscout will operate on every file under sup_data_dir that matches the glob cvsup*.log; it will rename each file by prepending an underscore when it is finished. If cache_ms_data is set to false, then the files will be removed instead.

Make sure that the output of every run of cvsup is given to portscout, or it will miss updates!

Rebuilding Without CVSup Logs

If you are unable to provide CVSup logs to portscout, it is possible, as a work-around, to use INDEX file diffs instead. For example, the following simple sh(1) script could be used to create a log in CVSup format for portscout:

diff -urN INDEX-6.old INDEX-6         \
    | awk -F'\|' '/^[-+]/ {print $2}' \
    | sed -e 's#^/usr/ports/\([^/]*\)/\(.*\)$# Touch ports/\1/\2/Makefile#' \
    | sort -u 

This might flag a few untouched ports as updated. As with the CVSup method, make sure that portscout knows about any and all ports tree changes.

Rebuilding Without ANY Logs

If you have some restrictions in place in portscout.conf - i.e. you only intend to use portscout for a handful of ports - then building the database each time from scratch may be easier. In this case, ensure the initial build was done with the restrictions in place, and after each cvsup / make index, run the standard full build:

$ portscout.pl build

6. Run Version Checks

$ portscout.pl check

This will instruct portscout to search for new distfiles for each port in the database.

7. Generate HTML Reports

$ portscout.pl generate

This will put HTML pages inside html_data_dir - existing pages will be deleted.

Checking Algorithm

For anyone who is interested in how portscout operates, here is a summary of the checking algorithm it uses:

Test 1:

  1. Order master sites using previous reliability data.
  2. Attempt to get an FTP listing or web server index from each site.
  3. Extract version from files found; compare to current version.
  4. Skip other tests if new or current version is found.

Test 2:

  1. Increment each part of the port's version string and attempt to download file, e.g. for 1.4.2, try 2.0.0, 1.5.0 and 1.4.3


Copyright © 2006-2007, Shaun Amott. All rights reserved.