NAME
    groupstats - create reports on newsgroup usage

SYNOPSIS
    groupstats [-Vhcs --comments] [-m *YYYY-MM*[:*YYYY-MM*] | *all*] [-n
    *newsgroup(s)*] [--checkgroups *checkgroups file*] [-r *report type*]
    [-l *lower boundary*] [-u *upper boundary*] [-b *boundary type*] [-g
    *group by*] [-o *order by*] [-f *output format*] [--filetemplate
    *filename template*] [--groupsdb *database table*] [--conffile
    *filename*]

REQUIREMENTS
    See "README" in doc.

DESCRIPTION
    This script creates reports on newsgroup usage (number of postings per
    group per month) taken from result tables created by gatherstats.pl.

  Features and options
   Time period and newsgroups
    The time period to act on defaults to last month; you can assign another
    time period or a single month (or drop all time constraints) via the
    --month option (see below).

    groupstats will process all newsgroups by default; you can limit
    processing to only some newsgroups by supplying a list of those groups
    via --newsgroups option (see below). You can include hierarchy levels in
    the output by adding the --sums switch (see below). Optionally
    newsgroups not present in a checkgroups file can be excluded from
    output, sse --checkgroups below.

   Report type
    You can choose between different --report types: postings per month,
    average postings per month or all postings summed up; for details, see
    below.

   Upper and lower boundaries
    Furthermore you can set an upper and/or lower boundary to exclude some
    results from output via the --lower and --upper options, respectively.
    By default, all newsgroups with more and/or less postings per month will
    be excluded from the result set (i.e. not shown and not considered for
    average and sum reports). You can change the meaning of those boundaries
    with the --boundary option. For details, please see below.

   Sorting and formatting the output
    By default, all results are grouped by month; you can group results by
    newsgroup instead via the --groupy-by option. Within those groups, the
    list of newsgroups (or months) is sorted alphabetically (or
    chronologically, respectively) ascending. You can change that order (and
    sort by number of postings) with the --order-by option. For details and
    exceptions, please see below.

    The results will be formatted as a kind of table; you can change the
    output format to a simple list or just a list of newsgroups and number
    of postings with the --format option. Captions will be added by means of
    the --caption option; all comments (and captions) can be supressed by
    using --nocomments.

    Last but not least you can redirect all output to a number of files,
    e.g. one for each month, by submitting the --filetemplate option, see
    below.

  Configuration
    groupstats will read its configuration from newsstats.conf which should
    be present in etc/ via Config::Auto or from a configuration file
    submitted by the --conffile option.

    See doc/INSTALL for an overview of possible configuration options.

    You can override some configuration options via the --groupsdb option.

OPTIONS
    -V, --version
       Display version and copyright information and exit.

    -h, --help
       Display this man page and exit.

    -m, --month *YYYY-MM[:YYYY-MM]|all*
       Set processing period to a single month in YYYY-MM format or to a
       time period between two month in YYYY-MM:YYYY-MM format (two month,
       separated by a colon). By using the keyword *all* instead, you can
       set no processing period to process the whole database. Defaults to
       last month.

    -n, --newsgroups *newsgroup(s)*
       Limit processing to a certain set of newsgroups. *newsgroup(s)* can
       be a single newsgroup name (de.alt.test), a newsgroup hierarchy
       (de.alt.*) or a list of either of these, separated by colons, for
       example

          de.test:de.alt.test:de.newusers.*

    -s, --sums|--nosums (sum per hierarchy level)
       Include "virtual" groups for every hierarchy level in output, for
       example:

           de.alt.ALL 10
           de.alt.test 5
           de.alt.admin 7

       See the gatherstats man page for details.

       This option does not work together with the --checkgroups option as
       all "virtual" groups will not be present in the checkgroups file.

       False by default.

    --checkgroups *filename*
       Restrict output to those newgroups present in a file in checkgroups
       format (one newgroup name per line; everything after the first
       whitespace on each line is ignored). All other newsgroups will be
       removed from output.

       Contrary to gatherstats, *filename* is not a template, but refers to
       a single file in checkgroups format.

       The --sums option will not work together with this option as
       "virtual" groups will not be present in the checkgroups file.

    -r, --report *default|average|sums*
       Choose the report type: *default*, *average* or *sums*

       By default, groupstats will report the number of postings for each
       newsgroup in each month. But it can also report the average number of
       postings per group for all months or the total sum of postings per
       group for all months.

       For report types *average* and *sums*, the group-by option has no
       meaning and will be silently ignored (see below).

    -l, --lower *lower boundary*
       Set the lower boundary. See --boundary below.

    -l, --upper *upper boundary*
       Set the upper boundary. See --boundary below.

    -b, --boundary *boundary type*
       Set the boundary type to one of *default*, *level*, *average* or
       *sums*.

       By default, all newsgroups with more postings per month than the
       upper boundary and/or less postings per month than the lower boundary
       will be excluded from further processing. For the default report that
       means each month only newsgroups with a number of postings between
       the boundaries will be displayed. For the other report types,
       newsgroups with a number of postings exceeding the boundaries in all
       (!) months will not be considered.

       For example, lets take a list of newsgroups like this:

           ----- 2012-01:
           de.comp.datenbanken.misc               6
           de.comp.datenbanken.ms-access         84
           de.comp.datenbanken.mysql             88
           ----- 2012-02:
           de.comp.datenbanken.misc               8
           de.comp.datenbanken.ms-access        126
           de.comp.datenbanken.mysql             21
           ----- 2012-03:
           de.comp.datenbanken.misc              24
           de.comp.datenbanken.ms-access         83
           de.comp.datenbanken.mysql             36

       With "groupstats --month 2012-01:2012-03 --lower 25 --report sums",
       you'll get the following result:

           ----- All months:
           de.comp.datenbanken.ms-access        293
           de.comp.datenbanken.mysql            124

       de.comp.datenbanken.misc has not been considered even though it has
       38 postings in total, because it has less than 25 postings in every
       single month. If you want to list all newsgroups with more than 25
       postings *in total*, you'll have to set the boundary type to *sum*,
       see below.

       A boundary type of *level* will show only those newsgroups - at all -
       that satisfy the boundaries in each and every single month. With the
       above list of newsgroups and "groupstats --month 2012-01:2012-03
       --lower 25 --boundary level --report sums", you'll get this result:

           ----- All months:
           de.comp.datenbanken.ms-access        293

       de.comp.datenbanken.mysql has not been considered because it had less
       than 25 postings in 2012-02 (only).

       You can use that to get a list of newsgroups that have more (or less)
       then x postings in every month during the whole reporting period.

       A boundary type of *average* will show only those newsgroups - at all
       - that satisfy the boundaries on average. With the above list of
       newsgroups and "groupstats --month 2012-01:2012-03 --lower 25
       --boundary avg --report sums", you'll get this result:

          ----- All months:
          de.comp.datenbanken.ms-access        293
          de.comp.datenbanken.mysql            145

       The average number of postings in the three groups is:

           de.comp.datenbanken.misc           12.67
           de.comp.datenbanken.ms-access      97.67
           de.comp.datenbanken.mysql          48.33

       Last but not least, a boundary type of *sums* will show only those
       newsgroups - at all - that satisfy the boundaries with the total sum
       of all postings during the reporting period. With the above list of
       newsgroups and "groupstats --month 2012-01:2012-03 --lower 25
       --boundary sum --report sums", you'll finally get this result:

           ----- All months:
           de.comp.datenbanken.misc              38
           de.comp.datenbanken.ms-access        293
           de.comp.datenbanken.mysql            145

    -g, --group-by *month[-desc]|newsgroups[-desc]*
       By default, all results are grouped by month, sorted chronologically
       in ascending order, like this:

           ----- 2012-01:
           de.comp.datenbanken.ms-access         84
           de.comp.datenbanken.mysql             88
           ----- 2012-02:
           de.comp.datenbanken.ms-access        126
           de.comp.datenbanken.mysql             21

       The results can be grouped by newsgroups instead via --group-by
       *newsgroup*:

           ----- de.comp.datenbanken.ms-access:
           2012-01         84
           2012-02        126
           ----- de.comp.datenbanken.mysql:
           2012-01         88
           2012-02         21

       By appending *-desc* to the group-by option parameter, you can
       reverse the sort order - e.g. --group-by *month-desc* will give:

           ----- 2012-02:
           de.comp.datenbanken.ms-access        126
           de.comp.datenbanken.mysql             21
           ----- 2012-01:
           de.comp.datenbanken.ms-access         84
           de.comp.datenbanken.mysql             88

       Average and sums reports (see above) will always be grouped by
       months; this option will therefore be ignored.

    -o, --order-by *default[-desc]|postings[-desc]*
       Within each group (a single month or single newsgroup, see above),
       the report will be sorted by newsgroup names in ascending
       alphabetical order by default. You can change the sort order to
       descending or sort by number of postings instead.

    -f, --format *pretty|list|dump*
       Select the output format, *pretty* being the default:

           ----- 2012-01:
           de.comp.datenbanken.ms-access         84
           de.comp.datenbanken.mysql             88
           ----- 2012-02:
           de.comp.datenbanken.ms-access        126
           de.comp.datenbanken.mysql             21

       *list* format looks like this:

           2012-01 de.comp.datenbanken.ms-access 84
           2012-01 de.comp.datenbanken.mysql 88
           2012-02 de.comp.datenbanken.ms-access 126
           2012-02 de.comp.datenbanken.mysql 21

       And *dump* format looks like this:

           # 2012-01:
           de.comp.datenbanken.ms-access 84
           de.comp.datenbanken.mysql 88
           # 2012-02:
           de.comp.datenbanken.ms-access 126
           de.comp.datenbanken.mysql 21

       You can remove the comments by using --nocomments, see below.

    -c, --captions|--nocaptions
       Add captions to output, like this:

           ----- Report for 2012-01 to 2012-02 (number of postings for each month)
           ----- Newsgroups: de.comp.datenbanken.*
           ----- Threshold: 10 => x <= 20 (on average)
           ----- Grouped by Newsgroups (ascending), sorted by number of postings descending

       False by default.

    --comments|--nocomments
       Add comments (group headers) to *dump* and *pretty* output. True by
       default as long as --filetemplate is not set.

       Use *--nocomments* to suppress anything except newsgroup names/months
       and numbers of postings.

    --filetemplate *filename template*
       Save output to file(s) instead of dumping it to STDOUT. groupstats
       will create one file for each month (or each newsgroup, according to
       the setting of --group-by, see above), with filenames composed by
       adding year and month (or newsgroup names) to the *filename
       template*, for example with --filetemplate *stats*:

           stats-2012-01
           stats-2012-02
           ... and so on

    --groupsdb *database table*
       Override *DBTableGrps* from newsstats.conf.

    --conffile *filename*
       Read configuration from *filename* instead of newsstats.conf.

INSTALLATION
    See "INSTALL" in doc.

EXAMPLES
    Show number of postings per group for lasth month in *pretty* format:

        groupstats

    Show that report for January of 2010 and de.alt.* plus de.test,
    including display of hierarchy levels:

        groupstats --month 2010-01 --newsgroups de.alt.*:de.test --sums

    Only show newsgroups with 30 postings or less last month, ordered by
    number of postings, descending, in *pretty* format:

        groupstats --upper 30 --order-by postings-desc

    Show the total of all postings for the year of 2010 for all groups that
    had 30 postings or less in every single month in that year, ordered by
    number of postings in descending order:

        groupstats -m 2010-01:2010-12 -u 30 -b level -r sums -o postings-desc

    The same for the average number of postings in the year of 2010:

        groupstats -m 2010-01:2010-12 -u 30 -b level -r avg -o postings-desc

    List number of postings per group for eacht month of 2010 and redirect
    output to one file for each month, namend stats-2010-01 and so on, in
    machine-readable form (without formatting):

        groupstats -m 2010-01:2010-12 -f dump --filetemplate stats

FILES
    bin/groupstats.pl
        The script itself.

    lib/NewsStats.pm
        Library functions for the NewsStats package.

    etc/newsstats.conf
        Runtime configuration file.

BUGS
    Please report any bugs or feature requests to the author or use the bug
    tracker at <https://code.virtcomm.de/thh/newsstats/issues>!

SEE ALSO
    - "README" in doc

    - "INSTALL" in doc

    - gatherstats -h

    This script is part of the NewsStats package.

AUTHOR
    Thomas Hochstein <thh@thh.name>

COPYRIGHT AND LICENSE
    Copyright (c) 2010-2013 Thomas Hochstein <thh@thh.name>

    This program is free software; you may redistribute it and/or modify it
    under the same terms as Perl itself.

