NAME
newsfeeds - determine where Usenet articles get sent
DESCRIPTION
The file /usr/local/news/etc/newsfeeds specifies how incom-
ing articles should be distributed to other sites. It is
parsed by the InterNetNews server innd(8) when it starts up,
or when directed to by ctlinnd(8).
The file is interpreted as a set of lines according to the
following rules. If a line ends with a backslash, then the
backslash, the newline, and any whitespace at the start of
the next line is deleted. This is repeated until the entire
``logical'' line is collected. If the logical line is
blank, or starts with a number sign (``#''), it is ignored.
All other lines are interpreted as feed entries. An entry
should consist of four colon-separated fields; two of the
fields may have optional sub-fields, marked off by a slash.
Fields or sub-fields that take multiple parameters should be
separated by a comma. Extra whitespace can cause problems.
Except for the site names, case is significant. The format
of an entry is:
sitename[/exclude,exclude,...]\
:pattern,pattern...[/distrib,distrib...]\
:flag,flag...\
:param
Each field is described below.
The sitename is the name of the site to which a news article
can be sent. It is used for writing log entries and for
determining if an article should be forwarded to a site. If
sitename already appears in the article's Path header, then
the article will not be sent to the site. The name is usu-
ally whatever the remote site uses to identify itself in the
Path line, but can be almost any word that makes sense; spe-
cial local entries (such as archivers or gateways) should
probably end with an exclamation point to make sure that
they do not have the same name as any real site. For exam-
ple, ``gateway'' is an obvious name for the local entry that
forwards articles out to a mailing list. If a site with the
name ``gateway'' posts an article, when the local site
receives the article it will see the name in the Path and
not send the article to its own ``gateway'' entry. See also
the description of the ``Ap'' flag, below. If an entry has
an exclusion sub-field, then the article will not be sent to
that site if any of the names specified as excludes appear
in the Path header. The same sitename can be used more than
once - the appropriate action will be taken for each entry
that should receive the article, regardless of the name -
although this is recommended only for program feeds to avoid
confusion. Case is not significant in site names.
The patterns specify which groups to send to the site and
are interpreted to build a ``subscription list'' for the
site. The default subscription is to get all groups. The
patterns in the field are wildmat(3)-style patterns, and are
matched in order against the list of newsgroups that the
local site receives. If the first character of a pattern is
an exclamation mark, then any groups matching the pattern
are removed from the subscription, otherwise any matching
groups are added. For example, to receive all ``comp''
groups, but only comp.sources.unix within the sources news-
groups, the following set of patterns can be used:
comp.*,!comp.sources.*,comp.sources.unix
There are three things to note about this example. The
first is that the trailing ``.*'' is required. The second
is that, again, the result of the last match is the most
important. The third is that ``comp.sources.*'' could be
written as ``comp.sources*'' but this would not have the
same effect if there were a ``comp.sources-only'' group.
There is also a way to subscribe to a newsgroup negatively.
That is to say, do not send this group even if the article
is cross-posted to a subscribed newsgroup. If the first
character of a pattern is an atsign ``@'', it means that any
article posted to a group matching the pattern will not be
sent even though the article may be cross-posted to a group
which is subscribed. The same rules of precedence apply in
that the last match is the one which counts. For example,
if you want to prevent all articles posted to any
"alt.binaries.warez" group from being propagated even if it
is cross-posted to another "alt" group or any other group
for that matter, then the following set of patterns can be
used:
alt.*,@alt.binaries.warez.*,misc.*
If you reverse the alt.* and alt.binaries.warez.* patterns,
it would nullify the atsign because the result of the last
match is the one that counts. Using the above example, if
an article is posted to one or more of the
alt.binaries.warez.* groups and is cross-posted to
misc.test, then the article is not sent.
See innd(8) for details on the propagation of control mes-
sages.
A subscription can be further modified by specifying ``dis-
tributions'' that the site should or should not receive.
The default is to send all articles to all sites that sub-
scribe to any of the groups where it has been posted , but
if an article has a Distribution header and any distribs are
specified, then they are checked according to the following
rules:
1. If the Distribution header matches any of the values in
the sub-field, then the article is sent.
2. If a distrib starts with an exclamation point, and it
matches the Distribution header, then the article is
not sent.
3. If Distribution header does not match any distrib in
the site's entry, and no negations were used, then the
article is not sent.
4. If Distribution header does not match any distrib in
the site's entry, and any distrib started with an exc-
lamation point, then the article is sent.
If an article has more than one distribution specified, then
each one is according to the above rules. If any of the
specified distributions indicate that the article should be
sent, it is; if none do, it is not sent - the rules are used
as a ``logical or.'' It is almost definitely a mistake to
have a single feed that specifies distributions that start
with an exclamation point along with some that don't.
Distributions are text words, not patterns; entries like
``*'' or ``all'' have no special meaning.
The flags parameter specifies miscellaneous parameters.
They may be specified in any order; flags that take values
should have the value immediately after the flag letter with
no whitespace. The valid flags are:
<size
An article will only be sent to the site if it is less
than size bytes long. The default is no limit.
>size
An article will only be sent to the site if it is
greater than size bytes long. The default is no limit.
Achecks
An article will only be sent to the site if it meets
the requirements specified in the checks, which should
be chosen from the following set:
d Distribution header required
p Do not check Path header for the sitename before
propagating (the exclusions are still checked).
Bhigh/low
If a site is being fed by a file, channel, or exploder
(see below), the server will normally start trying to
write the information as soon as possible. Providing a
buffer may give better system performance and help
smooth out overall load if a large batch of news comes
in. The value of the this flag should be two numbers
separated by a slash. The first specifies the point at
which the server can start draining the feed's I/O
buffer, and the second specifies when to stop writing
and begin buffering again; the units are bytes. The
default is to do no buffering, sending output as soon
as it is possible to do so.
Fname
This flag specifies the name of the file that should be
used if it is necessary to begin spooling for the site
(see below). If name is not an absolute pathname, it
is taken to be relative to /news/out.going. Then, if
the destination is a directory, the file togo in that
directory will be used as filename.
Gcount
If this flag is specified, an article will only be sent
to the site if it is posted to no more than count news-
groups.
Hcount
If this flag is specified, an article will only be sent
to the site if it has count or fewer sites in its Path
line. This flag should only be used as a rough guide
because of the loose interpretation of the Path header;
some sites put the poster's name in the header, and
some sites that might logically be considered to be one
hop become two because they put the posting
workstation's name in the header. The default value
for count is one.
Isize
The flag specifies the size of the internal buffer for
a file feed. If there are more file feeds then allowed
by the system, they will be buffered internally in
least-recently-used order. If the internal buffer
grows bigger then size bytes, however, the data will be
written out to the appropriate file. The default value
is (16 * 1024) bytes.
Nmodifiers
The newsgroups that a site receives are modified
according to the modifiers, which should be chosen from
the following set:
m Only moderated groups
u Only unmoderated groups
Ssize
If the amount of data queued for the site gets to be
larger than size bytes, then the server will switch to
spooling, appending to a file specified by the ``F''
flag, or /news/out.going/ sitename if the ``F'' flag is
not specified. Spooling usually happens only for chan-
nel or exploder feeds.
Ttype
This flag specifies the type of feed for the site.
Type should be a letter chosen from the following set:
c Channel
f File
l Log entry only
m Funnel (multiple entries feed into one)
p Program
x Exploder
Each feed is described below in the section on feed
types. The default is Tf.
Witems
If a site is fed by file, channel, or exploder, this
flag controls what information is written. If a site
is fed by a program, only the asterisk (``*'') has any
effect. The items should be chosen from the following
set:
b Size of the article in bytes
f Article's full pathname
g The newsgroup the article is in;
if cross-posted, then the first of the groups this
site gets
m Article's Message-ID
n Article's pathname relative to the spool directory
p The time the article was posted as seconds since epoch.
s The IP address of the site that sent the article
t Time article was received as seconds since epoch
* Names of the appropriate funnel entries;
or all sites that get the article
D Value of the Distribution header;
? if none present
H All headers
N Value of the Newsgroups header
O Overview data
R Information needed for replication
More than one letter can be used; the entries will be
separated by a space, and written in the order in which
they are specified. The default is Wn.
The ``H'' and ``O'' items are intended for use by pro-
grams that create news overview databases. If ``H'' is
present, then the all the article's headers are written
followed by a blank line. An Xref header (even if one
does not appear in the filed article) and a Bytes
header, specifying the article's size, will also be
part of the headers. If used, this should be the only
item in the list; if preceeded by other items, however,
a newline will be written before the headers. The
``O'' generates input to the overchan(8) program. It,
too, should be the only item in the list.
The asterisk has special meaning. It expands to a
space-separated list of all sites that received the
current article. If the site is the target of a funnel
however (i.e., it is named by other sites which have a
``Tm'' flag), then the asterisk expands to the names of
the funnel feeds that received the article. If the
site is fed by a program, then an asterisk in the param
field will be expanded into the list of funnel feeds
that received the article. A site fed by a program
cannot get the site list unless it is the target of
other ``Tm'' feeds.
The interpretation of the param field depends on the type of
feed, and is explained in more detail below in the section
on feed types. It can be omitted.
The site named ME is special. There should only be one such
entry, and it should be the first entry in the file. If the
ME entry has a subscription list, then that list is automat-
ically prepended to the subscription list of all other
entries. For example, ``*,!control,!junk,!foo.*'' can be
used to set up the initial subscription list for all feeds
so that local postings are not propagated unless ``foo.*
explicitly appears in the site's subscription list. Note
that most subscriptions should have ``!junk,!control'' in
their pattern list; see the discussion of ``control mes-
sages'' in innd(8). (Unlike other news software, it does
not affect what groups are received; that is done by the
active(5) file.)
If the ME entry has a distribution subfield, then only arti-
cles that match the distribution list are accepted; all
other articles are rejected. A commercial news server, for
example, might have ``/!local'' to reject local postings
from other, misconfigured, sites.
FEED TYPES
Innd provides four basic types of feeds: log, file, program,
and channel. An exploder is a special type of channel. In
addition, several entries can feed into the same feed; these
are funnel feeds, that refer to an entry that is one of the
other types. Note that the term ``feed'' is technically a
misnomer, since the server does not transfer articles, but
reports that an article should be sent to the site.
The simplest feed is one that is fed by a log entry. Other
than a mention in the news logfile, no data is ever written
out. This is equivalent to a ``Tf'' entry writing to
/dev/null except that no file is opened.
A site fed by a file is simplest type of feed. When the
site should receive an article, one line is written to the
file named by the param field. If param is not an absolute
pathname, it is taken to be relative to /news/out.going. If
empty, the filename defaults to /news/out.going/sitename.
This name should be unique.
When a site fed by a file is flushed (see ctlinnd), the fol-
lowing steps are performed. The script doing the flush
should have first renamed the file. The server tries to
write out any buffered data, and then closes the file. The
renamed file is now available for use. The server will then
re-open the original file, which will now get created.
A site fed by a program has a process spawned for every
article that the site receives. The param field must be a
sprintf(3) format string that may have a single %s parame-
ter, which will be given a pathname for the article, rela-
tive to the news spool directory. The full path name may be
obtained by prefixing the %s in the param field by the news
spool directory prefix. Standard input will be set to the
article or /dev/null if the article cannot be opened for
some reason. Standard output and error will be set to the
error log. The process will run with the user and group ID
of the /news/run directory. Innd will try to avoid spawning
a shell if the command has no shell meta-characters; this
feature can be defeated by appending a semi-colon to the end
of the command. The full pathname of the program to be run
must be specified; for security, PATH is not searched.
If the entry is the target of a funnel, and if the ``W*''
flag is used, then a single asterisk may be used in the
param field where it will be replaced by the names of the
sites that fed into the funnel. If the entry is not a fun-
nel, or if the ``W*'' flag is not used, then the asterisk
has no special meaning.
Flushing a site fed by a program does no action.
When a site is fed by a channel or exploder, the param field
names the process to start. Again, the full pathname of the
process must be given. When the site is to receive an arti-
cle, the process receives a line on its standard input tel-
ling it about the article. Standard output and error, and
the user and group ID of the all sub-process are set as for
a program feed, above. If the process exits, it will be
restarted. If the process cannot be started, the server
will spool input to a file named /news/out.going/sitename.
It will then try to start the process some time later.
When a site fed by a channel or exploder is flushed, the
server closes down its end of the pipe. Any pending data
that has not been written will be spooled; see the descrip-
tion of the ``S'' flag, above. No signal is sent; it is up
to the program to notice EOF on its standard input and exit.
The server then starts a new process.
Exploders are a superset of channel feeds. In addition to
channel behavior, exploders can be sent command lines.
These lines start with an exclamation point, and their
interpretation is up to the exploder. The following mes-
sages are generated automatically by the server:
newgroup group
rmgroup group
flush
flush site
These messages are sent when the ctlinnd command of the same
name is received by the server. In addition, the ``send''
command can be used to send an arbitrary command line to the
exploder child-process. The primary exploder is
buffchan(8).
Funnel feeds provide a way of merging several site entries
into a single output stream. For a site feeding into a fun-
nel, the param field names the actual entry that does the
feeding.
For more details on setting up different types of news
feeds, see the INN installation manual.
EXAMPLES
## Initial subscription list and our distributions.
ME:*,!junk,!foo.*/world,usa,na,ne,foo,ddn,gnu,inet\
::
## Feed all moderated source postings to an archiver
source-archive!:!*,*sources*,!*wanted*,!*.d\
:Tc,Wn:/usr/local/news/bin/archive -f -i \
/usr/spool/news.archive/INDEX
## Watch for big postings
watcher!:*\
:Tc,Wbnm\
:exec awk '$1 > 1000000 { print "BIG", $2, $3 }' >/dev/console
## A UUCP feed, where we try to keep the "batching" between 4 and 1K.
ihnp4:/world,usa,na,ddn,gnu\
:Tf,Wnb,B4096/1024:
## Usenet as mail; note ! in funnel name to avoid Path conflicts.
## Can't use ! in "fred" since it would like look a UUCP address.
fred:!*,comp.sources.unix,comp.sources.bugs\
:Tm:mailer!
barney@bar.com:!*,news.software.nntp,comp.sources.bugs\
:Tm:mailer!
mailer!:!*\
:W*,Tp:/usr/ucb/Mail -s "News article" *
## NNTP feeds fed off-line via nntpsend or equivalent.
feed1::Tf,Wnm:feed1.domain.name
peer.foo.com:foo.*:Tf,Wnm:peer.foo.com
## Real-time transmission.
mit.edu:/world,usa,na,ne,ddn,gnu,inet\
:Tc,Wnm:/usr/local/news/bin/nntplink -i stdin mit.edu
## Two sites feeding into a hypothetical NNTP fan-out program:
nic.near.net:\
:Tm:nntpfunnel1
uunet.uu.net/uunet:!ne.*/world,usa,na,foo,ddn,gnu,inet\
:Tm:nntpfunnel1
nntpfunnel1:!*\
:Tc,Wmn*:/usr/local/news/bin/nntpfanout
## A UUCP site that wants comp.* and moderated soc groups
uucpsite!comp:!*,comp.*/world,usa,na,gnu\
:Tm:uucpsite
uucpsite!soc:!*,soc.*/world,usa,na,gnu\
:Tm,Nm:uucpsite
uucpsite:!*\
:Tf,Wnb:/usr/spool/batch/uucpsite
The last two sets of entries show how funnel feeds can be
used. For example, the nntpfanout program would receive
lines like the following on its standard input:
<123@litchi.foo.com> comp/sources/unix/888 nic.near.net uunet.uu.net
<124@litchi.foo.com> ne/general/1003 nic.near.net
Since the UUCP funnel is only destined for one site, the
asterisk is not needed and entries like the following will
be written into the file:
<qwe#37x@snark.uu.net> comp/society/folklore/3
<123@litchi.foo.com> comp/sources/unix/888
HISTORY
Written by Rich $alz <rsalz@uunet.uu.net> for InterNetNews.
This is revision 1.35, dated 1996/12/17.
SEE ALSO
active(5), buffchan(8), ctlinnd(8), innd(8), wildmat(3).
Man(1) output converted with
man2html