SOME EXPLANATIONNews was originally transmitted in "batches," along with e-mail, hopping from box to box using the UUCP protocol. Most of the boxes were UNIX machines of various sorts, running at universities or larger companies, and the UUCP transfers took place late at night when the calls were cheapest. In those days, "Net Access" meant "access to Usenet news and to e-mail" and propagation cross-country and back could take 3 to 5 days (or more). UUCP is still used as a "transport" mechanism for news (and for e-mail), but most of the news and e-mail traffic has long since migrated to that global TCP/IP network called "the Internet." You may have heard of it. . .
DEFINITIONS"News Transfer" is the process of moving the actual news articles around (articles that have already been "injected" into the Usenet news network). This is now usually done via the NNTP (Network News Transfer Protocol), which runs on top of TCP/IP. "News Reading" is the process of querying a machine's stored database of news articles and groups - and also of "posting" news. News posting refers to the original place that a given article is injected into the news system. When an article is posted (not transferred), it is given a globally-unique message ID which identifies the article as it passes from system to system.
HOW NEWS IS PASSED AROUNDSince the beginning of Usenet, the idea was to avoid having one or more key central sites, without which the system would fall apart. So the system was designed for minimal intelligence and maximal redundancy. In general, every news server peers with at least one other news server, and automatically offers any article received to all of its news peers (except the peer it heard the article from). So if you have one news peer, you'll offer back only articles locally originated - but if you have two news peers, you'll offer to each peer your own local articles and those articles learned from the other peer. And if you have 10 news peers and one of them is much faster than another, you'll offer hundreds of thousands of articles a day to each of the other eight or nine peers. Actually, I'm lying a bit, here. You do have the ability to restrict which articles you send to which peers - you don't have to offer everyone a "full feed."
NEWS PEERING - WHO'S THE CUSTOMER?With the Usenet system, it's hard to tell who's the downstream or customer end of a news peering session (vs. who's the upstream or provider end). Everyone peers with everyone else and may the fastest box win.
NEWS STORAGEThe actual news articles are stored in the UNIX file system, and each newsgroup has a directory. So alt.binaries is the /news/alt/binaries directory, and article 5 in alt.binaries is found at /news/alt/binaries/5 . alt.binaries.really.really.sticky is found at /news/alt/binaries/really/really/.sticky, etc... Each system has a history database which keeps track of the news articles that have already been seen by the news system. These articles may currently exist on-disk; they may be older articles that at one point were on-disk but are now expired; or they may be articles that were not in a newsgroup carried by the system. Even if an article is not to be stored and kept around on a given news server, its message ID should be noted in the history database, so you waste the bandwidth and CPU time to retrieve the article again and then have to make the same "not interested" determination. A point about news: Yes, it's true. Unless you tell your news peers "don't send me alt.some.new.group.that.someone.created," you must first get an article that has that group listed before you can decide to toss it. This means that you can waste tons and tons of bandwidth and CPU time getting binaries articles that you're never going to store.
NEWS SYSTEM ARCHITECTUREThere's a single process, the inn "daemon" - called innd - that lives off in the background and handles all news-connection requests and all of the news feeding tasks. Any host wanting to talk to the innd and transfer news has to be listed in the hosts.nntp file. Any other host is handed off by innd to an nnrpd (Network News Reading Protocol Daemon, a subset of NNTP). If that host is listed in the nnrp.access file, the nnrpd will talk to it and let it read news - otherwise it'll deny it access. Each nnrpd handles only one news reader at a time, while the innd process handles many (potentially hundreds) of simultaneous news transfer sessions.
THE MOST IMPORTANT FILEThe most important file in the news system is the "active file." This is a list of every newsgroup the system will carry; the minimum and maximum article numbers currently on-disk in that newsgroup; and whether or not the group is moderated. The active file is maintained by the innd process. You use the "ctlinnd" program to tell the innd process to add or delete groups. As news is posted and transferred in, the innd process updates its in-memory idea of the maximum article number for each group. innd writes the active file out to disk every N minutes (N is a tunable parameter). The nnrpds (nowadays) all share a read-only copy of the active file - which is good, since it's usually at least a few hundred kilobytes and often a megabyte or two. The size of the active file is one of the major reasons to not throw in thousands of unused extra newsgroups (i.e. "We have all 45,000 newsgroups out there!"). Before the "shared-active patch" (which is now not a patch but is built-in to the inn distribution), each nnrpd loaded and refreshed its own copy of the active file, which was a huge waste of memory!
NEWS.DAILY: THE OVERNIGHT "THING"Currently, inn requires an overnight cleanup session to purge old news from the news store, and to process logs and clean up some of the databases. The script news.daily, usually run by the cron daemon, takes care of this. For most of the time that news.daily runs, the news system is still available to handle new article posts, but you should expect 15 to 45 minutes of news-server unavailability overnight (unless you modify inn) as news.daily finishes. Basically, news.daily's job is to run expire and then re-update the databases. Expire looks at the time stamps of the entries in the history databases, and figures out which articles (out of the hundreds of thousands or millions you'll probably have on-disk) need to be deleted - and then goes through the process of removing them. Once that's done, the overview indexes in each directory are rebuilt, and then the server pauses to renumber. This is the period where you'll have to modify the inn system if you want to be able to accept posts 24x7. The renumbering process involves looking at each news directory (potentially tens of thousands of them) and updating the active file's notion of the minimum and maximum article numbers for each group. As mentioned, logs are processed; rotated; and compressed - and the summary report(s) are mailed to the news administrator(s) of the system.
NEWS FEEDING: NON-STREAMING
The original NNTP protocol had each peer say to a news peer: "IHAVE
But there's a problem with using that protocol when your latency (the round-trip time to send data from
site A to site B and back to site A) isn't very low - at least, with today's new loads.
Suppose you're trying to send six articles per second. Let's do the math. If you assume that transferring
each article takes only as much time as the inter-machine latency (not a good assumption, but an excellent
simplification), we have: 1 second / 12 = 83ms. Twelve is the number of round-trip communications (each
article will have a IHAVE/335 round-trip and a
Of course, it usually takes longer than simply the round-trip latency to transfer an article - especially if
the article is a few hundred kilobytes in size.
Anyway, it's apparent that if the latency goes above 83ms between the two ends of a news peering session,
a full feed isn't possible.
The situation is even worse over saturated 56K and T-1 links, and satellite and trans-oceanic links, where
500ms and up is common.
We'll skip the implementation details for now, but the "streaming extensions to NNTP" are commonly
used. Basically, a message is sent saying "Here are 10 message IDs. Which do you want?." The responder
gives back a list of message-ids to send, and the sender sends them all. Though the same amount of data
(roughly) is sent, there are fewer "latency" delays.
Well, what's so hard about designing a news server? Disks. You need disks. Lots of disks. Yes, you need a
fairly powerful machine. Something like a Sun Sparc 10 with a 60 to 80 MHz CPU; or a P120 or greater
running some sort of BSD or Linux; or an Alpha with a bit of cache RAM (the Multias wont do); and on
and on. For most architectures 128 MB of RAM will be enough to support 5 to 30 news readers
simultaneously, but more memory never hurts and memory is cheap.
But about disks, also called "spindles," the problem is that each article that comes in causes a write to:
And a full news feed of 600,000 articles now means that you have to keep up with 6 articles per second -
and peaks of maybe 20-30 articles per second to keep up. Even if you take 200,000 articles per day, you
still have to write two articles per second and do the bookkeeping associated with that, every second.
Then, overnight, you have to search for a full day's load of articles and expire it!
Additionally, you have to support the nnrpds, which want to retrieve articles and .overview files (most of
the I/O done by nnrpds is now .overview lookups).
A fairly ideal news disk layout is:
In reality, few can afford that many disks. So what you do is make trade-offs. The most common trade-off
is to put the .overview files in with the news spool disks. This should be fine until you start getting more
than 50 or so simultaneous news readers (nnrpds) running. Often, there is no separate swap disk, which is
acceptable if you have 256 MB or so of RAM. Under no circumstances, though, should /var/log/news be
on the same disk with the history database - and neither should be on a news spool disk.
With the above configuration, you should be able to hold a week or so of non-binaries groups and two to
three days of binaries groups - binaries groups are any groups with "binaries" or "sex" in the title -
depending on how many of the binaries groups you accept.
NNTP is a text protocol. This means that you can just Telnet to port 119 on a news server; type NNTP
commands; and see the same responses that a news reading or transferring peer would see.
One simple way to know that your innd is very overloaded is to test the time it takes to Telnet to port 119
on it; get a welcome banner; type "QUIT"; and have the connection close. If any of these operations takes
much more than half of a second, your news box is getting overloaded. If it takes many seconds, it's
seriously overloaded.
This is a test of the "select loop" - how fast innd can come around and service each request.
Usually the bottleneck is disk I/O. If the innd is waiting for disks to spin so it can deposit articles or log
data, it'll fall behind and not be able to deal with other requests (like requests for new connections and
even simple requests to quit). You can use the "iostat -D 1" command under most UNIX OS flavors to see
this. If the percent use is near 100 for many seconds, you've got overloaded disks.
If you're running an older version of innd (say, before 1.5.1), sites that stream to you can slow down the
innd "select" loop. The problem is that innd would sit and read many articles from a given peer before
coming along to service the next peer waiting for innd's attention.
The fix was to make each read() from a remote site only read a maximum of 2K or 4K or so before going
back to the select loop to service other peers. Normally this is a bad thing - increasing the number of
system calls (read() and select() are system calls) increases operating system overhead on the machine, but
since innd isn't (yet) multi-threaded, there was no choice other than to disable streaming from remote sites.
The fix is called the "streaming patch," and you should apply it if your select loop(s) are slow but there
doesn't appear to be any disk I/O bottleneck.
Stock innd will store all of the 100,000+ cancel messages in one huge directory. If allowed to accumulate,
this can cause the expire times to balloon by many hours. Reading from a UNIX directory with tens or
hundreds of thousands of entries takes forever! These "cancel control messages" are stored in the spool
directory for the phantom group control.cancel.
The answer is to run a process every few minutes to wipe out any files (including .overview) in your
control.cancel directory. (Depending on where you put your news spool, this might be called
/var/spool/news/control/cancel).
You can add a UNIX crontab entry (the "crontab -e news" will edit the news user's crontab file on many
UNIX flavors) to do this. The line looks like:
0,10,20,30,40,50 * * * * (cd /var/spool/news/control/cancel ; rm *)
This makes sure that there will only be a couple of thousand of entries in the control.cancel directory when
expire runs.
Running a news server is like having a baby. It's more expensive than you could ever initially imagine,
both in terms of equipment and especially in terms of man-hours. There are companies that will let you
point your users at their news servers. These "news-reading" companies include zippo.com,
supernews.com (the oldest of the bunch), as well as new-comers like newsread.com (owned by the author)
and ispnews.com .
Things to look for in a news-reading provider are:
Almost all news-reading providers provide a free trial period, so take advantage of it and get your "news-
hog" users to test the services out for you.
|
Copyright 1998 Mecklermedia Corporation.
All Rights Reserved. Legal Notices.
About Mecklermedia Corp.
Colorado Offices
13949 W Colfax Ave Suite 250, Golden, CO 80401
Voice: 303-235-9510; Fax: 303-235-9502