newscache: Thread: Re: Any hints on newscache?


[<<] [<] Page 1 of 1 [>] [>>]
Subject: Re: Any hints on newscache?
From: Tilman Linneweh ####@####.####
Date: 8 Jan 2004 22:36:13 -0000
Message-Id: <20040108220345.GA851@huckfinn.arved.de>

Hello Mike,

* Mike Harding [Thu, Jan 08, 2004 at 11:32:12AM -0800]:
> I got this set up and when it works, it works great, but it seems to
> hang at other times.  I am using a news feed at earthlink.  I have
> been using leafnode for a while but newscache seems to be more of what
> I want.  How does it work for you?  What release are you using?  I am
> using 4.9-STABLE...

I am sorry, I don't use it in a production environment at the moment,
so I can't comment on this. In my test environment (4.9-RC1) 
it just caches from the local INN, where I didn't noticed any hangs
(but testing was very limited). 

I will CC this reply to the newscache Mailinglist, perhaps someone else
has some ideas. Can you describe the hangs a bit more?

regards
tilman

[Content type application/pgp-signature not shown. Download]
Subject: AW: Any hints on newscache?
From: Straub Herbert ####@####.####
Date: 9 Jan 2004 08:47:30 -0000
Message-Id: <3365692EA1026A498A5DB41801AE2DB50290D640@xcwrk2.xund.magwien.gv.at>

If you have a slow connection to the upstream host, then the download of the
active db could take some time (list active --> 3,5MB). What can you do:

1) in /etc/syslogd.conf insert news.debug /var/log/news.debug and look, what
newscache do, if it "hangs".
2) ./configure --enable-debug --> more debug messages


If the list active takes the time, then:
3) Change the active database time. see man newscache.conf parameter
timeouts
4) you can also work with updatenews, and fetch the active database

HTH
Herbert

-----Ursprüngliche Nachricht-----
Von: Tilman Linneweh ####@####.#### 
Gesendet: Donnerstag, 08. Jänner 2004 23:04
An: Mike Harding
Cc: ####@####.####
Betreff: Re: Any hints on newscache?


Hello Mike,

* Mike Harding [Thu, Jan 08, 2004 at 11:32:12AM -0800]:
> I got this set up and when it works, it works great, but it seems to 
> hang at other times.  I am using a news feed at earthlink.  I have 
> been using leafnode for a while but newscache seems to be more of what 
> I want.  How does it work for you?  What release are you using?  I am 
> using 4.9-STABLE...

I will CC this reply to the newscache Mailinglist, perhaps someone else has
some ideas. Can you describe the hangs a bit more?
Subject: Re: AW: Any hints on newscache?
From: Mike Harding ####@####.####
Date: 9 Jan 2004 18:47:20 -0000
Message-Id: <1073672280.16655.18.camel@netcom1.netcom.com>

The connection isn't slow - it just stops.  I have never had a problem
with leafnode or a direct connect.

Here's what I see in the external connection (via ethereal)

...
group rec.pets.cats.rescue
211 789 28054 28931 rec.pets.cats.rescue
group rec.pets.dogs.info
211 334 1992 2454 rec.pets.dogs.info
group rec.pets.ferrets
211 80 2369 2464 rec.pets.ferrets
group relcom.comp.virus
211 36 9792 9841 relcom.comp.virus
^ hangs here

Here's what's in syslog (all.log, showing all messages)

...
Jan  9 09:57:06 netcom1 NewsCache[53892]: RServer::issue: issue group
rec.pets.dogs.info
Jan  9 09:57:06 netcom1 NewsCache[53892]: groupsize [1992,2454]
Jan  9 09:57:06 netcom1 NewsCache[53892]: localhost [127.0.0.1] GROUP
rec.pets.ferrets
Jan  9 09:57:06 netcom1 NewsCache[53892]: RServer::issue: issue group
rec.pets.ferrets
Jan  9 09:57:06 netcom1 NewsCache[53892]: groupsize [2369,2464]
Jan  9 09:57:06 netcom1 NewsCache[53892]: localhost [127.0.0.1] GROUP
relcom.comp.virus
Jan  9 09:57:06 netcom1 NewsCache[53892]: RServer::issue: issue group
relcom.comp.virus

The external connection is still open

netcom1# sockstat | grep 119
news     newscach 53892    5 tcp4   127.0.0.1:119        
127.0.0.1:14227
news     newscach 53892    7 tcp4   192.168.0.2:14228    
207.217.77.203:119
mvh      tin      53891    3 tcp4   127.0.0.1:14227       127.0.0.1:119
mvh      tin      53891    4 tcp4   127.0.0.1:14227       127.0.0.1:119

..btw, does each spawned copy of newscache open its own connection?

One thing I noticed is that I seem to always see a bunch of NNTP
request/response in ethereal, then I see a naked ACK, with no data in
it, at the point that the hang occurs.  It's possible that this is what
causes the hang - newscache is stuck in 'select'

53996 news       2   0  4492K  3412K select   0:00  0.00%  0.00%
newscache
36475 news       2   0  2456K  1436K accept   0:00  0.00%  0.00%
newscache

and the other end has no reason to return any data after a null ack, I
think...

Any thoughts?

- Mike H.

On Fri, 2004-01-09 at 00:17, Straub Herbert wrote:
> If you have a slow connection to the upstream host, then the download
> of the active db could take some time (list active --> 3,5MB). What
> can you do:
> 
> 1) in /etc/syslogd.conf insert news.debug /var/log/news.debug and
> look, what newscache do, if it "hangs".
> 2) ./configure --enable-debug --> more debug messages
> 
> 
> If the list active takes the time, then:
> 3) Change the active database time. see man newscache.conf parameter
> timeouts
> 4) you can also work with updatenews, and fetch the active database
> 
> HTH
> Herbert
> 
> -----Ursprüngliche Nachricht-----
> Von: Tilman Linneweh ####@####.#### 
> Gesendet: Donnerstag, 08. Jänner 2004 23:04
> An: Mike Harding
> Cc: ####@####.####
> Betreff: Re: Any hints on newscache?
> 
> 
> Hello Mike,
> 
> * Mike Harding [Thu, Jan 08, 2004 at 11:32:12AM -0800]:
> > I got this set up and when it works, it works great, but it seems to
> > hang at other times.  I am using a news feed at earthlink.  I have 
> > been using leafnode for a while but newscache seems to be more of
> what 
> > I want.  How does it work for you?  What release are you using?  I
> am 
> > using 4.9-STABLE...
> 
> I will CC this reply to the newscache Mailinglist, perhaps someone
> else has some ideas. Can you describe the hangs a bit more?
> 

Subject: Re: AW: Any hints on newscache?
From: Herbert Straub ####@####.####
Date: 12 Jan 2004 07:57:32 -0000
Message-Id: <20040112072729.GB5402@rossi.localdomain>

On Fri, Jan 09, 2004 at 10:18:00AM -0800, Mike Harding wrote:
> The connection isn't slow - it just stops.  I have never had a problem
> with leafnode or a direct connect.

OK, this is a new situation. Is this reproduceable? Is the client
newsreader program tin? How can i try to reproduce this error?

> Here's what I see in the external connection (via ethereal)

Please send me (h.straub at aon.at) such a ethereal trace.

> Here's what's in syslog (all.log, showing all messages)

I think, the NewsCache is'nt configured with the --enable-debug option.
The debug output contains more messages.  If you are able to configure
it with the debug option, you can also send me the debug output.

> ..btw, does each spawned copy of newscache open its own connection?

Yes. Each client process opens a connection to the upstream server (if
necessary). There is no "link concentration" at the moment.

> One thing I noticed is that I seem to always see a bunch of NNTP
> request/response in ethereal, then I see a naked ACK, with no data in
> it, at the point that the hang occurs. 

I will compare this with the output on my systems.

> It's possible that this is what causes the hang - newscache is stuck
> in 'select'

> 53996 news       2   0  4492K  3412K select   0:00  0.00%  0.00%
> newscache
> 36475 news       2   0  2456K  1436K accept   0:00  0.00%  0.00%
> newscache
> 
> and the other end has no reason to return any data after a null ack, I
> think...

The newscache does'nt use explicit the select call, but socket++ using
the select call, do determine if data is available
(sockbuf::is_readready). Dan Muller reports a problem with the buffer
handling in the read and write methods of socket++. I have modified the
socket++ and i'm testing this modification at the moment. If you would
try the following patch and report the results:


Index: sockstream.cpp
===================================================================
RCS file: /home/stb/products/socket++/my_cvs/socket++/socket++/sockstream.cpp,v
retrieving revision 1.3
retrieving revision 1.4
diff -u -r1.3 -r1.4
--- sockstream.cpp	14 Mar 2003 18:26:07 -0000	1.3
+++ sockstream.cpp	29 Nov 2003 12:08:10 -0000	1.4
@@ -565,12 +565,13 @@
 // upon error, write throws the number of bytes writen so far instead
 // of sockerr.
 {
+  char *pbuf = (char*) buf;
   if (rep->stmo != -1 && is_writeready (rep->stmo)==0)
     throw sockerr (ETIMEDOUT, "sockbuf::write", sockname.c_str());
   
   int wlen=0;
   while(len>0) {
-    int	wval = ::write (rep->sock, (char*) buf, len);
+    int	wval = ::write (rep->sock, pbuf+wlen, len);
     if (wval == -1) throw wlen;
     len -= wval;
     wlen += wval;
@@ -582,12 +583,13 @@
 // upon error, write throws the number of bytes writen so far instead
 // of sockerr.
 {
+  char *pbuf = (char *) buf;
   if (rep->stmo != -1 && is_writeready (rep->stmo)==0)
     throw sockerr (ETIMEDOUT, "sockbuf::send", sockname.c_str());
   
   int wlen=0;
   while(len>0) {
-    int	wval = ::send (rep->sock, (char*) buf, len, msgf);
+    int	wval = ::send (rep->sock, pbuf+wlen, len, msgf);
     if (wval == -1) throw wlen;
     len -= wval;
     wlen += wval;
@@ -599,12 +601,13 @@
 // upon error, write throws the number of bytes writen so far instead
 // of sockerr.
 {
+  char *pbuf = (char *) buf;
   if (rep->stmo != -1 && is_writeready (rep->stmo)==0)
     throw sockerr (ETIMEDOUT, "sockbuf::sendto", sockname.c_str());
   
   int wlen=0;
   while(len>0) {
-    int	wval = ::sendto (rep->sock, (char*) buf, len, msgf,
+    int	wval = ::sendto (rep->sock, pbuf+wlen, len, msgf,
 			 sa.addr (), sa.size());
     if (wval == -1) throw wlen;
     len -= wval;

Subject: Re: AW: Any hints on newscache?
From: Herbert Straub ####@####.####
Date: 12 Jan 2004 18:55:37 -0000
Message-Id: <4002E685.6080607@aon.at>

Mike Harding wrote:

>On Sun, 2004-01-11 at 23:27, Herbert Straub wrote:
>  
>
>>On Fri, Jan 09, 2004 at 10:18:00AM -0800, Mike Harding wrote:
>>    
>>
>I'll generate a trace, and a debug option, and I'll run gdb on the code
>as well, and see where it stops.  I'll try the patch you included to
>libsock++ first, and see if that addresses my problem...  If you don't
>hear from me for a day or so that means that all is going well.
>
>This only happens during tin startup, it always works great once it
>starts working at all...
>  
>

Mike,

i have tested the rtin and found the effect, that the description is not 
loaded. With
rtin -q -d -n at.linux,at.test,de.test
it seems to be working. One other point: if found a Misfeature 
description in the original info file (doc/NewsCache.info):

Bugs and Misfeatures
********************

   The description of newsgroups is currently not cached properly. The
file containing the descriptions (`.newsgroups' in NewsCache's spool
directory) has to be created manually. The file's format is `newsgroup
description'.

     comp.os.linux.alpha Linux on Digital Alpha machines.
     comp.os.linux.m68k  Linux operating system on 680x0 Amiga, Atari, VME.
     comp.os.linux.networking Networking and communications under Linux.
     comp.os.linux.setup Linux installation and system administration.
     comp.os.linux.x     Linux X Window System servers, clients, libs 
and fonts.

   The file containing the descriptions can be obtained from your News
Server. Initiate a `telnet' session on port `nntp' and issue the `list
newsgroups' command.

     $ telnet news.tuwien.ac.at nntp
     200 news.tuwien.ac.at InterNetNews...
     list newsgroups
     215 Newsgroups in form "group high low flags".
     comp.os.linux.alpha Linux on Digital Alpha machines.
     comp.os.linux.m68k  Linux operating system on 680x0 Amiga, Atari, VME.
     ...
     .
A description database is currently not implemented. Is it possible, 
that this misfeature is the reason for the failure situation?

Subject: Re: AW: Any hints on newscache?
From: Herbert Straub ####@####.####
Date: 12 Jan 2004 21:15:01 -0000
Message-Id: <40030733.50701@aon.at>

I think i know the reason why newscache hangs. I play around with the 
rtin -D 2 option and see that the /tmp/NNTP logfile show another group 
as the /var/log/news/news.debug - after the hang. In the 
tin.../src/active.c i found:



*
 * if you are using C-News nntpd set NUM_SIMULTANEOUS_GROUP_COMMAND to 1
 */
#ifdef NNTP_ABLE
/* Straub
#       define NUM_SIMULTANEOUS_GROUP_COMMAND 50
*/
#       define NUM_SIMULTANEOUS_GROUP_COMMAND 1
#endif /* NNTP_ABLE */

i changed this to 1 and recompile tin. After this, it seems to me, that 
rtin don't hang anymore. At the moment i don't know, what i can do with 
this situation, but i will do further research. You can only a) change 
tin's NUM_SIMULTANEOUS_GROUP_COMMAND to 1 or b) use leafnode in the near 
future until a solution for this problem is found.


Best regards
Herbert







Subject: AW: Any hints on newscache?
From: Straub Herbert ####@####.####
Date: 14 Jan 2004 09:34:11 -0000
Message-Id: <3365692EA1026A498A5DB41801AE2DB50290D653@xcwrk2.xund.magwien.gv.at>

Yes, yesterday i made a new version, which correct this problem in the
NewsCache. One of your first assumption (hangs in select) was right! A
forgotten select call mixed with the socket++ and iostream library leads to
the situation, that the socket++ library read the data from the socket to
the internal buffer and the following call of select hangs until the next
data. This can be done with telnet ... 119 and a mouse copy of more GROUP
group.name lines and you will see, that not all of requested information are
returned. Press Return an you will get more lines. With the new implemention
all operations are done with the iostream and the socket++ library. Positive
sideeffect: the newsache -i and the daemon working in the same way and the
ClientTimeout (handled via Signals [again]) also.

I will relase 1.1.92 today or tomorrow, because there some other enhencemts
in it therefore no patch for 1.1.91

Debugging Session:

gdb src/newscache
GNU gdb 6.0-debian
Copyright 2003 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and yo
u are
welcome to change it and/or distribute copies of it under certain condi
tions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for deta
ils.
This GDB was configured as "i386-linux"...
(gdb) set arg -i
(gdb) run
Starting program: /home/stb/products/newscache-new/tin_fehler/NewsCache
/src/newscache -i
200 NewsCache 1.1.91, accepting NNRP commands
GROUP fr.comp.os.linux.annonces
GROUP linux.redhat.misc
GROUP linux.redhat.install
GROUP microsoft.public.jp.office.setup
GROUP linux.debian.ports.alpha
GROUP linux.dev.scsi480 authentication required
480 authentication required
480 authentication required
480 authentication required
480 authentication required

Program received signal SIGINT, Interrupt.
0x40244a72 in select () from /lib/libc.so.6
(gdb) bt
#0  0x40244a72 in select () from /lib/libc.so.6
#1  0xbffff1b4 in ?? ()
#2  0x400164e0 in ?? () from /lib/ld-linux.so.2
#3  0xbfffe960 in ?? ()
#4  0x0805703f in nnrpd(int) (fd=0) at NewsCache.cc:2029
#5  0x08059333 in main (argc=2, argv=0xbffffdb4) at NewsCache.cc:2601
(gdb)

The Sourcecode line doesn't match with the 1.1.91 Version.

Thanks for your support.

-----Ursprüngliche Nachricht-----
Von: Mike Harding ####@####.#### 
Gesendet: Dienstag, 13. Jänner 2004 21:19
An: Herbert Straub
Cc: ####@####.####
Betreff: Re: AW: Any hints on newscache?


This appears to be the correct diagnosis - I haven't been able to get tin to
hang for a while.

Is there interest in developing a long term fix for this?  I might be able
to help out...

- Mike H.

On Mon, 2004-01-12 at 12:44, Herbert Straub wrote:
> I think i know the reason why newscache hangs. I play around with the
> rtin -D 2 option and see that the /tmp/NNTP logfile show another group 
> as the /var/log/news/news.debug - after the hang. In the 
> tin.../src/active.c i found:
> 
> 
> 
> *
>  * if you are using C-News nntpd set NUM_SIMULTANEOUS_GROUP_COMMAND to 
> 1  */ #ifdef NNTP_ABLE
> /* Straub
> #       define NUM_SIMULTANEOUS_GROUP_COMMAND 50
> */
> #       define NUM_SIMULTANEOUS_GROUP_COMMAND 1
> #endif /* NNTP_ABLE */
> 
> i changed this to 1 and recompile tin. After this, it seems to me, 
> that
> rtin don't hang anymore. At the moment i don't know, what i can do with 
> this situation, but i will do further research. You can only a) change 
> tin's NUM_SIMULTANEOUS_GROUP_COMMAND to 1 or b) use leafnode in the near 
> future until a solution for this problem is found.
> 
> 
> Best regards
> Herbert
> 
> 
> 
> 
> 
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: ####@####.####
For additional commands, e-mail: ####@####.####
[<<] [<] Page 1 of 1 [>] [>>]


Powered by ezmlm-browse 0.20.