Mailing List Archive

Using the stable version per request..
Sorry if these are hard to read.. they are even harder for me, as i am
forced to abandon the project.

This version connected much more regularly than 0.5.6, but still not
consistant.

If you need more information, i may or may not be able to use these machines
for this purpose after today.

----
[Frontal] is primary, with dedicated nic at 10.0.0.1 via crossover cable
[temporal] is secondary, with dedicated nic at 10.0.0.2
-----

[frontal]# drbdsetup /dev/nb0 /dev/hda7 B 10.0.0.1 10.0.0.2 -d 2506108
[frontal]# cat /proc/drbd
version : 55

0: cs:WFConnection st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
1: cs:Unconfigured st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
[frontal]# cat /proc/drbd
version : 55

0: cs:Connected st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
1: cs:Unconfigured st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
[frontal]# drbdsetup /dev/nb0 PRI
[frontal]# cat /proc/drbd
version : 55

0: cs:Unconfigured st:Primary ns:0 nr:0 dw:0 dr:0 of:0
1: cs:Unconfigured st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
[frontal]# *woh..* did i do that?
bash: *woh..*: command not found
[frontal]# drbdsetup /dev/nb0 SEC
[frontal]# drbdsetup /dev/nb0 /dev/hda7 B 10.0.0.1 10.0.0.2 -d 2506108
[frontal]# cat /proc/drbd
version : 55

0: cs:Connected st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
1: cs:Unconfigured st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
[frontal]# drbdsetup /dev/nb0 /dev/hda7 B 10.0.0.1 10.0.0.2 -d 2506108
[frontal]# cat /proc/drbd
version : 55

0: cs:Connected st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
1: cs:Unconfigured st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
[frontal]# cat /proc/drbd
version : 55

0: cs:Connected st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
1: cs:Unconfigured st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
[frontal]# drbdsetup /dev/nb0 PRI
[frontal]# cat /proc/drbd
version : 55

0: cs:Connected st:Primary ns:0 nr:0 dw:0 dr:0 of:0
1: cs:Unconfigured st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
[frontal]# *good*
[frontal]# cat /proc/drbd
version : 55

0: cs:Connected st:Primary ns:0 nr:0 dw:0 dr:0 of:0
1: cs:Unconfigured st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
[frontal]# drbdsetup /dev/nb0 REPL
[frontal]# cat /proc/drbd
version : 55

0: cs:WFConnection st:Primary ns:8 nr:0 dw:0 dr:0 of:7
1: cs:Unconfigured st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
[frontal]# drbdsetup /dev/nb0 REPL
[frontal]# cat /proc/drbd
version : 55

0: cs:SyncingAll st:Primary ns:57 nr:0 dw:0 dr:0 of:49
1: cs:Unconfigured st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
[frontal]# cat /proc/drbd
version : 55

0: cs:WFConnection st:Primary ns:57 nr:0 dw:0 dr:0 of:49
1: cs:Unconfigured st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
----- while these appeared in the logs. ------
May 30 16:58:42 frontal kernel: drbd: module initialised. Version: 55
May 30 16:59:34 frontal kernel: drbd: user provided size = 2506108 KB
May 30 16:59:34 frontal kernel: drbd: vmallocing 78315 B for bitmap.
@c827601c
May 30 17:00:22 frontal kernel: drbd: agreed size = 2506108 KB
May 30 17:00:22 frontal kernel: drbd: agreed blksize = 4096 B
May 30 17:00:41 frontal kernel: drbd: send timed out!! (pid=416)
May 30 17:00:41 frontal kernel: drbd: sock_recvmsg returned -512
May 30 17:00:41 frontal kernel: drbd: accept failed! -512
May 30 17:01:15 frontal kernel: drbd: user provided size = 2506108 KB
May 30 17:01:15 frontal kernel: drbd: agreed size = 2506108 KB
May 30 17:01:15 frontal kernel: drbd: agreed blksize = 4096 B
May 30 17:01:25 frontal kernel: drbd: sock_recvmsg returned -512
May 30 17:01:25 frontal kernel: drbd: user provided size = 2506108 KB
May 30 17:01:25 frontal kernel: drbd: agreed size = 2506108 KB
May 30 17:01:25 frontal kernel: drbd: agreed blksize = 4096 B
May 30 17:02:16 frontal kernel: drbd: Synchronisation started blks=7 int=12

May 30 17:02:16 frontal kernel: drbd: sock_recvmsg returned 0
May 30 17:02:16 frontal kernel: drbd: Synchronisation done./m=0
May 30 17:02:19 frontal kernel: drbd: ack timeout detected!
May 30 17:02:39 frontal kernel: drbd: agreed size = 2506108 KB
May 30 17:02:39 frontal kernel: drbd: agreed blksize = 4096 B
May 30 17:02:39 frontal kernel: drbd: Synchronisation started blks=7 int=12

May 30 17:02:39 frontal kernel: drbd: sock_recvmsg returned -104
May 30 17:02:39 frontal kernel: drbd: Synchronisation done./m=0
May 30 17:02:44 frontal kernel: drbd: agreed size = 2506108 KB
May 30 17:02:44 frontal kernel: drbd: agreed blksize = 4096 B
May 30 17:02:44 frontal kernel: drbd: Synchronisation started blks=7 int=12

May 30 17:02:44 frontal kernel: drbd: sock_recvmsg returned -104
May 30 17:02:44 frontal kernel: drbd: Synchronisation done./m=0
May 30 17:02:47 frontal kernel: drbd: agreed size = 2506108 KB
May 30 17:02:47 frontal kernel: drbd: agreed blksize = 4096 B
May 30 17:02:47 frontal kernel: drbd: Synchronisation started blks=7 int=12

May 30 17:02:47 frontal kernel: drbd: Synchronisation done./m=0
May 30 17:02:59 frontal kernel: drbd: Synchronisation started blks=7 int=12

May 30 17:03:00 frontal kernel: drbd: sock_recvmsg returned -104
May 30 17:03:00 frontal kernel: drbd: Synchronisation done./m=0

----- and on temporal ---
[temporal]# drbdsetup /dev/nb0 /dev/hda7 B 10.0.0.2 10.0.0.1 -d 2506108
[temporal]# cat /proc/drbd
version : 55

0: cs:Connected st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
1: cs:Unconfigured st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
[temporal]# cat /proc/drbd
version : 55

0: cs:WFConnection st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
1: cs:Unconfigured st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
[temporal]# cat /proc/drbd
version : 55

0: cs:Unconfigured st:Secondary ns:0 nr:7 dw:7 dr:0 of:0
1: cs:Unconfigured st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
[temporal]# grr, now over here..
[temporal]# drbdsetup /dev/nb0 /dev/hda7 B 10.0.0.2 10.0.0.1 -d 2506108
[temporal]# cat /proc/drbd
version : 55

0: cs:Unconfigured st:Secondary ns:0 nr:7 dw:7 dr:0 of:0
1: cs:Unconfigured st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
[temporal]# drbdsetup /dev/nb0 /dev/hda7 B 10.0.0.2 10.0.0.1 -d 2506108
[temporal]# cat /proc/drbd
version : 55

0: cs:Unconfigured st:Secondary ns:0 nr:7 dw:7 dr:0 of:0
1: cs:Unconfigured st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
[temporal]# drbdsetup /dev/nb0 /dev/hda7 B 10.0.0.2 10.0.0.1 -d 2506108
[temporal]# cat /proc/drbd
version : 55

0: cs:Connected st:Secondary ns:0 nr:7 dw:7 dr:0 of:0
1: cs:Unconfigured st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
[temporal]# 3rd try is a charm? eh?
[temporal]# cat /proc/drbd
version : 55

0: cs:Unconfigured st:Secondary ns:0 nr:50 dw:50 dr:0 of:0
1: cs:Unconfigured st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
[temporal]# again, after the repl started..

---- logs on temporal ----
May 30 16:58:55 temporal kernel: drbd: module initialised. Version: 55
May 30 17:00:29 temporal kernel: drbd: user provided size = 2506108 KB
May 30 17:00:29 temporal kernel: drbd: vmallocing 78315 B for bitmap.
@c827601c
May 30 17:00:29 temporal kernel: drbd: agreed size = 2506108 KB
May 30 17:00:29 temporal kernel: drbd: agreed blksize = 4096 B
May 30 17:00:48 temporal kernel: drbd: agreed size = 2506108 KB
May 30 17:00:48 temporal kernel: drbd: agreed blksize = 4096 B
May 30 17:00:48 temporal kernel: drbd: sock_recvmsg returned 0
May 30 17:01:22 temporal kernel: drbd: agreed size = 2506108 KB
May 30 17:01:22 temporal kernel: drbd: agreed blksize = 4096 B
May 30 17:01:31 temporal kernel: drbd: sock_recvmsg returned 0
May 30 17:01:32 temporal kernel: drbd: agreed size = 2506108 KB
May 30 17:01:32 temporal kernel: drbd: agreed blksize = 4096 B
May 30 17:01:43 temporal kernel: drbd: agreed size = 2506108 KB
May 30 17:01:43 temporal kernel: drbd: agreed blksize = 4096 B
May 30 17:02:23 temporal kernel: drbd: send timed out!! (pid=440)
May 30 17:02:23 temporal kernel: drbd: sock_recvmsg returned -512
May 30 17:02:23 temporal kernel: drbd: accept failed! -512
May 30 17:02:46 temporal kernel: drbd: user provided size = 2506108 KB
May 30 17:02:46 temporal kernel: drbd: send timed out!! (pid=444)
May 30 17:02:46 temporal kernel: drbd: sock_recvmsg returned -512
May 30 17:02:46 temporal kernel: drbd: accept failed! -512
May 30 17:02:51 temporal kernel: drbd: user provided size = 2506108 KB
May 30 17:02:51 temporal kernel: drbd: send timed out!! (pid=448)
May 30 17:02:51 temporal kernel: drbd: sock_recvmsg returned -512
May 30 17:02:51 temporal kernel: drbd: accept failed! -512
May 30 17:02:54 temporal kernel: drbd: user provided size = 2506108 KB
May 30 17:02:54 temporal kernel: drbd: agreed size = 2506108 KB
May 30 17:02:54 temporal kernel: drbd: agreed blksize = 4096 B
May 30 17:03:07 temporal kernel: drbd: send timed out!! (pid=453)
May 30 17:03:07 temporal kernel: drbd: sock_recvmsg returned -512
May 30 17:03:07 temporal kernel: drbd: accept failed! -512
Re: Using the stable version per request.. [ In reply to ]
Am Die, 30 Mai 2000 schriebst Du:
>
>Sorry if these are hard to read.. they are even harder for me, as i am
>forced to abandon the project.
>
>This version connected much more regularly than 0.5.6, but still not
>consistant.
>
>If you need more information, i may or may not be able to use these machines
>for this purpose after today.
>

[...]

What type (manufacture/model) of NICs are you using ?

Ever tried to run the benchmarking (benchmark/run.sh) script ?

Your connection is always interrupted by a timeout...
... you can try to increase the timeout value by adding "-t 100" to
tthe drbdsetup call.

Aren't there some NICs/drivers which can not send if they are idle for some
time ?

1. frontal is getting the parameter packet from temporal
Log messages: agreed size...
2. You are switching frontal into primary state, therefore it tries
to send a C-State packet, the networking-subsystem is not able to send
this packet within 3 seconds.
Log messages: send times out!!
3. Frontal is closing the connection.

At home I am useing cheap NE2000s. I know it's working with RTL8139 and
via-rhine.

TO ALL SUBSCRIBERS:
Can you post the type/driver of your network cards, and a note if it's
working? Thanks in advance.

-Philipp
Re: Using the stable version per request.. [ In reply to ]
> At home I am useing cheap NE2000s. I know it's working with RTL8139 and
> via-rhine.

I use drbd on on two systems one with 2 RTL8139 card
and one with two identical "cheap" (18 pounds) card (tulip driver)
Both systems are working fine.

To set them up I configure them to use a very high "transfert rate value".
DRBD is them unable to communicate and then flood the screen with the
optimal value (or at least, I think it is.. philipp ???).
I don't know if it is the best way to do but it work fine for me.

With the same hardware and only the network card changing
I had a difference of 20% between the two configuration value
(in favor of the realtech)

Thomas
RE: Using the stable version per request.. [ In reply to ]
Ok, here goes, boss didnt deploy the servers, so i had another day to play
with them,
the machines are equipped like this:
each, identical, built-in Intel Etherexpress Pro 100, and a Intel
Etherexpress Pro 100 as a seperate card
ive tried with similar results on both cards.

Even if it is timing out, by my estimation, drbd should NEVER go back to
unconfigured, but it does, and very often.

Not going to post my whole situation, just the beginning, this is after a
clean boot, both machines
commands are numbered this time


[frontal]# 1; modprobe drbd
bash: 1: command not found
[frontal]# 3;drbdsetup /dev/nb0 /dev/hda7 B 10.0.0.1 10.0.0.2 -d 2506108
-t100
bash: 3: command not found
[frontal]# 5;cat /proc/drbd
bash: 5: command not found
version : 55

0: cs:Connected st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
1: cs:Unconfigured st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
[frontal]# 7;drbdsetup /dev/nb0 PRI
bash: 7: command not found
[frontal]# 8;cat /proc/drbd
bash: 8: command not found
version : 55

0: cs:Unconfigured st:Primary ns:0 nr:0 dw:0 dr:0 of:0
1: cs:Unconfigured st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
[frontal]# May 31 17:14:00 frontal kernel: drbd: module initialised.
Version: 55
May 31 17:14:09 frontal kernel: drbd: user provided size = 2506108 KB
May 31 17:14:09 frontal kernel: drbd: vmallocing 78315 B for bitmap.
@c827601c
May 31 17:14:15 frontal kernel: drbd: agreed size = 2506108 KB
May 31 17:14:15 frontal kernel: drbd: agreed blksize = 4096 B
May 31 17:15:07 frontal kernel: drbd: send timed out!! (pid=399)
May 31 17:15:07 frontal kernel: drbd: sock_recvmsg returned -512
May 31 17:15:07 frontal kernel: drbd: accept failed! -512
[temporal]# 2; modprobe drbd
bash: 2: command not found
[temporal]# 4;drbdsetup /dev/nb0 /dev/hda7 B 10.0.0.2 10.0.0.1 -d 2506108
-t100
bash: 4: command not found
[temporal]# 6;cat /proc/drbd
bash: 6: command not found
version : 55

0: cs:Connected st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
1: cs:Unconfigured st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
[temporal]#9;cat /proc/drbd
bash: 6: command not found
version : 55

0: cs:WFConnection st:Secondary ns:0 nr:0 dw:0 dr:0 of:0
1: cs:Unconfigured st:Secondary ns:0 nr:0 dw:0 dr:0 of:0

May 31 17:14:14 temporal kernel: drbd: module initialised. Version: 55
May 31 17:14:22 temporal kernel: drbd: user provided size = 2506108 KB
May 31 17:14:22 temporal kernel: drbd: vmallocing 78315 B for bitmap.
@c827601c
May 31 17:14:22 temporal kernel: drbd: agreed size = 2506108 KB
May 31 17:14:22 temporal kernel: drbd: agreed blksize = 4096 B
May 31 17:15:14 temporal kernel: drbd: agreed size = 2506108 KB
May 31 17:15:14 temporal kernel: drbd: agreed blksize = 4096 B
May 31 17:15:14 temporal kernel: drbd: sock_recvmsg returned 0
Re: Using the stable version per request.. [ In reply to ]
In order to create a better datadisk script and specialy the init part.
It would be nice to get a /proc more "verbose" or "readable".

I my case a had to write a small server (in shell with tcpserver) binded to
the private interface running the input it receive. A don't want rsh
installed and ssh is bringing some side problems.

This allow me to determine the state of the other computer ping to check
it's presence and datadisk state remotely to know it's state to setup the
state the node at boot time.

I think most of this could be done more cleanly with drbd, if all is already
here to do it I would be please to fix my own code.
All comment (even flame :) are welcome.

Thomas
Re: Using the stable version per request.. [ In reply to ]
Am Don, 01 Jun 2000 schriebst Du:
>In order to create a better datadisk script and specialy the init part.
>It would be nice to get a /proc more "verbose" or "readable".
>
>I my case a had to write a small server (in shell with tcpserver) binded to
>the private interface running the input it receive. A don't want rsh
>installed and ssh is bringing some side problems.
>
>This allow me to determine the state of the other computer ping to check
>it's presence and datadisk state remotely to know it's state to setup the
>state the node at boot time.
>
>I think most of this could be done more cleanly with drbd, if all is already
>here to do it I would be please to fix my own code.
>All comment (even flame :) are welcome.
>
>Thomas

Thomas, please tell us more. Which information do you want to see in
/proc/drbd. Maybe you can give us an example "screenshot" of the
/proc/drbd you are thinking of ?

-Philipp