Hello,
I am trying to deploy linstor gateway on a 3 nodes cluster on Debian 11.7.
I added the parameter "target id" on linstor-gateway to handle the
parameter "tid" in OCF ressources, because without it, I had:
ocf-exit-reason:Missing resource parameter "tid"!
But, I still have an error on tgt.
Well, here are the details:
root@linstor-01:~# cat /proc/drbd
version: 9.2.5 (api:2/proto:86-122)
GIT-hash: b44520271e63d4b6f359a6642eb4d475b7cc04e0 build by
root@linstor-01, 2023-10-10 01:29:10
Transports (api:18): tcp (9.2.5)
root@linstor-01:~# drbdadm -V
DRBDADM_BUILDTAG=GIT-hash:\ bb297231c27690a31bf527e8bf77dca1fc2ce268\
build\ by\ root@linstor-01\,\ 2023-10-10\ 23:37:11
DRBDADM_API_VERSION=2
DRBD_KERNEL_VERSION_CODE=0x090205
DRBD_KERNEL_VERSION=9.2.5
DRBDADM_VERSION_CODE=0x091900
DRBDADM_VERSION=9.25.0
I am trying to provide a 10G iscsi device, with the command:
root@linstor-01:~# linstor-gateway iscsi create
iqn.2023-10.com.example:test05 10.105.0.30/24 10G -r oneRessourceGroup
--implementation tgt -t 2
Created iSCSI target 'iqn.2023-10.com.example:test05'
So, no error is reported on creation.
I created before a drbd device 'linstor_db' which is replicated between
all nodes, and it is mounted successfully.
I add some "linstor" outputs here:
node
??????????????????????????????????????????????????????????????
? Node ? NodeType ? Addresses ? State ?
??????????????????????????????????????????????????????????????
? linstor-01 ? SATELLITE ? 10.105.0.31:3366 (PLAIN) ? Online ?
? linstor-02 ? SATELLITE ? 10.105.0.32:3366 (PLAIN) ? Online ?
? linstor-03 ? SATELLITE ? 10.105.0.33:3366 (PLAIN) ? Online ?
??????????????????????????????????????????????????????????????
physical-storage
?????????????????????????????
? Size ? Rotational ? Nodes ?
?????????????????????????????
?????????????????????????????
storage-pool
???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
? StoragePool ? Node ? Driver ? PoolName ? FreeCapacity
? TotalCapacity ? CanSnapshots ? State ? SharedName ?
???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
? DfltDisklessStorPool ? linstor-01 ? DISKLESS ? ?
? ? False ? Ok ? linstor-01;DfltDisklessStorPool ?
? DfltDisklessStorPool ? linstor-02 ? DISKLESS ? ?
? ? False ? Ok ? linstor-02;DfltDisklessStorPool ?
? DfltDisklessStorPool ? linstor-03 ? DISKLESS ? ?
? ? False ? Ok ? linstor-03;DfltDisklessStorPool ?
? storage ? linstor-01 ? ZFS ? storage ? 8.44 TiB
? 10.91 TiB ? True ? Ok ? linstor-01;storage ?
? storage ? linstor-02 ? ZFS ? storage ? 8.42 TiB
? 10.91 TiB ? True ? Ok ? linstor-02;storage ?
? storage ? linstor-03 ? ZFS ? storage ? 8.42 TiB
? 10.91 TiB ? True ? Ok ? linstor-03;storage ?
???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
resource-group
??????????????????????????????????????????????????????????????????????
? ResourceGroup ? SelectFilter ? VlmNrs ? Description ?
??????????????????????????????????????????????????????????????????????
? DfltRscGrp ? PlaceCount: 2 ? ? ?
??????????????????????????????????????????????????????????????????????
? oneRessourceGroup ? PlaceCount: 2 ? 0 ? ?
? ? StoragePool(s): storage ? ? ?
??????????????????????????????????????????????????????????????????????
resource
??????????????????????????????????????????????????????????????????????????????????????
? ResourceName ? Node ? Port ? Usage ? Conns ? State ?
CreatedOn ?
??????????????????????????????????????????????????????????????????????????????????????
? linstor_db ? linstor-01 ? 7001 ? InUse ? Ok ? UpToDate ?
2023-10-14 00:07:02 ?
? linstor_db ? linstor-02 ? 7001 ? Unused ? Ok ? UpToDate ?
2023-10-14 00:07:02 ?
? linstor_db ? linstor-03 ? 7001 ? Unused ? Ok ? UpToDate ?
2023-10-14 00:07:02 ?
? test05 ? linstor-01 ? 7000 ? Unused ? Ok ? Diskless ?
2023-10-27 10:54:47 ?
? test05 ? linstor-02 ? 7000 ? Unused ? Ok ? UpToDate ?
2023-10-27 10:54:58 ?
? test05 ? linstor-03 ? 7000 ? Unused ? Ok ? UpToDate ?
2023-10-27 10:54:58 ?
??????????????????????????????????????????????????????????????????????????????????????
volume-definition
???????????????????????????????????????????????????????????????????
? ResourceName ? VolumeNr ? VolumeMinor ? Size ? Gross ? State ?
???????????????????????????????????????????????????????????????????
? linstor_db ? 0 ? 1001 ? 200 MiB ? ? ok ?
? test05 ? 0 ? 1000 ? 64 MiB ? ? ok ?
? test05 ? 1 ? 1002 ? 10 GiB ? ? ok ?
???????????????????????????????????????????????????????????????????
resource-definition
???????????????????????????????????????????????????
? ResourceName ? Port ? ResourceGroup ? State ?
???????????????????????????????????????????????????
? linstor_db ? 7001 ? DfltRscGrp ? ok ?
? test05 ? 7000 ? oneRessourceGroup ? ok ?
???????????????????????????????????????????????????
volume
????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
? Node ? Resource ? StoragePool ? VolNr ? MinorNr ?
DeviceName ? Allocated ? InUse ? State ?
????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
? linstor-01 ? linstor_db ? storage ? 0 ? 1001 ?
/dev/drbd1001 ? 18.61 MiB ? InUse ? UpToDate ?
? linstor-02 ? linstor_db ? storage ? 0 ? 1001 ?
/dev/drbd1001 ? 18.61 MiB ? Unused ? UpToDate ?
? linstor-03 ? linstor_db ? storage ? 0 ? 1001 ?
/dev/drbd1001 ? 18.61 MiB ? Unused ? UpToDate ?
? linstor-01 ? test05 ? DfltDisklessStorPool ? 0 ? 1000 ?
/dev/drbd1000 ? ? Unused ? Diskless ?
? linstor-01 ? test05 ? DfltDisklessStorPool ? 1 ? 1002 ?
/dev/drbd1002 ? ? Unused ? Diskless ?
? linstor-02 ? test05 ? storage ? 0 ? 1000 ?
/dev/drbd1000 ? 204 KiB ? Unused ? UpToDate ?
? linstor-02 ? test05 ? storage ? 1 ? 1002 ?
/dev/drbd1002 ? 3.67 MiB ? Unused ? UpToDate ?
? linstor-03 ? test05 ? storage ? 0 ? 1000 ?
/dev/drbd1000 ? 204 KiB ? Unused ? UpToDate ?
? linstor-03 ? test05 ? storage ? 1 ? 1002 ?
/dev/drbd1002 ? 3.67 MiB ? Unused ? UpToDate ?
????????????????????????????????????????????????????????????????????????????????????????????????????????????????????+--------------------------------+----------------+---------------+-----+---------------+
| IQN | Service IP | Service state | LUN
| LINSTOR state |
+--------------------------------+----------------+---------------+-----+---------------+
| iqn.2023-10.com.example:test05 | 10.105.0.30/24 | Stopped | 1
| OK |
+--------------------------------+----------------+---------------+-----+---------------+
The service is stopped, but it seems that there is no error reported.
root@linstor-01:~# linstor-gateway iscsi start
iqn.2023-10.com.example:test05
Started target "iqn.2023-10.com.example:test05"
root@linstor-01:~# linstor-gateway iscsi list
+--------------------------------+----------------+---------------+-----+---------------+
| IQN | Service IP | Service state | LUN
| LINSTOR state |
+--------------------------------+----------------+---------------+-----+---------------+
| iqn.2023-10.com.example:test05 | 10.105.0.30/24 | Stopped | 1
| OK |
+--------------------------------+----------------+---------------+-----+---------------+
The service is still stopped...
If I "watch" drbdadm status, I see that the "Primary" state loops among
all servers, and fallback to secondary.
(on the third node)
test05 role:Secondary
volume:0 disk:UpToDate
volume:1 disk:UpToDate
linstor-01 role:Secondary
volume:0 peer-disk:Diskless
volume:1 peer-disk:Diskless
linstor-02 role:Secondary
volume:0 peer-disk:UpToDate
volume:1 peer-disk:UpToDate
So ... digging into journalctl :
Oct 27 11:04:39 linstor-03 drbd-reactor[1731492]: INFO
[drbd_reactor::plugin::promoter] systemd_start: systemctl start
drbd-services@test05.target
Oct 27 11:04:39 linstor-03 systemd[1]: Starting Promotion of DRBD
resource test05...
Oct 27 11:04:40 linstor-03 kernel: drbd test05: Preparing cluster-wide
state change 1823090526 (1->-1 3/1)
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: Aborting
local state change 1823090526 to yield to remote state change 1553699760.
Oct 27 11:04:40 linstor-03 kernel: drbd test05: Aborting cluster-wide
state change 1823090526 (0ms) rv = -19
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: Preparing
remote state change 1553699760
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: Aborting
remote state change 1553699760
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-01: Preparing
remote state change 2189658367
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-01: Committing
remote state change 2189658367 (primary_nodes=4)
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-01: peer(
Secondary -> Primary )
Oct 27 11:04:40 linstor-03 kernel: drbd test05/0 drbd1000 linstor-01:
received new current UUID: 1EF05D749E76B63D weak_nodes=FFFFFFFFFFFFFFFC
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: Preparing
remote state change 1032811290
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: Committing
remote state change 1032811290 (primary_nodes=5)
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: peer(
Secondary -> Primary )
Oct 27 11:04:40 linstor-03 kernel: drbd test05: Preparing cluster-wide
state change 4274765809 (1->-1 3/1)
Oct 27 11:04:40 linstor-03 kernel: drbd test05: State change 4274765809:
primary_nodes=7, weak_nodes=FFFFFFFFFFFFFFF8
Oct 27 11:04:40 linstor-03 kernel: drbd test05: Committing cluster-wide
state change 4274765809 (0ms)
Oct 27 11:04:40 linstor-03 kernel: drbd test05: role( Secondary -> Primary )
Oct 27 11:04:40 linstor-03 systemd[1]: Finished Promotion of DRBD
resource test05.
Oct 27 11:04:40 linstor-03 systemd[1]: Starting drbd-reactor controlled
ocf.ra...
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839779]: Oct 27 11:04:40
INFO: Running start for /dev/drbd/by-res/test05/0 on /srv/ha/internal/test05
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839775]: Filesystem:
fs_cluster_private_test05: NOTIFY READY=1 STATUS=calling monitor every
30 seconds
Oct 27 11:04:40 linstor-03 kernel: EXT4-fs (drbd1000): recovery complete
Oct 27 11:04:40 linstor-03 kernel: EXT4-fs (drbd1000): mounted
filesystem with ordered data mode. Opts: (null)
Oct 27 11:04:40 linstor-03 kernel: ext4 filesystem being mounted at
/srv/ha/internal/test05 supports timestamps until 2038 (0x7fffffff)
Oct 27 11:04:40 linstor-03 systemd[1]: Started drbd-reactor controlled
ocf.ra.
Oct 27 11:04:40 linstor-03 systemd[1]: Starting drbd-reactor controlled
ocf.ra...
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839857]: portblock:
pblock0_test05: NOTIFY READY=1 STATUS=calling monitor every 30 seconds
Oct 27 11:04:40 linstor-03 systemd[1]: Started drbd-reactor controlled
ocf.ra.
Oct 27 11:04:40 linstor-03 systemd[1]: Starting drbd-reactor controlled
ocf.ra...
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839876]: Oct 27 11:04:40
INFO: Adding inet address 10.105.0.30/24 with broadcast address
10.105.0.255 to device enp4s0f0
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839876]: Oct 27 11:04:40
INFO: Bringing device enp4s0f0 up
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839876]: Oct 27 11:04:40
INFO: /usr/lib/heartbeat/send_arp -i 200 -r 5 -p
/run/resource-agents/send_arp-10.105.0.30 enp4s0f0 10.105.0.30 auto
not_used not_used
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839874]: IPaddr2:
service_ip0_test05: NOTIFY READY=1 STATUS=calling monitor every 30 seconds
Oct 27 11:04:40 linstor-03 systemd[1]: Started drbd-reactor controlled
ocf.ra.
Oct 27 11:04:40 linstor-03 systemd[1]: Starting drbd-reactor controlled
ocf.ra...
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839943]: Oct 27 11:04:40
WARNING: Configuration parameter "portals" is not supported by the iSCSI
implementation and will be ignored.
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839963]: tgtadm: failed to
send request hdr to tgt daemon, Transport endpoint is not connected
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839943]: Oct 27 11:04:40
ERROR: tgtadm: failed to send request hdr to tgt daemon, Transport
endpoint is not connected
Oct 27 11:04:40 linstor-03 systemd[1]: ocf.ra@target_test05.service:
Main process exited, code=exited, status=1/FAILURE
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839990]: tgtadm: failed to
send request hdr to tgt daemon, Transport endpoint is not connected
Oct 27 11:04:40 linstor-03 systemd[1]: ocf.ra@target_test05.service:
Failed with result 'exit-code'.
Oct 27 11:04:40 linstor-03 systemd[1]: Failed to start drbd-reactor
controlled ocf.ra.
Oct 27 11:04:40 linstor-03 systemd[1]: Dependency failed for
drbd-reactor controlled ocf.ra.
Oct 27 11:04:40 linstor-03 systemd[1]: Dependency failed for
drbd-reactor controlled ocf.ra.
Oct 27 11:04:40 linstor-03 drbd-reactor[2839769]: A dependency job for
drbd-services@test05.target failed. See 'journalctl -xe' for details.
The error is on TGT start action. But, I do not know how to fix that.
Trying to launch it using "tgtd -f" changed nothing, the device is still
not available.
Eg:
root@linstor-03:~# tgtd -f
tgtd: iser_ib_init(3431) Failed to initialize RDMA; load kernel modules?
tgtd: work_timer_start(146) use timer_fd based scheduler
tgtd: bs_init(387) use signalfd notification
tgtd: device_mgmt(246) sz:31 params:path=/dev/drbd/by-res/test05/1
tgtd: bs_thread_open(409) 16
Do you have any idea to make that UP ? I do not have any more ideas ....
Thank you for any help you may provide.
Regards,
Nicolas.
PS: (my "fix" is push on my fork,
https://github.com/nicolasb827/linstor-gateway/tree/target-id-parameter)
_______________________________________________
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
drbd-user@lists.linbit.com
https://lists.linbit.com/mailman/listinfo/drbd-user
I am trying to deploy linstor gateway on a 3 nodes cluster on Debian 11.7.
I added the parameter "target id" on linstor-gateway to handle the
parameter "tid" in OCF ressources, because without it, I had:
ocf-exit-reason:Missing resource parameter "tid"!
But, I still have an error on tgt.
Well, here are the details:
root@linstor-01:~# cat /proc/drbd
version: 9.2.5 (api:2/proto:86-122)
GIT-hash: b44520271e63d4b6f359a6642eb4d475b7cc04e0 build by
root@linstor-01, 2023-10-10 01:29:10
Transports (api:18): tcp (9.2.5)
root@linstor-01:~# drbdadm -V
DRBDADM_BUILDTAG=GIT-hash:\ bb297231c27690a31bf527e8bf77dca1fc2ce268\
build\ by\ root@linstor-01\,\ 2023-10-10\ 23:37:11
DRBDADM_API_VERSION=2
DRBD_KERNEL_VERSION_CODE=0x090205
DRBD_KERNEL_VERSION=9.2.5
DRBDADM_VERSION_CODE=0x091900
DRBDADM_VERSION=9.25.0
I am trying to provide a 10G iscsi device, with the command:
root@linstor-01:~# linstor-gateway iscsi create
iqn.2023-10.com.example:test05 10.105.0.30/24 10G -r oneRessourceGroup
--implementation tgt -t 2
Created iSCSI target 'iqn.2023-10.com.example:test05'
So, no error is reported on creation.
I created before a drbd device 'linstor_db' which is replicated between
all nodes, and it is mounted successfully.
I add some "linstor" outputs here:
node
??????????????????????????????????????????????????????????????
? Node ? NodeType ? Addresses ? State ?
??????????????????????????????????????????????????????????????
? linstor-01 ? SATELLITE ? 10.105.0.31:3366 (PLAIN) ? Online ?
? linstor-02 ? SATELLITE ? 10.105.0.32:3366 (PLAIN) ? Online ?
? linstor-03 ? SATELLITE ? 10.105.0.33:3366 (PLAIN) ? Online ?
??????????????????????????????????????????????????????????????
physical-storage
?????????????????????????????
? Size ? Rotational ? Nodes ?
?????????????????????????????
?????????????????????????????
storage-pool
???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
? StoragePool ? Node ? Driver ? PoolName ? FreeCapacity
? TotalCapacity ? CanSnapshots ? State ? SharedName ?
???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
? DfltDisklessStorPool ? linstor-01 ? DISKLESS ? ?
? ? False ? Ok ? linstor-01;DfltDisklessStorPool ?
? DfltDisklessStorPool ? linstor-02 ? DISKLESS ? ?
? ? False ? Ok ? linstor-02;DfltDisklessStorPool ?
? DfltDisklessStorPool ? linstor-03 ? DISKLESS ? ?
? ? False ? Ok ? linstor-03;DfltDisklessStorPool ?
? storage ? linstor-01 ? ZFS ? storage ? 8.44 TiB
? 10.91 TiB ? True ? Ok ? linstor-01;storage ?
? storage ? linstor-02 ? ZFS ? storage ? 8.42 TiB
? 10.91 TiB ? True ? Ok ? linstor-02;storage ?
? storage ? linstor-03 ? ZFS ? storage ? 8.42 TiB
? 10.91 TiB ? True ? Ok ? linstor-03;storage ?
???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
resource-group
??????????????????????????????????????????????????????????????????????
? ResourceGroup ? SelectFilter ? VlmNrs ? Description ?
??????????????????????????????????????????????????????????????????????
? DfltRscGrp ? PlaceCount: 2 ? ? ?
??????????????????????????????????????????????????????????????????????
? oneRessourceGroup ? PlaceCount: 2 ? 0 ? ?
? ? StoragePool(s): storage ? ? ?
??????????????????????????????????????????????????????????????????????
resource
??????????????????????????????????????????????????????????????????????????????????????
? ResourceName ? Node ? Port ? Usage ? Conns ? State ?
CreatedOn ?
??????????????????????????????????????????????????????????????????????????????????????
? linstor_db ? linstor-01 ? 7001 ? InUse ? Ok ? UpToDate ?
2023-10-14 00:07:02 ?
? linstor_db ? linstor-02 ? 7001 ? Unused ? Ok ? UpToDate ?
2023-10-14 00:07:02 ?
? linstor_db ? linstor-03 ? 7001 ? Unused ? Ok ? UpToDate ?
2023-10-14 00:07:02 ?
? test05 ? linstor-01 ? 7000 ? Unused ? Ok ? Diskless ?
2023-10-27 10:54:47 ?
? test05 ? linstor-02 ? 7000 ? Unused ? Ok ? UpToDate ?
2023-10-27 10:54:58 ?
? test05 ? linstor-03 ? 7000 ? Unused ? Ok ? UpToDate ?
2023-10-27 10:54:58 ?
??????????????????????????????????????????????????????????????????????????????????????
volume-definition
???????????????????????????????????????????????????????????????????
? ResourceName ? VolumeNr ? VolumeMinor ? Size ? Gross ? State ?
???????????????????????????????????????????????????????????????????
? linstor_db ? 0 ? 1001 ? 200 MiB ? ? ok ?
? test05 ? 0 ? 1000 ? 64 MiB ? ? ok ?
? test05 ? 1 ? 1002 ? 10 GiB ? ? ok ?
???????????????????????????????????????????????????????????????????
resource-definition
???????????????????????????????????????????????????
? ResourceName ? Port ? ResourceGroup ? State ?
???????????????????????????????????????????????????
? linstor_db ? 7001 ? DfltRscGrp ? ok ?
? test05 ? 7000 ? oneRessourceGroup ? ok ?
???????????????????????????????????????????????????
volume
????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
? Node ? Resource ? StoragePool ? VolNr ? MinorNr ?
DeviceName ? Allocated ? InUse ? State ?
????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
? linstor-01 ? linstor_db ? storage ? 0 ? 1001 ?
/dev/drbd1001 ? 18.61 MiB ? InUse ? UpToDate ?
? linstor-02 ? linstor_db ? storage ? 0 ? 1001 ?
/dev/drbd1001 ? 18.61 MiB ? Unused ? UpToDate ?
? linstor-03 ? linstor_db ? storage ? 0 ? 1001 ?
/dev/drbd1001 ? 18.61 MiB ? Unused ? UpToDate ?
? linstor-01 ? test05 ? DfltDisklessStorPool ? 0 ? 1000 ?
/dev/drbd1000 ? ? Unused ? Diskless ?
? linstor-01 ? test05 ? DfltDisklessStorPool ? 1 ? 1002 ?
/dev/drbd1002 ? ? Unused ? Diskless ?
? linstor-02 ? test05 ? storage ? 0 ? 1000 ?
/dev/drbd1000 ? 204 KiB ? Unused ? UpToDate ?
? linstor-02 ? test05 ? storage ? 1 ? 1002 ?
/dev/drbd1002 ? 3.67 MiB ? Unused ? UpToDate ?
? linstor-03 ? test05 ? storage ? 0 ? 1000 ?
/dev/drbd1000 ? 204 KiB ? Unused ? UpToDate ?
? linstor-03 ? test05 ? storage ? 1 ? 1002 ?
/dev/drbd1002 ? 3.67 MiB ? Unused ? UpToDate ?
????????????????????????????????????????????????????????????????????????????????????????????????????????????????????+--------------------------------+----------------+---------------+-----+---------------+
| IQN | Service IP | Service state | LUN
| LINSTOR state |
+--------------------------------+----------------+---------------+-----+---------------+
| iqn.2023-10.com.example:test05 | 10.105.0.30/24 | Stopped | 1
| OK |
+--------------------------------+----------------+---------------+-----+---------------+
The service is stopped, but it seems that there is no error reported.
root@linstor-01:~# linstor-gateway iscsi start
iqn.2023-10.com.example:test05
Started target "iqn.2023-10.com.example:test05"
root@linstor-01:~# linstor-gateway iscsi list
+--------------------------------+----------------+---------------+-----+---------------+
| IQN | Service IP | Service state | LUN
| LINSTOR state |
+--------------------------------+----------------+---------------+-----+---------------+
| iqn.2023-10.com.example:test05 | 10.105.0.30/24 | Stopped | 1
| OK |
+--------------------------------+----------------+---------------+-----+---------------+
The service is still stopped...
If I "watch" drbdadm status, I see that the "Primary" state loops among
all servers, and fallback to secondary.
(on the third node)
test05 role:Secondary
volume:0 disk:UpToDate
volume:1 disk:UpToDate
linstor-01 role:Secondary
volume:0 peer-disk:Diskless
volume:1 peer-disk:Diskless
linstor-02 role:Secondary
volume:0 peer-disk:UpToDate
volume:1 peer-disk:UpToDate
So ... digging into journalctl :
Oct 27 11:04:39 linstor-03 drbd-reactor[1731492]: INFO
[drbd_reactor::plugin::promoter] systemd_start: systemctl start
drbd-services@test05.target
Oct 27 11:04:39 linstor-03 systemd[1]: Starting Promotion of DRBD
resource test05...
Oct 27 11:04:40 linstor-03 kernel: drbd test05: Preparing cluster-wide
state change 1823090526 (1->-1 3/1)
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: Aborting
local state change 1823090526 to yield to remote state change 1553699760.
Oct 27 11:04:40 linstor-03 kernel: drbd test05: Aborting cluster-wide
state change 1823090526 (0ms) rv = -19
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: Preparing
remote state change 1553699760
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: Aborting
remote state change 1553699760
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-01: Preparing
remote state change 2189658367
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-01: Committing
remote state change 2189658367 (primary_nodes=4)
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-01: peer(
Secondary -> Primary )
Oct 27 11:04:40 linstor-03 kernel: drbd test05/0 drbd1000 linstor-01:
received new current UUID: 1EF05D749E76B63D weak_nodes=FFFFFFFFFFFFFFFC
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: Preparing
remote state change 1032811290
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: Committing
remote state change 1032811290 (primary_nodes=5)
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: peer(
Secondary -> Primary )
Oct 27 11:04:40 linstor-03 kernel: drbd test05: Preparing cluster-wide
state change 4274765809 (1->-1 3/1)
Oct 27 11:04:40 linstor-03 kernel: drbd test05: State change 4274765809:
primary_nodes=7, weak_nodes=FFFFFFFFFFFFFFF8
Oct 27 11:04:40 linstor-03 kernel: drbd test05: Committing cluster-wide
state change 4274765809 (0ms)
Oct 27 11:04:40 linstor-03 kernel: drbd test05: role( Secondary -> Primary )
Oct 27 11:04:40 linstor-03 systemd[1]: Finished Promotion of DRBD
resource test05.
Oct 27 11:04:40 linstor-03 systemd[1]: Starting drbd-reactor controlled
ocf.ra...
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839779]: Oct 27 11:04:40
INFO: Running start for /dev/drbd/by-res/test05/0 on /srv/ha/internal/test05
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839775]: Filesystem:
fs_cluster_private_test05: NOTIFY READY=1 STATUS=calling monitor every
30 seconds
Oct 27 11:04:40 linstor-03 kernel: EXT4-fs (drbd1000): recovery complete
Oct 27 11:04:40 linstor-03 kernel: EXT4-fs (drbd1000): mounted
filesystem with ordered data mode. Opts: (null)
Oct 27 11:04:40 linstor-03 kernel: ext4 filesystem being mounted at
/srv/ha/internal/test05 supports timestamps until 2038 (0x7fffffff)
Oct 27 11:04:40 linstor-03 systemd[1]: Started drbd-reactor controlled
ocf.ra.
Oct 27 11:04:40 linstor-03 systemd[1]: Starting drbd-reactor controlled
ocf.ra...
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839857]: portblock:
pblock0_test05: NOTIFY READY=1 STATUS=calling monitor every 30 seconds
Oct 27 11:04:40 linstor-03 systemd[1]: Started drbd-reactor controlled
ocf.ra.
Oct 27 11:04:40 linstor-03 systemd[1]: Starting drbd-reactor controlled
ocf.ra...
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839876]: Oct 27 11:04:40
INFO: Adding inet address 10.105.0.30/24 with broadcast address
10.105.0.255 to device enp4s0f0
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839876]: Oct 27 11:04:40
INFO: Bringing device enp4s0f0 up
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839876]: Oct 27 11:04:40
INFO: /usr/lib/heartbeat/send_arp -i 200 -r 5 -p
/run/resource-agents/send_arp-10.105.0.30 enp4s0f0 10.105.0.30 auto
not_used not_used
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839874]: IPaddr2:
service_ip0_test05: NOTIFY READY=1 STATUS=calling monitor every 30 seconds
Oct 27 11:04:40 linstor-03 systemd[1]: Started drbd-reactor controlled
ocf.ra.
Oct 27 11:04:40 linstor-03 systemd[1]: Starting drbd-reactor controlled
ocf.ra...
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839943]: Oct 27 11:04:40
WARNING: Configuration parameter "portals" is not supported by the iSCSI
implementation and will be ignored.
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839963]: tgtadm: failed to
send request hdr to tgt daemon, Transport endpoint is not connected
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839943]: Oct 27 11:04:40
ERROR: tgtadm: failed to send request hdr to tgt daemon, Transport
endpoint is not connected
Oct 27 11:04:40 linstor-03 systemd[1]: ocf.ra@target_test05.service:
Main process exited, code=exited, status=1/FAILURE
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839990]: tgtadm: failed to
send request hdr to tgt daemon, Transport endpoint is not connected
Oct 27 11:04:40 linstor-03 systemd[1]: ocf.ra@target_test05.service:
Failed with result 'exit-code'.
Oct 27 11:04:40 linstor-03 systemd[1]: Failed to start drbd-reactor
controlled ocf.ra.
Oct 27 11:04:40 linstor-03 systemd[1]: Dependency failed for
drbd-reactor controlled ocf.ra.
Oct 27 11:04:40 linstor-03 systemd[1]: Dependency failed for
drbd-reactor controlled ocf.ra.
Oct 27 11:04:40 linstor-03 drbd-reactor[2839769]: A dependency job for
drbd-services@test05.target failed. See 'journalctl -xe' for details.
The error is on TGT start action. But, I do not know how to fix that.
Trying to launch it using "tgtd -f" changed nothing, the device is still
not available.
Eg:
root@linstor-03:~# tgtd -f
tgtd: iser_ib_init(3431) Failed to initialize RDMA; load kernel modules?
tgtd: work_timer_start(146) use timer_fd based scheduler
tgtd: bs_init(387) use signalfd notification
tgtd: device_mgmt(246) sz:31 params:path=/dev/drbd/by-res/test05/1
tgtd: bs_thread_open(409) 16
Do you have any idea to make that UP ? I do not have any more ideas ....
Thank you for any help you may provide.
Regards,
Nicolas.
PS: (my "fix" is push on my fork,
https://github.com/nicolasb827/linstor-gateway/tree/target-id-parameter)
_______________________________________________
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
drbd-user@lists.linbit.com
https://lists.linbit.com/mailman/listinfo/drbd-user