Mailing List Archive

linstor-gateway 1.3.0 on Debian 11.7
Hello,

I am trying to deploy linstor gateway on a 3 nodes cluster on Debian 11.7.

I added the parameter "target id" on linstor-gateway to handle the
parameter "tid" in OCF ressources, because without it, I had:

ocf-exit-reason:Missing resource parameter "tid"!

But, I still have an error on tgt.

Well, here are the details:

root@linstor-01:~# cat /proc/drbd
version: 9.2.5 (api:2/proto:86-122)
GIT-hash: b44520271e63d4b6f359a6642eb4d475b7cc04e0 build by
root@linstor-01, 2023-10-10 01:29:10
Transports (api:18): tcp (9.2.5)

root@linstor-01:~# drbdadm -V
DRBDADM_BUILDTAG=GIT-hash:\ bb297231c27690a31bf527e8bf77dca1fc2ce268\
build\ by\ root@linstor-01\,\ 2023-10-10\ 23:37:11
DRBDADM_API_VERSION=2
DRBD_KERNEL_VERSION_CODE=0x090205
DRBD_KERNEL_VERSION=9.2.5
DRBDADM_VERSION_CODE=0x091900
DRBDADM_VERSION=9.25.0

I am trying to provide a 10G iscsi device, with the command:

root@linstor-01:~# linstor-gateway iscsi create
iqn.2023-10.com.example:test05 10.105.0.30/24 10G -r oneRessourceGroup
--implementation tgt -t 2
Created iSCSI target 'iqn.2023-10.com.example:test05'

So, no error is reported on creation.

I created before a drbd device 'linstor_db' which is replicated between
all nodes, and it is mounted successfully.

I add some "linstor" outputs here:

node
??????????????????????????????????????????????????????????????
? Node       ? NodeType  ? Addresses                ? State  ?
??????????????????????????????????????????????????????????????
? linstor-01 ? SATELLITE ? 10.105.0.31:3366 (PLAIN) ? Online ?
? linstor-02 ? SATELLITE ? 10.105.0.32:3366 (PLAIN) ? Online ?
? linstor-03 ? SATELLITE ? 10.105.0.33:3366 (PLAIN) ? Online ?
??????????????????????????????????????????????????????????????
physical-storage
?????????????????????????????
? Size ? Rotational ? Nodes ?
?????????????????????????????
?????????????????????????????
storage-pool
???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
? StoragePool          ? Node       ? Driver   ? PoolName ? FreeCapacity
? TotalCapacity ? CanSnapshots ? State ? SharedName                      ?
???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
? DfltDisklessStorPool ? linstor-01 ? DISKLESS ? ?             
?               ? False        ? Ok    ? linstor-01;DfltDisklessStorPool ?
? DfltDisklessStorPool ? linstor-02 ? DISKLESS ? ?             
?               ? False        ? Ok    ? linstor-02;DfltDisklessStorPool ?
? DfltDisklessStorPool ? linstor-03 ? DISKLESS ? ?             
?               ? False        ? Ok    ? linstor-03;DfltDisklessStorPool ?
? storage              ? linstor-01 ? ZFS      ? storage  ? 8.44 TiB
?     10.91 TiB ? True         ? Ok    ? linstor-01;storage              ?
? storage              ? linstor-02 ? ZFS      ? storage  ? 8.42 TiB
?     10.91 TiB ? True         ? Ok    ? linstor-02;storage              ?
? storage              ? linstor-03 ? ZFS      ? storage  ? 8.42 TiB
?     10.91 TiB ? True         ? Ok    ? linstor-03;storage              ?
???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
resource-group
??????????????????????????????????????????????????????????????????????
? ResourceGroup     ? SelectFilter            ? VlmNrs ? Description ?
??????????????????????????????????????????????????????????????????????
? DfltRscGrp        ? PlaceCount: 2           ? ?             ?
??????????????????????????????????????????????????????????????????????
? oneRessourceGroup ? PlaceCount: 2           ? 0 ?             ?
?                   ? StoragePool(s): storage ? ?             ?
??????????????????????????????????????????????????????????????????????
resource
??????????????????????????????????????????????????????????????????????????????????????
? ResourceName ? Node       ? Port ? Usage  ? Conns ?    State ?
CreatedOn           ?
??????????????????????????????????????????????????????????????????????????????????????
? linstor_db   ? linstor-01 ? 7001 ? InUse  ? Ok    ? UpToDate ?
2023-10-14 00:07:02 ?
? linstor_db   ? linstor-02 ? 7001 ? Unused ? Ok    ? UpToDate ?
2023-10-14 00:07:02 ?
? linstor_db   ? linstor-03 ? 7001 ? Unused ? Ok    ? UpToDate ?
2023-10-14 00:07:02 ?
? test05       ? linstor-01 ? 7000 ? Unused ? Ok    ? Diskless ?
2023-10-27 10:54:47 ?
? test05       ? linstor-02 ? 7000 ? Unused ? Ok    ? UpToDate ?
2023-10-27 10:54:58 ?
? test05       ? linstor-03 ? 7000 ? Unused ? Ok    ? UpToDate ?
2023-10-27 10:54:58 ?
??????????????????????????????????????????????????????????????????????????????????????
volume-definition
???????????????????????????????????????????????????????????????????
? ResourceName ? VolumeNr ? VolumeMinor ? Size    ? Gross ? State ?
???????????????????????????????????????????????????????????????????
? linstor_db   ? 0        ? 1001        ? 200 MiB ?       ? ok    ?
? test05       ? 0        ? 1000        ? 64 MiB  ?       ? ok    ?
? test05       ? 1        ? 1002        ? 10 GiB  ?       ? ok    ?
???????????????????????????????????????????????????????????????????
resource-definition
???????????????????????????????????????????????????
? ResourceName ? Port ? ResourceGroup     ? State ?
???????????????????????????????????????????????????
? linstor_db   ? 7001 ? DfltRscGrp        ? ok    ?
? test05       ? 7000 ? oneRessourceGroup ? ok    ?
???????????????????????????????????????????????????
volume
????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
? Node       ? Resource   ? StoragePool          ? VolNr ? MinorNr ?
DeviceName    ? Allocated ? InUse  ?    State ?
????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
? linstor-01 ? linstor_db ? storage              ?     0 ? 1001 ?
/dev/drbd1001 ? 18.61 MiB ? InUse  ? UpToDate ?
? linstor-02 ? linstor_db ? storage              ?     0 ? 1001 ?
/dev/drbd1001 ? 18.61 MiB ? Unused ? UpToDate ?
? linstor-03 ? linstor_db ? storage              ?     0 ? 1001 ?
/dev/drbd1001 ? 18.61 MiB ? Unused ? UpToDate ?
? linstor-01 ? test05     ? DfltDisklessStorPool ?     0 ? 1000 ?
/dev/drbd1000 ?           ? Unused ? Diskless ?
? linstor-01 ? test05     ? DfltDisklessStorPool ?     1 ? 1002 ?
/dev/drbd1002 ?           ? Unused ? Diskless ?
? linstor-02 ? test05     ? storage              ?     0 ? 1000 ?
/dev/drbd1000 ?   204 KiB ? Unused ? UpToDate ?
? linstor-02 ? test05     ? storage              ?     1 ? 1002 ?
/dev/drbd1002 ?  3.67 MiB ? Unused ? UpToDate ?
? linstor-03 ? test05     ? storage              ?     0 ? 1000 ?
/dev/drbd1000 ?   204 KiB ? Unused ? UpToDate ?
? linstor-03 ? test05     ? storage              ?     1 ? 1002 ?
/dev/drbd1002 ?  3.67 MiB ? Unused ? UpToDate ?
????????????????????????????????????????????????????????????????????????????????????????????????????????????????????+--------------------------------+----------------+---------------+-----+---------------+
|              IQN               |   Service IP   | Service state | LUN
| LINSTOR state |
+--------------------------------+----------------+---------------+-----+---------------+
| iqn.2023-10.com.example:test05 | 10.105.0.30/24 | Stopped       |   1
| OK            |
+--------------------------------+----------------+---------------+-----+---------------+

The service is stopped, but it seems that there is no error reported.

root@linstor-01:~# linstor-gateway iscsi start
iqn.2023-10.com.example:test05
Started target "iqn.2023-10.com.example:test05"

root@linstor-01:~# linstor-gateway iscsi list
+--------------------------------+----------------+---------------+-----+---------------+
|              IQN               |   Service IP   | Service state | LUN
| LINSTOR state |
+--------------------------------+----------------+---------------+-----+---------------+
| iqn.2023-10.com.example:test05 | 10.105.0.30/24 | Stopped       |   1
| OK            |
+--------------------------------+----------------+---------------+-----+---------------+

The service is still stopped...

If I "watch" drbdadm status, I see that the "Primary" state loops among
all servers, and fallback to secondary.

(on the third node)

test05 role:Secondary
  volume:0 disk:UpToDate
  volume:1 disk:UpToDate
  linstor-01 role:Secondary
    volume:0 peer-disk:Diskless
    volume:1 peer-disk:Diskless
  linstor-02 role:Secondary
    volume:0 peer-disk:UpToDate
    volume:1 peer-disk:UpToDate

So ... digging into journalctl :

Oct 27 11:04:39 linstor-03 drbd-reactor[1731492]: INFO
[drbd_reactor::plugin::promoter] systemd_start: systemctl start
drbd-services@test05.target
Oct 27 11:04:39 linstor-03 systemd[1]: Starting Promotion of DRBD
resource test05...
Oct 27 11:04:40 linstor-03 kernel: drbd test05: Preparing cluster-wide
state change 1823090526 (1->-1 3/1)
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: Aborting
local state change 1823090526 to yield to remote state change 1553699760.
Oct 27 11:04:40 linstor-03 kernel: drbd test05: Aborting cluster-wide
state change 1823090526 (0ms) rv = -19
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: Preparing
remote state change 1553699760
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: Aborting
remote state change 1553699760
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-01: Preparing
remote state change 2189658367
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-01: Committing
remote state change 2189658367 (primary_nodes=4)
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-01: peer(
Secondary -> Primary )
Oct 27 11:04:40 linstor-03 kernel: drbd test05/0 drbd1000 linstor-01:
received new current UUID: 1EF05D749E76B63D weak_nodes=FFFFFFFFFFFFFFFC
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: Preparing
remote state change 1032811290
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: Committing
remote state change 1032811290 (primary_nodes=5)
Oct 27 11:04:40 linstor-03 kernel: drbd test05 linstor-02: peer(
Secondary -> Primary )
Oct 27 11:04:40 linstor-03 kernel: drbd test05: Preparing cluster-wide
state change 4274765809 (1->-1 3/1)
Oct 27 11:04:40 linstor-03 kernel: drbd test05: State change 4274765809:
primary_nodes=7, weak_nodes=FFFFFFFFFFFFFFF8
Oct 27 11:04:40 linstor-03 kernel: drbd test05: Committing cluster-wide
state change 4274765809 (0ms)
Oct 27 11:04:40 linstor-03 kernel: drbd test05: role( Secondary -> Primary )
Oct 27 11:04:40 linstor-03 systemd[1]: Finished Promotion of DRBD
resource test05.
Oct 27 11:04:40 linstor-03 systemd[1]: Starting drbd-reactor controlled
ocf.ra...
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839779]: Oct 27 11:04:40
INFO: Running start for /dev/drbd/by-res/test05/0 on /srv/ha/internal/test05
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839775]: Filesystem:
fs_cluster_private_test05: NOTIFY READY=1 STATUS=calling monitor every
30 seconds
Oct 27 11:04:40 linstor-03 kernel: EXT4-fs (drbd1000): recovery complete
Oct 27 11:04:40 linstor-03 kernel: EXT4-fs (drbd1000): mounted
filesystem with ordered data mode. Opts: (null)
Oct 27 11:04:40 linstor-03 kernel: ext4 filesystem being mounted at
/srv/ha/internal/test05 supports timestamps until 2038 (0x7fffffff)
Oct 27 11:04:40 linstor-03 systemd[1]: Started drbd-reactor controlled
ocf.ra.
Oct 27 11:04:40 linstor-03 systemd[1]: Starting drbd-reactor controlled
ocf.ra...
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839857]: portblock:
pblock0_test05: NOTIFY READY=1 STATUS=calling monitor every 30 seconds
Oct 27 11:04:40 linstor-03 systemd[1]: Started drbd-reactor controlled
ocf.ra.
Oct 27 11:04:40 linstor-03 systemd[1]: Starting drbd-reactor controlled
ocf.ra...
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839876]: Oct 27 11:04:40
INFO: Adding inet address 10.105.0.30/24 with broadcast address
10.105.0.255 to device enp4s0f0
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839876]: Oct 27 11:04:40
INFO: Bringing device enp4s0f0 up
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839876]: Oct 27 11:04:40
INFO: /usr/lib/heartbeat/send_arp  -i 200 -r 5 -p
/run/resource-agents/send_arp-10.105.0.30 enp4s0f0 10.105.0.30 auto
not_used not_used
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839874]: IPaddr2:
service_ip0_test05: NOTIFY READY=1 STATUS=calling monitor every 30 seconds
Oct 27 11:04:40 linstor-03 systemd[1]: Started drbd-reactor controlled
ocf.ra.
Oct 27 11:04:40 linstor-03 systemd[1]: Starting drbd-reactor controlled
ocf.ra...
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839943]: Oct 27 11:04:40
WARNING: Configuration parameter "portals" is not supported by the iSCSI
implementation and will be ignored.
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839963]: tgtadm: failed to
send request hdr to tgt daemon, Transport endpoint is not connected
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839943]: Oct 27 11:04:40
ERROR: tgtadm: failed to send request hdr to tgt daemon, Transport
endpoint is not connected
Oct 27 11:04:40 linstor-03 systemd[1]: ocf.ra@target_test05.service:
Main process exited, code=exited, status=1/FAILURE
Oct 27 11:04:40 linstor-03 ocf.ra.wrapper.sh[2839990]: tgtadm: failed to
send request hdr to tgt daemon, Transport endpoint is not connected
Oct 27 11:04:40 linstor-03 systemd[1]: ocf.ra@target_test05.service:
Failed with result 'exit-code'.
Oct 27 11:04:40 linstor-03 systemd[1]: Failed to start drbd-reactor
controlled ocf.ra.
Oct 27 11:04:40 linstor-03 systemd[1]: Dependency failed for
drbd-reactor controlled ocf.ra.
Oct 27 11:04:40 linstor-03 systemd[1]: Dependency failed for
drbd-reactor controlled ocf.ra.
Oct 27 11:04:40 linstor-03 drbd-reactor[2839769]: A dependency job for
drbd-services@test05.target failed. See 'journalctl -xe' for details.

The error is on TGT start action. But, I do not know how to fix that.

Trying to launch it using "tgtd -f" changed nothing, the device is still
not available.

Eg:

root@linstor-03:~# tgtd -f
tgtd: iser_ib_init(3431) Failed to initialize RDMA; load kernel modules?
tgtd: work_timer_start(146) use timer_fd based scheduler
tgtd: bs_init(387) use signalfd notification
tgtd: device_mgmt(246) sz:31 params:path=/dev/drbd/by-res/test05/1
tgtd: bs_thread_open(409) 16

Do you have any idea to make that UP ? I do not have any more ideas ....

Thank you for any help you may provide.

Regards,

Nicolas.

PS: (my "fix" is push on my fork,
https://github.com/nicolasb827/linstor-gateway/tree/target-id-parameter)

_______________________________________________
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
drbd-user@lists.linbit.com
https://lists.linbit.com/mailman/listinfo/drbd-user