Mailing List Archive

[Bug] Bring up Dom0 on Arm board
Hello,
I try to run Xen on a Rockchip RK3588 board and encountered some problems.
The command I used:
load mmc 1:1 0xC400000 dom0-Image;
load mmc 1:1 0x47C00000 xen4.14.5;
load mmc 1:1 0x47E00000 rk3588-evb7-lp4-v10-linux.dtb
fdt addr 0x47E00000
fdt resize 1024
fdt set /chosen \#address-cells <0x2>
fdt set /chosen \#size-cells <0x2>
fdt set /chosen xen,xen-bootargs "console=dtuart dtuart=serial2 dom0_mem=4G dom0_max_vcpus=4 vwfi=native sched=null"
fdt mknod /chosen dom0
fdt set /chosen/dom0 compatible "xen,linux-zimage" "xen,multiboot-module" "multiboot,module"
fdt set /chosen/dom0 reg <0x0 0xC400000 0x0 0x2000000>
fdt set /chosen xen,dom0-bootargs "console=hvc0 earlycon=xen earlyprintk=xen clk_ignore_unused root=/dev/mmcblk0p6 rw rootwait"
setenv fdt_high 0xffffffffffffffff
booti 0x47C00000 - 0x47E00000
1. Device tree generation failed errors.
when I used the default dtb to run xen, Painc occured on xen.
log:
(XEN) Unable to get irq 0 for /pcie@fe180000/legacy-interrupt-controller
(XEN) Device tree generation failed (-1).
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Could not set up DOM0 guest OS
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...
the dtb:
pcie2x1l1_intc: legacy-interrupt-controller {
interrupt-controller;
#address-cells = <0>;
#interrupt-cells = <1>;
interrupt-parent = <&gic>;
interrupts = <GIC_SPI 245 IRQ_TYPE_EDGE_RISING>;
};
I modified the legacy-interrupt-controller of interrupts from IRQ_TYPE_EDGE_RISING to IRQ_TYPE_LEVEL_HIGH.
And bring up Xen successed, through I not sure the modification is correct.
2. After boot up, I tried to input in the console but failed.
I added some log in api do_trap_guest_sync, try_handle_mmio as below:
In function do_trap_guest_sync:
static unsigned long ec = 0;
if(hsr.ec != ec)
{
gprintk(XENLOG_INFO, "do_trap_guest_sync hsr.ec=%x \n", hsr.ec);
ec = hsr.ec;
}
In function try_handle_mmio:
gprintk(XENLOG_INFO, "handler->addr: %lx\n", handler->addr);
Then everytime I type enter in the console, console show the log below:
(XEN) d0v0 do_trap_guest_sync hsr.ec=24
(XEN) d0v0 handler->addr: fe600000
(XEN) d0v0 handler->addr: fe600000
(XEN) d0v0 do_trap_guest_sync hsr.ec=18
Is that something wrong with the GIC interrupt ?
3. In Dom0, the dev mali0 and mmcblk2 is missing, and weston running failed.
While I can't input in the console, I tried use console via ssh.
In the /dev list, I can't find mali0 and mmcblk2(sdcard),
In u-boot mode, mmcblk2 can be recognized, I loaded dom0-Image, xen, and dtb from mmcblk2.
While booting without xen, the mali0 and mmcblk2 can be recognized,
Is that something wrong with xen while Initialize the driver?
4. xl command can not executed, and seems to be suspended.
Type the xl list or xl create command via ssh console, the command not responding until type ctrl+c.
Type xl info command succeed, and show some info in the console.
How can I resolve the above errors? Any one can give me some advice?
Best regards
Cailigang
Re: [Bug] Bring up Dom0 on Arm board [ In reply to ]
(+ Bertrand and Stefano)

Hello,

On 06/01/2023 06:41, ??? wrote:
> I try to run Xen on a Rockchip RK3588 board and encountered some problems.
> The command I used:
> load mmc 1:1 0xC400000 dom0-Image;
> load mmc 1:1 0x47C00000 xen4.14.5;

We have made a lot of improvement since Xen 4.14. This is also out of
support since January 2022. It is still security supported but not for
long (July 2023).

Would you be able to try Xen 4.17 (this was released a month ago)?

> load mmc 1:1 0x47E00000 rk3588-evb7-lp4-v10-linux.dtb
> fdt addr 0x47E00000
> fdt resize 1024
> fdt set /chosen \#address-cells <0x2>
> fdt set /chosen \#size-cells <0x2>
> fdt set /chosen xen,xen-bootargs "console=dtuart dtuart=serial2 dom0_mem=4G dom0_max_vcpus=4 vwfi=native sched=null"
> fdt mknod /chosen dom0
> fdt set /chosen/dom0 compatible "xen,linux-zimage" "xen,multiboot-module" "multiboot,module"
> fdt set /chosen/dom0 reg <0x0 0xC400000 0x0 0x2000000>
> fdt set /chosen xen,dom0-bootargs "console=hvc0 earlycon=xen earlyprintk=xen clk_ignore_unused root=/dev/mmcblk0p6 rw rootwait"
> setenv fdt_high 0xffffffffffffffff
> booti 0x47C00000 - 0x47E00000
> 1. Device tree generation failed errors.
> when I used the default dtb to run xen, Painc occured on xen.
> log:
> (XEN) Unable to get irq 0 for /pcie@fe180000/legacy-interrupt-controller
> (XEN) Device tree generation failed (-1).
> (XEN)
> (XEN) ****************************************
> (XEN) Panic on CPU 0:
> (XEN) Could not set up DOM0 guest OS
> (XEN) ****************************************
> (XEN)
> (XEN) Reboot in five seconds...
> the dtb:
> pcie2x1l1_intc: legacy-interrupt-controller {
> interrupt-controller;
> #address-cells = <0>;
> #interrupt-cells = <1>;
> interrupt-parent = <&gic>;
> interrupts = <GIC_SPI 245 IRQ_TYPE_EDGE_RISING>;
> }; > I modified the legacy-interrupt-controller of interrupts from
IRQ_TYPE_EDGE_RISING to IRQ_TYPE_LEVEL_HIGH.

Based on this change, I would say the call to irq_set_spi_type() (called
from platform_get_irq()) will return -1. The function will validate the
type and will throw an error if there is a problem.

Can you confirm whether the interrupt is shared with another device? Is
it described twice in the DT?

If yes to one of the two questions. Is the type different?

You could also print the old and new type in irq_set_spi_type() to
confirm the difference.

> And bring up Xen successed, through I not sure the modification is correct.
> 2. After boot up, I tried to input in the console but failed.
> I added some log in api do_trap_guest_sync, try_handle_mmio as below:
> In function do_trap_guest_sync:
> static unsigned long ec = 0;
> if(hsr.ec != ec)
> {
> gprintk(XENLOG_INFO, "do_trap_guest_sync hsr.ec=%x \n", hsr.ec);
> ec = hsr.ec;
> }
> In function try_handle_mmio:
> gprintk(XENLOG_INFO, "handler->addr: %lx\n", handler->addr);
> Then everytime I type enter in the console, console show the log below:
> (XEN) d0v0 do_trap_guest_sync hsr.ec=24
> (XEN) d0v0 handler->addr: fe600000
> (XEN) d0v0 handler->addr: fe600000
> (XEN) d0v0 do_trap_guest_sync hsr.ec=18
> Is that something wrong with the GIC interrupt ?
A few questions:
* What is the corresponding device in the host physical address space
for 0xfe600000?
* What is the UART on your board? Is there any specific workaround
required?

> 3. In Dom0, the dev mali0 and mmcblk2 is missing, and weston running failed.
Do you have any log in the kernel indicating why the mali and/or the mmc
driver didn't load?

Also, can you confirm that the same kernel image works without Xen?

> While I can't input in the console, I tried use console via ssh.
> In the /dev list, I can't find mali0 and mmcblk2(sdcard),
> In u-boot mode, mmcblk2 can be recognized, I loaded dom0-Image, xen, and dtb from mmcblk2.
> While booting without xen, the mali0 and mmcblk2 can be recognized,
> Is that something wrong with xen while Initialize the driver?
> 4. xl command can not executed, and seems to be suspended.

xl requires the initscript (or systemd service) to be executed. The fact
it hangs usually means this didn't happen.

Just in case, can you also check that your kernel has been build with
Xen support?

> Type the xl list or xl create command via ssh console, the command not responding until type ctrl+c.
> Type xl info command succeed, and show some info in the console.

Cheers,

--
Julien Grall
Re: [Bug] Bring up Dom0 on Arm board [ In reply to ]
Hi Julien Grall,
Thank you for your reply.
> (+ Bertrand and Stefano)
>
> Hello,
>
> On 06/01/2023 06:41, ??? wrote:
>> I try to run Xen on a Rockchip RK3588 board and encountered some problems.
>> The command I used:
>> load mmc 1:1 0xC400000 dom0-Image;
>> load mmc 1:1 0x47C00000 xen4.14.5;
>
> We have made a lot of improvement since Xen 4.14. This is also out of
> support since January 2022. It is still security supported but not for
> long (July 2023).
>
> Would you be able to try Xen 4.17 (this was released a month ago)?
I also tried the Xen4.17.0, But failed to run xl command in dom0, the error like below:
root@RK3588:~# xl list
libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve the maximum number of cpus
libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve the maximum number of cpus
libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve the maximum number of cpus
libxl: error: libxl_domain.c:334:libxl_list_domain: getting domain info list: Permission denied
libxl_list_domain failed.
In Rootfs, Xen tool version is 4.14.3,
I suspect that Xen tool and Xen hypervisor version conflict cause this problem, is that right?
And although I used Xen4.17.0, The problems I mentioned are still there,
The Device tree generation failed error, the dev mali0 and mmcblk2 still failed to run.
>> load mmc 1:1 0x47E00000 rk3588-evb7-lp4-v10-linux.dtb
>> fdt addr 0x47E00000
>> fdt resize 1024
>> fdt set /chosen \#address-cells <0x2>
>> fdt set /chosen \#size-cells <0x2>
>> fdt set /chosen xen,xen-bootargs "console=dtuart dtuart=serial2 dom0_mem=4G dom0_max_vcpus=4 vwfi=native sched=null"
>> fdt mknod /chosen dom0
>> fdt set /chosen/dom0 compatible "xen,linux-zimage" "xen,multiboot-module" "multiboot,module"
>> fdt set /chosen/dom0 reg <0x0 0xC400000 0x0 0x2000000>
>> fdt set /chosen xen,dom0-bootargs "console=hvc0 earlycon=xen earlyprintk=xen clk_ignore_unused root=/dev/mmcblk0p6 rw rootwait"
>> setenv fdt_high 0xffffffffffffffff
>> booti 0x47C00000 - 0x47E00000
>> 1. Device tree generation failed errors.
>> when I used the default dtb to run xen, Painc occured on xen.
>> log:
>> (XEN) Unable to get irq 0 for /pcie@fe180000/legacy-interrupt-controller
>> (XEN) Device tree generation failed (-1).
>> (XEN)
>> (XEN) ****************************************
>> (XEN) Panic on CPU 0:
>> (XEN) Could not set up DOM0 guest OS
>> (XEN) ****************************************
>> (XEN)
>> (XEN) Reboot in five seconds...
>> the dtb:
>> pcie2x1l1_intc: legacy-interrupt-controller {
>> interrupt-controller;
>> #address-cells = <0>;
>> #interrupt-cells = <1>;
>> interrupt-parent = <&gic>;
>> interrupts = <GIC_SPI 245 IRQ_TYPE_EDGE_RISING>;
>> }; > I modified the legacy-interrupt-controller of interrupts from
>> IRQ_TYPE_EDGE_RISING to IRQ_TYPE_LEVEL_HIGH.
>
> Based on this change, I would say the call to irq_set_spi_type() (called
> from platform_get_irq()) will return -1. The function will validate the
> type and will throw an error if there is a problem.
>
> Can you confirm whether the interrupt is shared with another device? Is
> it described twice in the DT?
>
> If yes to one of the two questions. Is the type different?
>
> You could also print the old and new type in irq_set_spi_type() to
> confirm the difference.
It may cause by the interrupt interrupt-controller@fe600000,
set the interrupt IRQ_TYPE_LEVEL_HIGH first according to interrupt-controller@fe600000 ,
then irq_set_spi_type() try to set the interrupt IRQ_TYPE_EDGE_RISING according to
pcie2x1l1_intc: legacy-interrupt-controller, but return -1.
the gic: interrupt-controller@fe600000 like below:
gic: interrupt-controller@fe600000 {
compatible = "arm,gic-v3";
#interrupt-cells = <3>;
#address-cells = <2>;
#size-cells = <2>;
ranges;
interrupt-controller;
reg = <0x0 0xfe600000 0 0x10000>, /* GICD */
<0x0 0xfe680000 0 0x100000>; /* GICR */
interrupts = <GIC_PPI 9 IRQ_TYPE_LEVEL_HIGH>;
its0: msi-controller@fe640000 {
compatible = "arm,gic-v3-its";
msi-controller;
#msi-cells = <1>;
reg = <0x0 0xfe640000 0x0 0x20000>;
};
its1: msi-controller@fe660000 {
compatible = "arm,gic-v3-its";
msi-controller;
#msi-cells = <1>;
reg = <0x0 0xfe660000 0x0 0x20000>;
};
};
>> And bring up Xen successed, through I not sure the modification is correct.
>> 2. After boot up, I tried to input in the console but failed.
>> I added some log in api do_trap_guest_sync, try_handle_mmio as below:
>> In function do_trap_guest_sync:
>> static unsigned long ec = 0;
>> if(hsr.ec != ec)
>> {
>> gprintk(XENLOG_INFO, "do_trap_guest_sync hsr.ec=%x \n", hsr.ec);
>> ec = hsr.ec;
>> }
>> In function try_handle_mmio:
>> gprintk(XENLOG_INFO, "handler->addr: %lx\n", handler->addr);
>> Then everytime I type enter in the console, console show the log below:
>> (XEN) d0v0 do_trap_guest_sync hsr.ec=24
>> (XEN) d0v0 handler->addr: fe600000
>> (XEN) d0v0 handler->addr: fe600000
>> (XEN) d0v0 do_trap_guest_sync hsr.ec=18
>> Is that something wrong with the GIC interrupt ?
> A few questions:
> * What is the corresponding device in the host physical address space
> for 0xfe600000?
> * What is the UART on your board? Is there any specific workaround
> required?
0xfe600000 is: gic: interrupt-controller@fe600000, full content above.
The UART is 8250, I set menuconfig in Debugging Options, the config like below:
[*] Early printk (Early printk via 8250 UART) --->
(0Xfeb50000) Early printk, physical base address of debug UART
(2) Early printk, left-shift to apply to the register offsets within the 8250 UART
I found that if I config the early printk in xen, I don't need the xen,dom0-bootargs=
"console=hvc0 earlycon=xen earlyprintk=xen" anymore, is that right?
>> 3. In Dom0, the dev mali0 and mmcblk2 is missing, and weston running failed.
> Do you have any log in the kernel indicating why the mali and/or the mmc
> driver didn't load?
>
> Also, can you confirm that the same kernel image works without Xen?
Boot without xen, the mali0 log like below:
root@RK3588:/# dmesg | grep mali
[ 4.192093] mali fb000000.gpu: Kernel DDK version g12p0-01eac0
[ 4.192148] mali fb000000.gpu: Looking up mali-supply from device tree
[ 4.194569] mali fb000000.gpu: Looking up mem-supply from device tree
[ 4.194747] mali fb000000.gpu: Looking up mali-supply from device tree
[ 4.194792] mali fb000000.gpu: Looking up mem-supply from device tree
[ 4.195383] mali fb000000.gpu: leakage=16
[ 4.195457] mali fb000000.gpu: Looking up mali-supply from device tree
[ 4.197004] mali fb000000.gpu: pvtm=858
[ 4.197099] mali fb000000.gpu: pvtm-volt-sel=2
[ 4.198437] mali fb000000.gpu: avs=0
[ 4.201271] W : [File] : drivers/gpu/arm/bifrost/platform/rk/mali_kbase_config_rk.c; [Line] : 136; [Func] :
kbase_platform_rk_init(); power-off-delay-ms not available.
[ 4.206668] mali fb000000.gpu: GPU hardware issue table may need updating:
[ 4.206683] mali fb000000.gpu: GPU identified as 0x7 arch 10.8.6 r0p0 status 0
[ 4.206810] mali fb000000.gpu: No priority control manager is configured
[ 4.206823] mali fb000000.gpu: No memory group manager is configured
[ 4.206852] mali fb000000.gpu: Protected memory allocator not available
[ 4.208342] mali fb000000.gpu: Couldn't find power_model DT node matching 'arm,mali-simple-power-model'
[ 4.208356] mali fb000000.gpu: Error -22, no DT entry: mali-simple-power-model.static-coefficient = 1*[0]
[ 4.208572] mali fb000000.gpu: Error -22, no DT entry: mali-simple-power-model.dynamic-coefficient = 1*[0]
[ 4.208766] mali fb000000.gpu: Error -22, no DT entry: mali-simple-power-model.ts = 4*[0]
[ 4.208958] mali fb000000.gpu: Error -22, no DT entry: mali-simple-power-model.thermal-zone = ''
[ 4.212287] mali fb000000.gpu: Using configured power model mali-lodx-power-model, and fallback mali-simple-power-model
[ 4.212539] mali fb000000.gpu: l=10000 h=85000 hyst=5000 l_limit=0 h_limit=800000000 h_table=0
[ 4.214528] mali fb000000.gpu: Probed as mali0
[ 4.318492] I : [File] : drivers/gpu/arm/mali400/mali/linux/mali_kernel_linux.c; [Line] : 405; [Func] : mali_module_init();
svn_rev_string_from_arm of this mali_ko is '', rk_ko_ver is '5', built at '10:04:19', on 'Dec 12 2022'.
[ 6.959913] mali fb000000.gpu: Loading Mali firmware 0x1010000
[ 6.960491] mali fb000000.gpu: Protected memory allocator not found, Firmware protected mode entry will not be supported
[ 6.960498] mali fb000000.gpu: Protected memory allocator not found, Firmware protected mode entry will not be supported
[ 6.960503] mali fb000000.gpu: Protected memory allocator not found, Firmware protected mode entry will not be supported
Boot with xen, the mali0 log like below:
[ 2.969638] I : [File] : drivers/gpu/arm/mali400/mali/linux/mali_kernel_linux.c; [Line] : 405; [Func] : mali_module_init();
svn_rev_string_from_arm of this mali_ko is '', rk_ko_ver is '5', built at '14:06:00', on 'Dec 16 2022'.
Boot without xen, the mmcblk2 log like below:
root@RK3588:/# dmesg |grep sdmmc
root@RK3588:/# dmesg |grep mmc
[ 1.842460] Kernel command line: storagemedia=emmc androidboot.storagemedia=emmc androidboot.mode=normal
androidboot.verifiedbootstate=orange rw rootwait earlycon=uart8250,mmio32,0xfeb50000
console=ttyFIQ0 irqchip.gicv3_pseudo_nmi=0 root=PARTUUID=614e0000-0000
[ 3.981216] dwmmc_rockchip fe2c0000.mmc: IDMAC supports 32-bit address mode.
[ 3.981321] dwmmc_rockchip fe2c0000.mmc: Using internal DMA controller.
[ 3.981349] dwmmc_rockchip fe2c0000.mmc: Version ID is 270a
[ 3.981435] dwmmc_rockchip fe2c0000.mmc: DW MMC controller at irq 77,32 bit host data width,256 deep fifo
[ 3.981588] dwmmc_rockchip fe2c0000.mmc: Looking up vmmc-supply from device tree
[ 3.982932] dwmmc_rockchip fe2c0000.mmc: Looking up vqmmc-supply from device tree
[ 3.983121] sdhci-dwcmshc fe2e0000.mmc: Looking up vmmc-supply from device tree
[ 3.983135] sdhci-dwcmshc fe2e0000.mmc: Looking up vmmc-supply property in node /mmc@fe2e0000 failed
[ 3.983168] sdhci-dwcmshc fe2e0000.mmc: Looking up vqmmc-supply from device tree
[ 3.983177] sdhci-dwcmshc fe2e0000.mmc: Looking up vqmmc-supply property in node /mmc@fe2e0000 failed
[ 3.983294] dwmmc_rockchip fe2c0000.mmc: Failed getting OCR mask: -22
[ 3.983461] dwmmc_rockchip fe2c0000.mmc: could not set regulator OCR (-22)
[ 3.983473] dwmmc_rockchip fe2c0000.mmc: failed to enable vmmc regulator
[ 3.995539] mmc_host mmc2: Bus speed (slot 0) = 400000Hz (slot req 400000Hz, actual 400000HZ div = 0)
[ 4.012246] mmc0: SDHCI controller on fe2e0000.mmc [fe2e0000.mmc] using ADMA
[ 4.043129] mmc_host mmc2: Bus speed (slot 0) = 49500000Hz (slot req 50000000Hz, actual 49500000HZ div = 0)
[ 4.043689] mmc2: new high speed SDHC card at address 0007
[ 4.044681] mmcblk2: mmc2:0007 SD8GB 7.21 GiB
[ 4.047294] mmcblk2: p1
[ 4.060614] mmc0: new HS400 Enhanced strobe MMC card at address 0001
[ 4.061406] mmcblk0: mmc0:0001 BJTD4R 29.1 GiB
[ 4.061539] mmcblk0boot0: mmc0:0001 BJTD4R partition 1 4.00 MiB
[ 4.061663] mmcblk0boot1: mmc0:0001 BJTD4R partition 2 4.00 MiB
[ 4.062273] mmcblk0rpmb: mmc0:0001 BJTD4R partition 3 4.00 MiB, chardev (236:0)
[ 4.068960] mmcblk0: p1 p2 p3 p4 p5 p6 p7 p8
[ 5.835901] EXT4-fs (mmcblk0p6): recovery complete
[ 5.836462] EXT4-fs (mmcblk0p6): mounted filesystem with ordered data mode. Opts: (null)
[ 5.839971] storagemedia=emmc
[ 5.867859] EXT4-fs (mmcblk0p6): re-mounted. Opts: (null)
[ 6.409008] FAT-fs (mmcblk2p1): utf8 is not a recommended IO charset for FAT filesystems, filesystem will be case sensitive!
[ 6.414043] FAT-fs (mmcblk2p1): Volume was not properly unmounted. Some data may be corrupt. Please run fsck.
[ 7.039362] EXT4-fs (mmcblk0p7): mounting ext2 file system using the ext4 subsystem
[ 7.040313] EXT4-fs (mmcblk0p7): warning: mounting unchecked fs, running e2fsck is recommended
[ 7.041225] EXT4-fs (mmcblk0p7): mounted filesystem without journal. Opts: (null)
[ 7.162903] EXT4-fs (mmcblk0p8): mounting ext2 file system using the ext4 subsystem
[ 7.165824] EXT4-fs (mmcblk0p8): warning: mounting unchecked fs, running e2fsck is recommended
[ 7.172777] EXT4-fs (mmcblk0p8): mounted filesystem without journal. Opts: (null)
Boot with xen, the mmcblk2(sdmmc) log like below:
root@RK3588:/sys/firmware# dmesg |grep sdmmc
[ 69.563072] rockchip-pm-domain fd8d8000.power-management:power-controller:
Looking up sdmmc-supply from device tree
[ 69.563112] rockchip-pm-domain fd8d8000.power-management:power-controller:
Looking up sdmmc-supply property in node /power-management@fd8d8000/power-controller failed
>> While I can't input in the console, I tried use console via ssh.
>> In the /dev list, I can't find mali0 and mmcblk2(sdcard),
>> In u-boot mode, mmcblk2 can be recognized, I loaded dom0-Image, xen, and dtb from mmcblk2.
>> While booting without xen, the mali0 and mmcblk2 can be recognized,
>> Is that something wrong with xen while Initialize the driver?
>> 4. xl command can not executed, and seems to be suspended.
>
> xl requires the initscript (or systemd service) to be executed. The fact
> it hangs usually means this didn't happen.
>
> Just in case, can you also check that your kernel has been build with
> Xen support?
initscript is xendriverdomain? I tried the command in dom0 like below:
(xl list command suspended)
root@RK3588:/# ./etc/init.d/xendriverdomain restart
root@RK3588:/#
root@RK3588:/# ps aux |grep xen
root 59 0.0 0.0 0 0 ? S 00:00 0:00 [xenbus]
root 60 0.0 0.0 0 0 ? S 00:00 0:00 [xenwatch]
root 165 0.0 0.0 0 0 ? D 00:00 0:00 [xenbus_probe]
root 5993 0.0 0.0 3044 380 pts/0 S+ 00:09 0:00 grep xen
root@RK3588:/# xl list
Name ID Mem VCPUs State Time(s)
I config the kernel according the manuals:
https://wiki.xenproject.org/wiki/Mainline_Linux_Kernel_Configs#Configuring_the_Kernel_for_dom0_Support
And used the kernel/arch/arm64/boot/Image as the dom0-Image.
How can I check the Kernel has been build with Xen support?
I used to check the kernel by Image file modification time.
>> Type the xl list or xl create command via ssh console, the command not responding until type ctrl+c.
>> Type xl info command succeed, and show some info in the console.
>
> Cheers,
>
> --
> Julien Grall
Best regards
Cailigang
Re: [Bug] Bring up Dom0 on Arm board [ In reply to ]
Hi,

On 10/01/2023 02:35, ??? wrote:
>> On 06/01/2023 06:41, ??? wrote:
>>> I try to run Xen on a Rockchip RK3588 board and encountered some problems.
>>> The command I used:
>>> load mmc 1:1 0xC400000 dom0-Image;
>>> load mmc 1:1 0x47C00000 xen4.14.5;
>>
>> We have made a lot of improvement since Xen 4.14. This is also out of
>> support since January 2022. It is still security supported but not for
>> long (July 2023).
>>
>> Would you be able to try Xen 4.17 (this was released a month ago)?
> I also tried the Xen4.17.0, But failed to run xl command in dom0, the error like below:
> root@RK3588:~# xl list
> libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve the maximum number of cpus
> libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve the maximum number of cpus
> libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve the maximum number of cpus
> libxl: error: libxl_domain.c:334:libxl_list_domain: getting domain info list: Permission denied
> libxl_list_domain failed.
> In Rootfs, Xen tool version is 4.14.3,
> I suspect that Xen tool and Xen hypervisor version conflict cause this problem, is that right?

Part of the ABI used between the tools and the hypervisor is not stable.
So you will need to rebuild the tools for every new major releases (for
minor releases it is usually not necessary).

> And although I used Xen4.17.0, The problems I mentioned are still there,
> The Device tree generation failed error, the dev mali0 and mmcblk2 still failed to run.

I will reply to this below.

>>> load mmc 1:1 0x47E00000 rk3588-evb7-lp4-v10-linux.dtb
>>> fdt addr 0x47E00000
>>> fdt resize 1024
>>> fdt set /chosen \#address-cells <0x2>
>>> fdt set /chosen \#size-cells <0x2>
>>> fdt set /chosen xen,xen-bootargs "console=dtuart dtuart=serial2 dom0_mem=4G dom0_max_vcpus=4 vwfi=native sched=null"
>>> fdt mknod /chosen dom0
>>> fdt set /chosen/dom0 compatible "xen,linux-zimage" "xen,multiboot-module" "multiboot,module"
>>> fdt set /chosen/dom0 reg <0x0 0xC400000 0x0 0x2000000>
>>> fdt set /chosen xen,dom0-bootargs "console=hvc0 earlycon=xen earlyprintk=xen clk_ignore_unused root=/dev/mmcblk0p6 rw rootwait"
>>> setenv fdt_high 0xffffffffffffffff
>>> booti 0x47C00000 - 0x47E00000
>>> 1. Device tree generation failed errors.
>>> when I used the default dtb to run xen, Painc occured on xen.
>>> log:
>>> (XEN) Unable to get irq 0 for /pcie@fe180000/legacy-interrupt-controller
>>> (XEN) Device tree generation failed (-1).
>>> (XEN)
>>> (XEN) ****************************************
>>> (XEN) Panic on CPU 0:
>>> (XEN) Could not set up DOM0 guest OS
>>> (XEN) ****************************************
>>> (XEN)
>>> (XEN) Reboot in five seconds...
>>> the dtb:
>>> pcie2x1l1_intc: legacy-interrupt-controller {
>>> interrupt-controller;
>>> #address-cells = <0>;
>>> #interrupt-cells = <1>;
>>> interrupt-parent = <&gic>;
>>> interrupts = <GIC_SPI 245 IRQ_TYPE_EDGE_RISING>;
>>> }; > I modified the legacy-interrupt-controller of interrupts from
>>> IRQ_TYPE_EDGE_RISING to IRQ_TYPE_LEVEL_HIGH.
>>
>> Based on this change, I would say the call to irq_set_spi_type() (called
>> from platform_get_irq()) will return -1. The function will validate the
>> type and will throw an error if there is a problem.
>>
>> Can you confirm whether the interrupt is shared with another device? Is
>> it described twice in the DT?
>>
>> If yes to one of the two questions. Is the type different?
>>
>> You could also print the old and new type in irq_set_spi_type() to
>> confirm the difference.
> It may cause by the interrupt interrupt-controller@fe600000,
> set the interrupt IRQ_TYPE_LEVEL_HIGH first according to interrupt-controller@fe600000 ,
> then irq_set_spi_type() try to set the interrupt IRQ_TYPE_EDGE_RISING according to
> pcie2x1l1_intc: legacy-interrupt-controller, but return -1.
> the gic: interrupt-controller@fe600000 like below:
> gic: interrupt-controller@fe600000 {
> compatible = "arm,gic-v3";
> #interrupt-cells = <3>;
> #address-cells = <2>;
> #size-cells = <2>;
> ranges;
> interrupt-controller;
> reg = <0x0 0xfe600000 0 0x10000>, /* GICD */
> <0x0 0xfe680000 0 0x100000>; /* GICR */
> interrupts = <GIC_PPI 9 IRQ_TYPE_LEVEL_HIGH>;
> its0: msi-controller@fe640000 {
> compatible = "arm,gic-v3-its";
> msi-controller;
> #msi-cells = <1>;
> reg = <0x0 0xfe640000 0x0 0x20000>;
> };
> its1: msi-controller@fe660000 {
> compatible = "arm,gic-v3-its";
> msi-controller;
> #msi-cells = <1>;
> reg = <0x0 0xfe660000 0x0 0x20000>;
> };
> };

I am a bit confused. Reading the binding, it looks like the GIC and PCI
interrupt controller don't share an interrupt. Can you confirm the IRQ
number you saw in Xen?

>>> And bring up Xen successed, through I not sure the modification is correct.
>>> 2. After boot up, I tried to input in the console but failed.
>>> I added some log in api do_trap_guest_sync, try_handle_mmio as below:
>>> In function do_trap_guest_sync:
>>> static unsigned long ec = 0;
>>> if(hsr.ec != ec)
>>> {
>>> gprintk(XENLOG_INFO, "do_trap_guest_sync hsr.ec=%x \n", hsr.ec);
>>> ec = hsr.ec;
>>> }
>>> In function try_handle_mmio:
>>> gprintk(XENLOG_INFO, "handler->addr: %lx\n", handler->addr);
>>> Then everytime I type enter in the console, console show the log below:
>>> (XEN) d0v0 do_trap_guest_sync hsr.ec=24
>>> (XEN) d0v0 handler->addr: fe600000
>>> (XEN) d0v0 handler->addr: fe600000
>>> (XEN) d0v0 do_trap_guest_sync hsr.ec=18
>>> Is that something wrong with the GIC interrupt ?
>> A few questions:
>> * What is the corresponding device in the host physical address space
>> for 0xfe600000?
>> * What is the UART on your board? Is there any specific workaround
>> required?
> 0xfe600000 is: gic: interrupt-controller@fe600000, full content above.

Thanks. So the trap is expected because the GICD exposed to the domains
is emulated.

> The UART is 8250, I set menuconfig in Debugging Options, the config like below:
> [*] Early printk (Early printk via 8250 UART) --->
> (0Xfeb50000) Early printk, physical base address of debug UART
> (2) Early printk, left-shift to apply to the register offsets within the 8250 UART
> I found that if I config the early printk in xen, I don't need the xen,dom0-bootargs=
> "console=hvc0 earlycon=xen earlyprintk=xen" anymore, is that right?

I don't know the exact configuration of the 8250. So I can't tell
whether this is correct.

That said, as you see some ouput, it would indicate that the
configuration might be right.

This could indicate that Xen is still using early printk and therefore
it would not be able to read character. From your previous email, I see
that you are requesting serial2. I am assuming this is an alias to the
same UART as the one you configure for the early printk?

Can you paste the content of the related Device-Tree node? Also, I would
suggest to check if there are any errors in the Xen logs.

>>> 3. In Dom0, the dev mali0 and mmcblk2 is missing, and weston running failed.
>> Do you have any log in the kernel indicating why the mali and/or the mmc
>> driver didn't load?
>>
>> Also, can you confirm that the same kernel image works without Xen?
> Boot without xen, the mali0 log like below:
> root@RK3588:/# dmesg | grep mali
> [ 4.192093] mali fb000000.gpu: Kernel DDK version g12p0-01eac0
> [ 4.192148] mali fb000000.gpu: Looking up mali-supply from device tree
> [ 4.194569] mali fb000000.gpu: Looking up mem-supply from device tree
> [ 4.194747] mali fb000000.gpu: Looking up mali-supply from device tree
> [ 4.194792] mali fb000000.gpu: Looking up mem-supply from device tree
> [ 4.195383] mali fb000000.gpu: leakage=16
> [ 4.195457] mali fb000000.gpu: Looking up mali-supply from device tree
> [ 4.197004] mali fb000000.gpu: pvtm=858
> [ 4.197099] mali fb000000.gpu: pvtm-volt-sel=2
> [ 4.198437] mali fb000000.gpu: avs=0
> [ 4.201271] W : [File] : drivers/gpu/arm/bifrost/platform/rk/mali_kbase_config_rk.c; [Line] : 136; [Func] :
> kbase_platform_rk_init(); power-off-delay-ms not available.
> [ 4.206668] mali fb000000.gpu: GPU hardware issue table may need updating:
> [ 4.206683] mali fb000000.gpu: GPU identified as 0x7 arch 10.8.6 r0p0 status 0
> [ 4.206810] mali fb000000.gpu: No priority control manager is configured
> [ 4.206823] mali fb000000.gpu: No memory group manager is configured
> [ 4.206852] mali fb000000.gpu: Protected memory allocator not available
> [ 4.208342] mali fb000000.gpu: Couldn't find power_model DT node matching 'arm,mali-simple-power-model'
> [ 4.208356] mali fb000000.gpu: Error -22, no DT entry: mali-simple-power-model.static-coefficient = 1*[0]
> [ 4.208572] mali fb000000.gpu: Error -22, no DT entry: mali-simple-power-model.dynamic-coefficient = 1*[0]
> [ 4.208766] mali fb000000.gpu: Error -22, no DT entry: mali-simple-power-model.ts = 4*[0]
> [ 4.208958] mali fb000000.gpu: Error -22, no DT entry: mali-simple-power-model.thermal-zone = ''
> [ 4.212287] mali fb000000.gpu: Using configured power model mali-lodx-power-model, and fallback mali-simple-power-model
> [ 4.212539] mali fb000000.gpu: l=10000 h=85000 hyst=5000 l_limit=0 h_limit=800000000 h_table=0
> [ 4.214528] mali fb000000.gpu: Probed as mali0
> [ 4.318492] I : [File] : drivers/gpu/arm/mali400/mali/linux/mali_kernel_linux.c; [Line] : 405; [Func] : mali_module_init();
> svn_rev_string_from_arm of this mali_ko is '', rk_ko_ver is '5', built at '10:04:19', on 'Dec 12 2022'.
> [ 6.959913] mali fb000000.gpu: Loading Mali firmware 0x1010000
> [ 6.960491] mali fb000000.gpu: Protected memory allocator not found, Firmware protected mode entry will not be supported
> [ 6.960498] mali fb000000.gpu: Protected memory allocator not found, Firmware protected mode entry will not be supported
> [ 6.960503] mali fb000000.gpu: Protected memory allocator not found, Firmware protected mode entry will not be supported
> Boot with xen, the mali0 log like below:
> [ 2.969638] I : [File] : drivers/gpu/arm/mali400/mali/linux/mali_kernel_linux.c; [Line] : 405; [Func] : mali_module_init();
> svn_rev_string_from_arm of this mali_ko is '', rk_ko_ver is '5', built at '14:06:00', on 'Dec 16 2022'.
So no error at all afterwards? Interestingly, this line is not shown in
your output above. So I would suggest to check the code to understand if
somehow we are using a different path.

> Boot without xen, the mmcblk2 log like below:
> root@RK3588:/# dmesg |grep sdmmc
> root@RK3588:/# dmesg |grep mmc
> [ 1.842460] Kernel command line: storagemedia=emmc androidboot.storagemedia=emmc androidboot.mode=normal
> androidboot.verifiedbootstate=orange rw rootwait earlycon=uart8250,mmio32,0xfeb50000
> console=ttyFIQ0 irqchip.gicv3_pseudo_nmi=0 root=PARTUUID=614e0000-0000
> [ 3.981216] dwmmc_rockchip fe2c0000.mmc: IDMAC supports 32-bit address mode.
> [ 3.981321] dwmmc_rockchip fe2c0000.mmc: Using internal DMA controller.
> [ 3.981349] dwmmc_rockchip fe2c0000.mmc: Version ID is 270a
> [ 3.981435] dwmmc_rockchip fe2c0000.mmc: DW MMC controller at irq 77,32 bit host data width,256 deep fifo
> [ 3.981588] dwmmc_rockchip fe2c0000.mmc: Looking up vmmc-supply from device tree
> [ 3.982932] dwmmc_rockchip fe2c0000.mmc: Looking up vqmmc-supply from device tree
> [ 3.983121] sdhci-dwcmshc fe2e0000.mmc: Looking up vmmc-supply from device tree
> [ 3.983135] sdhci-dwcmshc fe2e0000.mmc: Looking up vmmc-supply property in node /mmc@fe2e0000 failed
> [ 3.983168] sdhci-dwcmshc fe2e0000.mmc: Looking up vqmmc-supply from device tree
> [ 3.983177] sdhci-dwcmshc fe2e0000.mmc: Looking up vqmmc-supply property in node /mmc@fe2e0000 failed
> [ 3.983294] dwmmc_rockchip fe2c0000.mmc: Failed getting OCR mask: -22
> [ 3.983461] dwmmc_rockchip fe2c0000.mmc: could not set regulator OCR (-22)
> [ 3.983473] dwmmc_rockchip fe2c0000.mmc: failed to enable vmmc regulator
> [ 3.995539] mmc_host mmc2: Bus speed (slot 0) = 400000Hz (slot req 400000Hz, actual 400000HZ div = 0)
> [ 4.012246] mmc0: SDHCI controller on fe2e0000.mmc [fe2e0000.mmc] using ADMA
> [ 4.043129] mmc_host mmc2: Bus speed (slot 0) = 49500000Hz (slot req 50000000Hz, actual 49500000HZ div = 0)
> [ 4.043689] mmc2: new high speed SDHC card at address 0007
> [ 4.044681] mmcblk2: mmc2:0007 SD8GB 7.21 GiB
> [ 4.047294] mmcblk2: p1
> [ 4.060614] mmc0: new HS400 Enhanced strobe MMC card at address 0001
> [ 4.061406] mmcblk0: mmc0:0001 BJTD4R 29.1 GiB
> [ 4.061539] mmcblk0boot0: mmc0:0001 BJTD4R partition 1 4.00 MiB
> [ 4.061663] mmcblk0boot1: mmc0:0001 BJTD4R partition 2 4.00 MiB
> [ 4.062273] mmcblk0rpmb: mmc0:0001 BJTD4R partition 3 4.00 MiB, chardev (236:0)
> [ 4.068960] mmcblk0: p1 p2 p3 p4 p5 p6 p7 p8
> [ 5.835901] EXT4-fs (mmcblk0p6): recovery complete
> [ 5.836462] EXT4-fs (mmcblk0p6): mounted filesystem with ordered data mode. Opts: (null)
> [ 5.839971] storagemedia=emmc
> [ 5.867859] EXT4-fs (mmcblk0p6): re-mounted. Opts: (null)
> [ 6.409008] FAT-fs (mmcblk2p1): utf8 is not a recommended IO charset for FAT filesystems, filesystem will be case sensitive!
> [ 6.414043] FAT-fs (mmcblk2p1): Volume was not properly unmounted. Some data may be corrupt. Please run fsck.
> [ 7.039362] EXT4-fs (mmcblk0p7): mounting ext2 file system using the ext4 subsystem
> [ 7.040313] EXT4-fs (mmcblk0p7): warning: mounting unchecked fs, running e2fsck is recommended
> [ 7.041225] EXT4-fs (mmcblk0p7): mounted filesystem without journal. Opts: (null)
> [ 7.162903] EXT4-fs (mmcblk0p8): mounting ext2 file system using the ext4 subsystem
> [ 7.165824] EXT4-fs (mmcblk0p8): warning: mounting unchecked fs, running e2fsck is recommended
> [ 7.172777] EXT4-fs (mmcblk0p8): mounted filesystem without journal. Opts: (null)
> Boot with xen, the mmcblk2(sdmmc) log like below:
> root@RK3588:/sys/firmware# dmesg |grep sdmmc
> [ 69.563072] rockchip-pm-domain fd8d8000.power-management:power-controller:

It looks like the command line between Xen and baremetal is different.
When running under Xen, the command line should mostly be the same
(aside clk_* and console=hvc0). Otherwise you don't compare the same and
therefore the difference may only be due to your command line options.

> Looking up sdmmc-supply from device tree
> [ 69.563112] rockchip-pm-domain fd8d8000.power-management:power-controller:
> Looking up sdmmc-supply property in node /power-management@fd8d8000/power-controller failed

Can you check why this is failing?

>>> While I can't input in the console, I tried use console via ssh.
>>> In the /dev list, I can't find mali0 and mmcblk2(sdcard),
>>> In u-boot mode, mmcblk2 can be recognized, I loaded dom0-Image, xen, and dtb from mmcblk2.
>>> While booting without xen, the mali0 and mmcblk2 can be recognized,
>>> Is that something wrong with xen while Initialize the driver?
>>> 4. xl command can not executed, and seems to be suspended.
>>
>> xl requires the initscript (or systemd service) to be executed. The fact
>> it hangs usually means this didn't happen.
>>
>> Just in case, can you also check that your kernel has been build with
>> Xen support?
> initscript is xendriverdomain? I tried the command in dom0 like below:
> (xl list command suspended)
> root@RK3588:/# ./etc/init.d/xendriverdomain restart

You would want to use xencommons rather than xendriverdomain.

> root@RK3588:/#
> root@RK3588:/# ps aux |grep xen
> root 59 0.0 0.0 0 0 ? S 00:00 0:00 [xenbus]
> root 60 0.0 0.0 0 0 ? S 00:00 0:00 [xenwatch]
> root 165 0.0 0.0 0 0 ? D 00:00 0:00 [xenbus_probe]
> root 5993 0.0 0.0 3044 380 pts/0 S+ 00:09 0:00 grep xen
> root@RK3588:/# xl list
> Name ID Mem VCPUs State Time(s)
> I config the kernel according the manuals:
> https://wiki.xenproject.org/wiki/Mainline_Linux_Kernel_Configs#Configuring_the_Kernel_for_dom0_Support
> And used the kernel/arch/arm64/boot/Image as the dom0-Image.
> How can I check the Kernel has been build with Xen support?

You can grep XEN in your kernel config. You should see some enabled.

But looking at the output above, you don't have xenstored running. So
the most probable cause if that you didn't run xencommons.

Cheers,

--
Julien Grall
Re: [Bug] Bring up Dom0 on Arm board [ In reply to ]
Hi?
> On 10/01/2023 02:35, ??? wrote:
>>> On 06/01/2023 06:41, ??? wrote:
>>>> I try to run Xen on a Rockchip RK3588 board and encountered some problems.
>>>> The command I used:
>>>> load mmc 1:1 0xC400000 dom0-Image;
>>>> load mmc 1:1 0x47C00000 xen4.14.5;
>>>
>>> We have made a lot of improvement since Xen 4.14. This is also out of
>>> support since January 2022. It is still security supported but not for
>>> long (July 2023).
>>>
>>> Would you be able to try Xen 4.17 (this was released a month ago)?
>> I also tried the Xen4.17.0, But failed to run xl command in dom0, the error like below:
>> root@RK3588:~# xl list
>> libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve the maximum number of cpus
>> libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve the maximum number of cpus
>> libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve the maximum number of cpus
>> libxl: error: libxl_domain.c:334:libxl_list_domain: getting domain info list: Permission denied
>> libxl_list_domain failed.
>> In Rootfs, Xen tool version is 4.14.3,
>> I suspect that Xen tool and Xen hypervisor version conflict cause this problem, is that right?
>
> Part of the ABI used between the tools and the hypervisor is not stable.
> So you will need to rebuild the tools for every new major releases (for
> minor releases it is usually not necessary).
>
>> And although I used Xen4.17.0, The problems I mentioned are still there,
>> The Device tree generation failed error, the dev mali0 and mmcblk2 still failed to run.
>
> I will reply to this below.
>
>>>> load mmc 1:1 0x47E00000 rk3588-evb7-lp4-v10-linux.dtb
>>>> fdt addr 0x47E00000
>>>> fdt resize 1024
>>>> fdt set /chosen \#address-cells <0x2>
>>>> fdt set /chosen \#size-cells <0x2>
>>>> fdt set /chosen xen,xen-bootargs "console=dtuart dtuart=serial2 dom0_mem=4G dom0_max_vcpus=4 vwfi=native sched=null"
>>>> fdt mknod /chosen dom0
>>>> fdt set /chosen/dom0 compatible "xen,linux-zimage" "xen,multiboot-module" "multiboot,module"
>>>> fdt set /chosen/dom0 reg <0x0 0xC400000 0x0 0x2000000>
>>>> fdt set /chosen xen,dom0-bootargs "console=hvc0 earlycon=xen earlyprintk=xen clk_ignore_unused root=/dev/mmcblk0p6 rw rootwait"
>>>> setenv fdt_high 0xffffffffffffffff
>>>> booti 0x47C00000 - 0x47E00000
>>>> 1. Device tree generation failed errors.
>>>> when I used the default dtb to run xen, Painc occured on xen.
>>>> log:
>>>> (XEN) Unable to get irq 0 for /pcie@fe180000/legacy-interrupt-controller
>>>> (XEN) Device tree generation failed (-1).
>>>> (XEN)
>>>> (XEN) ****************************************
>>>> (XEN) Panic on CPU 0:
>>>> (XEN) Could not set up DOM0 guest OS
>>>> (XEN) ****************************************
>>>> (XEN)
>>>> (XEN) Reboot in five seconds...
>>>> the dtb:
>>>> pcie2x1l1_intc: legacy-interrupt-controller {
>>>> interrupt-controller;
>>>> #address-cells = <0>;
>>>> #interrupt-cells = <1>;
>>>> interrupt-parent = <&gic>;
>>>> interrupts = <GIC_SPI 245 IRQ_TYPE_EDGE_RISING>;
>>>> }; > I modified the legacy-interrupt-controller of interrupts from
>>>> IRQ_TYPE_EDGE_RISING to IRQ_TYPE_LEVEL_HIGH.
>>>
>>> Based on this change, I would say the call to irq_set_spi_type() (called
>>> from platform_get_irq()) will return -1. The function will validate the
>>> type and will throw an error if there is a problem.
>>>
>>> Can you confirm whether the interrupt is shared with another device? Is
>>> it described twice in the DT?
>>>
>>> If yes to one of the two questions. Is the type different?
>>>
>>> You could also print the old and new type in irq_set_spi_type() to
>>> confirm the difference.
>> It may cause by the interrupt interrupt-controller@fe600000,
>> set the interrupt IRQ_TYPE_LEVEL_HIGH first according to interrupt-controller@fe600000 ,
>> then irq_set_spi_type() try to set the interrupt IRQ_TYPE_EDGE_RISING according to
>> pcie2x1l1_intc: legacy-interrupt-controller, but return -1.
>> the gic: interrupt-controller@fe600000 like below:
>> gic: interrupt-controller@fe600000 {
>> compatible = "arm,gic-v3";
>> #interrupt-cells = <3>;
>> #address-cells = <2>;
>> #size-cells = <2>;
>> ranges;
>> interrupt-controller;
>> reg = <0x0 0xfe600000 0 0x10000>, /* GICD */
>> <0x0 0xfe680000 0 0x100000>; /* GICR */
>> interrupts = <GIC_PPI 9 IRQ_TYPE_LEVEL_HIGH>;
>> its0: msi-controller@fe640000 {
>> compatible = "arm,gic-v3-its";
>> msi-controller;
>> #msi-cells = <1>;
>> reg = <0x0 0xfe640000 0x0 0x20000>;
>> };
>> its1: msi-controller@fe660000 {
>> compatible = "arm,gic-v3-its";
>> msi-controller;
>> #msi-cells = <1>;
>> reg = <0x0 0xfe660000 0x0 0x20000>;
>> };
>> };
>
> I am a bit confused. Reading the binding, it looks like the GIC and PCI
> interrupt controller don't share an interrupt. Can you confirm the IRQ
> number you saw in Xen?
My mistake, my analysis was wrong.
I added the log in the platform_get_irq(), and get the log as below:
(XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 280, irq.type: 4
(XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 279, irq.type: 4
(XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 278, irq.type: 4
(XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 277, irq.type: 4
(XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 276, irq.type: 4
(XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 280, irq.type: 4
(XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 279, irq.type: 4
(XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 278, irq.type: 4
(XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 277, irq.type: 4
(XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 276, irq.type: 4
(XEN) d[IDLE]v0 -----fullname: /pcie@fe180000/legacy-interrupt-controller, irq: 277, irq.type: 1
(XEN) d[IDLE]v0 -----fullname: /pcie@fe180000/legacy-interrupt-controller, irq: 277, irq.type: 1
(XEN) Unable to get irq 0 for /pcie@fe180000/legacy-interrupt-controller
(XEN) Device tree generation failed (-1).
The Node pcie@fe180000:
pcie2x1l1: pcie@fe180000 {
compatible = "rockchip,rk3588-pcie", "snps,dw-pcie";
#address-cells = <3>;
#size-cells = <2>;
bus-range = <0x30 0x3f>;
clocks = <&cru ACLK_PCIE_1L1_MSTR>, <&cru ACLK_PCIE_1L1_SLV>,
<&cru ACLK_PCIE_1L1_DBI>, <&cru PCLK_PCIE_1L1>,
<&cru CLK_PCIE_AUX3>, <&cru CLK_PCIE1L1_PIPE>;
clock-names = "aclk_mst", "aclk_slv",
"aclk_dbi", "pclk",
"aux", "pipe";
device_type = "pci";
interrupts = <GIC_SPI 248 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 247 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 246 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 245 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 244 IRQ_TYPE_LEVEL_HIGH>;
interrupt-names = "sys", "pmc", "msg", "legacy", "err";
#interrupt-cells = <1>;
interrupt-map-mask = <0 0 0 7>;
interrupt-map = <0 0 0 1 &pcie2x1l1_intc 0>,
<0 0 0 2 &pcie2x1l1_intc 1>,
<0 0 0 3 &pcie2x1l1_intc 2>,
<0 0 0 4 &pcie2x1l1_intc 3>;
linux,pci-domain = <3>;
num-ib-windows = <8>;
num-ob-windows = <8>;
num-viewport = <4>;
max-link-speed = <2>;
msi-map = <0x3000 &its0 0x3000 0x1000>;
num-lanes = <1>;
phys = <&combphy2_psu PHY_TYPE_PCIE>;
phy-names = "pcie-phy";
ranges = <0x00000800 0x0 0xf3000000 0x0 0xf3000000 0x0 0x100000
0x81000000 0x0 0xf3100000 0x0 0xf3100000 0x0 0x100000
0x82000000 0x0 0xf3200000 0x0 0xf3200000 0x0 0xe00000
0xc3000000 0x9 0xc0000000 0x9 0xc0000000 0x0 0x40000000>;
reg = <0x0 0xfe180000 0x0 0x10000>,
<0xa 0x40c00000 0x0 0x400000>;
reg-names = "pcie-apb", "pcie-dbi";
resets = <&cru SRST_PCIE3_POWER_UP>, <&cru SRST_P_PCIE3>;
reset-names = "pcie", "periph";
rockchip,pipe-grf = <&php_grf>;
status = "disabled";
pcie2x1l1_intc: legacy-interrupt-controller {
interrupt-controller;
#address-cells = <0>;
#interrupt-cells = <1>;
interrupt-parent = <&gic>;
interrupts = <GIC_SPI 245 IRQ_TYPE_EDGE_RISING>;
};
};
I did not find IRQ number equal 277 in node pcie@fe180000, but found in node gpio@fd8a0000.
The Node gpio@fd8a0000:
gpio0: gpio@fd8a0000 {
compatible = "rockchip,gpio-bank";
reg = <0x0 0xfd8a0000 0x0 0x100>;
interrupts = <GIC_SPI 277 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&cru PCLK_GPIO0>, <&cru DBCLK_GPIO0>;
gpio-controller;
#gpio-cells = <2>;
gpio-ranges = <&pinctrl 0 0 32>;
interrupt-controller;
#interrupt-cells = <2>;
};
I'm confused, can you explain it?
>>>> And bring up Xen successed, through I not sure the modification is correct.
>>>> 2. After boot up, I tried to input in the console but failed.
>>>> I added some log in api do_trap_guest_sync, try_handle_mmio as below:
>>>> In function do_trap_guest_sync:
>>>> static unsigned long ec = 0;
>>>> if(hsr.ec != ec)
>>>> {
>>>> gprintk(XENLOG_INFO, "do_trap_guest_sync hsr.ec=%x \n", hsr.ec);
>>>> ec = hsr.ec;
>>>> }
>>>> In function try_handle_mmio:
>>>> gprintk(XENLOG_INFO, "handler->addr: %lx\n", handler->addr);
>>>> Then everytime I type enter in the console, console show the log below:
>>>> (XEN) d0v0 do_trap_guest_sync hsr.ec=24
>>>> (XEN) d0v0 handler->addr: fe600000
>>>> (XEN) d0v0 handler->addr: fe600000
>>>> (XEN) d0v0 do_trap_guest_sync hsr.ec=18
>>>> Is that something wrong with the GIC interrupt ?
>>> A few questions:
>>> * What is the corresponding device in the host physical address space
>>> for 0xfe600000?
>>> * What is the UART on your board? Is there any specific workaround
>>> required?
>> 0xfe600000 is: gic: interrupt-controller@fe600000, full content above.
>
> Thanks. So the trap is expected because the GICD exposed to the domains
> is emulated.
So this is not an error? How can I investigate the console can't input problem?
>> The UART is 8250, I set menuconfig in Debugging Options, the config like below:
>> [*] Early printk (Early printk via 8250 UART) --->
>> (0Xfeb50000) Early printk, physical base address of debug UART
>> (2) Early printk, left-shift to apply to the register offsets within the 8250 UART
>> I found that if I config the early printk in xen, I don't need the xen,dom0-bootargs=
>> "console=hvc0 earlycon=xen earlyprintk=xen" anymore, is that right?
>
> I don't know the exact configuration of the 8250. So I can't tell
> whether this is correct.
>
> That said, as you see some ouput, it would indicate that the
> configuration might be right.
If xen,dom0-bootargs contains the "console=hvc0 earlycon=xen earlyprintk=xen"
The boot up log would print twitce in console.
> This could indicate that Xen is still using early printk and therefore
> it would not be able to read character. From your previous email, I see
> that you are requesting serial2. I am assuming this is an alias to the
> same UART as the one you configure for the early printk?
>
> Can you paste the content of the related Device-Tree node? Also, I would
> suggest to check if there are any errors in the Xen logs.
Serial2 is the alias of the UART, some Xen logs as below:
(XEN) adding DT alias:serial2: stem=serial id=2 node=/serial@feb50000
...
(XEN) Looking for dtuart at "serial2", options ""
(XEN) Unable to initialize dtuart: -19
(XEN) Bad console= option 'dtuart'
The node /serial@feb50000:
uart2: serial@feb50000 {
compatible = "rockchip,rk3588-uart", "snps,dw-apb-uart";
reg = <0x0 0xfeb50000 0x0 0x100>;
interrupts = <GIC_SPI 333 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&cru SCLK_UART2>, <&cru PCLK_UART2>;
clock-names = "baudclk", "apb_pclk";
reg-shift = <2>;
reg-io-width = <4>;
dmas = <&dmac0 10>, <&dmac0 11>;
pinctrl-names = "default";
pinctrl-0 = <&uart2m1_xfer>;
status = "disabled";
};
There did an error when parse the xen command line, the command line I input is not correct ?
fdt set /chosen xen,xen-bootargs "console=dtuart dtuart=serial2 dom0_mem=4G dom0_max_vcpus=4 vwfi=native sched=null"
>>>> 3. In Dom0, the dev mali0 and mmcblk2 is missing, and weston running failed.
>>> Do you have any log in the kernel indicating why the mali and/or the mmc
>>> driver didn't load?
>>>
>>> Also, can you confirm that the same kernel image works without Xen?
>> Boot without xen, the mali0 log like below:
>> root@RK3588:/# dmesg | grep mali
>> [ 4.192093] mali fb000000.gpu: Kernel DDK version g12p0-01eac0
>> [ 4.192148] mali fb000000.gpu: Looking up mali-supply from device tree
>> [ 4.194569] mali fb000000.gpu: Looking up mem-supply from device tree
>> [ 4.194747] mali fb000000.gpu: Looking up mali-supply from device tree
>> [ 4.194792] mali fb000000.gpu: Looking up mem-supply from device tree
>> [ 4.195383] mali fb000000.gpu: leakage=16
>> [ 4.195457] mali fb000000.gpu: Looking up mali-supply from device tree
>> [ 4.197004] mali fb000000.gpu: pvtm=858
>> [ 4.197099] mali fb000000.gpu: pvtm-volt-sel=2
>> [ 4.198437] mali fb000000.gpu: avs=0
>> [ 4.201271] W : [File] : drivers/gpu/arm/bifrost/platform/rk/mali_kbase_config_rk.c; [Line] : 136; [Func] :
>> kbase_platform_rk_init(); power-off-delay-ms not available.
>> [ 4.206668] mali fb000000.gpu: GPU hardware issue table may need updating:
>> [ 4.206683] mali fb000000.gpu: GPU identified as 0x7 arch 10.8.6 r0p0 status 0
>> [ 4.206810] mali fb000000.gpu: No priority control manager is configured
>> [ 4.206823] mali fb000000.gpu: No memory group manager is configured
>> [ 4.206852] mali fb000000.gpu: Protected memory allocator not available
>> [ 4.208342] mali fb000000.gpu: Couldn't find power_model DT node matching 'arm,mali-simple-power-model'
>> [ 4.208356] mali fb000000.gpu: Error -22, no DT entry: mali-simple-power-model.static-coefficient = 1*[0]
>> [ 4.208572] mali fb000000.gpu: Error -22, no DT entry: mali-simple-power-model.dynamic-coefficient = 1*[0]
>> [ 4.208766] mali fb000000.gpu: Error -22, no DT entry: mali-simple-power-model.ts = 4*[0]
>> [ 4.208958] mali fb000000.gpu: Error -22, no DT entry: mali-simple-power-model.thermal-zone = ''
>> [ 4.212287] mali fb000000.gpu: Using configured power model mali-lodx-power-model, and fallback mali-simple-power-model
>> [ 4.212539] mali fb000000.gpu: l=10000 h=85000 hyst=5000 l_limit=0 h_limit=800000000 h_table=0
>> [ 4.214528] mali fb000000.gpu: Probed as mali0
>> [ 4.318492] I : [File] : drivers/gpu/arm/mali400/mali/linux/mali_kernel_linux.c; [Line] : 405; [Func] : mali_module_init();
>> svn_rev_string_from_arm of this mali_ko is '', rk_ko_ver is '5', built at '10:04:19', on 'Dec 12 2022'.
>> [ 6.959913] mali fb000000.gpu: Loading Mali firmware 0x1010000
>> [ 6.960491] mali fb000000.gpu: Protected memory allocator not found, Firmware protected mode entry will not be supported
>> [ 6.960498] mali fb000000.gpu: Protected memory allocator not found, Firmware protected mode entry will not be supported
>> [ 6.960503] mali fb000000.gpu: Protected memory allocator not found, Firmware protected mode entry will not be supported
>> Boot with xen, the mali0 log like below:
>> [ 2.969638] I : [File] : drivers/gpu/arm/mali400/mali/linux/mali_kernel_linux.c; [Line] : 405; [Func] : mali_module_init();
>> svn_rev_string_from_arm of this mali_ko is '', rk_ko_ver is '5', built at '14:06:00', on 'Dec 16 2022'.
> So no error at all afterwards? Interestingly, this line is not shown in
> your output above. So I would suggest to check the code to understand if
> somehow we are using a different path.
Yes, I did not find any log about mali0 init or init failed, and mali0 just not bring up when boot with xen.
About the gpu dtb, did xen support the gpu architecture of mali-bifrost?
Current gpu dtb node :
gpu: gpu@fb000000 {
compatible = "arm,mali-bifrost";
reg = <0x0 0xfb000000 0x0 0x200000>;
interrupts = <GIC_SPI 94 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 93 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 92 IRQ_TYPE_LEVEL_HIGH>;
interrupt-names = "GPU", "MMU", "JOB";
clocks = <&scmi_clk SCMI_CLK_GPU>, <&cru CLK_GPU_COREGROUP>,
<&cru CLK_GPU_STACKS>, <&cru CLK_GPU>;
clock-names = "clk_mali", "clk_gpu_coregroup",
"clk_gpu_stacks", "clk_gpu";
assigned-clocks = <&scmi_clk SCMI_CLK_GPU>;
assigned-clock-rates = <200000000>;
power-domains = <&power RK3588_PD_GPU>;
operating-points-v2 = <&gpu_opp_table>;
#cooling-cells = <2>;
dynamic-power-coefficient = <2982>;
upthreshold = <30>;
downdifferential = <10>;
status = "disabled";
};
>> Boot without xen, the mmcblk2 log like below:
>> root@RK3588:/# dmesg |grep sdmmc
>> root@RK3588:/# dmesg |grep mmc
>> [ 1.842460] Kernel command line: storagemedia=emmc androidboot.storagemedia=emmc androidboot.mode=normal
>> androidboot.verifiedbootstate=orange rw rootwait earlycon=uart8250,mmio32,0xfeb50000
>> console=ttyFIQ0 irqchip.gicv3_pseudo_nmi=0 root=PARTUUID=614e0000-0000
>> [ 3.981216] dwmmc_rockchip fe2c0000.mmc: IDMAC supports 32-bit address mode.
>> [ 3.981321] dwmmc_rockchip fe2c0000.mmc: Using internal DMA controller.
>> [ 3.981349] dwmmc_rockchip fe2c0000.mmc: Version ID is 270a
>> [ 3.981435] dwmmc_rockchip fe2c0000.mmc: DW MMC controller at irq 77,32 bit host data width,256 deep fifo
>> [ 3.981588] dwmmc_rockchip fe2c0000.mmc: Looking up vmmc-supply from device tree
>> [ 3.982932] dwmmc_rockchip fe2c0000.mmc: Looking up vqmmc-supply from device tree
>> [ 3.983121] sdhci-dwcmshc fe2e0000.mmc: Looking up vmmc-supply from device tree
>> [ 3.983135] sdhci-dwcmshc fe2e0000.mmc: Looking up vmmc-supply property in node /mmc@fe2e0000 failed
>> [ 3.983168] sdhci-dwcmshc fe2e0000.mmc: Looking up vqmmc-supply from device tree
>> [ 3.983177] sdhci-dwcmshc fe2e0000.mmc: Looking up vqmmc-supply property in node /mmc@fe2e0000 failed
>> [ 3.983294] dwmmc_rockchip fe2c0000.mmc: Failed getting OCR mask: -22
>> [ 3.983461] dwmmc_rockchip fe2c0000.mmc: could not set regulator OCR (-22)
>> [ 3.983473] dwmmc_rockchip fe2c0000.mmc: failed to enable vmmc regulator
>> [ 3.995539] mmc_host mmc2: Bus speed (slot 0) = 400000Hz (slot req 400000Hz, actual 400000HZ div = 0)
>> [ 4.012246] mmc0: SDHCI controller on fe2e0000.mmc [fe2e0000.mmc] using ADMA
>> [ 4.043129] mmc_host mmc2: Bus speed (slot 0) = 49500000Hz (slot req 50000000Hz, actual 49500000HZ div = 0)
>> [ 4.043689] mmc2: new high speed SDHC card at address 0007
>> [ 4.044681] mmcblk2: mmc2:0007 SD8GB 7.21 GiB
>> [ 4.047294] mmcblk2: p1
>> [ 4.060614] mmc0: new HS400 Enhanced strobe MMC card at address 0001
>> [ 4.061406] mmcblk0: mmc0:0001 BJTD4R 29.1 GiB
>> [ 4.061539] mmcblk0boot0: mmc0:0001 BJTD4R partition 1 4.00 MiB
>> [ 4.061663] mmcblk0boot1: mmc0:0001 BJTD4R partition 2 4.00 MiB
>> [ 4.062273] mmcblk0rpmb: mmc0:0001 BJTD4R partition 3 4.00 MiB, chardev (236:0)
>> [ 4.068960] mmcblk0: p1 p2 p3 p4 p5 p6 p7 p8
>> [ 5.835901] EXT4-fs (mmcblk0p6): recovery complete
>> [ 5.836462] EXT4-fs (mmcblk0p6): mounted filesystem with ordered data mode. Opts: (null)
>> [ 5.839971] storagemedia=emmc
>> [ 5.867859] EXT4-fs (mmcblk0p6): re-mounted. Opts: (null)
>> [ 6.409008] FAT-fs (mmcblk2p1): utf8 is not a recommended IO charset for FAT filesystems, filesystem will be case sensitive!
>> [ 6.414043] FAT-fs (mmcblk2p1): Volume was not properly unmounted. Some data may be corrupt. Please run fsck.
>> [ 7.039362] EXT4-fs (mmcblk0p7): mounting ext2 file system using the ext4 subsystem
>> [ 7.040313] EXT4-fs (mmcblk0p7): warning: mounting unchecked fs, running e2fsck is recommended
>> [ 7.041225] EXT4-fs (mmcblk0p7): mounted filesystem without journal. Opts: (null)
>> [ 7.162903] EXT4-fs (mmcblk0p8): mounting ext2 file system using the ext4 subsystem
>> [ 7.165824] EXT4-fs (mmcblk0p8): warning: mounting unchecked fs, running e2fsck is recommended
>> [ 7.172777] EXT4-fs (mmcblk0p8): mounted filesystem without journal. Opts: (null)
>> Boot with xen, the mmcblk2(sdmmc) log like below:
>> root@RK3588:/sys/firmware# dmesg |grep sdmmc
>> [ 69.563072] rockchip-pm-domain fd8d8000.power-management:power-controller:
>
> It looks like the command line between Xen and baremetal is different.
> When running under Xen, the command line should mostly be the same
> (aside clk_* and console=hvc0). Otherwise you don't compare the same and
> therefore the difference may only be due to your command line options.
The origin bootargs as below:
bootargs = "storagemedia=emmc androidboot.storagemedia=emmc androidboot.mode=normal androidboot.verifiedbootstate=orange rw rootwait
earlycon=uart8250,mmio32,0xfeb50000 console=ttyFIQ0 irqchip.gicv3_pseudo_nmi=0 root=PARTUUID=614e0000-0000";
Then I changed the xen,dom0-bootargs as below:
fdt set /chosen xen,dom0-bootargs "clk_ignore_unused storagemedia=emmc androidboot.storagemedia=emmc androidboot.mode=normal
androidboot.verifiedbootstate=orange rw rootwait earlycon=uart8250,mmio32,0xfeb50000 console=hvc0 irqchip.gicv3_pseudo_nmi=0 root=PARTUUID=614e0000-0000"
Then boot with xen and get the log as below:
root@RK3588:~# dmesg |grep mmc
[ 67.921516] Kernel command line: clk_ignore_unused storagemedia=emmc androidboot.storagemedia=emmc androidboot.mode=normal androidboot.verifiedbootstate=orange rw rootwait earlycon=uart8250,mmio32,0xfeb50000 console=hvc0 irqchip.gicv3_pseudo_nmi=0 root=PARTUUID=614e0000-0000
[ 77.089809] sdhci-dwcmshc fe2e0000.mmc: Looking up vmmc-supply from device tree
[ 77.099607] sdhci-dwcmshc fe2e0000.mmc: Looking up vmmc-supply property in node /mmc@fe2e0000 failed
[ 77.115046] sdhci-dwcmshc fe2e0000.mmc: Looking up vqmmc-supply from device tree
[ 77.126871] sdhci-dwcmshc fe2e0000.mmc: Looking up vqmmc-supply property in node /mmc@fe2e0000 failed
[ 77.167364] mmc0: SDHCI controller on fe2e0000.mmc [fe2e0000.mmc] using ADMA
[ 77.315318] mmc0: new HS400 Enhanced strobe MMC card at address 0001
[ 77.325677] mmcblk0: mmc0:0001 BJTD4R 29.1 GiB
[ 77.334497] mmcblk0boot0: mmc0:0001 BJTD4R partition 1 4.00 MiB
[ 77.344191] mmcblk0boot1: mmc0:0001 BJTD4R partition 2 4.00 MiB
[ 77.354610] mmcblk0rpmb: mmc0:0001 BJTD4R partition 3 4.00 MiB, chardev (236:0)
[ 77.368963] mmcblk0: p1 p2 p3 p4 p5 p6 p7 p8
[ 79.123640] rockchip-pm-domain fd8d8000.power-management:power-controller: Looking up sdmmc-supply from device tree
[ 79.141684] rockchip-pm-domain fd8d8000.power-management:power-controller: Looking up sdmmc-supply property in node /power-management@fd8d8000/power-controller failed
[ 84.550779] EXT4-fs (mmcblk0p6): recovery complete
[ 84.556109] EXT4-fs (mmcblk0p6): mounted filesystem with ordered data mode. Opts: (null)
[ 84.603570] storagemedia=emmc
[ 84.628559] EXT4-fs (mmcblk0p6): re-mounted. Opts: (null)
[ 85.748403] EXT4-fs (mmcblk0p7): mounting ext2 file system using the ext4 subsystem
[ 85.758234] EXT4-fs (mmcblk0p7): warning: mounting unchecked fs, running e2fsck is recommended
[ 85.774558] EXT4-fs (mmcblk0p7): mounted filesystem without journal. Opts: (null)
[ 85.898615] EXT4-fs (mmcblk0p8): mounting ext2 file system using the ext4 subsystem
[ 85.909913] EXT4-fs (mmcblk0p8): warning: mounting unchecked fs, running e2fsck is recommended
[ 85.928078] EXT4-fs (mmcblk0p8): mounted filesystem without journal. Opts: (null)
>> Looking up sdmmc-supply from device tree
>> [ 69.563112] rockchip-pm-domain fd8d8000.power-management:power-controller:
>> Looking up sdmmc-supply property in node /power-management@fd8d8000/power-controller failed
>
> Can you check why this is failing?
I'm trying to figure out why is failing, But I'm confused, the sdmmc has not a sdmmc-supply field.
The sdmmc node:
sdmmc: mmc@fe2c0000 {
compatible = "rockchip,rk3588-dw-mshc", "rockchip,rk3288-dw-mshc";
reg = <0x0 0xfe2c0000 0x0 0x4000>;
interrupts = <GIC_SPI 203 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&scmi_clk SCMI_HCLK_SD>, <&scmi_clk SCMI_CCLK_SD>,
<&cru SCLK_SDMMC_DRV>, <&cru SCLK_SDMMC_SAMPLE>;
clock-names = "biu", "ciu", "ciu-drive", "ciu-sample";
fifo-depth = <0x100>;
max-frequency = <200000000>;
pinctrl-names = "default";
pinctrl-0 = <&sdmmc_clk &sdmmc_cmd &sdmmc_det &sdmmc_bus4>;
power-domains = <&power RK3588_PD_SDMMC>;
status = "disabled";
};
power-domain@RK3588_PD_SDMMC {
reg = <RK3588_PD_SDMMC>;
pm_qos = <&qos_sdmmc>;
};
qos_sdmmc: qos@fdf3d800 {
compatible = "syscon";
reg = <0x0 0xfdf3d800 0x0 0x20>;
};
sdmmc {
/omit-if-no-ref/
sdmmc_bus4: sdmmc-bus4 {
rockchip,pins =
/* sdmmc_d0 */
<4 RK_PD0 1 &pcfg_pull_up_drv_level_2>,
/* sdmmc_d1 */
<4 RK_PD1 1 &pcfg_pull_up_drv_level_2>,
/* sdmmc_d2 */
<4 RK_PD2 1 &pcfg_pull_up_drv_level_2>,
/* sdmmc_d3 */
<4 RK_PD3 1 &pcfg_pull_up_drv_level_2>;
};
/omit-if-no-ref/
sdmmc_clk: sdmmc-clk {
rockchip,pins =
/* sdmmc_clk */
<4 RK_PD5 1 &pcfg_pull_up_drv_level_2>;
};
/omit-if-no-ref/
sdmmc_cmd: sdmmc-cmd {
rockchip,pins =
/* sdmmc_cmd */
<4 RK_PD4 1 &pcfg_pull_up_drv_level_2>;
};
/omit-if-no-ref/
sdmmc_det: sdmmc-det {
rockchip,pins =
/* sdmmc_det */
<0 RK_PA4 1 &pcfg_pull_up>;
};
/omit-if-no-ref/
sdmmc_pwren: sdmmc-pwren {
rockchip,pins =
/* sdmmc_pwren */
<0 RK_PA5 2 &pcfg_pull_none>;
};
};
>>>> While I can't input in the console, I tried use console via ssh.
>>>> In the /dev list, I can't find mali0 and mmcblk2(sdcard),
>>>> In u-boot mode, mmcblk2 can be recognized, I loaded dom0-Image, xen, and dtb from mmcblk2.
>>>> While booting without xen, the mali0 and mmcblk2 can be recognized,
>>>> Is that something wrong with xen while Initialize the driver?
>>>> 4. xl command can not executed, and seems to be suspended.
>>>
>>> xl requires the initscript (or systemd service) to be executed. The fact
>>> it hangs usually means this didn't happen.
>>>
>>> Just in case, can you also check that your kernel has been build with
>>> Xen support?
>> initscript is xendriverdomain? I tried the command in dom0 like below:
>> (xl list command suspended)
>> root@RK3588:/# ./etc/init.d/xendriverdomain restart
>
> You would want to use xencommons rather than xendriverdomain.
>
>> root@RK3588:/#
>> root@RK3588:/# ps aux |grep xen
>> root 59 0.0 0.0 0 0 ? S 00:00 0:00 [xenbus]
>> root 60 0.0 0.0 0 0 ? S 00:00 0:00 [xenwatch]
>> root 165 0.0 0.0 0 0 ? D 00:00 0:00 [xenbus_probe]
>> root 5993 0.0 0.0 3044 380 pts/0 S+ 00:09 0:00 grep xen
>> root@RK3588:/# xl list
>> Name ID Mem VCPUs State Time(s)
>> I config the kernel according the manuals:
>> https://wiki.xenproject.org/wiki/Mainline_Linux_Kernel_Configs#Configuring_the_Kernel_for_dom0_Support
>> And used the kernel/arch/arm64/boot/Image as the dom0-Image.
>> How can I check the Kernel has been build with Xen support?
>
> You can grep XEN in your kernel config. You should see some enabled.
>
> But looking at the output above, you don't have xenstored running. So
> the most probable cause if that you didn't run xencommons.
Thanks, after I run ./S50xencommons start command, xl command run succeed.
Best regards
Cailigang
Re: [Bug] Bring up Dom0 on Arm board [ In reply to ]
On 11/01/2023 06:40, ??? wrote:
> Hi?

Hi,

>> On 10/01/2023 02:35, ??? wrote:
>>>> On 06/01/2023 06:41, ??? wrote:
>>>>> I try to run Xen on a Rockchip RK3588 board and encountered some problems.
>>>>> The command I used:
>>>>> load mmc 1:1 0xC400000 dom0-Image;
>>>>> load mmc 1:1 0x47C00000 xen4.14.5;
>>>>
>>>> We have made a lot of improvement since Xen 4.14. This is also out of
>>>> support since January 2022. It is still security supported but not for
>>>> long (July 2023).
>>>>
>>>> Would you be able to try Xen 4.17 (this was released a month ago)?
>>> I also tried the Xen4.17.0, But failed to run xl command in dom0, the error like below:
>>> root@RK3588:~# xl list
>>> libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve the maximum number of cpus
>>> libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve the maximum number of cpus
>>> libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve the maximum number of cpus
>>> libxl: error: libxl_domain.c:334:libxl_list_domain: getting domain info list: Permission denied
>>> libxl_list_domain failed.
>>> In Rootfs, Xen tool version is 4.14.3,
>>> I suspect that Xen tool and Xen hypervisor version conflict cause this problem, is that right?
>>
>> Part of the ABI used between the tools and the hypervisor is not stable.
>> So you will need to rebuild the tools for every new major releases (for
>> minor releases it is usually not necessary).
>>
>>> And although I used Xen4.17.0, The problems I mentioned are still there,
>>> The Device tree generation failed error, the dev mali0 and mmcblk2 still failed to run.
>>
>> I will reply to this below.
>>
>>>>> load mmc 1:1 0x47E00000 rk3588-evb7-lp4-v10-linux.dtb
>>>>> fdt addr 0x47E00000
>>>>> fdt resize 1024
>>>>> fdt set /chosen \#address-cells <0x2>
>>>>> fdt set /chosen \#size-cells <0x2>
>>>>> fdt set /chosen xen,xen-bootargs "console=dtuart dtuart=serial2 dom0_mem=4G dom0_max_vcpus=4 vwfi=native sched=null"
>>>>> fdt mknod /chosen dom0
>>>>> fdt set /chosen/dom0 compatible "xen,linux-zimage" "xen,multiboot-module" "multiboot,module"
>>>>> fdt set /chosen/dom0 reg <0x0 0xC400000 0x0 0x2000000>
>>>>> fdt set /chosen xen,dom0-bootargs "console=hvc0 earlycon=xen earlyprintk=xen clk_ignore_unused root=/dev/mmcblk0p6 rw rootwait"
>>>>> setenv fdt_high 0xffffffffffffffff
>>>>> booti 0x47C00000 - 0x47E00000
>>>>> 1. Device tree generation failed errors.
>>>>> when I used the default dtb to run xen, Painc occured on xen.
>>>>> log:
>>>>> (XEN) Unable to get irq 0 for /pcie@fe180000/legacy-interrupt-controller
>>>>> (XEN) Device tree generation failed (-1).
>>>>> (XEN)
>>>>> (XEN) ****************************************
>>>>> (XEN) Panic on CPU 0:
>>>>> (XEN) Could not set up DOM0 guest OS
>>>>> (XEN) ****************************************
>>>>> (XEN)
>>>>> (XEN) Reboot in five seconds...
>>>>> the dtb:
>>>>> pcie2x1l1_intc: legacy-interrupt-controller {
>>>>> interrupt-controller;
>>>>> #address-cells = <0>;
>>>>> #interrupt-cells = <1>;
>>>>> interrupt-parent = <&gic>;
>>>>> interrupts = <GIC_SPI 245 IRQ_TYPE_EDGE_RISING>;
>>>>> }; > I modified the legacy-interrupt-controller of interrupts from
>>>>> IRQ_TYPE_EDGE_RISING to IRQ_TYPE_LEVEL_HIGH.
>>>>
>>>> Based on this change, I would say the call to irq_set_spi_type() (called
>>>> from platform_get_irq()) will return -1. The function will validate the
>>>> type and will throw an error if there is a problem.
>>>>
>>>> Can you confirm whether the interrupt is shared with another device? Is
>>>> it described twice in the DT?
>>>>
>>>> If yes to one of the two questions. Is the type different?
>>>>
>>>> You could also print the old and new type in irq_set_spi_type() to
>>>> confirm the difference.
>>> It may cause by the interrupt interrupt-controller@fe600000,
>>> set the interrupt IRQ_TYPE_LEVEL_HIGH first according to interrupt-controller@fe600000 ,
>>> then irq_set_spi_type() try to set the interrupt IRQ_TYPE_EDGE_RISING according to
>>> pcie2x1l1_intc: legacy-interrupt-controller, but return -1.
>>> the gic: interrupt-controller@fe600000 like below:
>>> gic: interrupt-controller@fe600000 {
>>> compatible = "arm,gic-v3";
>>> #interrupt-cells = <3>;
>>> #address-cells = <2>;
>>> #size-cells = <2>;
>>> ranges;
>>> interrupt-controller;
>>> reg = <0x0 0xfe600000 0 0x10000>, /* GICD */
>>> <0x0 0xfe680000 0 0x100000>; /* GICR */
>>> interrupts = <GIC_PPI 9 IRQ_TYPE_LEVEL_HIGH>;
>>> its0: msi-controller@fe640000 {
>>> compatible = "arm,gic-v3-its";
>>> msi-controller;
>>> #msi-cells = <1>;
>>> reg = <0x0 0xfe640000 0x0 0x20000>;
>>> };
>>> its1: msi-controller@fe660000 {
>>> compatible = "arm,gic-v3-its";
>>> msi-controller;
>>> #msi-cells = <1>;
>>> reg = <0x0 0xfe660000 0x0 0x20000>;
>>> };
>>> };
>>
>> I am a bit confused. Reading the binding, it looks like the GIC and PCI
>> interrupt controller don't share an interrupt. Can you confirm the IRQ
>> number you saw in Xen?
> My mistake, my analysis was wrong.
> I added the log in the platform_get_irq(), and get the log as below:
> (XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 280, irq.type: 4
> (XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 279, irq.type: 4
> (XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 278, irq.type: 4
> (XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 277, irq.type: 4
> (XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 276, irq.type: 4
> (XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 280, irq.type: 4
> (XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 279, irq.type: 4
> (XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 278, irq.type: 4
> (XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 277, irq.type: 4
> (XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 276, irq.type: 4
> (XEN) d[IDLE]v0 -----fullname: /pcie@fe180000/legacy-interrupt-controller, irq: 277, irq.type: 1
> (XEN) d[IDLE]v0 -----fullname: /pcie@fe180000/legacy-interrupt-controller, irq: 277, irq.type: 1
> (XEN) Unable to get irq 0 for /pcie@fe180000/legacy-interrupt-controller
> (XEN) Device tree generation failed (-1).
> The Node pcie@fe180000:
> pcie2x1l1: pcie@fe180000 {
> compatible = "rockchip,rk3588-pcie", "snps,dw-pcie";
> #address-cells = <3>;
> #size-cells = <2>;
> bus-range = <0x30 0x3f>;
> clocks = <&cru ACLK_PCIE_1L1_MSTR>, <&cru ACLK_PCIE_1L1_SLV>,
> <&cru ACLK_PCIE_1L1_DBI>, <&cru PCLK_PCIE_1L1>,
> <&cru CLK_PCIE_AUX3>, <&cru CLK_PCIE1L1_PIPE>;
> clock-names = "aclk_mst", "aclk_slv",
> "aclk_dbi", "pclk",
> "aux", "pipe";
> device_type = "pci";
> interrupts = <GIC_SPI 248 IRQ_TYPE_LEVEL_HIGH>,
> <GIC_SPI 247 IRQ_TYPE_LEVEL_HIGH>,
> <GIC_SPI 246 IRQ_TYPE_LEVEL_HIGH>,
> <GIC_SPI 245 IRQ_TYPE_LEVEL_HIGH>,
> <GIC_SPI 244 IRQ_TYPE_LEVEL_HIGH>;
> interrupt-names = "sys", "pmc", "msg", "legacy", "err";
> #interrupt-cells = <1>;
> interrupt-map-mask = <0 0 0 7>;
> interrupt-map = <0 0 0 1 &pcie2x1l1_intc 0>,
> <0 0 0 2 &pcie2x1l1_intc 1>,
> <0 0 0 3 &pcie2x1l1_intc 2>,
> <0 0 0 4 &pcie2x1l1_intc 3>;
> linux,pci-domain = <3>;
> num-ib-windows = <8>;
> num-ob-windows = <8>;
> num-viewport = <4>;
> max-link-speed = <2>;
> msi-map = <0x3000 &its0 0x3000 0x1000>;
> num-lanes = <1>;
> phys = <&combphy2_psu PHY_TYPE_PCIE>;
> phy-names = "pcie-phy";
> ranges = <0x00000800 0x0 0xf3000000 0x0 0xf3000000 0x0 0x100000
> 0x81000000 0x0 0xf3100000 0x0 0xf3100000 0x0 0x100000
> 0x82000000 0x0 0xf3200000 0x0 0xf3200000 0x0 0xe00000
> 0xc3000000 0x9 0xc0000000 0x9 0xc0000000 0x0 0x40000000>;
> reg = <0x0 0xfe180000 0x0 0x10000>,
> <0xa 0x40c00000 0x0 0x400000>;
> reg-names = "pcie-apb", "pcie-dbi";
> resets = <&cru SRST_PCIE3_POWER_UP>, <&cru SRST_P_PCIE3>;
> reset-names = "pcie", "periph";
> rockchip,pipe-grf = <&php_grf>;
> status = "disabled";
> pcie2x1l1_intc: legacy-interrupt-controller {
> interrupt-controller;
> #address-cells = <0>;
> #interrupt-cells = <1>;
> interrupt-parent = <&gic>;
> interrupts = <GIC_SPI 245 IRQ_TYPE_EDGE_RISING>;
> };
> };
> I did not find IRQ number equal 277 in node pcie@fe180000, but found in node gpio@fd8a0000.
> The Node gpio@fd8a0000:
> gpio0: gpio@fd8a0000 {
> compatible = "rockchip,gpio-bank";
> reg = <0x0 0xfd8a0000 0x0 0x100>;
> interrupts = <GIC_SPI 277 IRQ_TYPE_LEVEL_HIGH>;
> clocks = <&cru PCLK_GPIO0>, <&cru DBCLK_GPIO0>;
> gpio-controller;
> #gpio-cells = <2>;
> gpio-ranges = <&pinctrl 0 0 32>;
> interrupt-controller;
> #interrupt-cells = <2>;
> };
> I'm confused, can you explain it?

The irq type is used to configure GICD_ICFGR. Here the Device-Tree seems
to provide conflicting information (one interrupt is edge and the other
level).

Where did you take your device-tree from?

>>>>> And bring up Xen successed, through I not sure the modification is correct.
>>>>> 2. After boot up, I tried to input in the console but failed.
>>>>> I added some log in api do_trap_guest_sync, try_handle_mmio as below:
>>>>> In function do_trap_guest_sync:
>>>>> static unsigned long ec = 0;
>>>>> if(hsr.ec != ec)
>>>>> {
>>>>> gprintk(XENLOG_INFO, "do_trap_guest_sync hsr.ec=%x \n", hsr.ec);
>>>>> ec = hsr.ec;
>>>>> }
>>>>> In function try_handle_mmio:
>>>>> gprintk(XENLOG_INFO, "handler->addr: %lx\n", handler->addr);
>>>>> Then everytime I type enter in the console, console show the log below:
>>>>> (XEN) d0v0 do_trap_guest_sync hsr.ec=24
>>>>> (XEN) d0v0 handler->addr: fe600000
>>>>> (XEN) d0v0 handler->addr: fe600000
>>>>> (XEN) d0v0 do_trap_guest_sync hsr.ec=18
>>>>> Is that something wrong with the GIC interrupt ?
>>>> A few questions:
>>>> * What is the corresponding device in the host physical address space
>>>> for 0xfe600000?
>>>> * What is the UART on your board? Is there any specific workaround
>>>> required?
>>> 0xfe600000 is: gic: interrupt-controller@fe600000, full content above.
>>
>> Thanks. So the trap is expected because the GICD exposed to the domains
>> is emulated.
> So this is not an error? How can I investigate the console can't input problem?

See below.

>>> The UART is 8250, I set menuconfig in Debugging Options, the config like below:
>>> [*] Early printk (Early printk via 8250 UART) --->
>>> (0Xfeb50000) Early printk, physical base address of debug UART
>>> (2) Early printk, left-shift to apply to the register offsets within the 8250 UART
>>> I found that if I config the early printk in xen, I don't need the xen,dom0-bootargs=
>>> "console=hvc0 earlycon=xen earlyprintk=xen" anymore, is that right?
>>
>> I don't know the exact configuration of the 8250. So I can't tell
>> whether this is correct.
>>
>> That said, as you see some ouput, it would indicate that the
>> configuration might be right.
> If xen,dom0-bootargs contains the "console=hvc0 earlycon=xen earlyprintk=xen"
> The boot up log would print twitce in console.

I am not sure I understand your point.

>> This could indicate that Xen is still using early printk and therefore
>> it would not be able to read character. From your previous email, I see
>> that you are requesting serial2. I am assuming this is an alias to the
>> same UART as the one you configure for the early printk?
>>
>> Can you paste the content of the related Device-Tree node? Also, I would
>> suggest to check if there are any errors in the Xen logs.
> Serial2 is the alias of the UART, some Xen logs as below:
> (XEN) adding DT alias:serial2: stem=serial id=2 node=/serial@feb50000
> ...
> (XEN) Looking for dtuart at "serial2", options ""
> (XEN) Unable to initialize dtuart: -19
> (XEN) Bad console= option 'dtuart'
> The node /serial@feb50000:
> uart2: serial@feb50000 {
> compatible = "rockchip,rk3588-uart", "snps,dw-apb-uart";
> reg = <0x0 0xfeb50000 0x0 0x100>;
> interrupts = <GIC_SPI 333 IRQ_TYPE_LEVEL_HIGH>;
> clocks = <&cru SCLK_UART2>, <&cru PCLK_UART2>;
> clock-names = "baudclk", "apb_pclk";
> reg-shift = <2>;
> reg-io-width = <4>;
> dmas = <&dmac0 10>, <&dmac0 11>;
> pinctrl-names = "default";
> pinctrl-0 = <&uart2m1_xfer>;
> status = "disabled";
> };
> There did an error when parse the xen command line, the command line I input is not correct ?
> fdt set /chosen xen,xen-bootargs "console=dtuart dtuart=serial2 dom0_mem=4G dom0_max_vcpus=4 vwfi=native sched=null"

I can't find the compatible rockchip,rk3588-uart in Linux drivers. Is
the UART meant to be similar to snps,dw-apb-uart with some quirks?


>>>>> 3. In Dom0, the dev mali0 and mmcblk2 is missing, and weston running failed.
>>>> Do you have any log in the kernel indicating why the mali and/or the mmc
>>>> driver didn't load?
>>>>
>>>> Also, can you confirm that the same kernel image works without Xen?
>>> Boot without xen, the mali0 log like below:
>>> root@RK3588:/# dmesg | grep mali
>>> [ 4.192093] mali fb000000.gpu: Kernel DDK version g12p0-01eac0
>>> [ 4.192148] mali fb000000.gpu: Looking up mali-supply from device tree
>>> [ 4.194569] mali fb000000.gpu: Looking up mem-supply from device tree
>>> [ 4.194747] mali fb000000.gpu: Looking up mali-supply from device tree
>>> [ 4.194792] mali fb000000.gpu: Looking up mem-supply from device tree
>>> [ 4.195383] mali fb000000.gpu: leakage=16
>>> [ 4.195457] mali fb000000.gpu: Looking up mali-supply from device tree
>>> [ 4.197004] mali fb000000.gpu: pvtm=858
>>> [ 4.197099] mali fb000000.gpu: pvtm-volt-sel=2
>>> [ 4.198437] mali fb000000.gpu: avs=0
>>> [ 4.201271] W : [File] : drivers/gpu/arm/bifrost/platform/rk/mali_kbase_config_rk.c; [Line] : 136; [Func] :
>>> kbase_platform_rk_init(); power-off-delay-ms not available.
>>> [ 4.206668] mali fb000000.gpu: GPU hardware issue table may need updating:
>>> [ 4.206683] mali fb000000.gpu: GPU identified as 0x7 arch 10.8.6 r0p0 status 0
>>> [ 4.206810] mali fb000000.gpu: No priority control manager is configured
>>> [ 4.206823] mali fb000000.gpu: No memory group manager is configured
>>> [ 4.206852] mali fb000000.gpu: Protected memory allocator not available
>>> [ 4.208342] mali fb000000.gpu: Couldn't find power_model DT node matching 'arm,mali-simple-power-model'
>>> [ 4.208356] mali fb000000.gpu: Error -22, no DT entry: mali-simple-power-model.static-coefficient = 1*[0]
>>> [ 4.208572] mali fb000000.gpu: Error -22, no DT entry: mali-simple-power-model.dynamic-coefficient = 1*[0]
>>> [ 4.208766] mali fb000000.gpu: Error -22, no DT entry: mali-simple-power-model.ts = 4*[0]
>>> [ 4.208958] mali fb000000.gpu: Error -22, no DT entry: mali-simple-power-model.thermal-zone = ''
>>> [ 4.212287] mali fb000000.gpu: Using configured power model mali-lodx-power-model, and fallback mali-simple-power-model
>>> [ 4.212539] mali fb000000.gpu: l=10000 h=85000 hyst=5000 l_limit=0 h_limit=800000000 h_table=0
>>> [ 4.214528] mali fb000000.gpu: Probed as mali0
>>> [ 4.318492] I : [File] : drivers/gpu/arm/mali400/mali/linux/mali_kernel_linux.c; [Line] : 405; [Func] : mali_module_init();
>>> svn_rev_string_from_arm of this mali_ko is '', rk_ko_ver is '5', built at '10:04:19', on 'Dec 12 2022'.
>>> [ 6.959913] mali fb000000.gpu: Loading Mali firmware 0x1010000
>>> [ 6.960491] mali fb000000.gpu: Protected memory allocator not found, Firmware protected mode entry will not be supported
>>> [ 6.960498] mali fb000000.gpu: Protected memory allocator not found, Firmware protected mode entry will not be supported
>>> [ 6.960503] mali fb000000.gpu: Protected memory allocator not found, Firmware protected mode entry will not be supported
>>> Boot with xen, the mali0 log like below:
>>> [ 2.969638] I : [File] : drivers/gpu/arm/mali400/mali/linux/mali_kernel_linux.c; [Line] : 405; [Func] : mali_module_init();
>>> svn_rev_string_from_arm of this mali_ko is '', rk_ko_ver is '5', built at '14:06:00', on 'Dec 16 2022'.
>> So no error at all afterwards? Interestingly, this line is not shown in
>> your output above. So I would suggest to check the code to understand if
>> somehow we are using a different path.
> Yes, I did not find any log about mali0 init or init failed, and mali0 just not bring up when boot with xen.
> About the gpu dtb, did xen support the gpu architecture of mali-bifrost?
Most of the devices are assigned to Dom0. There should be no need for
specific code in Xen for Mali.

As I said before, you want to dig in the Linux code to understand why
the driver is not initializing Mali.

> Current gpu dtb node :
> gpu: gpu@fb000000 {
> compatible = "arm,mali-bifrost";
> reg = <0x0 0xfb000000 0x0 0x200000>;
> interrupts = <GIC_SPI 94 IRQ_TYPE_LEVEL_HIGH>,
> <GIC_SPI 93 IRQ_TYPE_LEVEL_HIGH>,
> <GIC_SPI 92 IRQ_TYPE_LEVEL_HIGH>;
> interrupt-names = "GPU", "MMU", "JOB";
> clocks = <&scmi_clk SCMI_CLK_GPU>, <&cru CLK_GPU_COREGROUP>,
> <&cru CLK_GPU_STACKS>, <&cru CLK_GPU>;
> clock-names = "clk_mali", "clk_gpu_coregroup",
> "clk_gpu_stacks", "clk_gpu";
> assigned-clocks = <&scmi_clk SCMI_CLK_GPU>;
> assigned-clock-rates = <200000000>;
> power-domains = <&power RK3588_PD_GPU>;
> operating-points-v2 = <&gpu_opp_table>;
> #cooling-cells = <2>;
> dynamic-power-coefficient = <2982>;
> upthreshold = <30>;
> downdifferential = <10>;
> status = "disabled";
> };
>>> Boot without xen, the mmcblk2 log like below:
>>> root@RK3588:/# dmesg |grep sdmmc
>>> root@RK3588:/# dmesg |grep mmc
>>> [ 1.842460] Kernel command line: storagemedia=emmc androidboot.storagemedia=emmc androidboot.mode=normal
>>> androidboot.verifiedbootstate=orange rw rootwait earlycon=uart8250,mmio32,0xfeb50000
>>> console=ttyFIQ0 irqchip.gicv3_pseudo_nmi=0 root=PARTUUID=614e0000-0000
>>> [ 3.981216] dwmmc_rockchip fe2c0000.mmc: IDMAC supports 32-bit address mode.
>>> [ 3.981321] dwmmc_rockchip fe2c0000.mmc: Using internal DMA controller.
>>> [ 3.981349] dwmmc_rockchip fe2c0000.mmc: Version ID is 270a
>>> [ 3.981435] dwmmc_rockchip fe2c0000.mmc: DW MMC controller at irq 77,32 bit host data width,256 deep fifo
>>> [ 3.981588] dwmmc_rockchip fe2c0000.mmc: Looking up vmmc-supply from device tree
>>> [ 3.982932] dwmmc_rockchip fe2c0000.mmc: Looking up vqmmc-supply from device tree
>>> [ 3.983121] sdhci-dwcmshc fe2e0000.mmc: Looking up vmmc-supply from device tree
>>> [ 3.983135] sdhci-dwcmshc fe2e0000.mmc: Looking up vmmc-supply property in node /mmc@fe2e0000 failed
>>> [ 3.983168] sdhci-dwcmshc fe2e0000.mmc: Looking up vqmmc-supply from device tree
>>> [ 3.983177] sdhci-dwcmshc fe2e0000.mmc: Looking up vqmmc-supply property in node /mmc@fe2e0000 failed
>>> [ 3.983294] dwmmc_rockchip fe2c0000.mmc: Failed getting OCR mask: -22
>>> [ 3.983461] dwmmc_rockchip fe2c0000.mmc: could not set regulator OCR (-22)
>>> [ 3.983473] dwmmc_rockchip fe2c0000.mmc: failed to enable vmmc regulator
>>> [ 3.995539] mmc_host mmc2: Bus speed (slot 0) = 400000Hz (slot req 400000Hz, actual 400000HZ div = 0)
>>> [ 4.012246] mmc0: SDHCI controller on fe2e0000.mmc [fe2e0000.mmc] using ADMA
>>> [ 4.043129] mmc_host mmc2: Bus speed (slot 0) = 49500000Hz (slot req 50000000Hz, actual 49500000HZ div = 0)
>>> [ 4.043689] mmc2: new high speed SDHC card at address 0007
>>> [ 4.044681] mmcblk2: mmc2:0007 SD8GB 7.21 GiB
>>> [ 4.047294] mmcblk2: p1
>>> [ 4.060614] mmc0: new HS400 Enhanced strobe MMC card at address 0001
>>> [ 4.061406] mmcblk0: mmc0:0001 BJTD4R 29.1 GiB
>>> [ 4.061539] mmcblk0boot0: mmc0:0001 BJTD4R partition 1 4.00 MiB
>>> [ 4.061663] mmcblk0boot1: mmc0:0001 BJTD4R partition 2 4.00 MiB
>>> [ 4.062273] mmcblk0rpmb: mmc0:0001 BJTD4R partition 3 4.00 MiB, chardev (236:0)
>>> [ 4.068960] mmcblk0: p1 p2 p3 p4 p5 p6 p7 p8
>>> [ 5.835901] EXT4-fs (mmcblk0p6): recovery complete
>>> [ 5.836462] EXT4-fs (mmcblk0p6): mounted filesystem with ordered data mode. Opts: (null)
>>> [ 5.839971] storagemedia=emmc
>>> [ 5.867859] EXT4-fs (mmcblk0p6): re-mounted. Opts: (null)
>>> [ 6.409008] FAT-fs (mmcblk2p1): utf8 is not a recommended IO charset for FAT filesystems, filesystem will be case sensitive!
>>> [ 6.414043] FAT-fs (mmcblk2p1): Volume was not properly unmounted. Some data may be corrupt. Please run fsck.
>>> [ 7.039362] EXT4-fs (mmcblk0p7): mounting ext2 file system using the ext4 subsystem
>>> [ 7.040313] EXT4-fs (mmcblk0p7): warning: mounting unchecked fs, running e2fsck is recommended
>>> [ 7.041225] EXT4-fs (mmcblk0p7): mounted filesystem without journal. Opts: (null)
>>> [ 7.162903] EXT4-fs (mmcblk0p8): mounting ext2 file system using the ext4 subsystem
>>> [ 7.165824] EXT4-fs (mmcblk0p8): warning: mounting unchecked fs, running e2fsck is recommended
>>> [ 7.172777] EXT4-fs (mmcblk0p8): mounted filesystem without journal. Opts: (null)
>>> Boot with xen, the mmcblk2(sdmmc) log like below:
>>> root@RK3588:/sys/firmware# dmesg |grep sdmmc
>>> [ 69.563072] rockchip-pm-domain fd8d8000.power-management:power-controller:
>>
>> It looks like the command line between Xen and baremetal is different.
>> When running under Xen, the command line should mostly be the same
>> (aside clk_* and console=hvc0). Otherwise you don't compare the same and
>> therefore the difference may only be due to your command line options.
> The origin bootargs as below:
> bootargs = "storagemedia=emmc androidboot.storagemedia=emmc androidboot.mode=normal androidboot.verifiedbootstate=orange rw rootwait
> earlycon=uart8250,mmio32,0xfeb50000 console=ttyFIQ0 irqchip.gicv3_pseudo_nmi=0 root=PARTUUID=614e0000-0000";
> Then I changed the xen,dom0-bootargs as below:
> fdt set /chosen xen,dom0-bootargs "clk_ignore_unused storagemedia=emmc androidboot.storagemedia=emmc androidboot.mode=normal
> androidboot.verifiedbootstate=orange rw rootwait earlycon=uart8250,mmio32,0xfeb50000 console=hvc0 irqchip.gicv3_pseudo_nmi=0 root=PARTUUID=614e0000-0000"
> Then boot with xen and get the log as below:
> root@RK3588:~# dmesg |grep mmc
> [ 67.921516] Kernel command line: clk_ignore_unused storagemedia=emmc androidboot.storagemedia=emmc androidboot.mode=normal androidboot.verifiedbootstate=orange rw rootwait earlycon=uart8250,mmio32,0xfeb50000 console=hvc0 irqchip.gicv3_pseudo_nmi=0 root=PARTUUID=614e0000-0000
> [ 77.089809] sdhci-dwcmshc fe2e0000.mmc: Looking up vmmc-supply from device tree
> [ 77.099607] sdhci-dwcmshc fe2e0000.mmc: Looking up vmmc-supply property in node /mmc@fe2e0000 failed
> [ 77.115046] sdhci-dwcmshc fe2e0000.mmc: Looking up vqmmc-supply from device tree
> [ 77.126871] sdhci-dwcmshc fe2e0000.mmc: Looking up vqmmc-supply property in node /mmc@fe2e0000 failed
> [ 77.167364] mmc0: SDHCI controller on fe2e0000.mmc [fe2e0000.mmc] using ADMA
> [ 77.315318] mmc0: new HS400 Enhanced strobe MMC card at address 0001
> [ 77.325677] mmcblk0: mmc0:0001 BJTD4R 29.1 GiB
> [ 77.334497] mmcblk0boot0: mmc0:0001 BJTD4R partition 1 4.00 MiB
> [ 77.344191] mmcblk0boot1: mmc0:0001 BJTD4R partition 2 4.00 MiB
> [ 77.354610] mmcblk0rpmb: mmc0:0001 BJTD4R partition 3 4.00 MiB, chardev (236:0)
> [ 77.368963] mmcblk0: p1 p2 p3 p4 p5 p6 p7 p8
> [ 79.123640] rockchip-pm-domain fd8d8000.power-management:power-controller: Looking up sdmmc-supply from device tree
> [ 79.141684] rockchip-pm-domain fd8d8000.power-management:power-controller: Looking up sdmmc-supply property in node /power-management@fd8d8000/power-controller failed
> [ 84.550779] EXT4-fs (mmcblk0p6): recovery complete
> [ 84.556109] EXT4-fs (mmcblk0p6): mounted filesystem with ordered data mode. Opts: (null)
> [ 84.603570] storagemedia=emmc
> [ 84.628559] EXT4-fs (mmcblk0p6): re-mounted. Opts: (null)
> [ 85.748403] EXT4-fs (mmcblk0p7): mounting ext2 file system using the ext4 subsystem
> [ 85.758234] EXT4-fs (mmcblk0p7): warning: mounting unchecked fs, running e2fsck is recommended
> [ 85.774558] EXT4-fs (mmcblk0p7): mounted filesystem without journal. Opts: (null)
> [ 85.898615] EXT4-fs (mmcblk0p8): mounting ext2 file system using the ext4 subsystem
> [ 85.909913] EXT4-fs (mmcblk0p8): warning: mounting unchecked fs, running e2fsck is recommended
> [ 85.928078] EXT4-fs (mmcblk0p8): mounted filesystem without journal. Opts: (null)
>>> Looking up sdmmc-supply from device tree
>>> [ 69.563112] rockchip-pm-domain fd8d8000.power-management:power-controller:
>>> Looking up sdmmc-supply property in node /power-management@fd8d8000/power-controller failed
>>
>> Can you check why this is failing?
> I'm trying to figure out why is failing, But I'm confused, the sdmmc has not a sdmmc-supply field.
> The sdmmc node:
> sdmmc: mmc@fe2c0000 {
> compatible = "rockchip,rk3588-dw-mshc", "rockchip,rk3288-dw-mshc";
> reg = <0x0 0xfe2c0000 0x0 0x4000>;
> interrupts = <GIC_SPI 203 IRQ_TYPE_LEVEL_HIGH>;
> clocks = <&scmi_clk SCMI_HCLK_SD>, <&scmi_clk SCMI_CCLK_SD>,
> <&cru SCLK_SDMMC_DRV>, <&cru SCLK_SDMMC_SAMPLE>;
> clock-names = "biu", "ciu", "ciu-drive", "ciu-sample";
> fifo-depth = <0x100>;
> max-frequency = <200000000>;
> pinctrl-names = "default";
> pinctrl-0 = <&sdmmc_clk &sdmmc_cmd &sdmmc_det &sdmmc_bus4>;
> power-domains = <&power RK3588_PD_SDMMC>;
> status = "disabled";
> };
> power-domain@RK3588_PD_SDMMC {
> reg = <RK3588_PD_SDMMC>;
> pm_qos = <&qos_sdmmc>;
> };
> qos_sdmmc: qos@fdf3d800 {
> compatible = "syscon";
> reg = <0x0 0xfdf3d800 0x0 0x20>;
> };
> sdmmc {
> /omit-if-no-ref/
> sdmmc_bus4: sdmmc-bus4 {
> rockchip,pins =
> /* sdmmc_d0 */
> <4 RK_PD0 1 &pcfg_pull_up_drv_level_2>,
> /* sdmmc_d1 */
> <4 RK_PD1 1 &pcfg_pull_up_drv_level_2>,
> /* sdmmc_d2 */
> <4 RK_PD2 1 &pcfg_pull_up_drv_level_2>,
> /* sdmmc_d3 */
> <4 RK_PD3 1 &pcfg_pull_up_drv_level_2>;
> };
> /omit-if-no-ref/
> sdmmc_clk: sdmmc-clk {
> rockchip,pins =
> /* sdmmc_clk */
> <4 RK_PD5 1 &pcfg_pull_up_drv_level_2>;
> };
> /omit-if-no-ref/
> sdmmc_cmd: sdmmc-cmd {
> rockchip,pins =
> /* sdmmc_cmd */
> <4 RK_PD4 1 &pcfg_pull_up_drv_level_2>;
> };
> /omit-if-no-ref/
> sdmmc_det: sdmmc-det {
> rockchip,pins =
> /* sdmmc_det */
> <0 RK_PA4 1 &pcfg_pull_up>;
> };
> /omit-if-no-ref/
> sdmmc_pwren: sdmmc-pwren {
> rockchip,pins =
> /* sdmmc_pwren */
> <0 RK_PA5 2 &pcfg_pull_none>;
> };
> };

Unfortunately, I don't have any experience with this platform. So I can
only provide tips how to debug it.

Cheers,

--
Julien Grall
Re: [Bug] Bring up Dom0 on Arm board [ In reply to ]
Hi,
>>>>>> I try to run Xen on a Rockchip RK3588 board and encountered some problems.
>>>>>> The command I used:
>>>>>> load mmc 1:1 0xC400000 dom0-Image;
>>>>>> load mmc 1:1 0x47C00000 xen4.14.5;
>>>>>
>>>>> We have made a lot of improvement since Xen 4.14. This is also out of
>>>>> support since January 2022. It is still security supported but not for
>>>>> long (July 2023).
>>>>>
>>>>> Would you be able to try Xen 4.17 (this was released a month ago)?
>>>> I also tried the Xen4.17.0, But failed to run xl command in dom0, the error like below:
>>>> root@RK3588:~# xl list
>>>> libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve the maximum number of cpus
>>>> libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve the maximum number of cpus
>>>> libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve the maximum number of cpus
>>>> libxl: error: libxl_domain.c:334:libxl_list_domain: getting domain info list: Permission denied
>>>> libxl_list_domain failed.
>>>> In Rootfs, Xen tool version is 4.14.3,
>>>> I suspect that Xen tool and Xen hypervisor version conflict cause this problem, is that right?
>>>
>>> Part of the ABI used between the tools and the hypervisor is not stable.
>>> So you will need to rebuild the tools for every new major releases (for
>>> minor releases it is usually not necessary).
>>>
>>>> And although I used Xen4.17.0, The problems I mentioned are still there,
>>>> The Device tree generation failed error, the dev mali0 and mmcblk2 still failed to run.
>>>
>>> I will reply to this below.
>>>
>>>>>> load mmc 1:1 0x47E00000 rk3588-evb7-lp4-v10-linux.dtb
>>>>>> fdt addr 0x47E00000
>>>>>> fdt resize 1024
>>>>>> fdt set /chosen \#address-cells <0x2>
>>>>>> fdt set /chosen \#size-cells <0x2>
>>>>>> fdt set /chosen xen,xen-bootargs "console=dtuart dtuart=serial2 dom0_mem=4G dom0_max_vcpus=4 vwfi=native sched=null"
>>>>>> fdt mknod /chosen dom0
>>>>>> fdt set /chosen/dom0 compatible "xen,linux-zimage" "xen,multiboot-module" "multiboot,module"
>>>>>> fdt set /chosen/dom0 reg <0x0 0xC400000 0x0 0x2000000>
>>>>>> fdt set /chosen xen,dom0-bootargs "console=hvc0 earlycon=xen earlyprintk=xen clk_ignore_unused root=/dev/mmcblk0p6 rw rootwait"
>>>>>> setenv fdt_high 0xffffffffffffffff
>>>>>> booti 0x47C00000 - 0x47E00000
>>>>>> 1. Device tree generation failed errors.
>>>>>> when I used the default dtb to run xen, Painc occured on xen.
>>>>>> log:
>>>>>> (XEN) Unable to get irq 0 for /pcie@fe180000/legacy-interrupt-controller
>>>>>> (XEN) Device tree generation failed (-1).
>>>>>> (XEN)
>>>>>> (XEN) ****************************************
>>>>>> (XEN) Panic on CPU 0:
>>>>>> (XEN) Could not set up DOM0 guest OS
>>>>>> (XEN) ****************************************
>>>>>> (XEN)
>>>>>> (XEN) Reboot in five seconds...
>>>>>> the dtb:
>>>>>> pcie2x1l1_intc: legacy-interrupt-controller {
>>>>>> interrupt-controller;
>>>>>> #address-cells = <0>;
>>>>>> #interrupt-cells = <1>;
>>>>>> interrupt-parent = <&gic>;
>>>>>> interrupts = <GIC_SPI 245 IRQ_TYPE_EDGE_RISING>;
>>>>>> }; > I modified the legacy-interrupt-controller of interrupts from
>>>>>> IRQ_TYPE_EDGE_RISING to IRQ_TYPE_LEVEL_HIGH.
>>>>>
>>>>> Based on this change, I would say the call to irq_set_spi_type() (called
>>>>> from platform_get_irq()) will return -1. The function will validate the
>>>>> type and will throw an error if there is a problem.
>>>>>
>>>>> Can you confirm whether the interrupt is shared with another device? Is
>>>>> it described twice in the DT?
>>>>>
>>>>> If yes to one of the two questions. Is the type different?
>>>>>
>>>>> You could also print the old and new type in irq_set_spi_type() to
>>>>> confirm the difference.
>>>> It may cause by the interrupt interrupt-controller@fe600000,
>>>> set the interrupt IRQ_TYPE_LEVEL_HIGH first according to interrupt-controller@fe600000 ,
>>>> then irq_set_spi_type() try to set the interrupt IRQ_TYPE_EDGE_RISING according to
>>>> pcie2x1l1_intc: legacy-interrupt-controller, but return -1.
>>>> the gic: interrupt-controller@fe600000 like below:
>>>> gic: interrupt-controller@fe600000 {
>>>> compatible = "arm,gic-v3";
>>>> #interrupt-cells = <3>;
>>>> #address-cells = <2>;
>>>> #size-cells = <2>;
>>>> ranges;
>>>> interrupt-controller;
>>>> reg = <0x0 0xfe600000 0 0x10000>, /* GICD */
>>>> <0x0 0xfe680000 0 0x100000>; /* GICR */
>>>> interrupts = <GIC_PPI 9 IRQ_TYPE_LEVEL_HIGH>;
>>>> its0: msi-controller@fe640000 {
>>>> compatible = "arm,gic-v3-its";
>>>> msi-controller;
>>>> #msi-cells = <1>;
>>>> reg = <0x0 0xfe640000 0x0 0x20000>;
>>>> };
>>>> its1: msi-controller@fe660000 {
>>>> compatible = "arm,gic-v3-its";
>>>> msi-controller;
>>>> #msi-cells = <1>;
>>>> reg = <0x0 0xfe660000 0x0 0x20000>;
>>>> };
>>>> };
>>>
>>> I am a bit confused. Reading the binding, it looks like the GIC and PCI
>>> interrupt controller don't share an interrupt. Can you confirm the IRQ
>>> number you saw in Xen?
>> My mistake, my analysis was wrong.
>> I added the log in the platform_get_irq(), and get the log as below:
>> (XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 280, irq.type: 4
>> (XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 279, irq.type: 4
>> (XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 278, irq.type: 4
>> (XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 277, irq.type: 4
>> (XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 276, irq.type: 4
>> (XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 280, irq.type: 4
>> (XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 279, irq.type: 4
>> (XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 278, irq.type: 4
>> (XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 277, irq.type: 4
>> (XEN) d[IDLE]v0 -----fullname: /pcie@fe180000, irq: 276, irq.type: 4
>> (XEN) d[IDLE]v0 -----fullname: /pcie@fe180000/legacy-interrupt-controller, irq: 277, irq.type: 1
>> (XEN) d[IDLE]v0 -----fullname: /pcie@fe180000/legacy-interrupt-controller, irq: 277, irq.type: 1
>> (XEN) Unable to get irq 0 for /pcie@fe180000/legacy-interrupt-controller
>> (XEN) Device tree generation failed (-1).
>> The Node pcie@fe180000:
>> pcie2x1l1: pcie@fe180000 {
>> compatible = "rockchip,rk3588-pcie", "snps,dw-pcie";
>> #address-cells = <3>;
>> #size-cells = <2>;
>> bus-range = <0x30 0x3f>;
>> clocks = <&cru ACLK_PCIE_1L1_MSTR>, <&cru ACLK_PCIE_1L1_SLV>,
>> <&cru ACLK_PCIE_1L1_DBI>, <&cru PCLK_PCIE_1L1>,
>> <&cru CLK_PCIE_AUX3>, <&cru CLK_PCIE1L1_PIPE>;
>> clock-names = "aclk_mst", "aclk_slv",
>> "aclk_dbi", "pclk",
>> "aux", "pipe";
>> device_type = "pci";
>> interrupts = <GIC_SPI 248 IRQ_TYPE_LEVEL_HIGH>,
>> <GIC_SPI 247 IRQ_TYPE_LEVEL_HIGH>,
>> <GIC_SPI 246 IRQ_TYPE_LEVEL_HIGH>,
>> <GIC_SPI 245 IRQ_TYPE_LEVEL_HIGH>,
>> <GIC_SPI 244 IRQ_TYPE_LEVEL_HIGH>;
>> interrupt-names = "sys", "pmc", "msg", "legacy", "err";
>> #interrupt-cells = <1>;
>> interrupt-map-mask = <0 0 0 7>;
>> interrupt-map = <0 0 0 1 &pcie2x1l1_intc 0>,
>> <0 0 0 2 &pcie2x1l1_intc 1>,
>> <0 0 0 3 &pcie2x1l1_intc 2>,
>> <0 0 0 4 &pcie2x1l1_intc 3>;
>> linux,pci-domain = <3>;
>> num-ib-windows = <8>;
>> num-ob-windows = <8>;
>> num-viewport = <4>;
>> max-link-speed = <2>;
>> msi-map = <0x3000 &its0 0x3000 0x1000>;
>> num-lanes = <1>;
>> phys = <&combphy2_psu PHY_TYPE_PCIE>;
>> phy-names = "pcie-phy";
>> ranges = <0x00000800 0x0 0xf3000000 0x0 0xf3000000 0x0 0x100000
>> 0x81000000 0x0 0xf3100000 0x0 0xf3100000 0x0 0x100000
>> 0x82000000 0x0 0xf3200000 0x0 0xf3200000 0x0 0xe00000
>> 0xc3000000 0x9 0xc0000000 0x9 0xc0000000 0x0 0x40000000>;
>> reg = <0x0 0xfe180000 0x0 0x10000>,
>> <0xa 0x40c00000 0x0 0x400000>;
>> reg-names = "pcie-apb", "pcie-dbi";
>> resets = <&cru SRST_PCIE3_POWER_UP>, <&cru SRST_P_PCIE3>;
>> reset-names = "pcie", "periph";
>> rockchip,pipe-grf = <&php_grf>;
>> status = "disabled";
>> pcie2x1l1_intc: legacy-interrupt-controller {
>> interrupt-controller;
>> #address-cells = <0>;
>> #interrupt-cells = <1>;
>> interrupt-parent = <&gic>;
>> interrupts = <GIC_SPI 245 IRQ_TYPE_EDGE_RISING>;
>> };
>> };
>> I did not find IRQ number equal 277 in node pcie@fe180000, but found in node gpio@fd8a0000.
>> The Node gpio@fd8a0000:
>> gpio0: gpio@fd8a0000 {
>> compatible = "rockchip,gpio-bank";
>> reg = <0x0 0xfd8a0000 0x0 0x100>;
>> interrupts = <GIC_SPI 277 IRQ_TYPE_LEVEL_HIGH>;
>> clocks = <&cru PCLK_GPIO0>, <&cru DBCLK_GPIO0>;
>> gpio-controller;
>> #gpio-cells = <2>;
>> gpio-ranges = <&pinctrl 0 0 32>;
>> interrupt-controller;
>> #interrupt-cells = <2>;
>> };
>> I'm confused, can you explain it?
>
> The irq type is used to configure GICD_ICFGR. Here the Device-Tree seems
> to provide conflicting information (one interrupt is edge and the other
> level).
> Where did you take your device-tree from?
I take the device-tree from rockchip offical source code.
>>>>>> And bring up Xen successed, through I not sure the modification is correct.
>>>>>> 2. After boot up, I tried to input in the console but failed.
>>>>>> I added some log in api do_trap_guest_sync, try_handle_mmio as below:
>>>>>> In function do_trap_guest_sync:
>>>>>> static unsigned long ec = 0;
>>>>>> if(hsr.ec != ec)
>>>>>> {
>>>>>> gprintk(XENLOG_INFO, "do_trap_guest_sync hsr.ec=%x \n", hsr.ec);
>>>>>> ec = hsr.ec;
>>>>>> }
>>>>>> In function try_handle_mmio:
>>>>>> gprintk(XENLOG_INFO, "handler->addr: %lx\n", handler->addr);
>>>>>> Then everytime I type enter in the console, console show the log below:
>>>>>> (XEN) d0v0 do_trap_guest_sync hsr.ec=24
>>>>>> (XEN) d0v0 handler->addr: fe600000
>>>>>> (XEN) d0v0 handler->addr: fe600000
>>>>>> (XEN) d0v0 do_trap_guest_sync hsr.ec=18
>>>>>> Is that something wrong with the GIC interrupt ?
>>>>> A few questions:
>>>>> * What is the corresponding device in the host physical address space
>>>>> for 0xfe600000?
>>>>> * What is the UART on your board? Is there any specific workaround
>>>>> required?
>>>> 0xfe600000 is: gic: interrupt-controller@fe600000, full content above.
>>>
>>> Thanks. So the trap is expected because the GICD exposed to the domains
>>> is emulated.
>> So this is not an error? How can I investigate the console can't input problem?
>
> See below.
>
>>>> The UART is 8250, I set menuconfig in Debugging Options, the config like below:
>>>> [*] Early printk (Early printk via 8250 UART) --->
>>>> (0Xfeb50000) Early printk, physical base address of debug UART
>>>> (2) Early printk, left-shift to apply to the register offsets within the 8250 UART
>>>> I found that if I config the early printk in xen, I don't need the xen,dom0-bootargs=
>>>> "console=hvc0 earlycon=xen earlyprintk=xen" anymore, is that right?
>>>
>>> I don't know the exact configuration of the 8250. So I can't tell
>>> whether this is correct.
>>>
>>> That said, as you see some ouput, it would indicate that the
>>> configuration might be right.
>> If xen,dom0-bootargs contains the "console=hvc0 earlycon=xen earlyprintk=xen"
>> The boot up log would print twitce in console.
>
> I am not sure I understand your point.
The menuconfig setting is according to :
http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=docs/misc/arm/early-printk.txt;hb=HEAD
After set this config and also set xen,dom0-bootargs="console=hvc0 earlycon=xen earlyprintk=xen",
Boot Dom0 with xen, the Dom0 kernel log print twice, like below:
[ 60.783809] Booting Linux on physical CPU 0x0000000000 [0x412fd050]
...
[ 60.783809] Booting Linux on physical CPU 0x0000000000 [0x412fd050]
...
>>> This could indicate that Xen is still using early printk and therefore
>>> it would not be able to read character. From your previous email, I see
>>> that you are requesting serial2. I am assuming this is an alias to the
>>> same UART as the one you configure for the early printk?
>>>
>>> Can you paste the content of the related Device-Tree node? Also, I would
>>> suggest to check if there are any errors in the Xen logs.
>> Serial2 is the alias of the UART, some Xen logs as below:
>> (XEN) adding DT alias:serial2: stem=serial id=2 node=/serial@feb50000
>> ...
>> (XEN) Looking for dtuart at "serial2", options ""
>> (XEN) Unable to initialize dtuart: -19
>> (XEN) Bad console= option 'dtuart'
>> The node /serial@feb50000:
>> uart2: serial@feb50000 {
>> compatible = "rockchip,rk3588-uart", "snps,dw-apb-uart";
>> reg = <0x0 0xfeb50000 0x0 0x100>;
>> interrupts = <GIC_SPI 333 IRQ_TYPE_LEVEL_HIGH>;
>> clocks = <&cru SCLK_UART2>, <&cru PCLK_UART2>;
>> clock-names = "baudclk", "apb_pclk";
>> reg-shift = <2>;
>> reg-io-width = <4>;
>> dmas = <&dmac0 10>, <&dmac0 11>;
>> pinctrl-names = "default";
>> pinctrl-0 = <&uart2m1_xfer>;
>> status = "disabled";
>> };
>> There did an error when parse the xen command line, the command line I input is not correct ?
>> fdt set /chosen xen,xen-bootargs "console=dtuart dtuart=serial2 dom0_mem=4G dom0_max_vcpus=4 vwfi=native sched=null"
>
> I can't find the compatible rockchip,rk3588-uart in Linux drivers. Is
> the UART meant to be similar to snps,dw-apb-uart with some quirks?
The uart2 with compatible "snps,dw-apb-uart", is not a standard UART driver?
The UART node:
uart2: serial@feb50000 {
compatible = "rockchip,rk3588-uart", "snps,dw-apb-uart";
reg = <0x0 0xfeb50000 0x0 0x100>;
interrupts = <GIC_SPI 333 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&cru SCLK_UART2>, <&cru PCLK_UART2>;
clock-names = "baudclk", "apb_pclk";
reg-shift = <2>;
reg-io-width = <4>;
dmas = <&dmac0 10>, <&dmac0 11>;
pinctrl-names = "default";
pinctrl-0 = <&uart2m1_xfer>;
status = "disabled";
};
Do you have any suggestions on how to investigate the console can't input problem?
About Mali and sdmmc drivers initialize problems, I will read Linux code to understand why
the driver is not initializing Mali and sdmmc.
Thanks for your answer.
Best regards
Cailigang
Re: [Bug] Bring up Dom0 on Arm board [ In reply to ]
Hi,

On 12/01/2023 02:11, ??? wrote:
>>> I did not find IRQ number equal 277 in node pcie@fe180000, but found in node gpio@fd8a0000.
>>> The Node gpio@fd8a0000:
>>> gpio0: gpio@fd8a0000 {
>>> compatible = "rockchip,gpio-bank";
>>> reg = <0x0 0xfd8a0000 0x0 0x100>;
>>> interrupts = <GIC_SPI 277 IRQ_TYPE_LEVEL_HIGH>;
>>> clocks = <&cru PCLK_GPIO0>, <&cru DBCLK_GPIO0>;
>>> gpio-controller;
>>> #gpio-cells = <2>;
>>> gpio-ranges = <&pinctrl 0 0 32>;
>>> interrupt-controller;
>>> #interrupt-cells = <2>;
>>> };
>>> I'm confused, can you explain it?
>>
>> The irq type is used to configure GICD_ICFGR. Here the Device-Tree seems
>> to provide conflicting information (one interrupt is edge and the other
>> level).
>> Where did you take your device-tree from?
> I take the device-tree from rockchip offical source code.

Ok. I would suggest to speak with them then.

>>>>>>> And bring up Xen successed, through I not sure the modification is correct.
>>>>>>> 2. After boot up, I tried to input in the console but failed.
>>>>>>> I added some log in api do_trap_guest_sync, try_handle_mmio as below:
>>>>>>> In function do_trap_guest_sync:
>>>>>>> static unsigned long ec = 0;
>>>>>>> if(hsr.ec != ec)
>>>>>>> {
>>>>>>> gprintk(XENLOG_INFO, "do_trap_guest_sync hsr.ec=%x \n", hsr.ec);
>>>>>>> ec = hsr.ec;
>>>>>>> }
>>>>>>> In function try_handle_mmio:
>>>>>>> gprintk(XENLOG_INFO, "handler->addr: %lx\n", handler->addr);
>>>>>>> Then everytime I type enter in the console, console show the log below:
>>>>>>> (XEN) d0v0 do_trap_guest_sync hsr.ec=24
>>>>>>> (XEN) d0v0 handler->addr: fe600000
>>>>>>> (XEN) d0v0 handler->addr: fe600000
>>>>>>> (XEN) d0v0 do_trap_guest_sync hsr.ec=18
>>>>>>> Is that something wrong with the GIC interrupt ?
>>>>>> A few questions:
>>>>>> * What is the corresponding device in the host physical address space
>>>>>> for 0xfe600000?
>>>>>> * What is the UART on your board? Is there any specific workaround
>>>>>> required?
>>>>> 0xfe600000 is: gic: interrupt-controller@fe600000, full content above.
>>>>
>>>> Thanks. So the trap is expected because the GICD exposed to the domains
>>>> is emulated.
>>> So this is not an error? How can I investigate the console can't input problem?
>>
>> See below.
>>
>>>>> The UART is 8250, I set menuconfig in Debugging Options, the config like below:
>>>>> [*] Early printk (Early printk via 8250 UART) --->
>>>>> (0Xfeb50000) Early printk, physical base address of debug UART
>>>>> (2) Early printk, left-shift to apply to the register offsets within the 8250 UART
>>>>> I found that if I config the early printk in xen, I don't need the xen,dom0-bootargs=
>>>>> "console=hvc0 earlycon=xen earlyprintk=xen" anymore, is that right?
>>>>
>>>> I don't know the exact configuration of the 8250. So I can't tell
>>>> whether this is correct.
>>>>
>>>> That said, as you see some ouput, it would indicate that the
>>>> configuration might be right.
>>> If xen,dom0-bootargs contains the "console=hvc0 earlycon=xen earlyprintk=xen"
>>> The boot up log would print twitce in console.
>>
>> I am not sure I understand your point.
> The menuconfig setting is according to :
> http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=docs/misc/arm/early-printk.txt;hb=HEAD
> After set this config and also set xen,dom0-bootargs="console=hvc0 earlycon=xen earlyprintk=xen",
> Boot Dom0 with xen, the Dom0 kernel log print twice, like below:
> [ 60.783809] Booting Linux on physical CPU 0x0000000000 [0x412fd050]
> ...
> [ 60.783809] Booting Linux on physical CPU 0x0000000000 [0x412fd050]
> ...

IIRC, this is expected as you specify a boot console and a runtime console.

>>>> This could indicate that Xen is still using early printk and therefore
>>>> it would not be able to read character. From your previous email, I see
>>>> that you are requesting serial2. I am assuming this is an alias to the
>>>> same UART as the one you configure for the early printk?
>>>>
>>>> Can you paste the content of the related Device-Tree node? Also, I would
>>>> suggest to check if there are any errors in the Xen logs.
>>> Serial2 is the alias of the UART, some Xen logs as below:
>>> (XEN) adding DT alias:serial2: stem=serial id=2 node=/serial@feb50000
>>> ...
>>> (XEN) Looking for dtuart at "serial2", options ""
>>> (XEN) Unable to initialize dtuart: -19
>>> (XEN) Bad console= option 'dtuart'
>>> The node /serial@feb50000:
>>> uart2: serial@feb50000 {
>>> compatible = "rockchip,rk3588-uart", "snps,dw-apb-uart";
>>> reg = <0x0 0xfeb50000 0x0 0x100>;
>>> interrupts = <GIC_SPI 333 IRQ_TYPE_LEVEL_HIGH>;
>>> clocks = <&cru SCLK_UART2>, <&cru PCLK_UART2>;
>>> clock-names = "baudclk", "apb_pclk";
>>> reg-shift = <2>;
>>> reg-io-width = <4>;
>>> dmas = <&dmac0 10>, <&dmac0 11>;
>>> pinctrl-names = "default";
>>> pinctrl-0 = <&uart2m1_xfer>;
>>> status = "disabled";
>>> };
>>> There did an error when parse the xen command line, the command line I input is not correct ?
>>> fdt set /chosen xen,xen-bootargs "console=dtuart dtuart=serial2 dom0_mem=4G dom0_max_vcpus=4 vwfi=native sched=null"
>>
>> I can't find the compatible rockchip,rk3588-uart in Linux drivers. Is
>> the UART meant to be similar to snps,dw-apb-uart with some quirks?
> The uart2 with compatible "snps,dw-apb-uart", is not a standard UART driver?

snps,dw-apb-uart is based on the 8250 but there are some quirks. We do
have support in Xen, but the fact you have an extra compatible makes me
wonder whether there are extra setup for your UART.

> The UART node:
> uart2: serial@feb50000 {
> compatible = "rockchip,rk3588-uart", "snps,dw-apb-uart";
> reg = <0x0 0xfeb50000 0x0 0x100>;
> interrupts = <GIC_SPI 333 IRQ_TYPE_LEVEL_HIGH>;
> clocks = <&cru SCLK_UART2>, <&cru PCLK_UART2>;
> clock-names = "baudclk", "apb_pclk";
> reg-shift = <2>;
> reg-io-width = <4>;
> dmas = <&dmac0 10>, <&dmac0 11>;
> pinctrl-names = "default";
> pinctrl-0 = <&uart2m1_xfer>;
> status = "disabled";
> };
> Do you have any suggestions on how to investigate the console can't input problem?

I would suggest to look at the Linux driver and check if there are any
different with the Xen one.

Cheers,

--
Julien Grall
Re: [Bug] Bring up Dom0 on Arm board [ In reply to ]
Hi,
>>>>> This could indicate that Xen is still using early printk and therefore
>>>>> it would not be able to read character. From your previous email, I see
>>>>> that you are requesting serial2. I am assuming this is an alias to the
>>>>> same UART as the one you configure for the early printk?
>>>>>
>>>>> Can you paste the content of the related Device-Tree node? Also, I would
>>>>> suggest to check if there are any errors in the Xen logs.
>>>> Serial2 is the alias of the UART, some Xen logs as below:
>>>> (XEN) adding DT alias:serial2: stem=serial id=2 node=/serial@feb50000
>>>> ...
>>>> (XEN) Looking for dtuart at "serial2", options ""
>>>> (XEN) Unable to initialize dtuart: -19
>>>> (XEN) Bad console= option 'dtuart'
>>>> The node /serial@feb50000:
>>>> uart2: serial@feb50000 {
>>>> compatible = "rockchip,rk3588-uart", "snps,dw-apb-uart";
>>>> reg = <0x0 0xfeb50000 0x0 0x100>;
>>>> interrupts = <GIC_SPI 333 IRQ_TYPE_LEVEL_HIGH>;
>>>> clocks = <&cru SCLK_UART2>, <&cru PCLK_UART2>;
>>>> clock-names = "baudclk", "apb_pclk";
>>>> reg-shift = <2>;
>>>> reg-io-width = <4>;
>>>> dmas = <&dmac0 10>, <&dmac0 11>;
>>>> pinctrl-names = "default";
>>>> pinctrl-0 = <&uart2m1_xfer>;
>>>> status = "disabled";
>>>> };
>>>> There did an error when parse the xen command line, the command line I input is not correct ?
>>>> fdt set /chosen xen,xen-bootargs "console=dtuart dtuart=serial2 dom0_mem=4G dom0_max_vcpus=4 vwfi=native sched=null"
>>>
>>> I can't find the compatible rockchip,rk3588-uart in Linux drivers. Is
>>> the UART meant to be similar to snps,dw-apb-uart with some quirks?
>> The uart2 with compatible "snps,dw-apb-uart", is not a standard UART driver?
>
> snps,dw-apb-uart is based on the 8250 but there are some quirks. We do
> have support in Xen, but the fact you have an extra compatible makes me
> wonder whether there are extra setup for your UART.
>
>> The UART node:
>> uart2: serial@feb50000 {
>> compatible = "rockchip,rk3588-uart", "snps,dw-apb-uart";
>> reg = <0x0 0xfeb50000 0x0 0x100>;
>> interrupts = <GIC_SPI 333 IRQ_TYPE_LEVEL_HIGH>;
>> clocks = <&cru SCLK_UART2>, <&cru PCLK_UART2>;
>> clock-names = "baudclk", "apb_pclk";
>> reg-shift = <2>;
>> reg-io-width = <4>;
>> dmas = <&dmac0 10>, <&dmac0 11>;
>> pinctrl-names = "default";
>> pinctrl-0 = <&uart2m1_xfer>;
>> status = "disabled";
>> };
>> Do you have any suggestions on how to investigate the console can't input problem?
>
> I would suggest to look at the Linux driver and check if there are any
> different with the Xen one.
I see, I will check UART driver.
About the mali and sdmmc drivers problem, I compare the log between boot with xen and boot without xen.
And found an error log as below:
[ 65.517345] arm-scmi firmware:scmi: SCMI Notifications - Core Enabled.
(XEN) d0v2 Unhandled SMC/HVC: 0x82000010
[ 66.559382] arm-scmi firmware:scmi: unable to communicate with SCMI
[ 66.559516] arm-scmi: probe of firmware:scmi failed with error -95
It seems SCMI driver probe failed.
So I did an experiment, disable SCMI driver and rebuild the Linux kernel,
boot up in normal way without xen, and reproduces the problem that mali and sdmmc did not bring up.
It looks like a high probability SCMI cause the problem.
I read the Linux code and targeting located the error -95,
It seems SCMI probe failed cause by SMCCC not supported, code as below:
static int smc_send_message(struct scmi_chan_info *cinfo,
struct scmi_xfer *xfer)
{
struct scmi_smc *scmi_info = cinfo->transport_info;
struct arm_smccc_res res;
mutex_lock(&scmi_info->shmem_lock);
shmem_tx_prepare(scmi_info->shmem, xfer);
if (scmi_info->irq)
reinit_completion(&scmi_info->tx_complete);
arm_smccc_1_1_invoke(scmi_info->func_id, 0, 0, 0, 0, 0, 0, 0, &res);
if (scmi_info->irq)
wait_for_completion(&scmi_info->tx_complete);
scmi_rx_callback(scmi_info->cinfo, shmem_read_header(scmi_info->shmem));
mutex_unlock(&scmi_info->shmem_lock);
/* Only SMCCC_RET_NOT_SUPPORTED is valid error code */
if (res.a0)
return -EOPNOTSUPP;
return 0;
}
#define EOPNOTSUPP 95 /* Operation not supported on transport endpoint */
I also check the code where Unhandled SMC/HVC print in xen,
and found the log cause by unhandled SMCCC call in function vsmccc_handle_call().
Could it be xen unhandle SMCCC call cause SCMI driver probe failed ?
Best regards
Cailigang
Re: [Bug] Bring up Dom0 on Arm board [ In reply to ]
Hi,

On 13/01/2023 01:24, ??? wrote:
>>>>>> This could indicate that Xen is still using early printk and therefore
>>>>>> it would not be able to read character. From your previous email, I see
>>>>>> that you are requesting serial2. I am assuming this is an alias to the
>>>>>> same UART as the one you configure for the early printk?
>>>>>>
>>>>>> Can you paste the content of the related Device-Tree node? Also, I would
>>>>>> suggest to check if there are any errors in the Xen logs.
>>>>> Serial2 is the alias of the UART, some Xen logs as below:
>>>>> (XEN) adding DT alias:serial2: stem=serial id=2 node=/serial@feb50000
>>>>> ...
>>>>> (XEN) Looking for dtuart at "serial2", options ""
>>>>> (XEN) Unable to initialize dtuart: -19
>>>>> (XEN) Bad console= option 'dtuart'
>>>>> The node /serial@feb50000:
>>>>> uart2: serial@feb50000 {
>>>>> compatible = "rockchip,rk3588-uart", "snps,dw-apb-uart";
>>>>> reg = <0x0 0xfeb50000 0x0 0x100>;
>>>>> interrupts = <GIC_SPI 333 IRQ_TYPE_LEVEL_HIGH>;
>>>>> clocks = <&cru SCLK_UART2>, <&cru PCLK_UART2>;
>>>>> clock-names = "baudclk", "apb_pclk";
>>>>> reg-shift = <2>;
>>>>> reg-io-width = <4>;
>>>>> dmas = <&dmac0 10>, <&dmac0 11>;
>>>>> pinctrl-names = "default";
>>>>> pinctrl-0 = <&uart2m1_xfer>;
>>>>> status = "disabled";
>>>>> };
>>>>> There did an error when parse the xen command line, the command line I input is not correct ?
>>>>> fdt set /chosen xen,xen-bootargs "console=dtuart dtuart=serial2 dom0_mem=4G dom0_max_vcpus=4 vwfi=native sched=null"
>>>>
>>>> I can't find the compatible rockchip,rk3588-uart in Linux drivers. Is
>>>> the UART meant to be similar to snps,dw-apb-uart with some quirks?
>>> The uart2 with compatible "snps,dw-apb-uart", is not a standard UART driver?
>>
>> snps,dw-apb-uart is based on the 8250 but there are some quirks. We do
>> have support in Xen, but the fact you have an extra compatible makes me
>> wonder whether there are extra setup for your UART.
>>
>>> The UART node:
>>> uart2: serial@feb50000 {
>>> compatible = "rockchip,rk3588-uart", "snps,dw-apb-uart";
>>> reg = <0x0 0xfeb50000 0x0 0x100>;
>>> interrupts = <GIC_SPI 333 IRQ_TYPE_LEVEL_HIGH>;
>>> clocks = <&cru SCLK_UART2>, <&cru PCLK_UART2>;
>>> clock-names = "baudclk", "apb_pclk";
>>> reg-shift = <2>;
>>> reg-io-width = <4>;
>>> dmas = <&dmac0 10>, <&dmac0 11>;
>>> pinctrl-names = "default";
>>> pinctrl-0 = <&uart2m1_xfer>;
>>> status = "disabled";
>>> };
>>> Do you have any suggestions on how to investigate the console can't input problem?
>>
>> I would suggest to look at the Linux driver and check if there are any
>> different with the Xen one.
> I see, I will check UART driver.
> About the mali and sdmmc drivers problem, I compare the log between boot with xen and boot without xen.
> And found an error log as below:
> [ 65.517345] arm-scmi firmware:scmi: SCMI Notifications - Core Enabled.
> (XEN) d0v2 Unhandled SMC/HVC: 0x82000010
> [ 66.559382] arm-scmi firmware:scmi: unable to communicate with SCMI
> [ 66.559516] arm-scmi: probe of firmware:scmi failed with error -95
> It seems SCMI driver probe failed.
> So I did an experiment, disable SCMI driver and rebuild the Linux kernel,
> boot up in normal way without xen, and reproduces the problem that mali and sdmmc did not bring up.
> It looks like a high probability SCMI cause the problem.
> I read the Linux code and targeting located the error -95,
> It seems SCMI probe failed cause by SMCCC not supported, code as below:
> static int smc_send_message(struct scmi_chan_info *cinfo,
> struct scmi_xfer *xfer)
> {
> struct scmi_smc *scmi_info = cinfo->transport_info;
> struct arm_smccc_res res;
> mutex_lock(&scmi_info->shmem_lock);
> shmem_tx_prepare(scmi_info->shmem, xfer);
> if (scmi_info->irq)
> reinit_completion(&scmi_info->tx_complete);
> arm_smccc_1_1_invoke(scmi_info->func_id, 0, 0, 0, 0, 0, 0, 0, &res);
> if (scmi_info->irq)
> wait_for_completion(&scmi_info->tx_complete);
> scmi_rx_callback(scmi_info->cinfo, shmem_read_header(scmi_info->shmem));
> mutex_unlock(&scmi_info->shmem_lock);
> /* Only SMCCC_RET_NOT_SUPPORTED is valid error code */
> if (res.a0)
> return -EOPNOTSUPP;
> return 0;
> }
> #define EOPNOTSUPP 95 /* Operation not supported on transport endpoint */
> I also check the code where Unhandled SMC/HVC print in xen,
> and found the log cause by unhandled SMCCC call in function vsmccc_handle_call().
> Could it be xen unhandle SMCCC call cause SCMI driver probe failed ?

Yes. The domain would need to talk to the host SCMI. This is not yet
supported because Xen doesn't provide a mediator (this is necessary to
ensure the safety of the call).

If you are *only* looking to use the Mali driver in dom0. So you could
add some code in Xen to forward simply forward the request to the host
and check if it helps you.

Cheers,

--
Julien Grall
Re: [Bug] Bring up Dom0 on Arm board [ In reply to ]
Hi,
>> About the mali and sdmmc drivers problem, I compare the log between boot with xen and boot without xen.
>> And found an error log as below:
>> [ 65.517345] arm-scmi firmware:scmi: SCMI Notifications - Core Enabled.
>> (XEN) d0v2 Unhandled SMC/HVC: 0x82000010
>> [ 66.559382] arm-scmi firmware:scmi: unable to communicate with SCMI
>> [ 66.559516] arm-scmi: probe of firmware:scmi failed with error -95
>> It seems SCMI driver probe failed.
>> So I did an experiment, disable SCMI driver and rebuild the Linux kernel,
>> boot up in normal way without xen, and reproduces the problem that mali and sdmmc did not bring up.
>> It looks like a high probability SCMI cause the problem.
>> I read the Linux code and targeting located the error -95,
>> It seems SCMI probe failed cause by SMCCC not supported, code as below:
>> static int smc_send_message(struct scmi_chan_info *cinfo,
>> struct scmi_xfer *xfer)
>> {
>> struct scmi_smc *scmi_info = cinfo->transport_info;
>> struct arm_smccc_res res;
>> mutex_lock(&scmi_info->shmem_lock);
>> shmem_tx_prepare(scmi_info->shmem, xfer);
>> if (scmi_info->irq)
>> reinit_completion(&scmi_info->tx_complete);
>> arm_smccc_1_1_invoke(scmi_info->func_id, 0, 0, 0, 0, 0, 0, 0, &res);
>> if (scmi_info->irq)
>> wait_for_completion(&scmi_info->tx_complete);
>> scmi_rx_callback(scmi_info->cinfo, shmem_read_header(scmi_info->shmem));
>> mutex_unlock(&scmi_info->shmem_lock);
>> /* Only SMCCC_RET_NOT_SUPPORTED is valid error code */
>> if (res.a0)
>> return -EOPNOTSUPP;
>> return 0;
>> }
>> #define EOPNOTSUPP 95 /* Operation not supported on transport endpoint */
>> I also check the code where Unhandled SMC/HVC print in xen,
>> and found the log cause by unhandled SMCCC call in function vsmccc_handle_call().
>> Could it be xen unhandle SMCCC call cause SCMI driver probe failed ?
>
> Yes. The domain would need to talk to the host SCMI. This is not yet
> supported because Xen doesn't provide a mediator (this is necessary to
> ensure the safety of the call).
>
> If you are *only* looking to use the Mali driver in dom0. So you could
> add some code in Xen to forward simply forward the request to the host
> and check if it helps you.
How can I pass the requset to the host? I'm not familiar with xen code, is there any
reference code in xen?
Best regards
Cailigang
Re: [Bug] Bring up Dom0 on Arm board [ In reply to ]
On 16/01/2023 05:59, ??? wrote:
> Hi,

Hi,

>>> About the mali and sdmmc drivers problem, I compare the log between boot with xen and boot without xen.
>>> And found an error log as below:
>>> [ 65.517345] arm-scmi firmware:scmi: SCMI Notifications - Core Enabled.
>>> (XEN) d0v2 Unhandled SMC/HVC: 0x82000010
>>> [ 66.559382] arm-scmi firmware:scmi: unable to communicate with SCMI
>>> [ 66.559516] arm-scmi: probe of firmware:scmi failed with error -95
>>> It seems SCMI driver probe failed.
>>> So I did an experiment, disable SCMI driver and rebuild the Linux kernel,
>>> boot up in normal way without xen, and reproduces the problem that mali and sdmmc did not bring up.
>>> It looks like a high probability SCMI cause the problem.
>>> I read the Linux code and targeting located the error -95,
>>> It seems SCMI probe failed cause by SMCCC not supported, code as below:
>>> static int smc_send_message(struct scmi_chan_info *cinfo,
>>>   struct scmi_xfer *xfer)
>>> {
>>>   struct scmi_smc *scmi_info = cinfo->transport_info;
>>>   struct arm_smccc_res res;
>>>   mutex_lock(&scmi_info->shmem_lock);
>>>   shmem_tx_prepare(scmi_info->shmem, xfer);
>>>   if (scmi_info->irq)
>>>   reinit_completion(&scmi_info->tx_complete);
>>>   arm_smccc_1_1_invoke(scmi_info->func_id, 0, 0, 0, 0, 0, 0, 0, &res);
>>>   if (scmi_info->irq)
>>>   wait_for_completion(&scmi_info->tx_complete);
>>>   scmi_rx_callback(scmi_info->cinfo, shmem_read_header(scmi_info->shmem));
>>>   mutex_unlock(&scmi_info->shmem_lock);
>>>   /* Only SMCCC_RET_NOT_SUPPORTED is valid error code */
>>>   if (res.a0)
>>>   return -EOPNOTSUPP;
>>>   return 0;
>>> }
>>> #define EOPNOTSUPP 95 /* Operation not supported on transport endpoint */
>>> I also check the code where Unhandled SMC/HVC print in xen,
>>> and found the log cause by unhandled SMCCC call in function vsmccc_handle_call().
>>> Could it be xen unhandle SMCCC call cause SCMI driver probe failed ?
>>
>> Yes. The domain would need to talk to the host SCMI. This is not yet
>> supported because Xen doesn't provide a mediator (this is necessary to
>> ensure the safety of the call).
>>
>> If you are *only* looking to use the Mali driver in dom0. So you could
>> add some code in Xen to forward simply forward the request to the host
>> and check if it helps you.
>
> How can I pass the requset to the host? I'm not familiar with xen code,
> is there any
> reference code in xen?

There are a couple of solution:
1) You request the hypervisor to avoid trapping SVC. This would also
need some changes in Linux to force the Mali driver to use SVC rather
than HVC. There is a patch on xen-devel, to avoid trapping (see [1]).
2) Add an allow list of the SMCCC operations. There are some examples
how to "emulate" SMC call in Xen (see vsmccc_handle_call()).

Cheers,

[1]
https://lore.kernel.org/xen-devel/alpine.DEB.2.21.2106241749310.24906@sstabellini-ThinkPad-T480s/

>
> Best regards
> Cailigang

--
Julien Grall
Re: [Bug] Bring up Dom0 on Arm board [ In reply to ]
Hi,
>>>> About the mali and sdmmc drivers problem, I compare the log between boot with xen and boot without xen.
>>>> And found an error log as below:
>>>> [ 65.517345] arm-scmi firmware:scmi: SCMI Notifications - Core Enabled.
>>>> (XEN) d0v2 Unhandled SMC/HVC: 0x82000010
>>>> [ 66.559382] arm-scmi firmware:scmi: unable to communicate with SCMI
>>>> [ 66.559516] arm-scmi: probe of firmware:scmi failed with error -95
>>>> It seems SCMI driver probe failed.
>>>> So I did an experiment, disable SCMI driver and rebuild the Linux kernel,
>>>> boot up in normal way without xen, and reproduces the problem that mali and sdmmc did not bring up.
>>>> It looks like a high probability SCMI cause the problem.
>>>> I read the Linux code and targeting located the error -95,
>>>> It seems SCMI probe failed cause by SMCCC not supported, code as below:
>>>> static int smc_send_message(struct scmi_chan_info *cinfo,
>>>> struct scmi_xfer *xfer)
>>>> {
>>>> struct scmi_smc *scmi_info = cinfo->transport_info;
>>>> struct arm_smccc_res res;
>>>> mutex_lock(&scmi_info->shmem_lock);
>>>> shmem_tx_prepare(scmi_info->shmem, xfer);
>>>> if (scmi_info->irq)
>>>> reinit_completion(&scmi_info->tx_complete);
>>>> arm_smccc_1_1_invoke(scmi_info->func_id, 0, 0, 0, 0, 0, 0, 0, &res);
>>>> if (scmi_info->irq)
>>>> wait_for_completion(&scmi_info->tx_complete);
>>>> scmi_rx_callback(scmi_info->cinfo, shmem_read_header(scmi_info->shmem));
>>>> mutex_unlock(&scmi_info->shmem_lock);
>>>> /* Only SMCCC_RET_NOT_SUPPORTED is valid error code */
>>>> if (res.a0)
>>>> return -EOPNOTSUPP;
>>>> return 0;
>>>> }
>>>> #define EOPNOTSUPP 95 /* Operation not supported on transport endpoint */
>>>> I also check the code where Unhandled SMC/HVC print in xen,
>>>> and found the log cause by unhandled SMCCC call in function vsmccc_handle_call().
>>>> Could it be xen unhandle SMCCC call cause SCMI driver probe failed ?
>>>
>>> Yes. The domain would need to talk to the host SCMI. This is not yet
>>> supported because Xen doesn't provide a mediator (this is necessary to
>>> ensure the safety of the call).
>>>
>>> If you are *only* looking to use the Mali driver in dom0. So you could
>>> add some code in Xen to forward simply forward the request to the host
>>> and check if it helps you.
>>
>> How can I pass the requset to the host? I'm not familiar with xen code,
>> is there any
>> reference code in xen?
>
> There are a couple of solution:
> 1) You request the hypervisor to avoid trapping SVC. This would also
> need some changes in Linux to force the Mali driver to use SVC rather
> than HVC. There is a patch on xen-devel, to avoid trapping (see [1]).
> 2) Add an allow list of the SMCCC operations. There are some examples
> how to "emulate" SMC call in Xen (see vsmccc_handle_call()).
>
> Cheers,
>
> [1]
> https://lore.kernel.org/xen-devel/alpine.DEB.2.21.2106241749310.24906@sstabellini-ThinkPad-T480s/
I tried both solutions but it didn't work.
First way:
I add the code as the patch, add forward_smc=true to the Xen command line.
Then boot, but still print the log 'Unhandled SMC/HVC ...', meanwhile xen
throw an exception, log as below:
(XEN) traps.c:1987:d0v3 HSR=0x92000007 pc=0xffffffc0106a3be8 gva=0xffffffc0127ad0a0 gpa=0x000000001000a0
[ 5.966596] Unhandled fault at 0xffffffc0127ad0a0
[ 5.966619] Mem abort info:
[ 5.966633] ESR = 0x96000000
[ 5.966649] EC = 0x25: DABT (current EL), IL = 32 bits
[ 5.966666] SET = 0, FnV = 0
[ 5.966680] EA = 0, S1PTW = 0
[ 5.966694] Data abort info:
[ 5.966708] ISV = 0, ISS = 0x00000000
[ 5.966722] CM = 0, WnR = 0
[ 5.966738] swapper pgtable: 4k pages, 39-bit VAs, pgdp=00000000194b6000
It seems forward_smc=true did not work, but cause an exception.
Second way:
I used zynqmp_eemi() in xilinx-zynqmp-eemi.c to handle smc call.
The smc call seems succeed according to xen log.
But after run smc call, kernel throw an exception, log as below:
[ 8.771446] rockchip-pm-domain fd8d8000.power-management:power-controller:
Looking up pcie-supply from device tree
[ 8.771485] rockchip-pm-domain fd8d8000.power-management:power-controller:
Looking up pcie-supply property in node /power-management@fd8d8000/power-controller failed
[ 66.037851] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[ 66.037874] rcu: 3-...0: (0 ticks this GP) idle=196/1/0x4000000000000000 softirq=30/30 fqs=6000
[ 66.037892] (detected by 5, t=18002 jiffies, g=-1147, q=81)
[ 66.037905] Task dump for CPU 3:
[ 66.037916] task:swapper/0 state:R running task stack: 0 pid: 1 ppid: 0 flags:0x0000002a
[ 66.037939] Call trace:
[ 66.037952] __switch_to+0x130/0x1a0
[ 66.037968] 0xffffffc0112bc173
It seems I need to modify zynqmp_eemi() code to adapt the kerenl,
But based on what to modify zynqmp_eemi()?
Best regards
Cailigang
Re: [Bug] Bring up Dom0 on Arm board [ In reply to ]
On 18/01/2023 09:04, ??? wrote:
>>>>> About the mali and sdmmc drivers problem, I compare the log between boot with xen and boot without xen.
>>>>> And found an error log as below:
>>>>> [ 65.517345] arm-scmi firmware:scmi: SCMI Notifications - Core Enabled.
>>>>> (XEN) d0v2 Unhandled SMC/HVC: 0x82000010
>>>>> [ 66.559382] arm-scmi firmware:scmi: unable to communicate with SCMI
>>>>> [ 66.559516] arm-scmi: probe of firmware:scmi failed with error -95
>>>>> It seems SCMI driver probe failed.
>>>>> So I did an experiment, disable SCMI driver and rebuild the Linux kernel,
>>>>> boot up in normal way without xen, and reproduces the problem that mali and sdmmc did not bring up.
>>>>> It looks like a high probability SCMI cause the problem.
>>>>> I read the Linux code and targeting located the error -95,
>>>>> It seems SCMI probe failed cause by SMCCC not supported, code as below:
>>>>> static int smc_send_message(struct scmi_chan_info *cinfo,
>>>>> struct scmi_xfer *xfer)
>>>>> {
>>>>> struct scmi_smc *scmi_info = cinfo->transport_info;
>>>>> struct arm_smccc_res res;
>>>>> mutex_lock(&scmi_info->shmem_lock);
>>>>> shmem_tx_prepare(scmi_info->shmem, xfer);
>>>>> if (scmi_info->irq)
>>>>> reinit_completion(&scmi_info->tx_complete);
>>>>> arm_smccc_1_1_invoke(scmi_info->func_id, 0, 0, 0, 0, 0, 0, 0, &res);
>>>>> if (scmi_info->irq)
>>>>> wait_for_completion(&scmi_info->tx_complete);
>>>>> scmi_rx_callback(scmi_info->cinfo, shmem_read_header(scmi_info->shmem));
>>>>> mutex_unlock(&scmi_info->shmem_lock);
>>>>> /* Only SMCCC_RET_NOT_SUPPORTED is valid error code */
>>>>> if (res.a0)
>>>>> return -EOPNOTSUPP;
>>>>> return 0;
>>>>> }
>>>>> #define EOPNOTSUPP 95 /* Operation not supported on transport endpoint */
>>>>> I also check the code where Unhandled SMC/HVC print in xen,
>>>>> and found the log cause by unhandled SMCCC call in function vsmccc_handle_call().
>>>>> Could it be xen unhandle SMCCC call cause SCMI driver probe failed ?
>>>>
>>>> Yes. The domain would need to talk to the host SCMI. This is not yet
>>>> supported because Xen doesn't provide a mediator (this is necessary to
>>>> ensure the safety of the call).
>>>>
>>>> If you are *only* looking to use the Mali driver in dom0. So you could
>>>> add some code in Xen to forward simply forward the request to the host
>>>> and check if it helps you.
>>>
>>> How can I pass the requset to the host? I'm not familiar with xen code,
>>> is there any
>>> reference code in xen?
>>
>> There are a couple of solution:
>> 1) You request the hypervisor to avoid trapping SVC. This would also
>> need some changes in Linux to force the Mali driver to use SVC rather
>> than HVC. There is a patch on xen-devel, to avoid trapping (see [1]).
>> 2) Add an allow list of the SMCCC operations. There are some examples
>> how to "emulate" SMC call in Xen (see vsmccc_handle_call()).
>>
>> Cheers,
>>
>> [1]
>> https://lore.kernel.org/xen-devel/alpine.DEB.2.21.2106241749310.24906@sstabellini-ThinkPad-T480s/
> I tried both solutions but it didn't work.
> First way:
> I add the code as the patch, add forward_smc=true to the Xen command line.
> Then boot, but still print the log 'Unhandled SMC/HVC ...',

Can you confirm whether you ask the mali driver to use SMC call?

> meanwhile xen
> throw an exception, log as below:
> (XEN) traps.c:1987:d0v3 HSR=0x92000007 pc=0xffffffc0106a3be8 gva=0xffffffc0127ad0a0 gpa=0x000000001000a0
> [ 5.966596] Unhandled fault at 0xffffffc0127ad0a0
> [ 5.966619] Mem abort info:
> [ 5.966633] ESR = 0x96000000
> [ 5.966649] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 5.966666] SET = 0, FnV = 0
> [ 5.966680] EA = 0, S1PTW = 0
> [ 5.966694] Data abort info:
> [ 5.966708] ISV = 0, ISS = 0x00000000
> [ 5.966722] CM = 0, WnR = 0
> [ 5.966738] swapper pgtable: 4k pages, 39-bit VAs, pgdp=00000000194b6000
> It seems forward_smc=true did not work, but cause an exception.

This is indicating that the dom0 is trying to access a region that is
not mapped.

Can you check what the address 0x1000a0 is used for the host layout?

> Second way:
> I used zynqmp_eemi() in xilinx-zynqmp-eemi.c to handle smc call.
> The smc call seems succeed according to xen log.
> But after run smc call, kernel throw an exception, log as below:
> [ 8.771446] rockchip-pm-domain fd8d8000.power-management:power-controller:
> Looking up pcie-supply from device tree
> [ 8.771485] rockchip-pm-domain fd8d8000.power-management:power-controller:
> Looking up pcie-supply property in node /power-management@fd8d8000/power-controller failed
> [ 66.037851] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
> [ 66.037874] rcu: 3-...0: (0 ticks this GP) idle=196/1/0x4000000000000000 softirq=30/30 fqs=6000
> [ 66.037892] (detected by 5, t=18002 jiffies, g=-1147, q=81)
> [ 66.037905] Task dump for CPU 3:
> [ 66.037916] task:swapper/0 state:R running task stack: 0 pid: 1 ppid: 0 flags:0x0000002a
> [ 66.037939] Call trace:
> [ 66.037952] __switch_to+0x130/0x1a0
> [ 66.037968] 0xffffffc0112bc173
> It seems I need to modify zynqmp_eemi() code to adapt the kerenl,
> But based on what to modify zynqmp_eemi()?

I am not sure I understand what are the changes you made in Xen. Can you
post the diff?

Cheers,

--
Julien Grall
Re: [Bug] Bring up Dom0 on Arm board [ In reply to ]
On Wed, 18 Jan 2023, ??? wrote:
> >>>> About the mali and sdmmc drivers problem, I compare the log between boot with xen and boot without xen.
> >>>> And found an error log as below:
> >>>> [ 65.517345] arm-scmi firmware:scmi: SCMI Notifications - Core Enabled.
> >>>> (XEN) d0v2 Unhandled SMC/HVC: 0x82000010
> >>>> [ 66.559382] arm-scmi firmware:scmi: unable to communicate with SCMI
> >>>> [ 66.559516] arm-scmi: probe of firmware:scmi failed with error -95
> >>>> It seems SCMI driver probe failed.
> >>>> So I did an experiment, disable SCMI driver and rebuild the Linux kernel,
> >>>> boot up in normal way without xen, and reproduces the problem that mali and sdmmc did not bring up.
> >>>> It looks like a high probability SCMI cause the problem.
> >>>> I read the Linux code and targeting located the error -95,
> >>>> It seems SCMI probe failed cause by SMCCC not supported, code as below:
> >>>> static int smc_send_message(struct scmi_chan_info *cinfo,
> >>>>   struct scmi_xfer *xfer)
> >>>> {
> >>>>   struct scmi_smc *scmi_info = cinfo->transport_info;
> >>>>   struct arm_smccc_res res;
> >>>>   mutex_lock(&scmi_info->shmem_lock);
> >>>>   shmem_tx_prepare(scmi_info->shmem, xfer);
> >>>>   if (scmi_info->irq)
> >>>>   reinit_completion(&scmi_info->tx_complete);
> >>>>   arm_smccc_1_1_invoke(scmi_info->func_id, 0, 0, 0, 0, 0, 0, 0, &res);
> >>>>   if (scmi_info->irq)
> >>>>   wait_for_completion(&scmi_info->tx_complete);
> >>>>   scmi_rx_callback(scmi_info->cinfo, shmem_read_header(scmi_info->shmem));
> >>>>   mutex_unlock(&scmi_info->shmem_lock);
> >>>>   /* Only SMCCC_RET_NOT_SUPPORTED is valid error code */
> >>>>   if (res.a0)
> >>>>   return -EOPNOTSUPP;
> >>>>   return 0;
> >>>> }
> >>>> #define EOPNOTSUPP 95 /* Operation not supported on transport endpoint */
> >>>> I also check the code where Unhandled SMC/HVC print in xen,
> >>>> and found the log cause by unhandled SMCCC call in function vsmccc_handle_call().
> >>>> Could it be xen unhandle SMCCC call cause SCMI driver probe failed ?
> >>> 
> >>> Yes. The domain would need to talk to the host SCMI. This is not yet 
> >>> supported because Xen doesn't provide a mediator (this is necessary to 
> >>> ensure the safety of the call).
> >>> 
> >>> If you are *only* looking to use the Mali driver in dom0. So you could 
> >>> add some code in Xen to forward simply forward the request to the host 
> >>> and check if it helps you.
> >> 
> >> How can I pass the requset to the host? I'm not familiar with xen code, 
> >> is there any
> >> reference code in xen?
> > 
> > There are a couple of solution:
> >    1) You request the hypervisor to avoid trapping SVC. This would also 
> > need some changes in Linux to force the Mali driver to use SVC rather 
> > than HVC. There is a patch on xen-devel, to avoid trapping (see [1]).
> >    2) Add an allow list of the SMCCC operations. There are some examples 
> > how to "emulate" SMC call in Xen (see vsmccc_handle_call()).
> > 
> > Cheers,
> > 
> > [1] 
> > https://lore.kernel.org/xen-devel/alpine.DEB.2.21.2106241749310.24906@sstabellini-ThinkPad-T480s/
>
>
> I tried both solutions but it didn't work.
>
> First way:
> I add the code as the patch, add forward_smc=true to the Xen command line.
> Then boot, but still print the  log 'Unhandled SMC/HVC ...', meanwhile xen
> throw an exception, log as below:
>
> (XEN) traps.c:1987:d0v3 HSR=0x92000007 pc=0xffffffc0106a3be8 gva=0xffffffc0127ad0a0 gpa=0x000000001000a0
> [    5.966596] Unhandled fault at 0xffffffc0127ad0a0
> [    5.966619] Mem abort info:
> [    5.966633]   ESR = 0x96000000
> [    5.966649]   EC = 0x25: DABT (current EL), IL = 32 bits
> [    5.966666]   SET = 0, FnV = 0
> [    5.966680]   EA = 0, S1PTW = 0
> [    5.966694] Data abort info:
> [    5.966708]   ISV = 0, ISS = 0x00000000
> [    5.966722]   CM = 0, WnR = 0
> [    5.966738] swapper pgtable: 4k pages, 39-bit VAs, pgdp=00000000194b6000
>
> It seems forward_smc=true did not work, but cause an exception.

This is not "Unhandled SMC/HVC". The guest is trying to access address
0x1000a0 which doesn't seem to be valid in the guest?

Do you know where 0x1000a0 is coming from? Could it be that one of the
SMC calls returns 0x1000a0 to the guest and the guest tries to access
it, but actually 0x1000a0 is not present in any valid device tree ranges
so it is not accessible from the guest?

If 0x1000a0 is a "special" address returned by the firmware, it needs to
belong to a range described in one of the device tree nodes for it to be
accessible by dom0.


> Second way:
> I used zynqmp_eemi() in xilinx-zynqmp-eemi.c to handle smc call.
> The smc call seems succeed according to xen log.
> But after run smc call, kernel throw an exception, log as below:
>
> [    8.771446] rockchip-pm-domain fd8d8000.power-management:power-controller: 
> Looking up pcie-supply from device tree
> [    8.771485] rockchip-pm-domain fd8d8000.power-management:power-controller: 
> Looking up pcie-supply property in node /power-management@fd8d8000/power-controller failed
> [   66.037851] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
> [   66.037874] rcu:     3-...0: (0 ticks this GP) idle=196/1/0x4000000000000000 softirq=30/30 fqs=6000 
> [   66.037892]  (detected by 5, t=18002 jiffies, g=-1147, q=81)
> [   66.037905] Task dump for CPU 3:
> [   66.037916] task:swapper/0       state:R  running task     stack:    0 pid:    1 ppid:     0 flags:0x0000002a
> [   66.037939] Call trace:
> [   66.037952]  __switch_to+0x130/0x1a0
> [   66.037968]  0xffffffc0112bc173
>
> It seems I need to modify zynqmp_eemi() code to adapt the kerenl, 
> But based on what to modify zynqmp_eemi()?

In theory, there is no reason why forward_smc=true would not work but
xilinx-zynqmp-eemi.c would work. But anyway, I am appending the changes
to make sure xilinx-zynqmp-eemi.c forwards everything to the firmware.

It looks like the kernel is throwing an exception because it is blocked
for too long. Maybe the SMC call didn't actually succeed after all.


diff --git a/xen/arch/arm/platforms/xilinx-zynqmp-eemi.c b/xen/arch/arm/platforms/xilinx-zynqmp-eemi.c
index 2053ed7ac5..20aa6afb47 100644
--- a/xen/arch/arm/platforms/xilinx-zynqmp-eemi.c
+++ b/xen/arch/arm/platforms/xilinx-zynqmp-eemi.c
@@ -57,6 +57,8 @@ bool zynqmp_eemi(struct cpu_user_regs *regs)
unsigned int pm_fn = fid & 0xFFFF;
enum pm_ret_status ret;

+ goto forward_to_fw;
+
switch ( fid )
{
/* Mandatory SMC32 functions. */
Re: [Bug] Bring up Dom0 on Arm board [ In reply to ]
Hi,
I was on vacation, sorry for the late reply.
>>>>>> About the mali and sdmmc drivers problem, I compare the log between boot with xen and boot without xen.
>>>>>> And found an error log as below:
>>>>>> [ 65.517345] arm-scmi firmware:scmi: SCMI Notifications - Core Enabled.
>>>>>> (XEN) d0v2 Unhandled SMC/HVC: 0x82000010
>>>>>> [ 66.559382] arm-scmi firmware:scmi: unable to communicate with SCMI
>>>>>> [ 66.559516] arm-scmi: probe of firmware:scmi failed with error -95
>>>>>> It seems SCMI driver probe failed.
>>>>>> So I did an experiment, disable SCMI driver and rebuild the Linux kernel,
>>>>>> boot up in normal way without xen, and reproduces the problem that mali and sdmmc did not bring up.
>>>>>> It looks like a high probability SCMI cause the problem.
>>>>>> I read the Linux code and targeting located the error -95,
>>>>>> It seems SCMI probe failed cause by SMCCC not supported, code as below:
>>>>>> static int smc_send_message(struct scmi_chan_info *cinfo,
>>>>>> struct scmi_xfer *xfer)
>>>>>> {
>>>>>> struct scmi_smc *scmi_info = cinfo->transport_info;
>>>>>> struct arm_smccc_res res;
>>>>>> mutex_lock(&scmi_info->shmem_lock);
>>>>>> shmem_tx_prepare(scmi_info->shmem, xfer);
>>>>>> if (scmi_info->irq)
>>>>>> reinit_completion(&scmi_info->tx_complete);
>>>>>> arm_smccc_1_1_invoke(scmi_info->func_id, 0, 0, 0, 0, 0, 0, 0, &res);
>>>>>> if (scmi_info->irq)
>>>>>> wait_for_completion(&scmi_info->tx_complete);
>>>>>> scmi_rx_callback(scmi_info->cinfo, shmem_read_header(scmi_info->shmem));
>>>>>> mutex_unlock(&scmi_info->shmem_lock);
>>>>>> /* Only SMCCC_RET_NOT_SUPPORTED is valid error code */
>>>>>> if (res.a0)
>>>>>> return -EOPNOTSUPP;
>>>>>> return 0;
>>>>>> }
>>>>>> #define EOPNOTSUPP 95 /* Operation not supported on transport endpoint */
>>>>>> I also check the code where Unhandled SMC/HVC print in xen,
>>>>>> and found the log cause by unhandled SMCCC call in function vsmccc_handle_call().
>>>>>> Could it be xen unhandle SMCCC call cause SCMI driver probe failed ?
>>>>>
>>>>> Yes. The domain would need to talk to the host SCMI. This is not yet
>>>>> supported because Xen doesn't provide a mediator (this is necessary to
>>>>> ensure the safety of the call).
>>>>>
>>>>> If you are *only* looking to use the Mali driver in dom0. So you could
>>>>> add some code in Xen to forward simply forward the request to the host
>>>>> and check if it helps you.
>>>>
>>>> How can I pass the requset to the host? I'm not familiar with xen code,
>>>> is there any
>>>> reference code in xen?
>>>
>>> There are a couple of solution:
>>> 1) You request the hypervisor to avoid trapping SVC. This would also
>>> need some changes in Linux to force the Mali driver to use SVC rather
>>> than HVC. There is a patch on xen-devel, to avoid trapping (see [1]).
>>> 2) Add an allow list of the SMCCC operations. There are some examples
>>> how to "emulate" SMC call in Xen (see vsmccc_handle_call()).
>>>
>>> Cheers,
>>>
>>> [1]
>>> https://lore.kernel.org/xen-devel/alpine.DEB.2.21.2106241749310.24906@sstabellini-ThinkPad-T480s/
>> I tried both solutions but it didn't work.
>> First way:
>> I add the code as the patch, add forward_smc=true to the Xen command line.
>> Then boot, but still print the log 'Unhandled SMC/HVC ...',
Julien Grall wrote:
> Can you confirm whether you ask the mali driver to use SMC call?
I add log to print hsr.ec value in do_trap_guest_sync(),
and shows call are HSR_EC_HVC64 and HSR_EC_SMC64.
But HCR_TSC seems only handle HSR_EC_SMC64 call.
>> meanwhile xen
>> throw an exception, log as below:
>> (XEN) traps.c:1987:d0v3 HSR=0x92000007 pc=0xffffffc0106a3be8 gva=0xffffffc0127ad0a0 gpa=0x000000001000a0
>> [ 5.966596] Unhandled fault at 0xffffffc0127ad0a0
>> [ 5.966619] Mem abort info:
>> [ 5.966633] ESR = 0x96000000
>> [ 5.966649] EC = 0x25: DABT (current EL), IL = 32 bits
>> [ 5.966666] SET = 0, FnV = 0
>> [ 5.966680] EA = 0, S1PTW = 0
>> [ 5.966694] Data abort info:
>> [ 5.966708] ISV = 0, ISS = 0x00000000
>> [ 5.966722] CM = 0, WnR = 0
>> [ 5.966738] swapper pgtable: 4k pages, 39-bit VAs, pgdp=00000000194b6000
>> It seems forward_smc=true did not work, but cause an exception.
Julien Grall wrote:
> This is indicating that the dom0 is trying to access a region that is
> not mapped.
> Can you check what the address 0x1000a0 is used for the host layout?
Stefano Stabellini wrote:
> This is not "Unhandled SMC/HVC". The guest is trying to access address
> 0x1000a0 which doesn't seem to be valid in the guest?
>
> Do you know where 0x1000a0 is coming from? Could it be that one of the
> SMC calls returns 0x1000a0 to the guest and the guest tries to access
> it, but actually 0x1000a0 is not present in any valid device tree ranges
> so it is not accessible from the guest?
> If 0x1000a0 is a "special" address returned by the firmware, it needs to
> belong to a range described in one of the device tree nodes for it to be
> accessible by dom0.
I searched 0x1000a0 in kernel code and in dtb files, but found nothing.
So far I haven't figured out where 0x1000a0 is coming from.
>> Second way:
>> I used zynqmp_eemi() in xilinx-zynqmp-eemi.c to handle smc call.
>> The smc call seems succeed according to xen log.
>> But after run smc call, kernel throw an exception, log as below:
>> [ 8.771446] rockchip-pm-domain fd8d8000.power-management:power-controller:
>> Looking up pcie-supply from device tree
>> [ 8.771485] rockchip-pm-domain fd8d8000.power-management:power-controller:
>> Looking up pcie-supply property in node /power-management@fd8d8000/power-controller failed
>> [ 66.037851] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
>> [ 66.037874] rcu: 3-...0: (0 ticks this GP) idle=196/1/0x4000000000000000 softirq=30/30 fqs=6000
>> [ 66.037892] (detected by 5, t=18002 jiffies, g=-1147, q=81)
>> [ 66.037905] Task dump for CPU 3:
>> [ 66.037916] task:swapper/0 state:R running task stack: 0 pid: 1 ppid: 0 flags:0x0000002a
>> [ 66.037939] Call trace:
>> [ 66.037952] __switch_to+0x130/0x1a0
>> [ 66.037968] 0xffffffc0112bc173
>> It seems I need to modify zynqmp_eemi() code to adapt the kerenl,
>> But based on what to modify zynqmp_eemi()?
Julien Grall wrote:
> I am not sure I understand what are the changes you made in Xen. Can you
> post the diff?
The changes made in xen as below:
diff --git a/xen/arch/arm/vsmc.c b/xen/arch/arm/vsmc.c
index b633ff2fe8..83bb063016 100644
--- a/xen/arch/arm/vsmc.c
+++ b/xen/arch/arm/vsmc.c
@@ -27,6 +27,7 @@
#include <asm/traps.h>
#include <asm/vpsci.h>
#include <asm/platform.h>
+#include <asm/platforms/xilinx-zynqmp-eemi.h>
/* Number of functions currently supported by Hypervisor Service. */
#define XEN_SMCCC_FUNCTION_COUNT 3
@@ -280,7 +281,8 @@ static bool vsmccc_handle_call(struct cpu_user_regs *regs)
handled = handle_sssc(regs);
break;
case ARM_SMCCC_OWNER_SIP:
- handled = platform_smc(regs);
+ // handled = platform_smc(regs);
+ handled = zynqmp_eemi(regs);
break;
case ARM_SMCCC_OWNER_TRUSTED_APP ... ARM_SMCCC_OWNER_TRUSTED_APP_END:
case ARM_SMCCC_OWNER_TRUSTED_OS ... ARM_SMCCC_OWNER_TRUSTED_OS_END:
diff --git a/xen/include/asm-arm/platforms/xilinx-zynqmp-eemi.h b/xen/include/asm-arm/platforms/xilinx-zynqmp-eemi.h
index cf25a9014d..55e20e99ca 100644
--- a/xen/include/asm-arm/platforms/xilinx-zynqmp-eemi.h
+++ b/xen/include/asm-arm/platforms/xilinx-zynqmp-eemi.h
@@ -18,7 +18,7 @@
#include <asm/smccc.h>
#define EEMI_FID(fid) ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
- ARM_SMCCC_CONV_64, \
+ ARM_SMCCC_CONV_32, \
ARM_SMCCC_OWNER_SIP, \
fid)
Stefano Stabellini wrote:
> In theory, there is no reason why forward_smc=true would not work but
> xilinx-zynqmp-eemi.c would work. But anyway, I am appending the changes
> to make sure xilinx-zynqmp-eemi.c forwards everything to the firmware.
>
> It looks like the kernel is throwing an exception because it is blocked
> for too long. Maybe the SMC call didn't actually succeed after all.
>
> diff --git a/xen/arch/arm/platforms/xilinx-zynqmp-eemi.c b/xen/arch/arm/platforms/xilinx-zynqmp-eemi.c
> index 2053ed7ac5..20aa6afb47 100644
> --- a/xen/arch/arm/platforms/xilinx-zynqmp-eemi.c
> +++ b/xen/arch/arm/platforms/xilinx-zynqmp-eemi.c
> @@ -57,6 +57,8 @@ bool zynqmp_eemi(struct cpu_user_regs *regs)
> unsigned int pm_fn = fid & 0xFFFF;
> enum pm_ret_status ret;
>
> + goto forward_to_fw;
> +
> switch ( fid )
> {
> /* Mandatory SMC32 functions. */
I tried this change in xen, still has error in kernel, log as below:
[ 8.985102] rockchip-pm-domain fd8d8000.power-management:power-controller:
Looking up pcie-supply from device tree
[ 8.985185] rockchip-pm-domain fd8d8000.power-management:power-controller:
Looking up pcie-supply property in node /power-management@fd8d8000/power-controller failed
[ 65.911230] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[ 65.911274] rcu: 0-...0: (1 ticks this GP) idle=1ba/1/0x4000000000000000 softirq=29/29 fqs=6001
[ 65.911311] (detected by 2, t=18003 jiffies, g=-1147, q=82)
[ 65.911337] Task dump for CPU 0:
[ 65.911360] task:swapper/0 state:R running task stack: 0 pid: 1 ppid: 0 flags:0x0000002a
[ 65.911406] Call trace:
[ 65.911429] __switch_to+0x130/0x1a0
[ 65.911453] die+0x58/0x20c
[ 65.911473] arm64_notify_die+0x88/0x8c
[ 65.911497] do_mem_abort+0xa0/0xb8
[ 65.911520] el1_abort+0x3c/0x5c
[ 65.911542] el1_sync_handler+0x94/0xd0
[ 65.911563] el1_sync+0x88/0x140
[ 65.911587] copy_from_kernel_nofault+0xc8/0x128
[ 65.911612] aarch64_insn_read+0x34/0x6c
[ 65.911635] show_data.constprop.0+0xac/0x128
[ 65.911660] show_regs+0xd8/0x118
[ 65.911681] die+0xcc/0x20c
[ 65.911700] arm64_notify_die+0x88/0x8c
[ 65.911722] do_mem_abort+0xa0/0xb8
[ 65.911743] el1_abort+0x3c/0x5c
[ 65.911764] el1_sync_handler+0x94/0xd0
[ 65.911785] el1_sync+0x88/0x140
[ 65.911808] rockchip_gem_get_ddr_info+0x28/0x40
[ 65.911833] rockchip_drm_init+0x110/0x114
[ 65.911855] do_one_initcall+0xa0/0x1e8
[ 65.911878] kernel_init_freeable+0x2a4/0x2ac
[ 65.911902] kernel_init+0x20/0x11c
[ 65.911924] ret_from_fork+0x10/0x30
I continue to try the second way, and find that during initialization,
SMC/HVC call come from different case, funcid 0x82000010 handle in HSR_EC_HVC64 case,
funcid 0x82000009 handle in HSR_EC_SMC64 case.
log as below:
(XEN) d0v2 do_trap_hvc_smccc trap....
(XEN) d0v2 Unhandled SMC/HVC: 0x82000010
(XEN) d0v3 do_trap_smc trap....
(XEN) d0v3 Unhandled SMC/HVC: 0x82000009
So I change the code in vsmc.c as below:
(The changes "goto forward_to_fw" remain)
diff --git a/xen/arch/arm/vsmc.c b/xen/arch/arm/vsmc.c
index b633ff2fe8..09636aab17 100644
--- a/xen/arch/arm/vsmc.c
+++ b/xen/arch/arm/vsmc.c
@@ -27,6 +27,7 @@
#include <asm/traps.h>
#include <asm/vpsci.h>
#include <asm/platform.h>
+#include <asm/platforms/xilinx-zynqmp-eemi.h>
/* Number of functions currently supported by Hypervisor Service. */
#define XEN_SMCCC_FUNCTION_COUNT 3
@@ -280,7 +281,12 @@ static bool vsmccc_handle_call(struct cpu_user_regs *regs)
handled = handle_sssc(regs);
break;
case ARM_SMCCC_OWNER_SIP:
- handled = platform_smc(regs);
+ if(hsr.ec == HSR_EC_HVC64)
+ {
+ handled = zynqmp_eemi(regs);
+ }
+ else
+ handled = platform_smc(regs);
break;
case ARM_SMCCC_OWNER_TRUSTED_APP ... ARM_SMCCC_OWNER_TRUSTED_APP_END:
case ARM_SMCCC_OWNER_TRUSTED_OS ... ARM_SMCCC_OWNER_TRUSTED_OS_END:
After this change, mali and sdmmc driver bring up succeed.
But I'm confused, is this the xen problem or kernel work SMC/HVC call in a quirks way?
Best regards
Cailigang
Re: [Bug] Bring up Dom0 on Arm board [ In reply to ]
On Sat, 28 Jan 2023, ??? wrote:
> Hi, 
> I was on vacation, sorry for the late reply.
>
> >>>>>> About the mali and sdmmc drivers problem, I compare the log between boot with xen and boot without xen.
> >>>>>> And found an error log as below:
> >>>>>> [ 65.517345] arm-scmi firmware:scmi: SCMI Notifications - Core Enabled.
> >>>>>> (XEN) d0v2 Unhandled SMC/HVC: 0x82000010
> >>>>>> [ 66.559382] arm-scmi firmware:scmi: unable to communicate with SCMI
> >>>>>> [ 66.559516] arm-scmi: probe of firmware:scmi failed with error -95
> >>>>>> It seems SCMI driver probe failed.
> >>>>>> So I did an experiment, disable SCMI driver and rebuild the Linux kernel,
> >>>>>> boot up in normal way without xen, and reproduces the problem that mali and sdmmc did not bring up.
> >>>>>> It looks like a high probability SCMI cause the problem.
> >>>>>> I read the Linux code and targeting located the error -95,
> >>>>>> It seems SCMI probe failed cause by SMCCC not supported, code as below:
> >>>>>> static int smc_send_message(struct scmi_chan_info *cinfo,
> >>>>>> struct scmi_xfer *xfer)
> >>>>>> {
> >>>>>> struct scmi_smc *scmi_info = cinfo->transport_info;
> >>>>>> struct arm_smccc_res res;
> >>>>>> mutex_lock(&scmi_info->shmem_lock);
> >>>>>> shmem_tx_prepare(scmi_info->shmem, xfer);
> >>>>>> if (scmi_info->irq)
> >>>>>> reinit_completion(&scmi_info->tx_complete);
> >>>>>> arm_smccc_1_1_invoke(scmi_info->func_id, 0, 0, 0, 0, 0, 0, 0, &res);
> >>>>>> if (scmi_info->irq)
> >>>>>> wait_for_completion(&scmi_info->tx_complete);
> >>>>>> scmi_rx_callback(scmi_info->cinfo, shmem_read_header(scmi_info->shmem));
> >>>>>> mutex_unlock(&scmi_info->shmem_lock);
> >>>>>> /* Only SMCCC_RET_NOT_SUPPORTED is valid error code */
> >>>>>> if (res.a0)
> >>>>>> return -EOPNOTSUPP;
> >>>>>> return 0;
> >>>>>> }
> >>>>>> #define EOPNOTSUPP 95 /* Operation not supported on transport endpoint */
> >>>>>> I also check the code where Unhandled SMC/HVC print in xen,
> >>>>>> and found the log cause by unhandled SMCCC call in function vsmccc_handle_call().
> >>>>>> Could it be xen unhandle SMCCC call cause SCMI driver probe failed ?
> >>>>>
> >>>>> Yes. The domain would need to talk to the host SCMI. This is not yet
> >>>>> supported because Xen doesn't provide a mediator (this is necessary to
> >>>>> ensure the safety of the call).
> >>>>>
> >>>>> If you are *only* looking to use the Mali driver in dom0. So you could
> >>>>> add some code in Xen to forward simply forward the request to the host
> >>>>> and check if it helps you.
> >>>>
> >>>> How can I pass the requset to the host? I'm not familiar with xen code,
> >>>> is there any
> >>>> reference code in xen?
> >>>
> >>> There are a couple of solution:
> >>> 1) You request the hypervisor to avoid trapping SVC. This would also
> >>> need some changes in Linux to force the Mali driver to use SVC rather
> >>> than HVC. There is a patch on xen-devel, to avoid trapping (see [1]).
> >>> 2) Add an allow list of the SMCCC operations. There are some examples
> >>> how to "emulate" SMC call in Xen (see vsmccc_handle_call()).
> >>>
> >>> Cheers,
> >>>
> >>> [1]
> >>> https://lore.kernel.org/xen-devel/alpine.DEB.2.21.2106241749310.24906@sstabellini-ThinkPad-T480s/
> >> I tried both solutions but it didn't work.
> >> First way:
> >> I add the code as the patch, add forward_smc=true to the Xen command line.
> >> Then boot, but still print the log 'Unhandled SMC/HVC ...',
>
> Julien Grall wrote:
> > Can you confirm whether you ask the mali driver to use SMC call?
>
> I add log to print hsr.ec value in do_trap_guest_sync(), 
> and shows call are HSR_EC_HVC64 and HSR_EC_SMC64.
> But HCR_TSC seems only handle HSR_EC_SMC64 call.
>
> >> meanwhile xen
> >> throw an exception, log as below:
> >> (XEN) traps.c:1987:d0v3 HSR=0x92000007 pc=0xffffffc0106a3be8 gva=0xffffffc0127ad0a0 gpa=0x000000001000a0
> >> [ 5.966596] Unhandled fault at 0xffffffc0127ad0a0
> >> [ 5.966619] Mem abort info:
> >> [ 5.966633] ESR = 0x96000000
> >> [ 5.966649] EC = 0x25: DABT (current EL), IL = 32 bits
> >> [ 5.966666] SET = 0, FnV = 0
> >> [ 5.966680] EA = 0, S1PTW = 0
> >> [ 5.966694] Data abort info:
> >> [ 5.966708] ISV = 0, ISS = 0x00000000
> >> [ 5.966722] CM = 0, WnR = 0
> >> [ 5.966738] swapper pgtable: 4k pages, 39-bit VAs, pgdp=00000000194b6000
> >> It seems forward_smc=true did not work, but cause an exception.
>
> Julien Grall wrote:
> > This is indicating that the dom0 is trying to access a region that is 
> > not mapped.
>
> > Can you check what the address 0x1000a0 is used for the host layout?
>
> Stefano Stabellini wrote:
> > This is not "Unhandled SMC/HVC". The guest is trying to access address
> > 0x1000a0 which doesn't seem to be valid in the guest?
> > 
> > Do you know where 0x1000a0 is coming from? Could it be that one of the
> > SMC calls returns 0x1000a0 to the guest and the guest tries to access
> > it, but actually 0x1000a0 is not present in any valid device tree ranges
> > so it is not accessible from the guest?
>
> > If 0x1000a0 is a "special" address returned by the firmware, it needs to
> > belong to a range described in one of the device tree nodes for it to be
> > accessible by dom0.
>
> I searched 0x1000a0 in kernel code and in dtb files, but found nothing.
> So far I haven't figured out where 0x1000a0 is coming from.
>
> >> Second way:
> >> I used zynqmp_eemi() in xilinx-zynqmp-eemi.c to handle smc call.
> >> The smc call seems succeed according to xen log.
> >> But after run smc call, kernel throw an exception, log as below:
> >> [ 8.771446] rockchip-pm-domain fd8d8000.power-management:power-controller:
> >> Looking up pcie-supply from device tree
> >> [ 8.771485] rockchip-pm-domain fd8d8000.power-management:power-controller:
> >> Looking up pcie-supply property in node /power-management@fd8d8000/power-controller failed
> >> [ 66.037851] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
> >> [ 66.037874] rcu: 3-...0: (0 ticks this GP) idle=196/1/0x4000000000000000 softirq=30/30 fqs=6000
> >> [ 66.037892] (detected by 5, t=18002 jiffies, g=-1147, q=81)
> >> [ 66.037905] Task dump for CPU 3:
> >> [ 66.037916] task:swapper/0 state:R running task stack: 0 pid: 1 ppid: 0 flags:0x0000002a
> >> [ 66.037939] Call trace:
> >> [ 66.037952] __switch_to+0x130/0x1a0
> >> [ 66.037968] 0xffffffc0112bc173
> >> It seems I need to modify zynqmp_eemi() code to adapt the kerenl,
> >> But based on what to modify zynqmp_eemi()?
>
> Julien Grall wrote:
> > I am not sure I understand what are the changes you made in Xen. Can you 
> > post the diff?
>
> The changes made in xen as below:
> diff --git a/xen/arch/arm/vsmc.c b/xen/arch/arm/vsmc.c
> index b633ff2fe8..83bb063016 100644
> --- a/xen/arch/arm/vsmc.c
> +++ b/xen/arch/arm/vsmc.c
> @@ -27,6 +27,7 @@
>  #include <asm/traps.h>
>  #include <asm/vpsci.h>
>  #include <asm/platform.h>
> +#include <asm/platforms/xilinx-zynqmp-eemi.h>
>  
>  /* Number of functions currently supported by Hypervisor Service. */
>  #define XEN_SMCCC_FUNCTION_COUNT 3
> @@ -280,7 +281,8 @@ static bool vsmccc_handle_call(struct cpu_user_regs *regs)
>              handled = handle_sssc(regs);
>              break;
>          case ARM_SMCCC_OWNER_SIP:
> -            handled = platform_smc(regs);
> +            // handled = platform_smc(regs);
> +            handled = zynqmp_eemi(regs);
>              break;
>          case ARM_SMCCC_OWNER_TRUSTED_APP ... ARM_SMCCC_OWNER_TRUSTED_APP_END:
>          case ARM_SMCCC_OWNER_TRUSTED_OS ... ARM_SMCCC_OWNER_TRUSTED_OS_END:
>
> diff --git a/xen/include/asm-arm/platforms/xilinx-zynqmp-eemi.h b/xen/include/asm-arm/platforms/xilinx-zynqmp-eemi.h
> index cf25a9014d..55e20e99ca 100644
> --- a/xen/include/asm-arm/platforms/xilinx-zynqmp-eemi.h
> +++ b/xen/include/asm-arm/platforms/xilinx-zynqmp-eemi.h
> @@ -18,7 +18,7 @@
>  #include <asm/smccc.h>
>  
>  #define EEMI_FID(fid) ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
> -                                         ARM_SMCCC_CONV_64,   \
> +                                         ARM_SMCCC_CONV_32,   \
>                                           ARM_SMCCC_OWNER_SIP, \
>                                           fid)
>
>
> Stefano Stabellini wrote:
> > In theory, there is no reason why forward_smc=true would not work but
> > xilinx-zynqmp-eemi.c would work. But anyway, I am appending the changes
> > to make sure xilinx-zynqmp-eemi.c forwards everything to the firmware.
> > 
> > It looks like the kernel is throwing an exception because it is blocked
> > for too long. Maybe the SMC call didn't actually succeed after all.
> > 
> > diff --git a/xen/arch/arm/platforms/xilinx-zynqmp-eemi.c b/xen/arch/arm/platforms/xilinx-zynqmp-eemi.c
> > index 2053ed7ac5..20aa6afb47 100644
> > --- a/xen/arch/arm/platforms/xilinx-zynqmp-eemi.c
> > +++ b/xen/arch/arm/platforms/xilinx-zynqmp-eemi.c
> > @@ -57,6 +57,8 @@ bool zynqmp_eemi(struct cpu_user_regs *regs)
> >      unsigned int pm_fn = fid & 0xFFFF;
> >      enum pm_ret_status ret;
> >  
> > +    goto forward_to_fw;
> > +
> >      switch ( fid )
> >      {
> >      /* Mandatory SMC32 functions. */
>
> I tried this change in xen, still has error in kernel, log as below:
>
> [    8.985102] rockchip-pm-domain fd8d8000.power-management:power-controller: 
> Looking up pcie-supply from device tree
> [    8.985185] rockchip-pm-domain fd8d8000.power-management:power-controller: 
> Looking up pcie-supply property in node /power-management@fd8d8000/power-controller failed
> [   65.911230] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
> [   65.911274] rcu:     0-...0: (1 ticks this GP) idle=1ba/1/0x4000000000000000 softirq=29/29 fqs=6001 
> [   65.911311]  (detected by 2, t=18003 jiffies, g=-1147, q=82)
> [   65.911337] Task dump for CPU 0:
> [   65.911360] task:swapper/0       state:R  running task     stack:    0 pid:    1 ppid:     0 flags:0x0000002a
> [   65.911406] Call trace:
> [   65.911429]  __switch_to+0x130/0x1a0
> [   65.911453]  die+0x58/0x20c
> [   65.911473]  arm64_notify_die+0x88/0x8c
> [   65.911497]  do_mem_abort+0xa0/0xb8
> [   65.911520]  el1_abort+0x3c/0x5c
> [   65.911542]  el1_sync_handler+0x94/0xd0
> [   65.911563]  el1_sync+0x88/0x140
> [   65.911587]  copy_from_kernel_nofault+0xc8/0x128
> [   65.911612]  aarch64_insn_read+0x34/0x6c
> [   65.911635]  show_data.constprop.0+0xac/0x128
> [   65.911660]  show_regs+0xd8/0x118
> [   65.911681]  die+0xcc/0x20c
> [   65.911700]  arm64_notify_die+0x88/0x8c
> [   65.911722]  do_mem_abort+0xa0/0xb8
> [   65.911743]  el1_abort+0x3c/0x5c
> [   65.911764]  el1_sync_handler+0x94/0xd0
> [   65.911785]  el1_sync+0x88/0x140
> [   65.911808]  rockchip_gem_get_ddr_info+0x28/0x40
> [   65.911833]  rockchip_drm_init+0x110/0x114
> [   65.911855]  do_one_initcall+0xa0/0x1e8
> [   65.911878]  kernel_init_freeable+0x2a4/0x2ac
> [   65.911902]  kernel_init+0x20/0x11c
> [   65.911924]  ret_from_fork+0x10/0x30
>
> I continue to try the second way, and find that during initialization,
> SMC/HVC call come from different case, funcid 0x82000010 handle in HSR_EC_HVC64 case,
> funcid 0x82000009 handle in HSR_EC_SMC64 case.
>
> log as below: 
> (XEN) d0v2 do_trap_hvc_smccc trap....
> (XEN) d0v2 Unhandled SMC/HVC: 0x82000010
>
> (XEN) d0v3 do_trap_smc trap....
> (XEN) d0v3 Unhandled SMC/HVC: 0x82000009
>
> So I change the code in vsmc.c as below:
> (The changes "goto forward_to_fw" remain)
>
> diff --git a/xen/arch/arm/vsmc.c b/xen/arch/arm/vsmc.c
> index b633ff2fe8..09636aab17 100644
> --- a/xen/arch/arm/vsmc.c
> +++ b/xen/arch/arm/vsmc.c
> @@ -27,6 +27,7 @@
>  #include <asm/traps.h>
>  #include <asm/vpsci.h>
>  #include <asm/platform.h>
> +#include <asm/platforms/xilinx-zynqmp-eemi.h>
>  
>  /* Number of functions currently supported by Hypervisor Service. */
>  #define XEN_SMCCC_FUNCTION_COUNT 3
> @@ -280,7 +281,12 @@ static bool vsmccc_handle_call(struct cpu_user_regs *regs)
>              handled = handle_sssc(regs);
>              break;
>          case ARM_SMCCC_OWNER_SIP:
> -            handled = platform_smc(regs);
> +            if(hsr.ec == HSR_EC_HVC64)
> +            {
> +                handled = zynqmp_eemi(regs);
> +            }
> +            else
> +                handled = platform_smc(regs);
>              break;
>          case ARM_SMCCC_OWNER_TRUSTED_APP ... ARM_SMCCC_OWNER_TRUSTED_APP_END:
>          case ARM_SMCCC_OWNER_TRUSTED_OS ... ARM_SMCCC_OWNER_TRUSTED_OS_END:
>
> After this change, mali and sdmmc driver bring up succeed.
> But I'm confused, is this the xen problem or kernel work SMC/HVC call in a quirks way?

I think we have two problems here.

One problem, which is a known problem, is that sometimes the kernel can
make firmware calls (as SMC calls) to initialize a device driver. This
is what the "forward_smc" suggestion was meant to solve.

"forward_smc" is an effective workaround, but the proper solution would
be to write a platform driver like zynqmp_eemi which check which SMC
calls need to be forwarded and forward only those.

The second problem is that the kernel is making firmware calls using HVC
as transport instead of SMC. This is uncommon. That is the reason why
"forward_smc" didn't work. "forward_smc" only forward SMC calls, not HVC
calls.

But if you had a platform driver like zynqmp_eemi, in theory the
platform_smc() call in vsmccc_handle_call should have worked correctly
for both SMC and HVC coming from Linux.

Just to see if we understood the problem correctly, the appended patch
alone (no need for other changes) should work, if you pass
forward_firmware=true to the Xen command line.


diff --git a/xen/arch/arm/vsmc.c b/xen/arch/arm/vsmc.c
index 7335276f3f..1d11634bff 100644
--- a/xen/arch/arm/vsmc.c
+++ b/xen/arch/arm/vsmc.c
@@ -8,6 +8,7 @@


#include <xen/lib.h>
+#include <xen/param.h>
#include <xen/types.h>
#include <public/arch-arm/smccc.h>
#include <asm/cpuerrata.h>
@@ -26,6 +27,9 @@
/* Number of functions currently supported by Standard Service Service Calls. */
#define SSSC_SMCCC_FUNCTION_COUNT (3 + VPSCI_NR_FUNCS)

+static bool __read_mostly forward_fw = false;
+boolean_param("forward_firmware", forward_fw);
+
static bool fill_uid(struct cpu_user_regs *regs, xen_uuid_t uuid)
{
int n;
@@ -224,6 +228,27 @@ static bool vsmccc_handle_call(struct cpu_user_regs *regs)
const union hsr hsr = { .bits = regs->hsr };
uint32_t funcid = get_user_reg(regs, 0);

+ if ( forward_fw )
+ {
+ struct arm_smccc_res res;
+
+ arm_smccc_1_1_smc(get_user_reg(regs, 0),
+ get_user_reg(regs, 1),
+ get_user_reg(regs, 2),
+ get_user_reg(regs, 3),
+ get_user_reg(regs, 4),
+ get_user_reg(regs, 5),
+ get_user_reg(regs, 6),
+ get_user_reg(regs, 7),
+ &res);
+
+ set_user_reg(regs, 0, res.a0);
+ set_user_reg(regs, 1, res.a1);
+ set_user_reg(regs, 2, res.a2);
+ set_user_reg(regs, 3, res.a3);
+ return true;
+ }
+
/*
* Check immediate value for HVC32, HVC64 and SMC64.
* It is not so easy to check immediate value for SMC32,
Re: [Bug] Bring up Dom0 on Arm board [ In reply to ]
Hi,
> I think we have two problems here.
>
> One problem, which is a known problem, is that sometimes the kernel can
> make firmware calls (as SMC calls) to initialize a device driver. This
> is what the "forward_smc" suggestion was meant to solve.
>
> "forward_smc" is an effective workaround, but the proper solution would
> be to write a platform driver like zynqmp_eemi which check which SMC
> calls need to be forwarded and forward only those.
>
> The second problem is that the kernel is making firmware calls using HVC
> as transport instead of SMC. This is uncommon. That is the reason why
> "forward_smc" didn't work. "forward_smc" only forward SMC calls, not HVC
> calls.
>
> But if you had a platform driver like zynqmp_eemi, in theory the
> platform_smc() call in vsmccc_handle_call should have worked correctly
> for both SMC and HVC coming from Linux.
>
> Just to see if we understood the problem correctly, the appended patch
> alone (no need for other changes) should work, if you pass
> forward_firmware=true to the Xen command line.
>
>
> diff --git a/xen/arch/arm/vsmc.c b/xen/arch/arm/vsmc.c
> index 7335276f3f..1d11634bff 100644
> --- a/xen/arch/arm/vsmc.c
> +++ b/xen/arch/arm/vsmc.c
> @@ -8,6 +8,7 @@
>
>
> #include <xen/lib.h>
> +#include <xen/param.h>
> #include <xen/types.h>
> #include <public/arch-arm/smccc.h>
> #include <asm/cpuerrata.h>
> @@ -26,6 +27,9 @@
> /* Number of functions currently supported by Standard Service Service Calls. */
> #define SSSC_SMCCC_FUNCTION_COUNT (3 + VPSCI_NR_FUNCS)
>
> +static bool __read_mostly forward_fw = false;
> +boolean_param("forward_firmware", forward_fw);
> +
> static bool fill_uid(struct cpu_user_regs *regs, xen_uuid_t uuid)
> {
> int n;
> @@ -224,6 +228,27 @@ static bool vsmccc_handle_call(struct cpu_user_regs *regs)
> const union hsr hsr = { .bits = regs->hsr };
> uint32_t funcid = get_user_reg(regs, 0);
>
> + if ( forward_fw )
> + {
> + struct arm_smccc_res res;
> +
> + arm_smccc_1_1_smc(get_user_reg(regs, 0),
> + get_user_reg(regs, 1),
> + get_user_reg(regs, 2),
> + get_user_reg(regs, 3),
> + get_user_reg(regs, 4),
> + get_user_reg(regs, 5),
> + get_user_reg(regs, 6),
> + get_user_reg(regs, 7),
> + &res);
> +
> + set_user_reg(regs, 0, res.a0);
> + set_user_reg(regs, 1, res.a1);
> + set_user_reg(regs, 2, res.a2);
> + set_user_reg(regs, 3, res.a3);
> + return true;
> + }
> +
I tried the change above, but still has a fault in kernel, log as below:
The change is to pass all HVC and SMC calls forward to firmware,
but what I have tried successfully is passing all HVC calls and not handling all SMC calls.
I have not idea how to find out 0x1000a0 where comes from, do you have any suggestions?
(XEN) d0v0 hsr.ec 0x17
(XEN) d0v0 do_trap_smc trap....
(XEN) traps.c:1987:d0v0 HSR=0x92000007 pc=0xffffffc01069b830 gva=0xffffffc0126e30a0 gpa=0x000000001000a0
[ 6.027388] Unhandled fault at 0xffffffc0126e30a0
[ 6.027415] Mem abort info:
[ 6.027490] ESR = 0x96000000
[ 6.027514] EC = 0x25: DABT (current EL), IL = 32 bits
[ 6.027537] SET = 0, FnV = 0
[ 6.027554] EA = 0, S1PTW = 0
[ 6.027571] Data abort info:
[ 6.027589] ISV = 0, ISS = 0x00000000
[ 6.027607] CM = 0, WnR = 0
[ 6.027627] swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000019484000
[ 6.027652] [ffffffc0126e30a0] pgd=000000013ffff003, p4d=000000013ffff003, pud=000000013ffff003, pmd=0000000101d86003, pte=0068000000100717
[ 6.027714] Internal error: ttbr address size fault: 96000000 [#1] SMP
[ 6.027740] Modules linked in:
[ 6.027766] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.66 #8
[ 6.027790] Hardware name: Rockchip RK3588 EVB7 LP4 V10 Board (DT)
[ 6.027819] pstate: 60c00005 (nZCv daif +PAN +UAO -TCO BTYPE=--)
[ 6.027850] pc : rockchip_gem_get_ddr_info+0x28/0x40
[ 6.027884] lr : rockchip_gem_get_ddr_info+0x1c/0x40
[ 6.027905] sp : ffffffc011e7bd60
[ 6.027923] x29: ffffffc011e7bd60 x28: 0000000000000000
[ 6.027952] x27: ffffffc011490438 x26: ffffffc0114f1070
[ 6.027981] x25: 0000000000000006 x24: ffffffc0115bf598
[ 6.028010] x23: 0000000000000000 x22: ffffffc011d56a40
[ 6.028039] x21: ffffffc011e34000 x20: 0000000000000000
[ 6.028067] x19: ffffffc011e34c30 x18: 0000000000000000
[ 6.028095] x17: 000000000000000e x16: 0000000000000007
[ 6.028124] x15: 000000000000000a x14: 0000000000000662
[ 6.028152] x13: ffffffffffffffff x12: ffffffffffffffff
[ 6.028181] x11: 0000000000000000 x10: ffffff8102730a1c
[ 6.028209] x9 : ffffffc010b26fe0 x8 : ffffffc011e7bd38
[ 6.028237] x7 : 0000000000000000 x6 : 0000000000000000
[ 6.028266] x5 : 0000000000000000 x4 : 0000000000000000
[ 6.028297] x3 : 0000000000000000 x2 : ffffffc011c99980
[ 6.028326] x1 : ffffffc011c99000 x0 : ffffffc0126e3000
[ 6.028355] Call trace:
[ 6.028377] rockchip_gem_get_ddr_info+0x28/0x40
[ 6.028404] rockchip_drm_init+0x110/0x114
[ 6.028427] do_one_initcall+0xa0/0x1e8
[ 6.028450] kernel_init_freeable+0x2a4/0x2ac
[ 6.028475] kernel_init+0x20/0x11c
[ 6.028496] ret_from_fork+0x10/0x30
[ 6.028516]
[ 6.028516] PC: 0xffffffc01069b730:
[ 6.028536] b730 aa0003f3 f9400400 f9402c14 f9412660 97eb302d f9412a60 97f7be94 f9412a60
[ 6.028598] b750 97ec26a5 f9401280 d2800003 f9407662 f940c661 97f81021 a94153f3 a8c27bfd
[ 6.028657] b770 d50323bf d65f03c0 aa1e03e9 d503201f d503233f a9bd7bfd d2804c02 910003fd
[ 6.028716] b790 f90013f5 aa0003f5 d0006b20 a90153f3 2a0103f4 f945f800 5281b801 97ec22a5
[ 6.028775] b7b0 b4000280 aa0003f3 51000682 32002c42 aa0003e1 11000442 aa1503e0 97ff3263
[ 6.028835] b7d0 f9400a60 52819a41 72a00201 f9401000 f9401800 b9003001 aa1303e0 a94153f3
[ 6.028894] b7f0 f94013f5 a8c37bfd d50323bf d65f03c0 92800173 17fffff9 d503245f aa1e03e9
[ 6.028952] b810 d503201f d503233f a9bf7bfd 910003fd 941230e8 b40000c0 d000afe1 91260022
[ 6.029011] b830 29540003 b9098020 b9000443 a8c17bfd d50323bf d65f03c0 d503245f aa1e03e9
[ 6.029070] b850 d503201f d503233f a9be7bfd aa0103e2 910003fd a90153f3 aa0103f4 aa0003f3
[ 6.029128] b870 f9407401 97ff34bd 35000080 aa1403e1 aa1303e0 97fffda7 a94153f3 a8c27bfd
[ 6.029187] b890 d50323bf d65f03c0 d503245f aa1e03e9 d503201f d503233f a9be7bfd 910003fd
[ 6.029255] b8b0 f9000bf3 aa0103f3 97ff34ec 350000a0 f9405660 f9004e7f aa1303e1 97fffd95
[ 6.029314] b8d0 f9400bf3 a8c27bfd d50323bf d65f03c0 d503245f aa1e03e9 d503201f d503233f
[ 6.029372] b8f0 a9bc7bfd 910003fd a90153f3 2a0303f4 a9025bf5 12001c55 a90363f7 97ffff9b
Re: [Bug] Bring up Dom0 on Arm board [ In reply to ]
On Tue, 31 Jan 2023, ??? wrote:
> Hi,
>
> > I think we have two problems here.
> > 
> > One problem, which is a known problem, is that sometimes the kernel can
> > make firmware calls (as SMC calls) to initialize a device driver. This
> > is what the "forward_smc" suggestion was meant to solve.
> > 
> > "forward_smc" is an effective workaround, but the proper solution would
> > be to write a platform driver like zynqmp_eemi which check which SMC
> > calls need to be forwarded and forward only those.
> > 
> > The second problem is that the kernel is making firmware calls using HVC
> > as transport instead of SMC. This is uncommon. That is the reason why
> > "forward_smc" didn't work. "forward_smc" only forward SMC calls, not HVC
> > calls.
> > 
> > But if you had a platform driver like zynqmp_eemi, in theory the
> > platform_smc() call in vsmccc_handle_call should have worked correctly
> > for both SMC and HVC coming from Linux.
> > 
> > Just to see if we understood the problem correctly, the appended patch
> > alone (no need for other changes) should work, if you pass
> > forward_firmware=true to the Xen command line.
> > 
> > 
> > diff --git a/xen/arch/arm/vsmc.c b/xen/arch/arm/vsmc.c
> > index 7335276f3f..1d11634bff 100644
> > --- a/xen/arch/arm/vsmc.c
> > +++ b/xen/arch/arm/vsmc.c
> > @@ -8,6 +8,7 @@
> >  
> >  
> >  #include <xen/lib.h>
> > +#include <xen/param.h>
> >  #include <xen/types.h>
> >  #include <public/arch-arm/smccc.h>
> >  #include <asm/cpuerrata.h>
> > @@ -26,6 +27,9 @@
> >  /* Number of functions currently supported by Standard Service Service Calls. */
> >  #define SSSC_SMCCC_FUNCTION_COUNT (3 + VPSCI_NR_FUNCS)
> >  
> > +static bool __read_mostly forward_fw = false;
> > +boolean_param("forward_firmware", forward_fw);
> > +
> >  static bool fill_uid(struct cpu_user_regs *regs, xen_uuid_t uuid)
> >  {
> >      int n;
> > @@ -224,6 +228,27 @@ static bool vsmccc_handle_call(struct cpu_user_regs *regs)
> >      const union hsr hsr = { .bits = regs->hsr };
> >      uint32_t funcid = get_user_reg(regs, 0);
> >  
> > +    if ( forward_fw )
> > +    {
> > +        struct arm_smccc_res res;
> > +
> > +        arm_smccc_1_1_smc(get_user_reg(regs, 0),
> > +                          get_user_reg(regs, 1),
> > +                          get_user_reg(regs, 2),
> > +                          get_user_reg(regs, 3),
> > +                          get_user_reg(regs, 4),
> > +                          get_user_reg(regs, 5),
> > +                          get_user_reg(regs, 6),
> > +                          get_user_reg(regs, 7),
> > +                          &res);
> > +
> > +        set_user_reg(regs, 0, res.a0);
> > +        set_user_reg(regs, 1, res.a1);
> > +        set_user_reg(regs, 2, res.a2);
> > +        set_user_reg(regs, 3, res.a3);
> > +        return true;
> > +    }
> > +
>
> I tried the change above, but still has a fault in kernel, log as below:
> The change is to pass all HVC and SMC calls forward to firmware,
> but what I have tried successfully is passing all HVC calls and not handling all SMC calls.

I think you should try to find out which ones are the HVCs/SMCs that
need to be forwarded and which ones that need to be handled as usual.

Once you know that, you could write a platform driver like
xen/arch/arm/platforms/xilinx-zynqmp.c that handles things
appropriately.


> I have not idea how to find out 0x1000a0 where comes from, do you have any suggestions?

I would look at device tree to see within which range 0x1000a0 falls.


> (XEN) d0v0 hsr.ec 0x17
> (XEN) d0v0 do_trap_smc trap....
> (XEN) traps.c:1987:d0v0 HSR=0x92000007 pc=0xffffffc01069b830 gva=0xffffffc0126e30a0 gpa=0x000000001000a0
> [    6.027388] Unhandled fault at 0xffffffc0126e30a0
> [    6.027415] Mem abort info:
> [    6.027490]   ESR = 0x96000000
> [    6.027514]   EC = 0x25: DABT (current EL), IL = 32 bits
> [    6.027537]   SET = 0, FnV = 0
> [    6.027554]   EA = 0, S1PTW = 0
> [    6.027571] Data abort info:
> [    6.027589]   ISV = 0, ISS = 0x00000000
> [    6.027607]   CM = 0, WnR = 0
> [    6.027627] swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000019484000
> [    6.027652] [ffffffc0126e30a0] pgd=000000013ffff003, p4d=000000013ffff003, pud=000000013ffff003, pmd=0000000101d86003, pte=006800000010
> 0717
> [    6.027714] Internal error: ttbr address size fault: 96000000 [#1] SMP
> [    6.027740] Modules linked in:
> [    6.027766] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.66 #8
> [    6.027790] Hardware name: Rockchip RK3588 EVB7 LP4 V10 Board (DT)
> [    6.027819] pstate: 60c00005 (nZCv daif +PAN +UAO -TCO BTYPE=--)
> [    6.027850] pc : rockchip_gem_get_ddr_info+0x28/0x40
> [    6.027884] lr : rockchip_gem_get_ddr_info+0x1c/0x40
> [    6.027905] sp : ffffffc011e7bd60
> [    6.027923] x29: ffffffc011e7bd60 x28: 0000000000000000 
> [    6.027952] x27: ffffffc011490438 x26: ffffffc0114f1070 
> [    6.027981] x25: 0000000000000006 x24: ffffffc0115bf598 
> [    6.028010] x23: 0000000000000000 x22: ffffffc011d56a40 
> [    6.028039] x21: ffffffc011e34000 x20: 0000000000000000 
> [    6.028067] x19: ffffffc011e34c30 x18: 0000000000000000 
> [    6.028095] x17: 000000000000000e x16: 0000000000000007 
> [    6.028124] x15: 000000000000000a x14: 0000000000000662 
> [    6.028152] x13: ffffffffffffffff x12: ffffffffffffffff 
> [    6.028181] x11: 0000000000000000 x10: ffffff8102730a1c 
> [    6.028209] x9 : ffffffc010b26fe0 x8 : ffffffc011e7bd38 
> [    6.028237] x7 : 0000000000000000 x6 : 0000000000000000 
> [    6.028266] x5 : 0000000000000000 x4 : 0000000000000000 
> [    6.028297] x3 : 0000000000000000 x2 : ffffffc011c99980 
> [    6.028326] x1 : ffffffc011c99000 x0 : ffffffc0126e3000 
> [    6.028355] Call trace:
> [    6.028377]  rockchip_gem_get_ddr_info+0x28/0x40
> [    6.028404]  rockchip_drm_init+0x110/0x114
> [    6.028427]  do_one_initcall+0xa0/0x1e8
> [    6.028450]  kernel_init_freeable+0x2a4/0x2ac
> [    6.028475]  kernel_init+0x20/0x11c
> [    6.028496]  ret_from_fork+0x10/0x30
> [    6.028516] 
> [    6.028516] PC: 0xffffffc01069b730:
> [    6.028536] b730  aa0003f3 f9400400 f9402c14 f9412660 97eb302d f9412a60 97f7be94 f9412a60
> [    6.028598] b750  97ec26a5 f9401280 d2800003 f9407662 f940c661 97f81021 a94153f3 a8c27bfd
> [    6.028657] b770  d50323bf d65f03c0 aa1e03e9 d503201f d503233f a9bd7bfd d2804c02 910003fd
> [    6.028716] b790  f90013f5 aa0003f5 d0006b20 a90153f3 2a0103f4 f945f800 5281b801 97ec22a5
> [    6.028775] b7b0  b4000280 aa0003f3 51000682 32002c42 aa0003e1 11000442 aa1503e0 97ff3263
> [    6.028835] b7d0  f9400a60 52819a41 72a00201 f9401000 f9401800 b9003001 aa1303e0 a94153f3
> [    6.028894] b7f0  f94013f5 a8c37bfd d50323bf d65f03c0 92800173 17fffff9 d503245f aa1e03e9
> [    6.028952] b810  d503201f d503233f a9bf7bfd 910003fd 941230e8 b40000c0 d000afe1 91260022
> [    6.029011] b830  29540003 b9098020 b9000443 a8c17bfd d50323bf d65f03c0 d503245f aa1e03e9
> [    6.029070] b850  d503201f d503233f a9be7bfd aa0103e2 910003fd a90153f3 aa0103f4 aa0003f3
> [    6.029128] b870  f9407401 97ff34bd 35000080 aa1403e1 aa1303e0 97fffda7 a94153f3 a8c27bfd
> [    6.029187] b890  d50323bf d65f03c0 d503245f aa1e03e9 d503201f d503233f a9be7bfd 910003fd
> [    6.029255] b8b0  f9000bf3 aa0103f3 97ff34ec 350000a0 f9405660 f9004e7f aa1303e1 97fffd95
> [    6.029314] b8d0  f9400bf3 a8c27bfd d50323bf d65f03c0 d503245f aa1e03e9 d503201f d503233f
> [    6.029372] b8f0  a9bc7bfd 910003fd a90153f3 2a0303f4 a9025bf5 12001c55 a90363f7 97ffff9b
>
>
Re: [Bug] Bring up Dom0 on Arm board [ In reply to ]
Hi,
>> > I think we have two problems here.
>> >
>> > One problem, which is a known problem, is that sometimes the kernel can
>> > make firmware calls (as SMC calls) to initialize a device driver. This
>> > is what the "forward_smc" suggestion was meant to solve.
>> >
>> > "forward_smc" is an effective workaround, but the proper solution would
>> > be to write a platform driver like zynqmp_eemi which check which SMC
>> > calls need to be forwarded and forward only those.
>> >
>> > The second problem is that the kernel is making firmware calls using HVC
>> > as transport instead of SMC. This is uncommon. That is the reason why
>> > "forward_smc" didn't work. "forward_smc" only forward SMC calls, not HVC
>> > calls.
>> >
>> > But if you had a platform driver like zynqmp_eemi, in theory the
>> > platform_smc() call in vsmccc_handle_call should have worked correctly
>> > for both SMC and HVC coming from Linux.
>> >
>> > Just to see if we understood the problem correctly, the appended patch
>> > alone (no need for other changes) should work, if you pass
>> > forward_firmware=true to the Xen command line.
>> >
>> >
>> > diff --git a/xen/arch/arm/vsmc.c b/xen/arch/arm/vsmc.c
>> > index 7335276f3f..1d11634bff 100644
>> > --- a/xen/arch/arm/vsmc.c
>> > +++ b/xen/arch/arm/vsmc.c
>> > @@ -8,6 +8,7 @@
>> >
>> >
>> > #include <xen/lib.h>
>> > +#include <xen/param.h>
>> > #include <xen/types.h>
>> > #include <public/arch-arm/smccc.h>
>> > #include <asm/cpuerrata.h>
>> > @@ -26,6 +27,9 @@
>> > /* Number of functions currently supported by Standard Service Service Calls. */
>> > #define SSSC_SMCCC_FUNCTION_COUNT (3 + VPSCI_NR_FUNCS)
>> >
>> > +static bool __read_mostly forward_fw = false;
>> > +boolean_param("forward_firmware", forward_fw);
>> > +
>> > static bool fill_uid(struct cpu_user_regs *regs, xen_uuid_t uuid)
>> > {
>> > int n;
>> > @@ -224,6 +228,27 @@ static bool vsmccc_handle_call(struct cpu_user_regs *regs)
>> > const union hsr hsr = { .bits = regs->hsr };
>> > uint32_t funcid = get_user_reg(regs, 0);
>> >
>> > + if ( forward_fw )
>> > + {
>> > + struct arm_smccc_res res;
>> > +
>> > + arm_smccc_1_1_smc(get_user_reg(regs, 0),
>> > + get_user_reg(regs, 1),
>> > + get_user_reg(regs, 2),
>> > + get_user_reg(regs, 3),
>> > + get_user_reg(regs, 4),
>> > + get_user_reg(regs, 5),
>> > + get_user_reg(regs, 6),
>> > + get_user_reg(regs, 7),
>> > + &res);
>> > +
>> > + set_user_reg(regs, 0, res.a0);
>> > + set_user_reg(regs, 1, res.a1);
>> > + set_user_reg(regs, 2, res.a2);
>> > + set_user_reg(regs, 3, res.a3);
>> > + return true;
>> > + }
>> > +
>>
>> I tried the change above, but still has a fault in kernel, log as below:
>> The change is to pass all HVC and SMC calls forward to firmware,
>> but what I have tried successfully is passing all HVC calls and not handling all SMC calls.
>
> I think you should try to find out which ones are the HVCs/SMCs that
> need to be forwarded and which ones that need to be handled as usual.
>
> Once you know that, you could write a platform driver like
> xen/arch/arm/platforms/xilinx-zynqmp.c that handles things
> appropriately.
>
>
> > I have not idea how to find out 0x1000a0 where comes from, do you have any suggestions?
>
> I would look at device tree to see within which range 0x1000a0 falls.
What I'm confused about is what's the difference between HVC/SMC calls
through xen forward to firmware and kernel HVC/SMC calls direct to firmware?
If it's the same, then there should be no fault.
I search the dtb and found a device may be related to 0x1000a0, content as below:
sram@10f000 {
compatible = "mmio-sram";
reg = <0x00 0x10f000 0x00 0x100>;
#address-cells = <0x01>;
#size-cells = <0x01>;
ranges = <0x00 0x00 0x10f000 0x100>;
sram@0 {
compatible = "arm,scmi-shmem";
reg = <0x00 0x100>;
phandle = <0x38>;
};
};
Best regards
Cailigang
Re: [Bug] Bring up Dom0 on Arm board [ In reply to ]
On Wed, 1 Feb 2023, ??? wrote:
> >> > I think we have two problems here.
> >> > 
> >> > One problem, which is a known problem, is that sometimes the kernel can
> >> > make firmware calls (as SMC calls) to initialize a device driver. This
> >> > is what the "forward_smc" suggestion was meant to solve.
> >> > 
> >> > "forward_smc" is an effective workaround, but the proper solution would
> >> > be to write a platform driver like zynqmp_eemi which check which SMC
> >> > calls need to be forwarded and forward only those.
> >> > 
> >> > The second problem is that the kernel is making firmware calls using HVC
> >> > as transport instead of SMC. This is uncommon. That is the reason why
> >> > "forward_smc" didn't work. "forward_smc" only forward SMC calls, not HVC
> >> > calls.
> >> > 
> >> > But if you had a platform driver like zynqmp_eemi, in theory the
> >> > platform_smc() call in vsmccc_handle_call should have worked correctly
> >> > for both SMC and HVC coming from Linux.
> >> > 
> >> > Just to see if we understood the problem correctly, the appended patch
> >> > alone (no need for other changes) should work, if you pass
> >> > forward_firmware=true to the Xen command line.
> >> > 
> >> > 
> >> > diff --git a/xen/arch/arm/vsmc.c b/xen/arch/arm/vsmc.c
> >> > index 7335276f3f..1d11634bff 100644
> >> > --- a/xen/arch/arm/vsmc.c
> >> > +++ b/xen/arch/arm/vsmc.c
> >> > @@ -8,6 +8,7 @@
> >> >  
> >> >  
> >> >  #include <xen/lib.h>
> >> > +#include <xen/param.h>
> >> >  #include <xen/types.h>
> >> >  #include <public/arch-arm/smccc.h>
> >> >  #include <asm/cpuerrata.h>
> >> > @@ -26,6 +27,9 @@
> >> >  /* Number of functions currently supported by Standard Service Service Calls. */
> >> >  #define SSSC_SMCCC_FUNCTION_COUNT (3 + VPSCI_NR_FUNCS)
> >> >  
> >> > +static bool __read_mostly forward_fw = false;
> >> > +boolean_param("forward_firmware", forward_fw);
> >> > +
> >> >  static bool fill_uid(struct cpu_user_regs *regs, xen_uuid_t uuid)
> >> >  {
> >> >      int n;
> >> > @@ -224,6 +228,27 @@ static bool vsmccc_handle_call(struct cpu_user_regs *regs)
> >> >      const union hsr hsr = { .bits = regs->hsr };
> >> >      uint32_t funcid = get_user_reg(regs, 0);
> >> >  
> >> > +    if ( forward_fw )
> >> > +    {
> >> > +        struct arm_smccc_res res;
> >> > +
> >> > +        arm_smccc_1_1_smc(get_user_reg(regs, 0),
> >> > +                          get_user_reg(regs, 1),
> >> > +                          get_user_reg(regs, 2),
> >> > +                          get_user_reg(regs, 3),
> >> > +                          get_user_reg(regs, 4),
> >> > +                          get_user_reg(regs, 5),
> >> > +                          get_user_reg(regs, 6),
> >> > +                          get_user_reg(regs, 7),
> >> > +                          &res);
> >> > +
> >> > +        set_user_reg(regs, 0, res.a0);
> >> > +        set_user_reg(regs, 1, res.a1);
> >> > +        set_user_reg(regs, 2, res.a2);
> >> > +        set_user_reg(regs, 3, res.a3);
> >> > +        return true;
> >> > +    }
> >> > +
> >> 
> >> I tried the change above, but still has a fault in kernel, log as below:
> >> The change is to pass all HVC and SMC calls forward to firmware,
> >> but what I have tried successfully is passing all HVC calls and not handling all SMC calls.
> > 
> > I think you should try to find out which ones are the HVCs/SMCs that
> > need to be forwarded and which ones that need to be handled as usual.
> > 
> > Once you know that, you could write a platform driver like
> > xen/arch/arm/platforms/xilinx-zynqmp.c that handles things
> > appropriately.
> > 
> > 
> > > I have not idea how to find out 0x1000a0 where comes from, do you have any suggestions?
> > 
> > I would look at device tree to see within which range 0x1000a0 falls.
>
> What I'm confused about is what's the difference between HVC/SMC calls 
> through xen forward to firmware and kernel HVC/SMC calls direct to firmware?
> If it's the same, then there should be no fault.

In theory with the last patch I sent you there should be no difference
between the HVC/SMC calls direct to firmware and the ones forwarded to
firmware by Xen. One thing you could try is to print out all the SMC/HVC
calls that are done on native and compared them against the ones done on
Xen to see if they match.

One difference however is memory mapping. If something like address
0x1000a0 is nomapped in the guest, then a failure is expected,


> I search the dtb and found a device may be related to 0x1000a0, content as below:
>
> sram@10f000 {
>     compatible = "mmio-sram";
>     reg = <0x00 0x10f000 0x00 0x100>;
>     #address-cells = <0x01>;
>     #size-cells = <0x01>;
>     ranges = <0x00 0x00 0x10f000 0x100>;
>
>     sram@0 {
>         compatible = "arm,scmi-shmem";
>         reg = <0x00 0x100>;
>         phandle = <0x38>;
>     };
> };

What is the #address-cells and #size-cells of the parent node?

If it is #address-cells = 2 and #size-cells = 2, then this region is
0x10f000-0x110000. We need the region just before it.

It is possible that it exists but it is not described in device tree. In
that case it is normal to get an error when trying to access it.