Mailing List Archive

Xen & I/O in clusters - problems!
Hi, we are benchmarking Xen in a cluster, and got some bad results. We
might do something wrong, and wonder if anyone
have similar problems.

When we benchmark troughput from native Linux to native Linux (two
physical nodes in the cluster) we get 786.034 MByte/s
When we benchmark from a virtual domain (running on Xen on a physical
node) to an another virtual domain (on another physical node) we get
56.480 MByte/s (1:16)

The difference is huge, and we wonder if the bottleneck could be the
fact that we are using software routing (We use this in order to route
from the physical node to the virtual OSs), or if this is just a
downside of Xen?

I would guess it IS the SW routing, so is there any good alternatives
to make virtual domains communicate on a cluster without sw routing?

cheers,
Rune J.A
Re: Xen & I/O in clusters - problems! [ In reply to ]
> Hi, we are benchmarking Xen in a cluster, and got some bad results. We
> might do something wrong, and wonder if anyone
> have similar problems.
>
> When we benchmark troughput from native Linux to native Linux (two
> physical nodes in the cluster) we get 786.034 MByte/s
> When we benchmark from a virtual domain (running on Xen on a physical
> node) to an another virtual domain (on another physical node) we get
> 56.480 MByte/s (1:16)

(Presumably you mean MBits rather than Mbytes)

The numbers you're getting are terrible compared to what we see.
Running between virtual domains on a cluster we measure
throughput as high as 897Mb/s (same as Linux native).

Our results were recorded with dual 2.4GHz Xeons with tg3 NICs
and a 128KB socket buffer, measured using ttcp. With the virtual
domain running on the other physical CPU from domain 0 we get
897Mb/s. We get similar results running the virtual domain on the
other hyperthread of the same physical CPU. We observe a
performance reduction if we run the virtual domain on the same
(logical) CPU as domain 0, down to 660Mb/s [.843Mb/s on a dual
3GHz machine, so we appear to be CPU limited in this case].

> The difference is huge, and we wonder if the bottleneck could be the
> fact that we are using software routing (We use this in order to route
> from the physical node to the virtual OSs), or if this is just a
> downside of Xen?

Our results were recorded using the dom0 linux bridge code rather
than using routing.

One thing to check is that you have don't have
CONFIG_IP_NF_CONNTRACK set to 'y' -- this slays performance.

Also, if you're running multiple domains on the same CPU you may
be running into CPU scheduling issues. Some tweaks to scheduler
parameters may fix this.

> I would guess it IS the SW routing, so is there any good alternatives
> to make virtual domains communicate on a cluster without sw routing?

The Xen 2.0 architecture is not as slick as the
monolithic-hypervisor approach of Xen 1.2, but we get better
hardware support and a lot more flexibility. However, we do burn
more CPU to achieve the same IO rate. We just have to wait for
Moore's law to catch up ;-)

Ian


-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Xen & I/O in clusters - problems! [ In reply to ]
> When we benchmark from a virtual domain (running on Xen on a physical
> node) to an another virtual domain (on another physical node) we get
> 56.480 MByte/s (1:16)

Ouch. How are you benchmarking this? (what tool, what parameters, etc.).
It'll help me reproduce this on our test systems. Then we'll know if it's
your config or if there's something to track down.

We did see some weird performance for small packets at one stage and I'm not
sure if that was ever resolved. If it's the same problem, I can do a binary
chop search of changesets in order to locate it.

Cheers,
Mark


-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Xen & I/O in clusters - problems! [ In reply to ]
Here are some details on our horrible benchmark results:


System setup:

Ethernet controller:
Intel Corp. 82547GI Gigabit Ethernet Controller

Benchmark:
PARKBENCH low-level pingpong benchmark comms1_mpi

MPI library:
Scali MPI (http://www.scali.com/)

Xen distribution:
2.0 stable, downloaded with bitkeeper

Xen configuration:
default, i.e. we just leave config-2.6.8.1-xen0 the way it is

CPU:
Single Intel(R) Pentium(R) 4 CPU 3.40GHz

Memory:
1 GB


Bandwidth measurements:

Between two nodes running plain redhat EL3 with kernel 2.4.21-15.EL:

786.034 MByte/s

Between two nodes each running only xen domain 0:

56.480 MByte/s

A graph over the measurements - x = message length, y = time:

http://www.idi.ntnu.no/~havarbj/tmp/plot.png


Cheers,
Havard


-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Xen & I/O in clusters - problems! [ In reply to ]
Hi, we benchmark with Scali MPI. We are in the first stage with an easy
ping program which
first send 1 byte to measure the latency, then we tested with 10^5,
10^7 and 10^8 size packages. We also
get the same results when sending from domain0 to domain0 in the
cluster. We are now testing if the routing table
is the bottleneck, I let you know the results. Thank you :)

Cheers,
Rune J.A

On Oct 15, 2004, at 1:24 AM, Mark A. Williamson wrote:

>> When we benchmark from a virtual domain (running on Xen on a physical
>> node) to an another virtual domain (on another physical node) we get
>> 56.480 MByte/s (1:16)
>
> Ouch. How are you benchmarking this? (what tool, what parameters,
> etc.).
> It'll help me reproduce this on our test systems. Then we'll know if
> it's
> your config or if there's something to track down.
>
> We did see some weird performance for small packets at one stage and
> I'm not
> sure if that was ever resolved. If it's the same problem, I can do a
> binary
> chop search of changesets in order to locate it.
>
> Cheers,
> Mark



-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Xen & I/O in clusters - problems! [ In reply to ]
> Between two nodes running plain redhat EL3 with kernel 2.4.21-15.EL:
>
> 786.034 MByte/s
>
> Between two nodes each running only xen domain 0:
>
> 56.480 MByte/s

That's surprising - I'd have expected any performance problems to involve
unpriv domains somehow. We've never had any performance problems when just
running domain 0, even when the code was still under development...

It'd be interesting to see your config file (I know it's just the default but
it'd be interesting for comparison as there's no obvious reason for your
problems).

Cheers,
Mark


-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Re: Xen & I/O in clusters - problems! [ In reply to ]
On Fri, Oct 15, 2004 at 02:02:31PM +0000, Mark A. Williamson wrote:
> > Between two nodes running plain redhat EL3 with kernel 2.4.21-15.EL:
> >
> > 786.034 MByte/s
> >
> > Between two nodes each running only xen domain 0:
> >
> > 56.480 MByte/s
>
> That's surprising - I'd have expected any performance problems to involve
> unpriv domains somehow. We've never had any performance problems when just
> running domain 0, even when the code was still under development...
>

What's possibly even more funny is that when I do the same benchmark localhost <-> localhost, ie. through the loopback interface, on domain 0, the bandwidth is halved. CPU use is ~100% during both these benchmarks on domain 0 (50% per process in the last benchmark). This indicates to me that the bandwidth depends on CPU resources. Some heavy processing is happening somewhere.

I suspect this might have something to do with the MPI library scaMPI, which is supposed to be more closely linked with the lower layers of the OSI protocol stack or something. I will investigate it further.

Another thing that might be worth noting is that we're using the 2.6.8.1 kernel on domain 0 as opposed to a 2.4 kernel. I don't think that would make any difference, though (except for the fact that modules aren't loaded, but I don't think we need any of them anyway).

> It'd be interesting to see your config file (I know it's just the default but
> it'd be interesting for comparison as there's no obvious reason for your
> problems).
>

Attached

Cheers,
Havard
Re: Xen & I/O in clusters - problems! [ In reply to ]
HÃ¥vard Bjerke <Havard.Bjerke <at> idi.ntnu.no> writes:

>
> On Fri, Oct 15, 2004 at 02:02:31PM +0000, Mark A. Williamson wrote:
> > > Between two nodes running plain redhat EL3 with kernel 2.4.21-15.EL:
> > >
> > > 786.034 MByte/s
> > >
> > > Between two nodes each running only xen domain 0:
> > >
> > > 56.480 MByte/s
> >
> > That's surprising - I'd have expected any performance problems to involve
> > unpriv domains somehow. We've never had any performance problems when just
> > running domain 0, even when the code was still under development...
> >
>
> What's possibly even more funny is that when I do the same benchmark localhost
<-> localhost, ie. through
> the loopback interface, on domain 0, the bandwidth is halved. CPU use is ~100%
during both these
> benchmarks on domain 0 (50% per process in the last benchmark). This indicates
to me that the bandwidth
> depends on CPU resources. Some heavy processing is happening somewhere.
>
> I suspect this might have something to do with the MPI library scaMPI, which
is supposed to be more closely
> linked with the lower layers of the OSI protocol stack or something. I will
investigate it further.

Around 2000 the driver used a mix of polling and interrupts to get the latency
down. If I remember correctly there done polling for abount half the time of an
interrupt.

To get the adapters manipulate memory directly the driver also have to allocate
memory in physical continuous blocks. Exact how the reading and writing of this
areas is done through the drivers I do not know but this should not make any
performance hit except if Xen have any issues with MMU manipulation.

You could send SCALI an email..


--
John Enok



-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Xen & I/O in clusters - problems! New Information [ In reply to ]
Hi, we have some additional information about our problem benchmarking
Xen in clusters:

Between two Xen dom0 domains (between two physical computers in the
cluster) we got these strange results:
(We use ttcp socketbuffsize, 10^4 -> 10^6)

Kernel 2.4:
Xen Dom0 -> Xen Dom0: ca. 65 000 KB/s

Kernel 2.6:
Xen Dom0 -> Xen Dom0: ca. 80 000 KB/s (?)

Native Linux:
Native Linux -> Native Linux: ca. 114 000 KB/s

What is new and strage is that Xen Dom0 use about 60% of the CPU when
transfering or receiving, while
Native Linux ony use 6-7%(!) It seems like we have a problem with the
DMA here(?). We use Xen 2.0,
Gigabit ethernet.

I tried 'mv /lib/tls /lib/tls.disabled' on each node without success

Under is the dmesg for the Xen node:

Linux version 2.6.8.1-xen0 (root@comp-pvfs-0-17.local) (gcc version
3.2.3 20030502 (Red Hat Linux 3.2.3-34)) #1 Tue Oct 12 14:10:47 GMT
2004
BIOS-provided physical RAM map:
Xen: 0000000000000000 - 0000000008000000 (usable)
128MB LOWMEM available.
On node 0 totalpages: 32768
DMA zone: 4096 pages, LIFO batch:1
Normal zone: 28672 pages, LIFO batch:7
HighMem zone: 0 pages, LIFO batch:1
DMI not present.
Built 1 zonelists
Kernel command line: root=/dev/sda1 ro console=tty0
Initializing CPU#0
PID hash table entries: 1024 (order 10: 8192 bytes)
Xen reported: 3400.171 MHz processor.
Using tsc for high-res timesource
Console: colour VGA+ 80x25
Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
Memory: 125260k/131072k available (2636k kernel code, 5624k reserved,
834k data, 396k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor
mode... Ok.
Calibrating delay loop... 6789.52 BogoMIPS
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000
CPU: After vendor identify, caps: bfebfbff 00000000 00000000 00000000
CPU: Trace cache: 12K uops, L1 D cache: 8K
CPU: L2 cache: 512K
CPU: After all inits, caps: beebcbe1 00000000 00000000 00000080
CPU: Intel(R) Pentium(R) 4 CPU 3.40GHz stepping 09
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... disabled
NET: Registered protocol family 16
PCI: Using configuration type Xen
SCSI subsystem initialized
PCI: Probing PCI hardware
PCI: Probing PCI hardware (bus 00)
PCI: Probing PCI hardware (bus 01)
PCI: Probing PCI hardware (bus 02)
PCI: Probing PCI hardware (bus 03)
PCI: Probing PCI hardware
Initializing Cryptographic API
RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize
loop: loaded (max 8 devices)
Using anticipatory io scheduler
nbd: registered device at major 43
Intel(R) PRO/1000 Network Driver - version 5.2.52-k4
Copyright (c) 1999-2004 Intel Corporation.
PCI: Obtained IRQ 18 for device 0000:01:01.0
PCI: Setting latency timer of device 0000:01:01.0 to 64
e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
PCI: Obtained IRQ 21 for device 0000:03:02.0
e1000: eth1: e1000_probe: Intel(R) PRO/1000 Network Connection
pcnet32.c:v1.30i 06.28.2004 tsbogend@alpha.franken.de
e100: Intel(R) PRO/100 Network Driver, 3.0.18
e100: Copyright(c) 1999-2004 Intel Corporation
Xen virtual console successfully installed as ttyS
Event-channel device installed.
Initialising Xen netif backend
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with
idebus=xx
hda: SAMSUNG CD-ROM SN-124, ATAPI CD/DVD-ROM drive
ide1: I/O resource 0x170-0x177 not free.
ide1: ports already in use, skipping probe
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: ATAPI 24X CD-ROM drive, 128kB Cache
Uniform CD-ROM driver Revision: 3.20
PCI: Obtained IRQ 24 for device 0000:02:01.0
PCI: Obtained IRQ 25 for device 0000:02:01.1
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36
<Adaptec 3960D Ultra160 SCSI adapter>
aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs

(scsi0:A:0): 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit)
Vendor: SEAGATE Model: ST336607LW Rev: DS09
Type: Direct-Access ANSI SCSI revision: 03
scsi0:A:0:0: Tagged Queuing enabled. Depth 32
scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36
<Adaptec 3960D Ultra160 SCSI adapter>
aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs

Red Hat/Adaptec aacraid driver (1.1.2-lk2 Oct 12 2004)
3ware Storage Controller device driver for Linux v1.26.00.039.
3w-xxxx: No cards found.
libata version 1.02 loaded.
ata_piix version 1.02
ata_piix: combined mode detected
PCI: Obtained IRQ 17 for device 0000:00:1f.2
ata: 0x1f0 IDE port busy
PCI: Setting latency timer of device 0000:00:1f.2 to 64
ata1: SATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0xFEA8 irq 15
ata1: SATA port has no device.
scsi2 : ata_piix
SCSI device sda: 71132959 512-byte hdwr sectors (36420 MB)
SCSI device sda: drive cache: write through
sda: sda1 sda2 sda3 sda4 < sda5 >
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
mice: PS/2 mouse device common for all mice
serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
md: raid0 personality registered as nr 2
md: raid1 personality registered as nr 3
md: raid5 personality registered as nr 4
raid5: automatically using best checksumming function: pIII_sse
pIII_sse : 440.400 MB/sec
raid5: using function: pIII_sse (440.400 MB/sec)
md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27
device-mapper: 4.1.0-ioctl (2003-12-10) initialised: dm@uk.sistina.com
NET: Registered protocol family 2
IP: routing cache hash table of 1024 buckets, 8Kbytes
TCP: Hash tables configured (established 8192 bind 16384)
NET: Registered protocol family 1
NET: Registered protocol family 17
Bridge firewalling registered
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
kjournald starting. Commit interval 5 seconds
EXT3-fs: sda1: orphan cleanup on readonly fs
ext3_orphan_cleanup: deleting unreferenced inode 4718
EXT3-fs: sda1: 1 orphan inode deleted
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 396k freed

***************************************************************
***************************************************************
** WARNING: Currently emulating unsupported memory accesses **
** in /lib/tls libraries. The emulation is very **
** slow, and may not work correctly with all **
** programs (e.g., some may 'Segmentation fault'). **
** TO ENSURE FULL PERFORMANCE AND CORRECT FUNCTION, **
** YOU MUST EXECUTE THE FOLLOWING AS ROOT: **
** mv /lib/tls /lib/tls.disabled **
***************************************************************
***************************************************************

Pausing... 5Pausing... 4Pausing...
3Pausing... 2Pausing...
1Continuing...

EXT3 FS on sda1, internal journal
Adding 1020116k swap on /dev/sda3. Priority:-1 extents:1
kjournald starting. Commit interval 5 seconds
EXT3 FS on sda2, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 5 seconds
EXT3 FS on sda5, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
e1000: eth0: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex
process `syslogd' is using obsolete setsockopt SO_BSDCOMPAT
process `snmpd' is using obsolete setsockopt SO_BSDCOMPAT

Cheers,
Rune




On Oct 15, 2004, at 1:24 AM, Mark A. Williamson wrote:

>> When we benchmark from a virtual domain (running on Xen on a physical
>> node) to an another virtual domain (on another physical node) we get
>> 56.480 MByte/s (1:16)
>
> Ouch. How are you benchmarking this? (what tool, what parameters,
> etc.).
> It'll help me reproduce this on our test systems. Then we'll know if
> it's
> your config or if there's something to track down.
>
> We did see some weird performance for small packets at one stage and
> I'm not
> sure if that was ever resolved. If it's the same problem, I can do a
> binary
> chop search of changesets in order to locate it.
>
> Cheers,
> Mark
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by: IT Product Guide on
> ITManagersJournal
> Use IT products in your business? Tell us what you think of them. Give
> us
> Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out
> more
> http://productguide.itmanagersjournal.com/guidepromo.tmpl
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xen-devel



-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Xen & I/O in clusters - problems! New Information [ In reply to ]
> Between two Xen dom0 domains (between two physical computers in the
> cluster) we got these strange results:
> (We use ttcp socketbuffsize, 10^4 -> 10^6)
>
> Kernel 2.6:
> Xen Dom0 -> Xen Dom0: ca. 80 000 KB/s (?)
>
> Native Linux:
> Native Linux -> Native Linux: ca. 114 000 KB/s
>
> What is new and strage is that Xen Dom0 use about 60% of the CPU when
> transfering or receiving, while
> Native Linux ony use 6-7%(!) It seems like we have a problem with the
> DMA here(?). We use Xen 2.0,
> Gigabit ethernet.

dom0 to dom0 performance really shouldn't be any difference from
native. It certainly isn't on any of our machines.

The only thing I can think of is that something stupid might be
happening with interrupts on your machines. Can you compare the
rate that the relevant interrupts are going up in
/proc/interrupts between xenLinux and native.

There's no interrupt sharing or anything daft like that going on?
Are you using NAPI on the native e1000 linux driver?


Ian


-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Xen & I/O in clusters - problems! New Information [ In reply to ]
We tried to compile xen0 with CONFIG_E1000_NAPI = y and got the samre
results between
two xen dom0 nodes. I am not sure if these interrupts tells anything:

Native Linux:


CPU0
0: 87290435 XT-PIC timer
1: 2 XT-PIC keyboard
2: 0 XT-PIC cascade
3: 7668994 XT-PIC eth0
7: 0 XT-PIC ehci-hcd
8: 1 XT-PIC rtc
10: 3 XT-PIC usb-uhci
11: 332088 XT-PIC aic7xxx, aic7xxx, usb-uhci
14: 1 XT-PIC ide0
15: 0 XT-PIC libata
NMI: 0
ERR: 0


Xen Dom0:

CPU0
1: 2 Phys-irq keyboard
14: 3 Phys-irq ide0
18: 954304 Phys-irq eth0
24: 7313 Phys-irq aic7xxx
25: 30 Phys-irq aic7xxx
128: 1 Dynamic-irq misdirect
129: 0 Dynamic-irq ctrl-if
130: 241914 Dynamic-irq timer
131: 0 Dynamic-irq timer_dbg, net-be-dbg
132: 0 Dynamic-irq console
NMI: 0
ERR: 0

If you can see anything which is not normal behavior of xen please
tell us :)

Cheers,
Rune


On Oct 21, 2004, at 10:24 PM, Ian Pratt wrote:

>
>> Between two Xen dom0 domains (between two physical computers in the
>> cluster) we got these strange results:
>> (We use ttcp socketbuffsize, 10^4 -> 10^6)
>>
>> Kernel 2.6:
>> Xen Dom0 -> Xen Dom0: ca. 80 000 KB/s (?)
>>
>> Native Linux:
>> Native Linux -> Native Linux: ca. 114 000 KB/s
>>
>> What is new and strage is that Xen Dom0 use about 60% of the CPU when
>> transfering or receiving, while
>> Native Linux ony use 6-7%(!) It seems like we have a problem with the
>> DMA here(?). We use Xen 2.0,
>> Gigabit ethernet.
>
> dom0 to dom0 performance really shouldn't be any difference from
> native. It certainly isn't on any of our machines.
>
> The only thing I can think of is that something stupid might be
> happening with interrupts on your machines. Can you compare the
> rate that the relevant interrupts are going up in
> /proc/interrupts between xenLinux and native.
>
> There's no interrupt sharing or anything daft like that going on?
> Are you using NAPI on the native e1000 linux driver?
>
>
> Ian



-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Xen & I/O in clusters - problems! New Information [ In reply to ]
>
> We tried to compile xen0 with CONFIG_E1000_NAPI = y and got the samre
> results between
> two xen dom0 nodes. I am not sure if these interrupts tells anything:

It's the different rate at which the eth0 interrupt counts go up
during your bandwidth tests that is interesting. e.g. poll it
once a second during the test.

It looks like your native Linux is not using legacy PIC mode
rather than using the ioapic. Have you tried an SMP native
kernel to see if it gets the same interrupt layout as Xen?

Ian

> Native Linux:
>
>
> CPU0
> 0: 87290435 XT-PIC timer
> 1: 2 XT-PIC keyboard
> 2: 0 XT-PIC cascade
> 3: 7668994 XT-PIC eth0
> 7: 0 XT-PIC ehci-hcd
> 8: 1 XT-PIC rtc
> 10: 3 XT-PIC usb-uhci
> 11: 332088 XT-PIC aic7xxx, aic7xxx, usb-uhci
> 14: 1 XT-PIC ide0
> 15: 0 XT-PIC libata
> NMI: 0
> ERR: 0
>
>
> Xen Dom0:
>
> CPU0
> 1: 2 Phys-irq keyboard
> 14: 3 Phys-irq ide0
> 18: 954304 Phys-irq eth0
> 24: 7313 Phys-irq aic7xxx
> 25: 30 Phys-irq aic7xxx
> 128: 1 Dynamic-irq misdirect
> 129: 0 Dynamic-irq ctrl-if
> 130: 241914 Dynamic-irq timer
> 131: 0 Dynamic-irq timer_dbg, net-be-dbg
> 132: 0 Dynamic-irq console
> NMI: 0
> ERR: 0
>
> If you can see anything which is not normal behavior of xen please
> tell us :)
>
> Cheers,
> Rune


-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Xen & I/O in clusters - problems! New Information [ In reply to ]
On Fri, Oct 22, 2004 at 06:47:35PM +0100, Ian Pratt wrote:
> >
> > We tried to compile xen0 with CONFIG_E1000_NAPI = y and got the samre
> > results between
> > two xen dom0 nodes. I am not sure if these interrupts tells anything:
>
> It's the different rate at which the eth0 interrupt counts go up
> during your bandwidth tests that is interesting. e.g. poll it
> once a second during the test.
>

We tried sending 1 MB and measured:
non-SMP native Linux:
~ 130k interrupts
114 kB/s
native Linux with compiled-in SMP support, single CPU:
~ 140k interrupts
114 kB/s
Xen0:
~ 180k interrupts
80 kB/s

> It looks like your native Linux is not using legacy PIC mode
> rather than using the ioapic. Have you tried an SMP native
> kernel to see if it gets the same interrupt layout as Xen?
>

Layout in native linux w/o SMP:

CPU0
0: 130710161 XT-PIC timer
1: 2 XT-PIC keyboard
2: 0 XT-PIC cascade
3: 12872557 XT-PIC eth0
7: 0 XT-PIC ehci-hcd
8: 1 XT-PIC rtc
10: 3 XT-PIC usb-uhci
11: 445263 XT-PIC aic7xxx, aic7xxx, usb-uhci
14: 1 XT-PIC ide0
15: 0 XT-PIC libata
NMI: 0
ERR: 0


Layout in native linux compiled with SMP:

CPU0
0: 121362 IO-APIC-edge timer
1: 2 IO-APIC-edge keyboard
2: 0 XT-PIC cascade
14: 5 IO-APIC-edge ide0
16: 0 IO-APIC-level usb-uhci
18: 177583 IO-APIC-level eth0
19: 0 IO-APIC-level usb-uhci
24: 7587 IO-APIC-level aic7xxx
25: 30 IO-APIC-level aic7xxx
NMI: 0
LOC: 121307
ERR: 0
MIS: 0


Layout in Xen0:

CPU0
1: 2 Phys-irq keyboard
14: 3 Phys-irq ide0
18: 267274 Phys-irq eth0
24: 15910 Phys-irq aic7xxx
25: 30 Phys-irq aic7xxx
128: 1 Dynamic-irq misdirect
129: 0 Dynamic-irq ctrl-if
130: 771917 Dynamic-irq timer
131: 0 Dynamic-irq timer_dbg, net-be-dbg
132: 0 Dynamic-irq console
NMI: 0
ERR: 0


Questions:
Do you think interrupt sharing is the problem?
Or the use of IO-APIC?
Is this an inherent problem with Xen?
Is it possible to change the interrupt scheme in Xen in order to achieve the same performance as in native linux?


Cheers,
Håvard


-------------------------------------------------------
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_idU88&alloc_id065&op=click
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Xen & I/O in clusters - problems! New Information [ In reply to ]
> On Fri, Oct 22, 2004 at 06:47:35PM +0100, Ian Pratt wrote:
> > >
> > > We tried to compile xen0 with CONFIG_E1000_NAPI = y and got the samre
> > > results between
> > > two xen dom0 nodes. I am not sure if these interrupts tells anything:
> >
> > It's the different rate at which the eth0 interrupt counts go up
> > during your bandwidth tests that is interesting. e.g. poll it
> > once a second during the test.
> >
>
> We tried sending 1 MB and measured:

1MB isn't really very much with a 128KB socket buffer. Do you get
the same results with larger transfers?

> non-SMP native Linux:
> ~ 130k interrupts
> 114 kB/s
> native Linux with compiled-in SMP support, single CPU:
> ~ 140k interrupts
> 114 kB/s
> Xen0:
> ~ 180k interrupts
> 80 kB/s

It's pretty odd that Xen's taking more interrupts. Are you using
the same native kernel version as you are for Xen?

Also, what happens if you boot xen with 'nosmp' on the Xen
command line.

We've got Xen tcp performance results from a bunch of machines,
and dom0 to dom0 performance has always been almost identical to
native for 1500 byte MTU packets.


Ian



-------------------------------------------------------
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Xen & I/O in clusters - problems! New Information [ In reply to ]
Just a thought: you are using the BVT scheduler, right? We haven't tested
performance with the other schedulers recently but we know something goes
wrong for IO intensive domains on Atropos.

Mark

On Wednesday 27 Oct 2004 19:26, Ian Pratt wrote:
> > On Fri, Oct 22, 2004 at 06:47:35PM +0100, Ian Pratt wrote:
> > > > We tried to compile xen0 with CONFIG_E1000_NAPI = y and got the samre
> > > > results between
> > > > two xen dom0 nodes. I am not sure if these interrupts tells anything:
> > >
> > > It's the different rate at which the eth0 interrupt counts go up
> > > during your bandwidth tests that is interesting. e.g. poll it
> > > once a second during the test.
> >
> > We tried sending 1 MB and measured:
>
> 1MB isn't really very much with a 128KB socket buffer. Do you get
> the same results with larger transfers?
>
> > non-SMP native Linux:
> > ~ 130k interrupts
> > 114 kB/s
> > native Linux with compiled-in SMP support, single CPU:
> > ~ 140k interrupts
> > 114 kB/s
> > Xen0:
> > ~ 180k interrupts
> > 80 kB/s
>
> It's pretty odd that Xen's taking more interrupts. Are you using
> the same native kernel version as you are for Xen?
>
> Also, what happens if you boot xen with 'nosmp' on the Xen
> command line.
>
> We've got Xen tcp performance results from a bunch of machines,
> and dom0 to dom0 performance has always been almost identical to
> native for 1500 byte MTU packets.
>
>
> Ian
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by:
> Sybase ASE Linux Express Edition - download now for FREE
> LinuxWorld Reader's Choice Award Winner for best database on Linux.
> http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xen-devel


-------------------------------------------------------
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Xen & I/O in clusters - problems! New Information [ In reply to ]
On Wed, Oct 27, 2004 at 07:26:10PM +0100, Ian Pratt wrote:
> > On Fri, Oct 22, 2004 at 06:47:35PM +0100, Ian Pratt wrote:
> > > >
> > > > We tried to compile xen0 with CONFIG_E1000_NAPI = y and got the samre
> > > > results between
> > > > two xen dom0 nodes. I am not sure if these interrupts tells anything:
> > >
> > > It's the different rate at which the eth0 interrupt counts go up
> > > during your bandwidth tests that is interesting. e.g. poll it
> > > once a second during the test.
> > >
> >
> > We tried sending 1 MB and measured:
>
> 1MB isn't really very much with a 128KB socket buffer. Do you get
> the same results with larger transfers?
>

With 10 MB transfers the results are roughly the same:

native Linux with compiled-in SMP support, single CPU:
~ 1365k interrupts
114 kB/s
Xen0 with "nosmp":
~ 1676 k interrupts
76 kB/s

> > non-SMP native Linux:
> > ~ 130k interrupts
> > 114 kB/s
> > native Linux with compiled-in SMP support, single CPU:
> > ~ 140k interrupts
> > 114 kB/s
> > Xen0:
> > ~ 180k interrupts
> > 80 kB/s
>
> It's pretty odd that Xen's taking more interrupts. Are you using
> the same native kernel version as you are for Xen?
>

Yes, both are 2.4.27

> Also, what happens if you boot xen with 'nosmp' on the Xen
> command line.
>

There seems to be no change in behaviour. The interrupt layout and count remains roughly the same.

Do you have any more tips? :)


Håvard


-------------------------------------------------------
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_idU88&alloc_id065&op=click
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Xen & I/O in clusters - problems! New Information [ In reply to ]
We're getting these results in domain 0, not in a VM. As I understand, BVT or apropos scheduling do not apply in domain 0?

Håvard

On Wed, Oct 27, 2004 at 07:40:27PM +0100, Mark A. Williamson wrote:
> Just a thought: you are using the BVT scheduler, right? We haven't tested
> performance with the other schedulers recently but we know something goes
> wrong for IO intensive domains on Atropos.
>
> Mark
>
> On Wednesday 27 Oct 2004 19:26, Ian Pratt wrote:
> > > On Fri, Oct 22, 2004 at 06:47:35PM +0100, Ian Pratt wrote:
> > > > > We tried to compile xen0 with CONFIG_E1000_NAPI = y and got the samre
> > > > > results between
> > > > > two xen dom0 nodes. I am not sure if these interrupts tells anything:
> > > >
> > > > It's the different rate at which the eth0 interrupt counts go up
> > > > during your bandwidth tests that is interesting. e.g. poll it
> > > > once a second during the test.
> > >
> > > We tried sending 1 MB and measured:
> >
> > 1MB isn't really very much with a 128KB socket buffer. Do you get
> > the same results with larger transfers?
> >
> > > non-SMP native Linux:
> > > ~ 130k interrupts
> > > 114 kB/s
> > > native Linux with compiled-in SMP support, single CPU:
> > > ~ 140k interrupts
> > > 114 kB/s
> > > Xen0:
> > > ~ 180k interrupts
> > > 80 kB/s
> >
> > It's pretty odd that Xen's taking more interrupts. Are you using
> > the same native kernel version as you are for Xen?
> >
> > Also, what happens if you boot xen with 'nosmp' on the Xen
> > command line.
> >
> > We've got Xen tcp performance results from a bunch of machines,
> > and dom0 to dom0 performance has always been almost identical to
> > native for 1500 byte MTU packets.
> >
> >
> > Ian
> >
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by:
> > Sybase ASE Linux Express Edition - download now for FREE
> > LinuxWorld Reader's Choice Award Winner for best database on Linux.
> > http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/xen-devel


-------------------------------------------------------
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_idU88&alloc_id065&op=click
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Xen & I/O in clusters - problems! New Information [ In reply to ]
> Yes, both are 2.4.27
>
> > Also, what happens if you boot xen with 'nosmp' on the Xen
> > command line.
> >
>
> There seems to be no change in behaviour. The interrupt layout and count remains roughly the same.
>
> Do you have any more tips? :)

Please can you try Xen/linux 2.6.9. Also, please can you remind
me of the spec of your machines.

Ian
 -=- MIME -=- 
On Wed, Oct 27, 2004 at 07:26:10PM +0100, Ian Pratt wrote:
> > On Fri, Oct 22, 2004 at 06:47:35PM +0100, Ian Pratt wrote:
> > > >=20
> > > > We tried to compile xen0 with CONFIG_E1000_NAPI =3D y and got the=
samre=20
> > > > results between
> > > > two xen dom0 nodes. I am not sure if these interrupts tells anyth=
ing:
> > > It's the different rate at which the eth0 interrupt counts go up
> > > during your bandwidth tests that is interesting. e.g. poll it
> > > once a second during the test.
> > >=20
> >=20
> > We tried sending 1 MB and measured:
>=20
> 1MB isn't really very much with a 128KB socket buffer. Do you get
> the same results with larger transfers?
>=20

With 10 MB transfers the results are roughly the same:

native Linux with compiled-in SMP support, single CPU:
~ 1365k interrupts
114 kB/s
Xen0 with "nosmp":
~ 1676 k interrupts
76 kB/s

> > non-SMP native Linux:
> > ~ 130k interrupts
> > 114 kB/s
> > native Linux with compiled-in SMP support, single CPU:
> > ~ 140k interrupts
> > 114 kB/s
> > Xen0:
> > ~ 180k interrupts
> > 80 kB/s
>=20
> It's pretty odd that Xen's taking more interrupts. Are you using
> the same native kernel version as you are for Xen?
>=20

Yes, both are 2.4.27

> Also, what happens if you boot xen with 'nosmp' on the Xen
> command line.
>=20

There seems to be no change in behaviour. The interrupt layout and count =
remains roughly the same.

Do you have any more tips? :)


H=E5vard



-------------------------------------------------------
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Xen & I/O in clusters - problems! New Information [ In reply to ]
Domain 0 is (from the POV of Xen) basically Just Another VM and is scheduled
pre-emptively. Some capability flags give the Dom0 VM the privileges it
needs in order to control devices, screen, Xen management functions etc. If
you use the buggy Atropos then you'll lose performance even with just one
domain.

HTH,
Mark

On Thursday 28 October 2004 11:51, Håvard Bjerke wrote:
> We're getting these results in domain 0, not in a VM. As I understand, BVT
> or apropos scheduling do not apply in domain 0?
>
> HÃ¥vard
>
> On Wed, Oct 27, 2004 at 07:40:27PM +0100, Mark A. Williamson wrote:
> > Just a thought: you are using the BVT scheduler, right? We haven't
> > tested performance with the other schedulers recently but we know
> > something goes wrong for IO intensive domains on Atropos.
> >
> > Mark
> >
> > On Wednesday 27 Oct 2004 19:26, Ian Pratt wrote:
> > > > On Fri, Oct 22, 2004 at 06:47:35PM +0100, Ian Pratt wrote:
> > > > > > We tried to compile xen0 with CONFIG_E1000_NAPI = y and got the
> > > > > > samre results between
> > > > > > two xen dom0 nodes. I am not sure if these interrupts tells
> > > > > > anything:
> > > > >
> > > > > It's the different rate at which the eth0 interrupt counts go up
> > > > > during your bandwidth tests that is interesting. e.g. poll it
> > > > > once a second during the test.
> > > >
> > > > We tried sending 1 MB and measured:
> > >
> > > 1MB isn't really very much with a 128KB socket buffer. Do you get
> > > the same results with larger transfers?
> > >
> > > > non-SMP native Linux:
> > > > ~ 130k interrupts
> > > > 114 kB/s
> > > > native Linux with compiled-in SMP support, single CPU:
> > > > ~ 140k interrupts
> > > > 114 kB/s
> > > > Xen0:
> > > > ~ 180k interrupts
> > > > 80 kB/s
> > >
> > > It's pretty odd that Xen's taking more interrupts. Are you using
> > > the same native kernel version as you are for Xen?
> > >
> > > Also, what happens if you boot xen with 'nosmp' on the Xen
> > > command line.
> > >
> > > We've got Xen tcp performance results from a bunch of machines,
> > > and dom0 to dom0 performance has always been almost identical to
> > > native for 1500 byte MTU packets.
> > >
> > >
> > > Ian
> > >
> > >
> > >
> > > -------------------------------------------------------
> > > This SF.Net email is sponsored by:
> > > Sybase ASE Linux Express Edition - download now for FREE
> > > LinuxWorld Reader's Choice Award Winner for best database on Linux.
> > > http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click
> > > _______________________________________________
> > > Xen-devel mailing list
> > > Xen-devel@lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/xen-devel


-------------------------------------------------------
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_idU88&alloc_id065&op=click
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Xen & I/O in clusters - problems! New Information [ In reply to ]
On Thu, Oct 28, 2004 at 01:44:44PM +0100, Ian Pratt wrote:
>
> Please can you try Xen/linux 2.6.9. Also, please can you remind
> me of the spec of your machines.
>

Specs:
Ethernet controller: Intel Corp. 82547GI Gigabit Ethernet Controller
CPU: Intel(R) Pentium(R) 4 CPU 3.40GHz single cpu
RAM: 1 GB


The previous results from 2.4.27 are (I wrote kB/s earlier but that was wrong):
Native:
130k interrupts
114 MB/s
Xen0:
180k interrupts
80 MB/s

New results with 2.6.9:
Native:
bandwidth: 2048000000 bytes in 17.42 real seconds = 114794.42 KB/sec
CPU: 0.0user 1.4sys 0:17real 8%
interrupts: 135k
Xen0:
bandwidth: 2048000000 bytes in 21.77 real seconds = 91885.76 KB/sec
CPU: 0.0user 16.2sys 0:21real 74%
interrupts: 107k

I also tried sending localhost-localhost in 2.4.27, and interestingly Native performed 7:1 times better than xen0:
Native:
bandwidth: 2048000000 bytes in 2.70 real seconds = 741704.96 KB/sec
CPU: 0.0user 1.6sys 0:02real 62%
Xen0:
bandwidth: 2048000000 bytes in 17.12 real seconds = 116838.03 KB/sec
CPU: 5.9user 0.0sys 0:17real 34%

We've also recently benchmarked in an SMP cluster and achieved satisfying results, ie. around 114 MB/s with both Native and Xen0. But we're still wondering why we're not achieving full speed in a single-cpu cluster.

Håvard


-------------------------------------------------------
This Newsletter Sponsored by: Macrovision
For reliable Linux application installations, use the industry's leading
setup authoring tool, InstallShield X. Learn more and evaluate
today. http://clk.atdmt.com/MSI/go/ins0030000001msi/direct/01/
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Xen & I/O in clusters - Single Vs. Dual CPU issue [ In reply to ]
Hi, we have validated the single CPU results in another single CPU
cluster, and we still get an performance loss about 30%
(ca. 85 000 KB/s between two dom0 nodes (and NOT 114 000 KB/s),
Specs:
Ethernet controller: Intel Corp. 82547GI Gigabit Ethernet Controller
CPU: Intel(R) Pentium(R) 4 CPU 3.40GHz single CPU
RAM: 1 GB)

But, in a dual CPU cluster, Intel Xenon CPU 2.40 GHz Ethernet
controller: Intel Corp. 82547GI Gigabit Ethernet Controller, we get
only 1%
(from 114 000 KB/s -> 110 000 KB/s) performance loss.)

Both the single CPU clusters and the dual cluster are Xen 2.0 beta
(2.4.26 kernel) with Red Hat Ent. 3

It seems to me that there is an issue with Xen and Gigabit ethernet
controllers, where Xen is optimized for
a dual core(?). As mentioned before we have more interrupts with Xen
dom0 on eth0 than Native Linux
(we use bvt, not atropos)

Are there any Xen developers/testers which have tested Xen 2.0 on a
single CPU cluster and can confirm (or not) these results?

Cheers,
Rune



-------------------------------------------------------
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Xen & I/O in clusters - Single Vs. Dual CPU issue [ In reply to ]
> But, in a dual CPU cluster, Intel Xenon CPU 2.40 GHz Ethernet
> controller: Intel Corp. 82547GI Gigabit Ethernet Controller, we get
> only 1%
> (from 114 000 KB/s -> 110 000 KB/s) performance loss.)

Good, that's the result we expect.

> Hi, we have validated the single CPU results in another single CPU
> cluster, and we still get an performance loss about 30%
> (ca. 85 000 KB/s between two dom0 nodes (and NOT 114 000 KB/s),
> Specs:
> Ethernet controller: Intel Corp. 82547GI Gigabit Ethernet Controller
> CPU: Intel(R) Pentium(R) 4 CPU 3.40GHz single CPU
> RAM: 1 GB)

Hmm, have your systems got an IOAPIC, or is Xen using the legacy
PIC code? The latter probably hasn't been thoroughly performance
tested...

Ian



-------------------------------------------------------
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Xen & I/O in clusters - Single Vs. Dual CPU issue [ In reply to ]
On Fri, Oct 29, 2004 at 06:24:35PM +0100, Ian Pratt wrote:
>
> > But, in a dual CPU cluster, Intel Xenon CPU 2.40 GHz Ethernet
> > controller: Intel Corp. 82547GI Gigabit Ethernet Controller, we get
> > only 1%
> > (from 114 000 KB/s -> 110 000 KB/s) performance loss.)
>
> Good, that's the result we expect.
>
> > Hi, we have validated the single CPU results in another single CPU
> > cluster, and we still get an performance loss about 30%
> > (ca. 85 000 KB/s between two dom0 nodes (and NOT 114 000 KB/s),
> > Specs:
> > Ethernet controller: Intel Corp. 82547GI Gigabit Ethernet Controller
> > CPU: Intel(R) Pentium(R) 4 CPU 3.40GHz single CPU
> > RAM: 1 GB)
>
> Hmm, have your systems got an IOAPIC, or is Xen using the legacy
> PIC code? The latter probably hasn't been thoroughly performance
> tested...
>

The native systems can have either XT-PIC or IOAPIC, and it seems that both have equal performance. In Xen0, however, by looking through dmesg, there doesn't seem to be any IOAPIC. Is it possible to enable (or disable) IOAPIC in Xen0?

Håvard


-------------------------------------------------------
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_idU88&alloc_id065&op=click
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Xen & I/O in clusters - Single Vs. Dual CPU issue [ In reply to ]
> The native systems can have either XT-PIC or IOAPIC, and it seems that both have equal performance. In Xen0, however, by looking through dmesg, there doesn't seem to be any IOAPIC. Is it possible to enable (or disable) IOAPIC in Xen0?

If you boot with 'ignorebiostables' on the Xen command line Xen
will ignore the IOAPIC and use the PIC.

I could believe there might be a lurking performance problem with
PIC support as it probably hasn't had the same level of
performance testing as IOAPIC.

Ian
 -=- MIME -=- 
On Fri, Oct 29, 2004 at 06:24:35PM +0100, Ian Pratt wrote:
>=20
> > But, in a dual CPU cluster, Intel Xenon CPU 2.40 GHz Ethernet=20
> > controller: Intel Corp. 82547GI Gigabit Ethernet Controller, we get=20
> > only 1%
> > (from 114 000 KB/s -> 110 000 KB/s) performance loss.)
>=20
> Good, that's the result we expect.
>=20
> > Hi, we have validated the single CPU results in another single CPU=20
> > cluster, and we still get an performance loss about 30%
> > (ca. 85 000 KB/s between two dom0 nodes (and NOT 114 000 KB/s),
> > Specs:
> > Ethernet controller: Intel Corp. 82547GI Gigabit Ethernet Controller
> > CPU: Intel(R) Pentium(R) 4 CPU 3.40GHz single CPU
> > RAM: 1 GB)
>=20
> Hmm, have your systems got an IOAPIC, or is Xen using the legacy
> PIC code? The latter probably hasn't been thoroughly performance
> tested...
>=20

The native systems can have either XT-PIC or IOAPIC, and it seems that bo=
th have equal performance. In Xen0, however, by looking through dmesg, th=
ere doesn't seem to be any IOAPIC. Is it possible to enable (or disable) =
IOAPIC in Xen0?

H=E5vard



-------------------------------------------------------
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Xen & I/O in clusters - Single Vs. Dual CPU issue [ In reply to ]
On Fri, Oct 29, 2004 at 08:46:46PM +0100, Ian Pratt wrote:
> > The native systems can have either XT-PIC or IOAPIC, and it seems that both have equal performance. In Xen0, however, by looking through dmesg, there doesn't seem to be any IOAPIC. Is it possible to enable (or disable) IOAPIC in Xen0?
>
> If you boot with 'ignorebiostables' on the Xen command line Xen
> will ignore the IOAPIC and use the PIC.
>
> I could believe there might be a lurking performance problem with
> PIC support as it probably hasn't had the same level of
> performance testing as IOAPIC.
>

'ignorebiostables' did the trick. Thanks for the help :)

Now I'm curious about live migration. Is it possible to live migrate an MPI application running on a set of nodes to another set of nodes? Has anyone tried that?

Håvard


-------------------------------------------------------
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_idU88&alloc_id065&op=click
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel

1 2  View All