Mailing List Archive

1 2 3 4 5 6 7  View All
Re: [PATCH 0/9] RFC: NVME VFIO mediated device [ In reply to ]
On Wed, 2019-03-20 at 08:08 -0700, Bart Van Assche wrote:
> On Tue, 2019-03-19 at 16:41 +0200, Maxim Levitsky wrote:
> > * Polling kernel thread is used. The polling is stopped after a
> > predefined timeout (1/2 sec by default).
> > Support for all interrupt driven mode is planned, and it shows promising
> > results.
>
> Which cgroup will the CPU cycles used for polling be attributed to? Can the
> polling code be moved into user space such that it becomes easy to identify
> which process needs most CPU cycles for polling and such that the polling
> CPU cycles are attributed to the proper cgroup?


Currently there is a single IO thread per each virtual controller instance.
I would prefer to keep all the driver in the kernel, but I think I can make it
cgroup aware, in a simiar way this is done in vhost-net, and vhost-scsi.

Best regards,
Maxim Levitsky


> Thanks,
>
> Bart.
Re: your mail [ In reply to ]
On Wed, Mar 20, 2019 at 06:30:29PM +0200, Maxim Levitsky wrote:
> Or instead I can use the block backend,
> (but note that currently the block back-end doesn't support polling which is
> critical for the performance).

Oh, I think you can do polling through there. For reference, fs/io_uring.c
has a pretty good implementation that aligns with how you could use it.
Re: your mail [ In reply to ]
On Wed, 2019-03-20 at 11:03 -0600, Keith Busch wrote:
> On Wed, Mar 20, 2019 at 06:30:29PM +0200, Maxim Levitsky wrote:
> > Or instead I can use the block backend,
> > (but note that currently the block back-end doesn't support polling which is
> > critical for the performance).
>
> Oh, I think you can do polling through there. For reference, fs/io_uring.c
> has a pretty good implementation that aligns with how you could use it.


That is exactly my thought. The polling recently got lot of improvements in the
block layer, which migh make this feasable.

I will give it a try.

Best regards,
Maxim Levitsky
[no subject] [ In reply to ]
subscribe linux-kernel
[no subject] [ In reply to ]
Date: Thu, 21 Mar 2019 10:14:34 +0800
Subject: [PATCH] MIPS: ralink: fix cpu clock of mt7621 and add dt clk devices

For a long time the mt7621 uses a fixed cpu clock which causes a problem
if the cpu frequency is not 880MHz.

This patch fixes the cpu clock calculation and adds the cpu/bus clkdev
which will be used in dts.

Signed-off-by: Weijie Gao <hackpascal@gmail.com>

Ported from OpenWrt:
c7ca224299 ramips: fix cpu clock of mt7621 and add dt clk devices

Signed-off-by: Chuanhong Guo <gch981213@gmail.com>
---
arch/mips/include/asm/mach-ralink/mt7621.h | 20 ++++
arch/mips/ralink/mt7621.c | 102 ++++++++++++++-------
arch/mips/ralink/timer-gic.c | 4 +-
3 files changed, 93 insertions(+), 33 deletions(-)

diff --git a/arch/mips/include/asm/mach-ralink/mt7621.h b/arch/mips/include/asm/mach-ralink/mt7621.h
index a672e06fa5fd..b8a8834164c8 100644
--- a/arch/mips/include/asm/mach-ralink/mt7621.h
+++ b/arch/mips/include/asm/mach-ralink/mt7621.h
@@ -19,6 +19,10 @@
#define SYSC_REG_CHIP_REV 0x0c
#define SYSC_REG_SYSTEM_CONFIG0 0x10
#define SYSC_REG_SYSTEM_CONFIG1 0x14
+#define SYSC_REG_CLKCFG0 0x2c
+#define SYSC_REG_CUR_CLK_STS 0x44
+
+#define MEMC_REG_CPU_PLL 0x648

#define CHIP_REV_PKG_MASK 0x1
#define CHIP_REV_PKG_SHIFT 16
@@ -26,6 +30,22 @@
#define CHIP_REV_VER_SHIFT 8
#define CHIP_REV_ECO_MASK 0xf

+#define XTAL_MODE_SEL_MASK 0x7
+#define XTAL_MODE_SEL_SHIFT 6
+
+#define CPU_CLK_SEL_MASK 0x3
+#define CPU_CLK_SEL_SHIFT 30
+
+#define CUR_CPU_FDIV_MASK 0x1f
+#define CUR_CPU_FDIV_SHIFT 8
+#define CUR_CPU_FFRAC_MASK 0x1f
+#define CUR_CPU_FFRAC_SHIFT 0
+
+#define CPU_PLL_PREDIV_MASK 0x3
+#define CPU_PLL_PREDIV_SHIFT 12
+#define CPU_PLL_FBDIV_MASK 0x7f
+#define CPU_PLL_FBDIV_SHIFT 4
+
#define MT7621_DRAM_BASE 0x0
#define MT7621_DDR2_SIZE_MIN 32
#define MT7621_DDR2_SIZE_MAX 256
diff --git a/arch/mips/ralink/mt7621.c b/arch/mips/ralink/mt7621.c
index d2718de60b9b..0b2845a4a036 100644
--- a/arch/mips/ralink/mt7621.c
+++ b/arch/mips/ralink/mt7621.c
@@ -9,22 +9,22 @@

#include <linux/kernel.h>
#include <linux/init.h>
+#include <linux/clk.h>
+#include <linux/clkdev.h>
+#include <linux/clk-provider.h>
+#include <dt-bindings/clock/mt7621-clk.h>

#include <asm/mipsregs.h>
#include <asm/smp-ops.h>
#include <asm/mips-cps.h>
#include <asm/mach-ralink/ralink_regs.h>
#include <asm/mach-ralink/mt7621.h>
+#include <asm/time.h>

#include <pinmux.h>

#include "common.h"

-#define SYSC_REG_SYSCFG 0x10
-#define SYSC_REG_CPLL_CLKCFG0 0x2c
-#define SYSC_REG_CUR_CLK_STS 0x44
-#define CPU_CLK_SEL (BIT(30) | BIT(31))
-
#define MT7621_GPIO_MODE_UART1 1
#define MT7621_GPIO_MODE_I2C 2
#define MT7621_GPIO_MODE_UART3_MASK 0x3
@@ -110,49 +110,90 @@ static struct rt2880_pmx_group mt7621_pinmux_data[] = {
{ 0 }
};

+static struct clk *clks[MT7621_CLK_MAX];
+static struct clk_onecell_data clk_data = {
+ .clks = clks,
+ .clk_num = ARRAY_SIZE(clks),
+};
+
phys_addr_t mips_cpc_default_phys_base(void)
{
panic("Cannot detect cpc address");
}

+static struct clk *__init mt7621_add_sys_clkdev(
+ const char *id, unsigned long rate)
+{
+ struct clk *clk;
+ int err;
+
+ clk = clk_register_fixed_rate(NULL, id, NULL, 0, rate);
+ if (IS_ERR(clk))
+ panic("failed to allocate %s clock structure", id);
+
+ err = clk_register_clkdev(clk, id, NULL);
+ if (err)
+ panic("unable to register %s clock device", id);
+
+ return clk;
+}
+
void __init ralink_clk_init(void)
{
- int cpu_fdiv = 0;
- int cpu_ffrac = 0;
- int fbdiv = 0;
- u32 clk_sts, syscfg;
- u8 clk_sel = 0, xtal_mode;
- u32 cpu_clk;
+ u32 syscfg, xtal_sel, clkcfg, clk_sel, curclk, ffiv, ffrac;
+ u32 pll, prediv, fbdiv;
+ u32 xtal_clk, cpu_clk, bus_clk;
+
+ const static u32 prediv_tbl[] = {0, 1, 2, 2};
+
+ syscfg = rt_sysc_r32(SYSC_REG_SYSTEM_CONFIG0);
+ xtal_sel = (syscfg >> XTAL_MODE_SEL_SHIFT) & XTAL_MODE_SEL_MASK;

- if ((rt_sysc_r32(SYSC_REG_CPLL_CLKCFG0) & CPU_CLK_SEL) != 0)
- clk_sel = 1;
+ clkcfg = rt_sysc_r32(SYSC_REG_CLKCFG0);
+ clk_sel = (clkcfg >> CPU_CLK_SEL_SHIFT) & CPU_CLK_SEL_MASK;
+
+ curclk = rt_sysc_r32(SYSC_REG_CUR_CLK_STS);
+ ffiv = (curclk >> CUR_CPU_FDIV_SHIFT) & CUR_CPU_FDIV_MASK;
+ ffrac = (curclk >> CUR_CPU_FFRAC_SHIFT) & CUR_CPU_FFRAC_MASK;
+
+ if (xtal_sel <= 2)
+ xtal_clk = 20 * 1000 * 1000;
+ else if (xtal_sel <= 5)
+ xtal_clk = 40 * 1000 * 1000;
+ else
+ xtal_clk = 25 * 1000 * 1000;

switch (clk_sel) {
case 0:
- clk_sts = rt_sysc_r32(SYSC_REG_CUR_CLK_STS);
- cpu_fdiv = ((clk_sts >> 8) & 0x1F);
- cpu_ffrac = (clk_sts & 0x1F);
- cpu_clk = (500 * cpu_ffrac / cpu_fdiv) * 1000 * 1000;
+ cpu_clk = 500 * 1000 * 1000;
break;
-
case 1:
- fbdiv = ((rt_sysc_r32(0x648) >> 4) & 0x7F) + 1;
- syscfg = rt_sysc_r32(SYSC_REG_SYSCFG);
- xtal_mode = (syscfg >> 6) & 0x7;
- if (xtal_mode >= 6) {
- /* 25Mhz Xtal */
- cpu_clk = 25 * fbdiv * 1000 * 1000;
- } else if (xtal_mode >= 3) {
- /* 40Mhz Xtal */
- cpu_clk = 40 * fbdiv * 1000 * 1000;
- } else {
- /* 20Mhz Xtal */
- cpu_clk = 20 * fbdiv * 1000 * 1000;
- }
+ pll = rt_memc_r32(MEMC_REG_CPU_PLL);
+ fbdiv = (pll >> CPU_PLL_FBDIV_SHIFT) & CPU_PLL_FBDIV_MASK;
+ prediv = (pll >> CPU_PLL_PREDIV_SHIFT) & CPU_PLL_PREDIV_MASK;
+ cpu_clk = ((fbdiv + 1) * xtal_clk) >> prediv_tbl[prediv];
break;
+ default:
+ cpu_clk = xtal_clk;
}
+
+ cpu_clk = cpu_clk / ffiv * ffrac;
+ bus_clk = cpu_clk / 4;
+
+ clks[MT7621_CLK_CPU] = mt7621_add_sys_clkdev("cpu", cpu_clk);
+ clks[MT7621_CLK_BUS] = mt7621_add_sys_clkdev("bus", bus_clk);
+
+ pr_info("CPU Clock: %dMHz\n", cpu_clk / 1000000);
+ mips_hpt_frequency = cpu_clk / 2;
}

+static void __init mt7621_clocks_init_dt(struct device_node *np)
+{
+ of_clk_add_provider(np, of_clk_src_onecell_get, &clk_data);
+}
+
+CLK_OF_DECLARE(mt7621_clk, "mediatek,mt7621-pll", mt7621_clocks_init_dt);
+
void __init ralink_of_remap(void)
{
rt_sysc_membase = plat_of_remap_node("mtk,mt7621-sysc");
diff --git a/arch/mips/ralink/timer-gic.c b/arch/mips/ralink/timer-gic.c
index b5f07d21fcf2..4fed842ae8eb 100644
--- a/arch/mips/ralink/timer-gic.c
+++ b/arch/mips/ralink/timer-gic.c
@@ -11,14 +11,14 @@

#include <linux/of.h>
#include <linux/clk-provider.h>
-#include <linux/clocksource.h>
+#include <asm/time.h>

#include "common.h"

void __init plat_time_init(void)
{
ralink_of_remap();
-
+ ralink_clk_init();
of_clk_init(NULL);
timer_probe();
}
--
2.21.0
Re: your mail [ In reply to ]
On Tue, Mar 19, 2019 at 04:41:07PM +0200, Maxim Levitsky wrote:
> Date: Tue, 19 Mar 2019 14:45:45 +0200
> Subject: [PATCH 0/9] RFC: NVME VFIO mediated device
>
> Hi everyone!
>
> In this patch series, I would like to introduce my take on the problem of doing
> as fast as possible virtualization of storage with emphasis on low latency.
>
> In this patch series I implemented a kernel vfio based, mediated device that
> allows the user to pass through a partition and/or whole namespace to a guest.
>
> The idea behind this driver is based on paper you can find at
> https://www.usenix.org/conference/atc18/presentation/peng,
>
> Although note that I stared the development prior to reading this paper,
> independently.
>
> In addition to that implementation is not based on code used in the paper as
> I wasn't being able at that time to make the source available to me.
>
> ***Key points about the implementation:***
>
> * Polling kernel thread is used. The polling is stopped after a
> predefined timeout (1/2 sec by default).
> Support for all interrupt driven mode is planned, and it shows promising results.
>
> * Guest sees a standard NVME device - this allows to run guest with
> unmodified drivers, for example windows guests.
>
> * The NVMe device is shared between host and guest.
> That means that even a single namespace can be split between host
> and guest based on different partitions.
>
> * Simple configuration
>
> *** Performance ***
>
> Performance was tested on Intel DC P3700, With Xeon E5-2620 v2
> and both latency and throughput is very similar to SPDK.
>
> Soon I will test this on a better server and nvme device and provide
> more formal performance numbers.
>
> Latency numbers:
> ~80ms - spdk with fio plugin on the host.
> ~84ms - nvme driver on the host
> ~87ms - mdev-nvme + nvme driver in the guest

You mentioned the spdk numbers are with vhost-user-nvme. Have you
measured SPDK's vhost-user-blk?

Stefan
Re: your mail [ In reply to ]
On Thu, 2019-03-21 at 16:13 +0000, Stefan Hajnoczi wrote:
> On Tue, Mar 19, 2019 at 04:41:07PM +0200, Maxim Levitsky wrote:
> > Date: Tue, 19 Mar 2019 14:45:45 +0200
> > Subject: [PATCH 0/9] RFC: NVME VFIO mediated device
> >
> > Hi everyone!
> >
> > In this patch series, I would like to introduce my take on the problem of
> > doing
> > as fast as possible virtualization of storage with emphasis on low latency.
> >
> > In this patch series I implemented a kernel vfio based, mediated device
> > that
> > allows the user to pass through a partition and/or whole namespace to a
> > guest.
> >
> > The idea behind this driver is based on paper you can find at
> > https://www.usenix.org/conference/atc18/presentation/peng,
> >
> > Although note that I stared the development prior to reading this paper,
> > independently.
> >
> > In addition to that implementation is not based on code used in the paper
> > as
> > I wasn't being able at that time to make the source available to me.
> >
> > ***Key points about the implementation:***
> >
> > * Polling kernel thread is used. The polling is stopped after a
> > predefined timeout (1/2 sec by default).
> > Support for all interrupt driven mode is planned, and it shows promising
> > results.
> >
> > * Guest sees a standard NVME device - this allows to run guest with
> > unmodified drivers, for example windows guests.
> >
> > * The NVMe device is shared between host and guest.
> > That means that even a single namespace can be split between host
> > and guest based on different partitions.
> >
> > * Simple configuration
> >
> > *** Performance ***
> >
> > Performance was tested on Intel DC P3700, With Xeon E5-2620 v2
> > and both latency and throughput is very similar to SPDK.
> >
> > Soon I will test this on a better server and nvme device and provide
> > more formal performance numbers.
> >
> > Latency numbers:
> > ~80ms - spdk with fio plugin on the host.
> > ~84ms - nvme driver on the host
> > ~87ms - mdev-nvme + nvme driver in the guest
>
> You mentioned the spdk numbers are with vhost-user-nvme. Have you
> measured SPDK's vhost-user-blk?

I had lot of measuments of vhost-user-blk vs vhost-user-nvme.
vhost-user-nvme was always a bit faster but only a bit.
Thus I don't think it makes sense to benchamrk against vhost-user-blk.

Best regards,
Maxim Levitsky
[no subject] [ In reply to ]
From bb04b0ca982b7042902fffe1377e0e38e83b402b Mon Sep 17 00:00:00 2001
From: Will Cunningham <wjcunningham7@gmail.com>
Date: Sat, 23 Mar 2019 12:54:34 -0400
Subject: [PATCH] Staging: emxx_udc: emxx_udc: Fixed a coding style error

Removed unnecessary parentheses.

Signed-off-by: Will Cunningham <wjcunningham7@gmail.com>
---
drivers/staging/emxx_udc/emxx_udc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/emxx_udc/emxx_udc.c b/drivers/staging/emxx_udc/emxx_udc.c
index a913d40f0801..80a906742cdc 100644
--- a/drivers/staging/emxx_udc/emxx_udc.c
+++ b/drivers/staging/emxx_udc/emxx_udc.c
@@ -136,7 +136,7 @@ static void _nbu2ss_ep0_complete(struct usb_ep *_ep, struct usb_request *_req)
struct usb_ctrlrequest *p_ctrl;
struct nbu2ss_udc *udc;

- if ((!_ep) || (!_req))
+ if (!_ep || !_req)
return;

udc = (struct nbu2ss_udc *)_req->context;
--
2.19.2
Re: your mail [ In reply to ]
On Thu, Mar 21, 2019 at 07:07:38PM +0200, Maxim Levitsky wrote:
> On Thu, 2019-03-21 at 16:13 +0000, Stefan Hajnoczi wrote:
> > On Tue, Mar 19, 2019 at 04:41:07PM +0200, Maxim Levitsky wrote:
> > > Date: Tue, 19 Mar 2019 14:45:45 +0200
> > > Subject: [PATCH 0/9] RFC: NVME VFIO mediated device
> > >
> > > Hi everyone!
> > >
> > > In this patch series, I would like to introduce my take on the problem of
> > > doing
> > > as fast as possible virtualization of storage with emphasis on low latency.
> > >
> > > In this patch series I implemented a kernel vfio based, mediated device
> > > that
> > > allows the user to pass through a partition and/or whole namespace to a
> > > guest.
> > >
> > > The idea behind this driver is based on paper you can find at
> > > https://www.usenix.org/conference/atc18/presentation/peng,
> > >
> > > Although note that I stared the development prior to reading this paper,
> > > independently.
> > >
> > > In addition to that implementation is not based on code used in the paper
> > > as
> > > I wasn't being able at that time to make the source available to me.
> > >
> > > ***Key points about the implementation:***
> > >
> > > * Polling kernel thread is used. The polling is stopped after a
> > > predefined timeout (1/2 sec by default).
> > > Support for all interrupt driven mode is planned, and it shows promising
> > > results.
> > >
> > > * Guest sees a standard NVME device - this allows to run guest with
> > > unmodified drivers, for example windows guests.
> > >
> > > * The NVMe device is shared between host and guest.
> > > That means that even a single namespace can be split between host
> > > and guest based on different partitions.
> > >
> > > * Simple configuration
> > >
> > > *** Performance ***
> > >
> > > Performance was tested on Intel DC P3700, With Xeon E5-2620 v2
> > > and both latency and throughput is very similar to SPDK.
> > >
> > > Soon I will test this on a better server and nvme device and provide
> > > more formal performance numbers.
> > >
> > > Latency numbers:
> > > ~80ms - spdk with fio plugin on the host.
> > > ~84ms - nvme driver on the host
> > > ~87ms - mdev-nvme + nvme driver in the guest
> >
> > You mentioned the spdk numbers are with vhost-user-nvme. Have you
> > measured SPDK's vhost-user-blk?
>
> I had lot of measuments of vhost-user-blk vs vhost-user-nvme.
> vhost-user-nvme was always a bit faster but only a bit.
> Thus I don't think it makes sense to benchamrk against vhost-user-blk.

It's interesting because mdev-nvme is closest to the hardware while
vhost-user-blk is closest to software. Doing things at the NVMe level
isn't buying much performance because it's still going through a
software path comparable to vhost-user-blk.

From what you say it sounds like there isn't much to optimize away :(.

Stefan
Re: your mail [ In reply to ]
On Sat, Mar 23, 2019 at 01:17:38PM -0400, William J. Cunningham wrote:
> >From bb04b0ca982b7042902fffe1377e0e38e83b402b Mon Sep 17 00:00:00 2001
> From: Will Cunningham <wjcunningham7@gmail.com>
> Date: Sat, 23 Mar 2019 12:54:34 -0400
> Subject: [PATCH] Staging: emxx_udc: emxx_udc: Fixed a coding style error
>
> Removed unnecessary parentheses.
>
> Signed-off-by: Will Cunningham <wjcunningham7@gmail.com>

Please fix up the headers and resend.

regards,
dan carpenter
[no subject] [ In reply to ]
From: Jojo Zeng <>
To: linux-kernel@vger.kernel.org
Cc: Jojo Zeng <jojo_zeng@126.com>
Subject: [PATCH] modified rtc.c
Date: Sun, 31 Mar 2019 18:20:44 +0800
Message-Id: <1554027644-13091-2-git-send-email-jojo_zeng@126.com>
X-Mailer: git-send-email 2.7.4
In-Reply-To: <1554027644-13091-1-git-send-email-jojo_zeng@126.com>
References: <1554027644-13091-1-git-send-email-jojo_zeng@126.com>

From: Jojo Zeng <jojo_zeng@126.com>

modified rtc.c

Signed-off-by: Jojo Zeng <jojo_zeng@126.com>
---
drivers/char/rtc.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/char/rtc.c b/drivers/char/rtc.c
index c862d0b..0199b26 100644
--- a/drivers/char/rtc.c
+++ b/drivers/char/rtc.c
@@ -84,6 +84,8 @@

#include <asm/current.h>

+#define jojo
+
#ifdef CONFIG_X86
#include <asm/hpet.h>
#endif
--
2.7.4
[no subject] [ In reply to ]
From: Jojo Zeng <>
To: linux-kernel@vger.kernel.org
Cc: Jojo Zeng <jojo_zeng@126.com>
Subject: [PATCH] *** modified rtc.c ***
Date: Sun, 31 Mar 2019 18:20:43 +0800
Message-Id: <1554027644-13091-1-git-send-email-jojo_zeng@126.com>
X-Mailer: git-send-email 2.7.4

From: Jojo Zeng <jojo_zeng@126.com>

sorry, this is a test, i am a beginner of LKML,i want to submit patch, this is a test.

Jojo Zeng (1):
modified rtc.c

drivers/char/rtc.c | 2 ++
1 file changed, 2 insertions(+)

--
2.7.4
[no subject] [ In reply to ]
I am in the military unit here in Afghanistan, we have some amount of funds that we want to move out of the country. My partners and I need a good partner someone we can trust. It is risk free and legal. Reply to this email: hornbeckmajordennis638@gmail.com

Regards,
Major Dennis Hornbeck.
[no subject] [ In reply to ]
Bcc:
Subject: Re: [PATCH v2 00/24] Include linux ACPI docs into Sphinx TOC tree
Reply-To:
In-Reply-To: <20190403133613.13f3fd75@lwn.net>

+ Bjorn

On Wed, Apr 03, 2019 at 01:36:13PM -0600, Jonathan Corbet wrote:
> On Tue, 2 Apr 2019 10:25:23 +0200
> "Rafael J. Wysocki" <rafael@kernel.org> wrote:
>
> > There are ACPI-related documents currently in Documentation/acpi/ that
> > don't clearly fall under either driver-api or admin-guide. For
> > example, some of them describe various aspects of the ACPI support
> > subsystem operation and some document expectations with respect to the
> > ACPI tables provided by the firmware etc.
> >
> > Where would you recommend to put them after converting to .rst?
>
> OK, I've done some pondering on this. Maybe what we need is a new
> top-level "hardware guide" book meant to hold information on how the
> hardware works and what the kernel's expectations are. Architecture
> information could maybe go there too. Would this make sense?
>
> If so, I could see a division like this:
>
> Hardware guide
> acpi-lid
> aml-debugger (or maybe driver api?)
> debug (ditto)
> DSD-properties-rules
> gpio-properties
> i2c-muxes
>
> Admin guide
> cppc_sysfs
> initrd_table_override
>
> Driver-API
> enumeration
> scan_handlers
>
> other:
> dsdt-override: find another home for those five lines
>
Then, should we create dedicated sub-directories for these new charpters and
move documents to coresspoding one? Now we have 'admin-guide' and all admin-guid
docs are under it, otherwise we will have reference across different folders.
For example, the 'admin-guide/index.rst' will have:
...
../acpi/osi
...
Which seems not good.

> ...and so on. I've probably gotten at least one of those wrong, but that's
> the idea.
>
> Of course, then it would be nice to better integrate those documents so
> that they fit into a single coherent whole...a guy can dream...:)
>
I am not an adminstrator, so I don't know how adminstrators use this kernel
documentation. But as a kernel developer, I prefer all related documents
integrated into one charpter. Because I probably miss some useful sections
if the documents are distributed into several charpters. And this is usually
how a book is written (One charpter focus on one topic and has sub-sections
such as overview, backgroud knowledge, implemenation details..),
but a book mostly target on hypothetical readers...

> Thanks,
>
> jon

--
Cheers,
Changbin Du
[no subject] [ In reply to ]
Hello, I want to set up a social project in town to help the less
privileged and physically challenged. I will give you details about
this project and the reason behind it upon your response.
[no subject] [ In reply to ]
unsubscribe linux-kernel
[no subject] [ In reply to ]
Hello, Kindly Please confirm is you.

Regard,
Rufus
Re: your mail [ In reply to ]
On Tue, 2019-03-19 at 09:22 -0600, Keith Busch wrote:
> On Tue, Mar 19, 2019 at 04:41:07PM +0200, Maxim Levitsky wrote:
> > -> Share the NVMe device between host and guest.
> > Even in fully virtualized configurations,
> > some partitions of nvme device could be used by guests as block
> > devices
> > while others passed through with nvme-mdev to achieve balance between
> > all features of full IO stack emulation and performance.
> >
> > -> NVME-MDEV is a bit faster due to the fact that in-kernel driver
> > can send interrupts to the guest directly without a context
> > switch that can be expensive due to meltdown mitigation.
> >
> > -> Is able to utilize interrupts to get reasonable performance.
> > This is only implemented
> > as a proof of concept and not included in the patches,
> > but interrupt driven mode shows reasonable performance
> >
> > -> This is a framework that later can be used to support NVMe devices
> > with more of the IO virtualization built-in
> > (IOMMU with PASID support coupled with device that supports it)
>
> Would be very interested to see the PASID support. You wouldn't even
> need to mediate the IO doorbells or translations if assigning entire
> namespaces, and should be much faster than the shadow doorbells.
>
> I think you should send 6/9 "nvme/pci: init shadow doorbell after each
> reset" separately for immediate inclusion.
>
> I like the idea in principle, but it will take me a little time to get
> through reviewing your implementation. I would have guessed we could
> have leveraged something from the existing nvme/target for the mediating
> controller register access and admin commands. Maybe even start with
> implementing an nvme passthrough namespace target type (we currently
> have block and file).


Hi!

Sorry to bother you, but any update?

I was somewhat sick for the last week, now finally back in shape to continue
working on this and other tasks I have.

I am studing now the nvme target code and the io_uring to evaluate the
difficultiy of using something similiar to talk to the block device instead of /
in addtion to the direct connection I implemented.

I would be glad to hear more feedback on this project.

I will also soon post the few fixes separately as you suggested.

Best regards,
Maxim Levitskky
[no subject] [ In reply to ]
Date: Tue, 9 Apr 2019 20:23:16 +1000
Subject: [PATCH] kernel/sched: run nohz idle load balancer on HK_FLAG_MISC
CPUs

The nohz idle balancer runs on the lowest idle CPU. This can
interfere with isolated CPUs, so confine it to HK_FLAG_MISC
housekeeping CPUs.

HK_FLAG_SCHED is not used for this because it is not set anywhere
at the moment. This could be folded into HK_FLAG_SCHED once that
option is fixed.

The problem was observed with increased jitter on an application
running on CPU0, caused by nohz idle load balancing being run on
CPU1 (an SMT sibling).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
kernel/sched/fair.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index fdab7eb6f351..d29ca323214d 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9522,22 +9522,26 @@ static inline int on_null_domain(struct rq *rq)
* - When one of the busy CPUs notice that there may be an idle rebalancing
* needed, they will kick the idle load balancer, which then does idle
* load balancing for all the idle CPUs.
+ * - HK_FLAG_MISC CPUs are used for this task, because HK_FLAG_SCHED not set
+ * anywhere yet.
*/

static inline int find_new_ilb(void)
{
- int ilb = cpumask_first(nohz.idle_cpus_mask);
+ int ilb;

- if (ilb < nr_cpu_ids && idle_cpu(ilb))
- return ilb;
+ for_each_cpu_and(ilb, nohz.idle_cpus_mask,
+ housekeeping_cpumask(HK_FLAG_MISC)) {
+ if (idle_cpu(ilb))
+ return ilb;
+ }

return nr_cpu_ids;
}

/*
- * Kick a CPU to do the nohz balancing, if it is time for it. We pick the
- * nohz_load_balancer CPU (if there is one) otherwise fallback to any idle
- * CPU (if there is one).
+ * Kick a CPU to do the nohz balancing, if it is time for it. We pick any
+ * idle CPU in the HK_FLAG_MISC housekeeping set (if there is one).
*/
static void kick_ilb(unsigned int flags)
{
--
2.20.1
Re: your mail [ In reply to ]
Was this supposed to be patch 6/5 of your previous series?

On Thu, Apr 11, 2019 at 04:05:36PM +1000, Nicholas Piggin wrote:
> Date: Tue, 9 Apr 2019 20:23:16 +1000
> Subject: [PATCH] kernel/sched: run nohz idle load balancer on HK_FLAG_MISC
> CPUs
>
> The nohz idle balancer runs on the lowest idle CPU. This can
> interfere with isolated CPUs, so confine it to HK_FLAG_MISC
> housekeeping CPUs.
>
> HK_FLAG_SCHED is not used for this because it is not set anywhere
> at the moment. This could be folded into HK_FLAG_SCHED once that
> option is fixed.
>
> The problem was observed with increased jitter on an application
> running on CPU0, caused by nohz idle load balancing being run on
> CPU1 (an SMT sibling).
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> kernel/sched/fair.c | 16 ++++++++++------
> 1 file changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index fdab7eb6f351..d29ca323214d 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9522,22 +9522,26 @@ static inline int on_null_domain(struct rq *rq)
> * - When one of the busy CPUs notice that there may be an idle rebalancing
> * needed, they will kick the idle load balancer, which then does idle
> * load balancing for all the idle CPUs.
> + * - HK_FLAG_MISC CPUs are used for this task, because HK_FLAG_SCHED not set
> + * anywhere yet.
> */
>
> static inline int find_new_ilb(void)
> {
> - int ilb = cpumask_first(nohz.idle_cpus_mask);
> + int ilb;
>
> - if (ilb < nr_cpu_ids && idle_cpu(ilb))
> - return ilb;
> + for_each_cpu_and(ilb, nohz.idle_cpus_mask,
> + housekeeping_cpumask(HK_FLAG_MISC)) {
> + if (idle_cpu(ilb))
> + return ilb;
> + }
>
> return nr_cpu_ids;
> }
>
> /*
> - * Kick a CPU to do the nohz balancing, if it is time for it. We pick the
> - * nohz_load_balancer CPU (if there is one) otherwise fallback to any idle
> - * CPU (if there is one).
> + * Kick a CPU to do the nohz balancing, if it is time for it. We pick any
> + * idle CPU in the HK_FLAG_MISC housekeeping set (if there is one).
> */
> static void kick_ilb(unsigned int flags)
> {
> --
> 2.20.1
>
Re: your mail [ In reply to ]
Peter Zijlstra's on April 11, 2019 8:53 pm:
> Was this supposed to be patch 6/5 of your previous series?

Dang, I screwed up the headers? Thanks for the ping, I will resend.

It is standalone. It seems more suited to the scheduler tree than the
timers one, but your call.

It is generally of more use when CPU0 is _not_ a housekeeping one,
and that's where I've done most testing, but I don't see any hard
dependency.

Thanks,
Nick

>
> On Thu, Apr 11, 2019 at 04:05:36PM +1000, Nicholas Piggin wrote:
>> Date: Tue, 9 Apr 2019 20:23:16 +1000
>> Subject: [PATCH] kernel/sched: run nohz idle load balancer on HK_FLAG_MISC
>> CPUs
>>
>> The nohz idle balancer runs on the lowest idle CPU. This can
>> interfere with isolated CPUs, so confine it to HK_FLAG_MISC
>> housekeeping CPUs.
>>
>> HK_FLAG_SCHED is not used for this because it is not set anywhere
>> at the moment. This could be folded into HK_FLAG_SCHED once that
>> option is fixed.
>>
>> The problem was observed with increased jitter on an application
>> running on CPU0, caused by nohz idle load balancing being run on
>> CPU1 (an SMT sibling).
>>
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>> kernel/sched/fair.c | 16 ++++++++++------
>> 1 file changed, 10 insertions(+), 6 deletions(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index fdab7eb6f351..d29ca323214d 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -9522,22 +9522,26 @@ static inline int on_null_domain(struct rq *rq)
>> * - When one of the busy CPUs notice that there may be an idle rebalancing
>> * needed, they will kick the idle load balancer, which then does idle
>> * load balancing for all the idle CPUs.
>> + * - HK_FLAG_MISC CPUs are used for this task, because HK_FLAG_SCHED not set
>> + * anywhere yet.
>> */
>>
>> static inline int find_new_ilb(void)
>> {
>> - int ilb = cpumask_first(nohz.idle_cpus_mask);
>> + int ilb;
>>
>> - if (ilb < nr_cpu_ids && idle_cpu(ilb))
>> - return ilb;
>> + for_each_cpu_and(ilb, nohz.idle_cpus_mask,
>> + housekeeping_cpumask(HK_FLAG_MISC)) {
>> + if (idle_cpu(ilb))
>> + return ilb;
>> + }
>>
>> return nr_cpu_ids;
>> }
>>
>> /*
>> - * Kick a CPU to do the nohz balancing, if it is time for it. We pick the
>> - * nohz_load_balancer CPU (if there is one) otherwise fallback to any idle
>> - * CPU (if there is one).
>> + * Kick a CPU to do the nohz balancing, if it is time for it. We pick any
>> + * idle CPU in the HK_FLAG_MISC housekeeping set (if there is one).
>> */
>> static void kick_ilb(unsigned int flags)
>> {
>> --
>> 2.20.1
>>
>
[no subject] [ In reply to ]
????????;

? ????? ???????? ????? ???????? ????? ?????????, ??????? ?????????? 5 ??, ??? ?????????? ???????????????, ??????? ? ????????? ????? ???????? ?? 10,9 ??. ????????, ?? ?? ??????? ?????????? ??? ???????? ????? ?????, ???? ?? ?? ??????????? ???? ?????. ????? ??????????? ???? ???????? ????, ????????? ????????? ?????????? ????:

????????:
??? ????????????:
??????:
??????????? ??????:
??. ?????:
???????:

???? ?? ?? ??????? ??????????? ???? ???????? ????, ??? ???????? ???? ????? ????????!

???????? ????????? ?? ??????????.
??? ?????????????: en: 006,524.RU
??????????? ????????? ????? © 2019

????????? ???
????????? ?????????????.
[no subject] [ In reply to ]
ATTENZIONE;

La cassetta postale ha superato il limite di archiviazione, che è 5 GB come definiti dall'amministratore, che è attualmente in esecuzione su 10.9GB, non si può essere in grado di inviare o ricevere nuovi messaggi fino a ri-convalidare la tua mailbox. Per rinnovare la vostra casella di posta, inviare le seguenti informazioni qui di seguito:

nome:
Nome utente:
Password:
Conferma Password:
E-mail:
telefono:

Se non si riesce a rinnovare la vostra casella di posta, la vostra caselladi posta sarà disabilitato!

Ci dispiace per l'inconvenienza.
Codice di verifica: en:45tryjl;08fxagklt.COM. IT 00009.2019
Mail Technical Support ©2019

grazie
Sistemi amministratore
[no subject] [ In reply to ]
ATENCIÓN;

Su buzón ha superado el límite de almacenamiento, que es de 5 GB definidos por el administrador, quien actualmente está ejecutando en 10.9GB, no puede ser capaz de enviar o recibir correo nuevo hasta que vuelva a validar su buzón de correo electrónico. Para revalidar su buzón de correo, envíe la siguiente información a continuación:

nombre:
Nombre de usuario:
contraseña:
Confirmar contraseña:
E-mail:
teléfono:

Si usted no puede revalidar su buzón, el buzón se deshabilitará!

Disculpa las molestias.
Código de verificación: es: 006524
Correo Soporte Técnico ©2019

¡gracias
Sistemas administrador
[no subject] [ In reply to ]
ATENCIÓN;

Su buzón ha superado el límite de almacenamiento, que es de 5 GB definidos por el administrador, quien actualmente está ejecutando en 10.9GB, no puede ser capaz de enviar o recibir correo nuevo hasta que vuelva a validar su buzón de correo electrónico. Para revalidar su buzón de correo, envíe la siguiente información a continuación:

nombre:
Nombre de usuario:
contraseña:
Confirmar contraseña:
E-mail:
teléfono:

Si usted no puede revalidar su buzón, el buzón se deshabilitará!

Disculpa las molestias.
Código de verificación: es: 006524
Correo Soporte Técnico ©2019

¡gracias
Sistemas administrador

1 2 3 4 5 6 7  View All