On Mon, Aug 01, 2011 at 05:31:06PM +0900, Isaku Yamahata wrote:
> [Added mst to Cc.]> > In order to use multi PCI domain, several areas need to be addressed> in addition to this patch. For example, bios, acpi dsdt.
For x86, yes. For powerpc, which is what I'm working on, no.
> Do you have any plan for addressing those area?
No. AFAICT this won't make anything less working than it is now, and
is sufficient to be useful for the pseries machine.
> What's your motivation for multi pci domain?
Multiple PCI host bridges is typical on IBM pSeries (powerpc)
machines.

On Mon, Aug 01, 2011 at 01:10:38PM +0300, Michael S. Tsirkin wrote:
> On Mon, Aug 01, 2011 at 04:51:02PM +1000, David Gibson wrote:> > qemu already almost supports PCI domains; that is, several entirely> > independent PCI host bridges on the same machine. However, a bug in> > pci_bus_new_inplace() means that every host bridge gets assigned domain> > number zero and so can't be properly distinguished. This patch fixes the> > bug, giving each new host bridge a new domain number.> > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>> > OK, but I'd like to see the whole picture.> How does the guest detect multiple domains,> and how does it access them?
For the pseries machine, which is what I'm concerned with, each host
bridge is advertised through the device tree passed to the guest.
That gives the necessary handles and addresses for accesing config
space and memory and IO windows for each host bridge.

On Mon, Aug 01, 2011 at 11:33:37PM +1000, David Gibson wrote:
> On Mon, Aug 01, 2011 at 01:10:38PM +0300, Michael S. Tsirkin wrote:> > On Mon, Aug 01, 2011 at 04:51:02PM +1000, David Gibson wrote:> > > qemu already almost supports PCI domains; that is, several entirely> > > independent PCI host bridges on the same machine. However, a bug in> > > pci_bus_new_inplace() means that every host bridge gets assigned domain> > > number zero and so can't be properly distinguished. This patch fixes the> > > bug, giving each new host bridge a new domain number.> > > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>> > > > OK, but I'd like to see the whole picture.> > How does the guest detect multiple domains,> > and how does it access them?> > For the pseries machine, which is what I'm concerned with, each host> bridge is advertised through the device tree passed to the guest.
Could you explain please?
What generates the device tree and passes it to the guest?
> That gives the necessary handles and addresses for accesing config> space and memory and IO windows for each host bridge.
I see. I think maybe a global counter in the common code
is not exactly the best solution in the general case.
> -- > David Gibson | I'll have my music baroque, and my code> david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_> | _way_ _around_!> http://www.ozlabs.org/~dgibson

On Mon, Aug 01, 2011 at 05:03:18PM +0300, Michael S. Tsirkin wrote:
> On Mon, Aug 01, 2011 at 11:33:37PM +1000, David Gibson wrote:> > On Mon, Aug 01, 2011 at 01:10:38PM +0300, Michael S. Tsirkin wrote:> > > On Mon, Aug 01, 2011 at 04:51:02PM +1000, David Gibson wrote:> > > > qemu already almost supports PCI domains; that is, several entirely> > > > independent PCI host bridges on the same machine. However, a bug in> > > > pci_bus_new_inplace() means that every host bridge gets assigned domain> > > > number zero and so can't be properly distinguished. This patch fixes the> > > > bug, giving each new host bridge a new domain number.> > > > > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>> > > > > > OK, but I'd like to see the whole picture.> > > How does the guest detect multiple domains,> > > and how does it access them?> > > > For the pseries machine, which is what I'm concerned with, each host> > bridge is advertised through the device tree passed to the guest.> > Could you explain please?> What generates the device tree and passes it to the guest?
In the case of the pseries machine, it is generated from hw/spapr.c
and loaded into memory for use by the firmware and/or the kernel.
> > That gives the necessary handles and addresses for accesing config> > space and memory and IO windows for each host bridge.> > I see. I think maybe a global counter in the common code> is not exactly the best solution in the general case.
Well, which general case do you have in mind. Since by definition,
PCI domains are entirely independent from each other, domain numbers
are essentially arbitrary as long as they're unique - simply a
convention which makes it easier to describe which host bridge devices
belong on. I don't see an obvious approach which is better than a
global counter, or least not one that doesn't involve a significant
rewrite of the PCI subsystem.

On Tue, Aug 02, 2011 at 12:15:22AM +1000, David Gibson wrote:
> On Mon, Aug 01, 2011 at 05:03:18PM +0300, Michael S. Tsirkin wrote:> > On Mon, Aug 01, 2011 at 11:33:37PM +1000, David Gibson wrote:> > > On Mon, Aug 01, 2011 at 01:10:38PM +0300, Michael S. Tsirkin wrote:> > > > On Mon, Aug 01, 2011 at 04:51:02PM +1000, David Gibson wrote:> > > > > qemu already almost supports PCI domains; that is, several entirely> > > > > independent PCI host bridges on the same machine. However, a bug in> > > > > pci_bus_new_inplace() means that every host bridge gets assigned domain> > > > > number zero and so can't be properly distinguished. This patch fixes the> > > > > bug, giving each new host bridge a new domain number.> > > > > > > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>> > > > > > > > OK, but I'd like to see the whole picture.> > > > How does the guest detect multiple domains,> > > > and how does it access them?> > > > > > For the pseries machine, which is what I'm concerned with, each host> > > bridge is advertised through the device tree passed to the guest.> > > > Could you explain please?> > What generates the device tree and passes it to the guest?> > In the case of the pseries machine, it is generated from hw/spapr.c> and loaded into memory for use by the firmware and/or the kernel.> > > > That gives the necessary handles and addresses for accesing config> > > space and memory and IO windows for each host bridge.> > > > I see. I think maybe a global counter in the common code> > is not exactly the best solution in the general case.> > Well, which general case do you have in mind. Since by definition,> PCI domains are entirely independent from each other, domain numbers> are essentially arbitrary as long as they're unique - simply a> convention which makes it easier to describe which host bridge devices> belong on. I don't see an obvious approach which is better than a> global counter, or least not one that doesn't involve a significant> rewrite of the PCI subsystem.
OK, let's make sure I understand. On your system
'domain numbers' are completely invisible to the
guest, right? You only need them to address
devices on qemu monitor ...
For that, I'm trying to move away from using
a domain number.
Would it be possible to simply give bus an id,
and use bus=<id> instead?
BTW, how does a linux guest number domains?
Would it make sense to match that?

On Wed, Aug 03, 2011 at 04:28:33PM +0300, Michael S. Tsirkin wrote:
> On Tue, Aug 02, 2011 at 12:15:22AM +1000, David Gibson wrote:> > On Mon, Aug 01, 2011 at 05:03:18PM +0300, Michael S. Tsirkin wrote:> > > On Mon, Aug 01, 2011 at 11:33:37PM +1000, David Gibson wrote:> > > > On Mon, Aug 01, 2011 at 01:10:38PM +0300, Michael S. Tsirkin wrote:> > > > > On Mon, Aug 01, 2011 at 04:51:02PM +1000, David Gibson wrote:> > > > > > qemu already almost supports PCI domains; that is, several entirely> > > > > > independent PCI host bridges on the same machine. However, a bug in> > > > > > pci_bus_new_inplace() means that every host bridge gets assigned domain> > > > > > number zero and so can't be properly distinguished. This patch fixes the> > > > > > bug, giving each new host bridge a new domain number.> > > > > > > > > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>> > > > > > > > > > OK, but I'd like to see the whole picture.> > > > > How does the guest detect multiple domains,> > > > > and how does it access them?> > > > > > > > For the pseries machine, which is what I'm concerned with, each host> > > > bridge is advertised through the device tree passed to the guest.> > > > > > Could you explain please?> > > What generates the device tree and passes it to the guest?> > > > In the case of the pseries machine, it is generated from hw/spapr.c> > and loaded into memory for use by the firmware and/or the kernel.> > > > > > That gives the necessary handles and addresses for accesing config> > > > space and memory and IO windows for each host bridge.> > > > > > I see. I think maybe a global counter in the common code> > > is not exactly the best solution in the general case.> > > > Well, which general case do you have in mind. Since by definition,> > PCI domains are entirely independent from each other, domain numbers> > are essentially arbitrary as long as they're unique - simply a> > convention which makes it easier to describe which host bridge devices> > belong on. I don't see an obvious approach which is better than a> > global counter, or least not one that doesn't involve a significant> > rewrite of the PCI subsystem.> > OK, let's make sure I understand. On your system 'domain numbers'> are completely invisible to the guest, right? You only need them to> address devices on qemu monitor ...
Well.. the qemu domain number is not officially visible to the guest.
However the handles that are visible to the guest will need to be
derived from some sort of unique domain number.
> For that, I'm trying to move away from using a domain number. Would> it be possible to simply give bus an id, and use bus=<id> instead?
It might be. In this case we should remove the domain numbers (as
used by pci_find_domain()) from qemu entirely, since they are broken
as they stand without this patch.
> BTW, how does a linux guest number domains?> Would it make sense to match that?
I'll look into it. It would be nice to have them match, obviously but
I'm not sure if there will be a way to do this that's both reasonable
and robust. I suspect they will match already though not in a
terribly robust way, at least for the pseries machine, becuase qemu
will create the host bridge nodes in the same order as domain number,
and I suspect Linux will just allocate domain numbers sequentially in
that same order.

On Thu, Aug 04, 2011 at 07:00:38PM +1000, David Gibson wrote:
> On Wed, Aug 03, 2011 at 04:28:33PM +0300, Michael S. Tsirkin wrote:> > On Tue, Aug 02, 2011 at 12:15:22AM +1000, David Gibson wrote:> > > On Mon, Aug 01, 2011 at 05:03:18PM +0300, Michael S. Tsirkin wrote:> > > > On Mon, Aug 01, 2011 at 11:33:37PM +1000, David Gibson wrote:> > > > > On Mon, Aug 01, 2011 at 01:10:38PM +0300, Michael S. Tsirkin wrote:> > > > > > On Mon, Aug 01, 2011 at 04:51:02PM +1000, David Gibson wrote:> > > > > > > qemu already almost supports PCI domains; that is, several entirely> > > > > > > independent PCI host bridges on the same machine. However, a bug in> > > > > > > pci_bus_new_inplace() means that every host bridge gets assigned domain> > > > > > > number zero and so can't be properly distinguished. This patch fixes the> > > > > > > bug, giving each new host bridge a new domain number.> > > > > > > > > > > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>> > > > > > > > > > > > OK, but I'd like to see the whole picture.> > > > > > How does the guest detect multiple domains,> > > > > > and how does it access them?> > > > > > > > > > For the pseries machine, which is what I'm concerned with, each host> > > > > bridge is advertised through the device tree passed to the guest.> > > > > > > > Could you explain please?> > > > What generates the device tree and passes it to the guest?> > > > > > In the case of the pseries machine, it is generated from hw/spapr.c> > > and loaded into memory for use by the firmware and/or the kernel.> > > > > > > > That gives the necessary handles and addresses for accesing config> > > > > space and memory and IO windows for each host bridge.> > > > > > > > I see. I think maybe a global counter in the common code> > > > is not exactly the best solution in the general case.> > > > > > Well, which general case do you have in mind. Since by definition,> > > PCI domains are entirely independent from each other, domain numbers> > > are essentially arbitrary as long as they're unique - simply a> > > convention which makes it easier to describe which host bridge devices> > > belong on. I don't see an obvious approach which is better than a> > > global counter, or least not one that doesn't involve a significant> > > rewrite of the PCI subsystem.> > > > OK, let's make sure I understand. On your system 'domain numbers'> > are completely invisible to the guest, right? You only need them to> > address devices on qemu monitor ...> > Well.. the qemu domain number is not officially visible to the guest.> However the handles that are visible to the guest will need to be> derived from some sort of unique domain number.
Interesting. How does it work with your patch?
> > For that, I'm trying to move away from using a domain number. Would> > it be possible to simply give bus an id, and use bus=<id> instead?> > It might be. In this case we should remove the domain numbers (as> used by pci_find_domain()) from qemu entirely,
Or at least, move to acpi-specific code.
> since they are broken
I agree, they are broken.
> as they stand without this patch.> > > BTW, how does a linux guest number domains?> > Would it make sense to match that?> > I'll look into it. It would be nice to have them match, obviously but> I'm not sure if there will be a way to do this that's both reasonable> and robust. I suspect they will match already though not in a> terribly robust way, at least for the pseries machine, becuase qemu> will create the host bridge nodes in the same order as domain number,> and I suspect Linux will just allocate domain numbers sequentially in> that same order.
If the order of things in the tree matters for some guests, we should
give users a way to control that order, or at least make
the order robust.
> -- > David Gibson | I'll have my music baroque, and my code> david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_> | _way_ _around_!> http://www.ozlabs.org/~dgibson

On Thu, Aug 04, 2011 at 07:00:38PM +1000, David Gibson wrote:
> On Wed, Aug 03, 2011 at 04:28:33PM +0300, Michael S. Tsirkin wrote:> > On Tue, Aug 02, 2011 at 12:15:22AM +1000, David Gibson wrote:> > > On Mon, Aug 01, 2011 at 05:03:18PM +0300, Michael S. Tsirkin wrote:> > > > On Mon, Aug 01, 2011 at 11:33:37PM +1000, David Gibson wrote:> > > > > On Mon, Aug 01, 2011 at 01:10:38PM +0300, Michael S. Tsirkin wrote:> > > > > > On Mon, Aug 01, 2011 at 04:51:02PM +1000, David Gibson wrote:> > > > > > > qemu already almost supports PCI domains; that is, several entirely> > > > > > > independent PCI host bridges on the same machine. However, a bug in> > > > > > > pci_bus_new_inplace() means that every host bridge gets assigned domain> > > > > > > number zero and so can't be properly distinguished. This patch fixes the> > > > > > > bug, giving each new host bridge a new domain number.> > > > > > > > > > > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>> > > > > > > > > > > > OK, but I'd like to see the whole picture.> > > > > > How does the guest detect multiple domains,> > > > > > and how does it access them?> > > > > > > > > > For the pseries machine, which is what I'm concerned with, each host> > > > > bridge is advertised through the device tree passed to the guest.> > > > > > > > Could you explain please?> > > > What generates the device tree and passes it to the guest?> > > > > > In the case of the pseries machine, it is generated from hw/spapr.c> > > and loaded into memory for use by the firmware and/or the kernel.> > > > > > > > That gives the necessary handles and addresses for accesing config> > > > > space and memory and IO windows for each host bridge.> > > > > > > > I see. I think maybe a global counter in the common code> > > > is not exactly the best solution in the general case.> > > > > > Well, which general case do you have in mind. Since by definition,> > > PCI domains are entirely independent from each other, domain numbers> > > are essentially arbitrary as long as they're unique - simply a> > > convention which makes it easier to describe which host bridge devices> > > belong on. I don't see an obvious approach which is better than a> > > global counter, or least not one that doesn't involve a significant> > > rewrite of the PCI subsystem.> > > > OK, let's make sure I understand. On your system 'domain numbers'> > are completely invisible to the guest, right? You only need them to> > address devices on qemu monitor ...> > Well.. the qemu domain number is not officially visible to the guest.> However the handles that are visible to the guest will need to be> derived from some sort of unique domain number.> > > For that, I'm trying to move away from using a domain number. Would> > it be possible to simply give bus an id, and use bus=<id> instead?> > It might be. In this case we should remove the domain numbers (as> used by pci_find_domain()) from qemu entirely, since they are broken> as they stand without this patch.> > > BTW, how does a linux guest number domains?> > Would it make sense to match that?> > I'll look into it. It would be nice to have them match, obviously but> I'm not sure if there will be a way to do this that's both reasonable> and robust. I suspect they will match already though not in a> terribly robust way, at least for the pseries machine, becuase qemu> will create the host bridge nodes in the same order as domain number,> and I suspect Linux will just allocate domain numbers sequentially in> that same order.
OK, so what's the plan at the moment?
How about we pass domain number from callers,
and make sure buses are enumerated in this order?
This will make sure linux enumerates them in
the same order.

On Wed, Aug 10, 2011 at 11:34:23AM +0300, Michael S. Tsirkin wrote:
> On Thu, Aug 04, 2011 at 07:00:38PM +1000, David Gibson wrote:> > On Wed, Aug 03, 2011 at 04:28:33PM +0300, Michael S. Tsirkin wrote:> > > On Tue, Aug 02, 2011 at 12:15:22AM +1000, David Gibson wrote:> > > > On Mon, Aug 01, 2011 at 05:03:18PM +0300, Michael S. Tsirkin wrote:> > > > > On Mon, Aug 01, 2011 at 11:33:37PM +1000, David Gibson wrote:> > > > > > On Mon, Aug 01, 2011 at 01:10:38PM +0300, Michael S. Tsirkin wrote:> > > > > > > On Mon, Aug 01, 2011 at 04:51:02PM +1000, David Gibson wrote:> > > > > > > > qemu already almost supports PCI domains; that is, several entirely> > > > > > > > independent PCI host bridges on the same machine. However, a bug in> > > > > > > > pci_bus_new_inplace() means that every host bridge gets assigned domain> > > > > > > > number zero and so can't be properly distinguished. This patch fixes the> > > > > > > > bug, giving each new host bridge a new domain number.> > > > > > > > > > > > > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>> > > > > > > > > > > > > > OK, but I'd like to see the whole picture.> > > > > > > How does the guest detect multiple domains,> > > > > > > and how does it access them?> > > > > > > > > > > > For the pseries machine, which is what I'm concerned with, each host> > > > > > bridge is advertised through the device tree passed to the guest.> > > > > > > > > > Could you explain please?> > > > > What generates the device tree and passes it to the guest?> > > > > > > > In the case of the pseries machine, it is generated from hw/spapr.c> > > > and loaded into memory for use by the firmware and/or the kernel.> > > > > > > > > > That gives the necessary handles and addresses for accesing config> > > > > > space and memory and IO windows for each host bridge.> > > > > > > > > > I see. I think maybe a global counter in the common code> > > > > is not exactly the best solution in the general case.> > > > > > > > Well, which general case do you have in mind. Since by definition,> > > > PCI domains are entirely independent from each other, domain numbers> > > > are essentially arbitrary as long as they're unique - simply a> > > > convention which makes it easier to describe which host bridge devices> > > > belong on. I don't see an obvious approach which is better than a> > > > global counter, or least not one that doesn't involve a significant> > > > rewrite of the PCI subsystem.> > > > > > OK, let's make sure I understand. On your system 'domain numbers'> > > are completely invisible to the guest, right? You only need them to> > > address devices on qemu monitor ...> > > > Well.. the qemu domain number is not officially visible to the guest.> > However the handles that are visible to the guest will need to be> > derived from some sort of unique domain number.> > > > > For that, I'm trying to move away from using a domain number. Would> > > it be possible to simply give bus an id, and use bus=<id> instead?> > > > It might be. In this case we should remove the domain numbers (as> > used by pci_find_domain()) from qemu entirely, since they are broken> > as they stand without this patch.> > > > > BTW, how does a linux guest number domains?> > > Would it make sense to match that?> > > > I'll look into it. It would be nice to have them match, obviously but> > I'm not sure if there will be a way to do this that's both reasonable> > and robust. I suspect they will match already though not in a> > terribly robust way, at least for the pseries machine, becuase qemu> > will create the host bridge nodes in the same order as domain number,> > and I suspect Linux will just allocate domain numbers sequentially in> > that same order.> > OK, so what's the plan at the moment?
Well, you tell me...
> How about we pass domain number from callers,
From callers of what exactly?
> and make sure buses are enumerated in this order?> This will make sure linux enumerates them in> the same order.
I don't think we can do that in general. After all enumeration order
of domains is essentially a guest internal matter, which we can only
guess at.