Home » Mailing lists » Devel » Re: [patch 2/6] [Network namespace] Network device sharing by view
Re: [patch 2/6] [Network namespace] Network device sharing by view [message #4035] |
Tue, 27 June 2006 09:38  |
Andrey Savochkin
Messages: 47 Registered: December 2005
|
Member |
|
|
On Tue, Jun 27, 2006 at 11:34:36AM +0200, Daniel Lezcano wrote:
> Andrey Savochkin wrote:
> > Daniel,
> >
> > On Mon, Jun 26, 2006 at 05:49:41PM +0200, Daniel Lezcano wrote:
> >
> >>>Then you lose the ability for each namespace to have its own routing entries.
> >>>Which implies that you'll have difficulties with devices that should exist
> >>>and be visible in one namespace only (like tunnels), as they require IP
> >>>addresses and route.
> >>
> >>I mean instead of having the route tables private to the namespace, the
> >>routes have the information to which namespace they are associated.
> >
> >
> > I think I understand what you're talking about: you want to make routing
> > responsible for determining destination namespace ID in addition to route
> > type (local, unicast etc), nexthop information, and so on. Right?
>
> Yes.
>
> >
> > My point is that if you make namespace tagging at routing time, and
> > your packets are being routed only once, you lose the ability
> > to have separate routing tables in each namespace.
>
> Right. What is the advantage of having separate the routing tables ?
Routing is everything.
For example, I want namespaces to have their private tunnel devices.
It means that namespaces should be allowed have private routes of local type,
private default routes, and so on...
Andrey
|
|
|
Re: [patch 2/6] [Network namespace] Network device sharing by view [message #4039 is a reply to message #4035] |
Tue, 27 June 2006 11:21   |
Daniel Lezcano
Messages: 417 Registered: June 2006
|
Senior Member |
|
|
>>>My point is that if you make namespace tagging at routing time, and
>>>your packets are being routed only once, you lose the ability
>>>to have separate routing tables in each namespace.
>>
>>Right. What is the advantage of having separate the routing tables ?
>
>
> Routing is everything.
> For example, I want namespaces to have their private tunnel devices.
> It means that namespaces should be allowed have private routes of local type,
> private default routes, and so on...
>
Ok, we are talking about the same things. We do it only in a different way:
* separate routing table :
namespace
|
\--- route_tables
|
\---routes
* tagged routing table :
route_tables
|
\---routes
|
\---namespace
When using routes private to the namespace, globally the logic of the ip
stack is not changed, it manipulates only differents variables. It is
more clean than tagging the route for the reasons mentioned by Eric.
When using route tagging, the logic is changed because when doing lookup
on the routes table which is global, the namespace is used to match the
route and make it visible.
I use the second method, because I think it is more effecient and reduce
the overhead. But the isolation is minimalist and only aims to avoid the
application using ressources outside of the container (aka namespace)
without taking care of the system. For example, I didn't take care of
network devices, because as far as see I can't imagine an administrator
wanting to change the network device name while there are hundred of
containers running. Concerning tunnel devices for example, they should
be created inside the container.
I think, private network ressources method is more elegant and involves
more network ressources, but there is probably a significant overhead
and some difficulties to have __lightweight__ container (aka application
container), make nfs working well, etc... I did some tests with tbench
and the loopback with the private namespace and there is roughly an
overhead of 4 % without the isolation since with the tagging method
there is 1 % with the isolation.
The network namespace aims the isolation for now, but the container
based on the namespaces will probably need checkpoint/restart and
migration ability. The migration is needed not only for servers but for
HPC jobs too.
So I don't know what level of isolation/virtualization is really needed
by users, what should be acceptable (strong isolation and overhead /
weak isolation and efficiency). I don't know if people wanting strong
isolation will not prefer Xen (cleary with much more overhead than your
patches ;) )
Regards
-- Daniel
|
|
|
Re: [patch 2/6] [Network namespace] Network device sharing by view [message #4040 is a reply to message #4039] |
Tue, 27 June 2006 11:52   |
ebiederm
Messages: 1354 Registered: February 2006
|
Senior Member |
|
|
Daniel Lezcano <dlezcano@fr.ibm.com> writes:
>>>>My point is that if you make namespace tagging at routing time, and
>>>>your packets are being routed only once, you lose the ability
>>>>to have separate routing tables in each namespace.
>>>
>>>Right. What is the advantage of having separate the routing tables ?
>> Routing is everything.
>> For example, I want namespaces to have their private tunnel devices.
>> It means that namespaces should be allowed have private routes of local type,
>> private default routes, and so on...
>>
>
> Ok, we are talking about the same things. We do it only in a different way:
>
> * separate routing table :
> namespace
> |
> \--- route_tables
> |
> \---routes
>
> * tagged routing table :
> route_tables
> |
> \---routes
> |
> \---namespace
There is a third possibility, that falls in between these two if local
communication is really the bottle neck.
We have the dst cache for caching routes and cache multiple transformations
that happen on a packet.
With a little extra knowledge it is possible to have the separate
routing tables but have special logic that recognizes the local tunnel
device that connects namespaces and have it look into the next
namespaces routes, and build up a complete stack of dst entries of
where the packet needs to go.
I keep forgetting about that possibility. But as long as everything
is done at the routing layer that should work.
> I use the second method, because I think it is more effecient and reduce the
> overhead. But the isolation is minimalist and only aims to avoid the application
> using ressources outside of the container (aka namespace) without taking care of
> the system. For example, I didn't take care of network devices, because as far
> as see I can't imagine an administrator wanting to change the network device
> name while there are hundred of containers running. Concerning tunnel devices
> for example, they should be created inside the container.
Inside the containers I want all network devices named eth0!
> I think, private network ressources method is more elegant and involves more
> network ressources, but there is probably a significant overhead and some
> difficulties to have __lightweight__ container (aka application container), make
> nfs working well, etc... I did some tests with tbench and the loopback with the
> private namespace and there is roughly an overhead of 4 % without the isolation
> since with the tagging method there is 1 % with the isolation.
The overhead went down?
> The network namespace aims the isolation for now, but the container based on the
> namespaces will probably need checkpoint/restart and migration ability. The
> migration is needed not only for servers but for HPC jobs too.
Yes.
> So I don't know what level of isolation/virtualization is really needed by
> users, what should be acceptable (strong isolation and overhead / weak isolation
> and efficiency). I don't know if people wanting strong isolation will not prefer
> Xen (cleary with much more overhead than your patches ;) )
We need a clean abstraction that optimizes well.
However local communication between containers is not what we should
benchmark. That can always be improved later. So long as the
performance is reasonable. What needs to be benchmarked is the
overhead of namespaces when connected to physical networking devices
and on their own local loopback, and comparing that to a kernel
without namespace support.
If we don't hurt that core case we have an implementation we can
merge. There are a lot of optimization opportunities for local
communications and we can do that after we have a correct and accepted
implementation. Anything else is optimizing too soon, and will
just be muddying the waters.
Eric
|
|
|
Re: [patch 2/6] [Network namespace] Network device sharing by view [message #4041 is a reply to message #4039] |
Tue, 27 June 2006 11:55   |
Andrey Savochkin
Messages: 47 Registered: December 2005
|
Member |
|
|
Daniel,
On Tue, Jun 27, 2006 at 01:21:02PM +0200, Daniel Lezcano wrote:
> >>>My point is that if you make namespace tagging at routing time, and
> >>>your packets are being routed only once, you lose the ability
> >>>to have separate routing tables in each namespace.
> >>
> >>Right. What is the advantage of having separate the routing tables ?
> >
> >
> > Routing is everything.
> > For example, I want namespaces to have their private tunnel devices.
> > It means that namespaces should be allowed have private routes of local type,
> > private default routes, and so on...
> >
>
> Ok, we are talking about the same things. We do it only in a different way:
We are not talking about the same things.
It isn't a technical thing whether route lookup is performed before or after
namespace change.
It is a fundamental question determining functionality of network namespaces.
We are talking about the capabilities namespaces provide.
Your proposal essentially denies namespaces to have their own tunnel or other
devices. There is no point in having a device inside a namespace if the
namespace owner can't route all or some specific outgoing packets through
that device. You don't allow system administrators to completely delegate
management of network configuration to namespace owners.
Andrey
|
|
|
Re: [patch 2/6] [Network namespace] Network device sharing by view [message #4047 is a reply to message #4040] |
Tue, 27 June 2006 16:02   |
Herbert Poetzl
Messages: 239 Registered: February 2006
|
Senior Member |
|
|
On Tue, Jun 27, 2006 at 05:52:52AM -0600, Eric W. Biederman wrote:
> Daniel Lezcano <dlezcano@fr.ibm.com> writes:
>
> >>>>My point is that if you make namespace tagging at routing time,
> >>>>and your packets are being routed only once, you lose the ability
> >>>>to have separate routing tables in each namespace.
> >>>
> >>>Right. What is the advantage of having separate the routing tables ?
> >> Routing is everything. For example, I want namespaces to have their
> >> private tunnel devices. It means that namespaces should be allowed
> >> have private routes of local type, private default routes, and so
> >> on...
> >>
> >
> > Ok, we are talking about the same things. We do it only in a different way:
> >
> > * separate routing table :
> > namespace
> > |
> > \--- route_tables
> > |
> > \---routes
> >
> > * tagged routing table :
> > route_tables
> > |
> > \---routes
> > |
> > \---namespace
>
> There is a third possibility, that falls in between these two if local
> communication is really the bottle neck.
>
> We have the dst cache for caching routes and cache multiple
> transformations that happen on a packet.
>
> With a little extra knowledge it is possible to have the separate
> routing tables but have special logic that recognizes the local
> tunnel device that connects namespaces and have it look into the next
> namespaces routes, and build up a complete stack of dst entries of
> where the packet needs to go.
>
> I keep forgetting about that possibility. But as long as everything is
> done at the routing layer that should work.
>
> > I use the second method, because I think it is more effecient and
> > reduce the overhead. But the isolation is minimalist and only aims
> > to avoid the application using ressources outside of the container
> > (aka namespace) without taking care of the system. For example, I
> > didn't take care of network devices, because as far as see I can't
> > imagine an administrator wanting to change the network device name
> > while there are hundred of containers running. Concerning tunnel
> > devices for example, they should be created inside the container.
>
> Inside the containers I want all network devices named eth0!
huh? even if there are two of them? also tun?
I think you meant, you want to be able to have eth0 in
_more_ than one guest where eth0 in a guest can also
be/use/relate to eth1 on the host, right?
> > I think, private network ressources method is more elegant
> > and involves more network ressources, but there is probably a
> > significant overhead and some difficulties to have __lightweight__
> > container (aka application container), make nfs working well,
> > etc... I did some tests with tbench and the loopback with the
> > private namespace and there is roughly an overhead of 4 % without
> > the isolation since with the tagging method there is 1 % with the
> > isolation.
>
> The overhead went down?
yes, this might actually happen, because the guest
has only to look at a certain subset of entries
but this needs a lot more testing, especially with
a lot of guests
> > The network namespace aims the isolation for now, but the container
> > based on the namespaces will probably need checkpoint/restart and
> > migration ability. The migration is needed not only for servers but
> > for HPC jobs too.
>
> Yes.
>
> > So I don't know what level of isolation/virtualization is really
> > needed by users, what should be acceptable (strong isolation and
> > overhead / weak isolation and efficiency). I don't know if people
> > wanting strong isolation will not prefer Xen (cleary with much more
> > overhead than your patches ;) )
well, Xen claims something below 2% IIRC, and would
be clearly the better choice if you want strict
separation with the complete functionality, especially
with hardware support
> We need a clean abstraction that optimizes well.
>
> However local communication between containers is not what we
> should benchmark. That can always be improved later. So long as
> the performance is reasonable. What needs to be benchmarked is the
> overhead of namespaces when connected to physical networking devices
> and on their own local loopback, and comparing that to a kernel
> without namespace support.
well, for me (obviously advocating the lightweight case)
it seems improtant that the following conditions are met:
- loopback traffic inside a guest is insignificantly
slower than on a normal system
- loopback traffic on the host is insignificantly
slower than on a normal system
- inter guest traffic is faster than on-wire traffic,
and should be withing a small tolerance of the
loopback case (as it really isn't different)
- network (on-wire) traffic should be as fast as without
the namespace (i.e. within 1% or so, better not really
measurable)
- all this should be true in a setup with a significant
number of guests, when only one guest is active, but
all other guests are ready/configured
- all this should scale well with a few hundred guests
> If we don't hurt that core case we have an implementation we can
> merge. There are a lot of optimization opportunities for local
> communications and we can do that after we have a correct and accepted
> implementation. Anything else is optimizing too soon, and will
> just be muddying the waters.
what I fear is that once something is in, the kernel will
just become slower (as it already did in some areas) and
nobody will care/be-able to fix that later on ...
best,
Herbert
> Eric
|
|
|
Re: [patch 2/6] [Network namespace] Network device sharing by view [message #4054 is a reply to message #4047] |
Tue, 27 June 2006 16:47   |
ebiederm
Messages: 1354 Registered: February 2006
|
Senior Member |
|
|
Herbert Poetzl <herbert@13thfloor.at> writes:
> On Tue, Jun 27, 2006 at 05:52:52AM -0600, Eric W. Biederman wrote:
>>
>> Inside the containers I want all network devices named eth0!
>
> huh? even if there are two of them? also tun?
>
> I think you meant, you want to be able to have eth0 in
> _more_ than one guest where eth0 in a guest can also
> be/use/relate to eth1 on the host, right?
Right I want to have an eth0 in each guest where eth0 is
it's own network device and need have no relationship to
eth0 on the host.
>> We need a clean abstraction that optimizes well.
>>
>> However local communication between containers is not what we
>> should benchmark. That can always be improved later. So long as
>> the performance is reasonable. What needs to be benchmarked is the
>> overhead of namespaces when connected to physical networking devices
>> and on their own local loopback, and comparing that to a kernel
>> without namespace support.
>
> well, for me (obviously advocating the lightweight case)
> it seems improtant that the following conditions are met:
>
> - loopback traffic inside a guest is insignificantly
> slower than on a normal system
>
> - loopback traffic on the host is insignificantly
> slower than on a normal system
>
> - inter guest traffic is faster than on-wire traffic,
> and should be withing a small tolerance of the
> loopback case (as it really isn't different)
>
> - network (on-wire) traffic should be as fast as without
> the namespace (i.e. within 1% or so, better not really
> measurable)
>
> - all this should be true in a setup with a significant
> number of guests, when only one guest is active, but
> all other guests are ready/configured
>
> - all this should scale well with a few hundred guests
Ultimately I agree. However. Only host performance should be
a merge blocker. Allowing us to go back and reclaim the few
percentage points we lost later.
>> If we don't hurt that core case we have an implementation we can
>> merge. There are a lot of optimization opportunities for local
>> communications and we can do that after we have a correct and accepted
>> implementation. Anything else is optimizing too soon, and will
>> just be muddying the waters.
>
> what I fear is that once something is in, the kernel will
> just become slower (as it already did in some areas) and
> nobody will care/be-able to fix that later on ...
If nobody cares it doesn't matter.
If no one can fix it that is a problem. Which is why we need
high standards and clean code, not early optimizations.
But on that front each step of the way must be justified on
it's own merits. Not because it will give us some holy grail.
The way to keep the inter guest performance from degrading is
to measure it an complain. But the linux network stack is too
big to get in one pass.
Eric
|
|
|
Re: [patch 2/6] [Network namespace] Network device sharing by view [message #4055 is a reply to message #4047] |
Tue, 27 June 2006 16:49   |
Alexey Kuznetsov
Messages: 18 Registered: February 2006
|
Junior Member |
|
|
On Tue, Jun 27, 2006 at 06:02:42PM +0200, Herbert Poetzl wrote:
> - loopback traffic inside a guest is insignificantly
> slower than on a normal system
>
> - loopback traffic on the host is insignificantly
> slower than on a normal system
>
> - inter guest traffic is faster than on-wire traffic,
> and should be withing a small tolerance of the
> loopback case (as it really isn't different)
I do not follow what are you people arguing about?
Intra-guest, guest-guest and host-guest paths have _no_ differences
from host-host loopback. Only the device is different:
* virtual loopback for intra-guest
* virtual interface for guest-guest and host-guest
But the work is exactly the same, only the place where packets
looped back is different. How could this be issue to break a lance over? :-)
Alexey
PS. The only thing, which I can imagine is "optimized" out ip_route_input()
in the case of loopback. But this optimization was an obvious design mistake
(mine, sorry) and apparently will die together with removal of current
deficiences of routing cache. Actually, it is one of deficiences.
|
|
|
|
Re: [patch 2/6] [Network namespace] Network device sharing by view [message #4063 is a reply to message #4057] |
Tue, 27 June 2006 22:52   |
Herbert Poetzl
Messages: 239 Registered: February 2006
|
Senior Member |
|
|
On Tue, Jun 27, 2006 at 10:19:23AM -0700, Ben Greear wrote:
> Eric W. Biederman wrote:
> >Herbert Poetzl <herbert@13thfloor.at> writes:
> >
> >
> >>On Tue, Jun 27, 2006 at 05:52:52AM -0600, Eric W. Biederman wrote:
> >>
> >>>Inside the containers I want all network devices named eth0!
> >>
> >>huh? even if there are two of them? also tun?
> >>
> >>I think you meant, you want to be able to have eth0 in
> >>_more_ than one guest where eth0 in a guest can also
> >>be/use/relate to eth1 on the host, right?
> >
> >
> >Right I want to have an eth0 in each guest where eth0 is
> >it's own network device and need have no relationship to
> >eth0 on the host.
>
> How does that help anything? Do you envision programs
> that make special decisions on whether the interface is
> called eth0 v/s eth151?
well, those poor folks who do not have ethernet
devices for networking :)
seriously, what I think Eric meant was that it
might be nice (especially for migration purposes)
to keep the device namespace completely virtualized
and not just isolated ...
I'm fine with that, as long as it does not add
overhead or complicate handling, and as far as I
can tell, it should not do that ...
best,
Herbert
> Ben
>
>
> --
> Ben Greear <greearb@candelatech.com>
> Candela Technologies Inc http://www.candelatech.com
|
|
|
|
|
|
Re: [patch 2/6] [Network namespace] Network device sharing by view [message #4088 is a reply to message #4067] |
Wed, 28 June 2006 13:36   |
Herbert Poetzl
Messages: 239 Registered: February 2006
|
Senior Member |
|
|
On Tue, Jun 27, 2006 at 09:38:14PM -0600, Eric W. Biederman wrote:
> Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> writes:
>
> > Hello!
> >
> >> It may look weird, but do application really *need* to see eth0 rather
> >> than eth858354?
> >
> > Applications do not care, humans do. :-)
> >
> > What's about applications they just need to see exactly the same
> > device after migration. Not only name, but f.e. also its ifindex.
> > If you do not create a separate namespace for netdevices, you will
> > inevitably end up with some strange hack sort of VPIDs to translate
> > (or to partition) ifindices or to tell that "ping -I eth858354 xxx"
> > is too coimplicated application to survive migration.
>
>
> Actually there are applications with peculiar licensing practices that
> do look at devices like eth0 to verify you have the appropriate mac, and
> do really weird things if you don't have an eth0.
>
> Plus there are other cases where it can be simpler to hard code things
> if it is allowable. (The human factor) Otherwise your configuration
> must be done through hotplug scripts.
>
> But yes there are misguided applications that care.
last time I pointed to such 'misguided' apps which
made assumptions that are not necessarily true
inside a virtual environment (e.g. pstree, initpid)
the general? position was that those apps should
be fixed instead adding a 'workaround'
note: personally I'm absolutely not against virtualizing
the device names so that each guest can have a separate
name space for devices, but there should be a way to
'see' _and_ 'identify' the interfaces from outside
(i.e. host or spectator context)
best,
Herbert
> Eric
|
|
|
|
|
|
|
strict isolation of net interfaces [message #4147 is a reply to message #4146] |
Thu, 29 June 2006 22:14   |
Cedric Le Goater
Messages: 443 Registered: February 2006
|
Senior Member |
|
|
Sam Vilain wrote:
> jamal wrote:
>>> note: personally I'm absolutely not against virtualizing
>>> the device names so that each guest can have a separate
>>> name space for devices, but there should be a way to
>>> 'see' _and_ 'identify' the interfaces from outside
>>> (i.e. host or spectator context)
>>>
>>>
>> Makes sense for the host side to have naming convention tied
>> to the guest. Example as a prefix: guest0-eth0. Would it not
>> be interesting to have the host also manage these interfaces
>> via standard tools like ip or ifconfig etc? i.e if i admin up
>> guest0-eth0, then the user in guest0 will see its eth0 going
>> up.
>
> That particular convention only works if you have network namespaces and
> UTS namespaces tightly bound. We plan to have them separate - so for
> that to work, each network namespace could have an arbitrary "prefix"
> that determines what the interface name will look like from the outside
> when combined. We'd have to be careful about length limits.
>
> And guest0-eth0 doesn't necessarily make sense; it's not really an
> ethernet interface, more like a tun or something.
>
> So, an equally good convention might be to use sequential prefixes on
> the host, like "tun", "dummy", or a new prefix - then a property of that
> is what the name of the interface is perceived to be to those who are in
> the corresponding network namespace.
>
> Then the pragmatic question becomes how to correlate what you see from
> `ip addr list' to guests.
we could work on virtualizing the net interfaces in the host, map them to
eth0 or something in the guest and let the guest handle upper network layers ?
lo0 would just be exposed relying on skbuff tagging to discriminate traffic
between guests.
host | guest 0 | guest 1 | guest2
----------------------+-----------+-----------+------------- -
| | | |
|-> l0 <-------+-> lo0 ... | lo0 | lo0
| | | |
|-> bar0 <--------+-> eth0 | |
| | | |
|-> foo0 <--------+-----------+-----------+-> eth0
| | | |
`-> foo0:1 <-------+-----------+-> eth0 |
| | |
is that clear ? stupid ? reinventing the wheel ?
thanks,
C.
|
|
|
Re: [patch 2/6] [Network namespace] Network device sharing by view [message #4148 is a reply to message #4146] |
Fri, 30 June 2006 00:15   |
jamal
Messages: 12 Registered: June 2006
|
Junior Member |
|
|
On Fri, 2006-30-06 at 09:07 +1200, Sam Vilain wrote:
> jamal wrote:
> > Makes sense for the host side to have naming convention tied
> > to the guest. Example as a prefix: guest0-eth0. Would it not
> > be interesting to have the host also manage these interfaces
> > via standard tools like ip or ifconfig etc? i.e if i admin up
> > guest0-eth0, then the user in guest0 will see its eth0 going
> > up.
>
> That particular convention only works if you have network namespaces and
> UTS namespaces tightly bound.
that would be one approach. Another less sophisticated approach is to
have no binding whatsoever, rather some translation table to map two
unrelated devices.
> We plan to have them separate - so for
> that to work, each network namespace could have an arbitrary "prefix"
> that determines what the interface name will look like from the outside
> when combined. We'd have to be careful about length limits.
>
> And guest0-eth0 doesn't necessarily make sense; it's not really an
> ethernet interface, more like a tun or something.
>
it wouldnt quiet fit as a tun device. More like a mirror side of the
guest eth0 created on the host side
i.e a sort of passthrough device with one side visible on the host (send
from guest0-eth0 is received on eth0 in the guest and vice-versa).
Note this is radically different from what i have heard Andrey and co
talk about and i dont wanna disturb any shit because there seems to be
some agreement. But if you address me i respond because it is very
interesting a topic;->
> So, an equally good convention might be to use sequential prefixes on
> the host, like "tun", "dummy", or a new prefix - then a property of that
> is what the name of the interface is perceived to be to those who are in
> the corresponding network namespace.
>
> Then the pragmatic question becomes how to correlate what you see from
> `ip addr list' to guests.
on the host ip addr and the one seen on the guest side are the same.
Except one is seen (on the host) on guest0-eth0 and another is seen
on eth0 (on guest).
Anyways, ignore what i am saying if it is disrupting the discussion.
cheers,
jamal
|
|
|
|
Re: strict isolation of net interfaces [message #4152 is a reply to message #4147] |
Fri, 30 June 2006 02:39   |
serue
Messages: 750 Registered: February 2006
|
Senior Member |
|
|
Quoting Cedric Le Goater (clg@fr.ibm.com):
> Sam Vilain wrote:
> > jamal wrote:
> >>> note: personally I'm absolutely not against virtualizing
> >>> the device names so that each guest can have a separate
> >>> name space for devices, but there should be a way to
> >>> 'see' _and_ 'identify' the interfaces from outside
> >>> (i.e. host or spectator context)
> >>>
> >>>
> >> Makes sense for the host side to have naming convention tied
> >> to the guest. Example as a prefix: guest0-eth0. Would it not
> >> be interesting to have the host also manage these interfaces
> >> via standard tools like ip or ifconfig etc? i.e if i admin up
> >> guest0-eth0, then the user in guest0 will see its eth0 going
> >> up.
> >
> > That particular convention only works if you have network namespaces and
> > UTS namespaces tightly bound. We plan to have them separate - so for
> > that to work, each network namespace could have an arbitrary "prefix"
> > that determines what the interface name will look like from the outside
> > when combined. We'd have to be careful about length limits.
> >
> > And guest0-eth0 doesn't necessarily make sense; it's not really an
> > ethernet interface, more like a tun or something.
> >
> > So, an equally good convention might be to use sequential prefixes on
> > the host, like "tun", "dummy", or a new prefix - then a property of that
> > is what the name of the interface is perceived to be to those who are in
> > the corresponding network namespace.
> >
> > Then the pragmatic question becomes how to correlate what you see from
> > `ip addr list' to guests.
>
>
> we could work on virtualizing the net interfaces in the host, map them to
> eth0 or something in the guest and let the guest handle upper network layers ?
>
> lo0 would just be exposed relying on skbuff tagging to discriminate traffic
> between guests.
This seems to me the preferable way. We create a full virtual net
device for each new container, and fully virtualize the device
namespace.
> host | guest 0 | guest 1 | guest2
> ----------------------+-----------+-----------+------------- -
> | | | |
> |-> l0 <-------+-> lo0 ... | lo0 | lo0
> | | | |
> |-> bar0 <--------+-> eth0 | |
> | | | |
> |-> foo0 <--------+-----------+-----------+-> eth0
> | | | |
> `-> foo0:1 <-------+-----------+-> eth0 |
> | | |
>
>
> is that clear ? stupid ? reinventing the wheel ?
The last one in your diagram confuses me - why foo0:1? I would
have thought it'd be
host | guest 0 | guest 1 | guest2
----------------------+-----------+-----------+------------- -
| | | |
|-> l0 <-------+-> lo0 ... | lo0 | lo0
| | | |
|-> eth0 | | |
| | | |
|-> veth0 <--------+-> eth0 | |
| | | |
|-> veth1 <--------+-----------+-----------+-> eth0
| | | |
|-> veth2 <-------+-----------+-> eth0 |
I think we should avoid using device aliases, as trying to do
something like giving eth0:1 to guest1 and eth0:2 to guest2
while hiding eth0:1 from guest2 requires some uglier code (as
I recall) than working with full devices. In other words,
if a namespace can see eth0, and eth0:2 exists, it should always
see eth0:2.
So conceptually using a full virtual net device per container
certainly seems cleaner to me, and it seems like it should be
simpler by way of statistics gathering etc, but are there actually
any real gains? Or is the support for multiple IPs per device
actually enough?
Herbert, is this basically how ngnet is supposed to work?
-serge
|
|
|
Re: [patch 2/6] [Network namespace] Network device sharing by view [message #4153 is a reply to message #4148] |
Fri, 30 June 2006 03:35   |
Herbert Poetzl
Messages: 239 Registered: February 2006
|
Senior Member |
|
|
On Thu, Jun 29, 2006 at 08:15:52PM -0400, jamal wrote:
> On Fri, 2006-30-06 at 09:07 +1200, Sam Vilain wrote:
> > jamal wrote:
>
> > > Makes sense for the host side to have naming convention tied
> > > to the guest. Example as a prefix: guest0-eth0. Would it not
> > > be interesting to have the host also manage these interfaces
> > > via standard tools like ip or ifconfig etc? i.e if i admin up
> > > guest0-eth0, then the user in guest0 will see its eth0 going
> > > up.
> >
> > That particular convention only works if you have network namespaces
> > and UTS namespaces tightly bound.
>
> that would be one approach. Another less sophisticated approach is to
> have no binding whatsoever, rather some translation table to map two
> unrelated devices.
>
> > We plan to have them separate - so for
> > that to work, each network namespace could have an arbitrary
> > "prefix" that determines what the interface name will look like from
> > the outside when combined. We'd have to be careful about length
> > limits.
> >
> > And guest0-eth0 doesn't necessarily make sense; it's not really an
> > ethernet interface, more like a tun or something.
>
> it wouldnt quiet fit as a tun device. More like a mirror side of the
> guest eth0 created on the host side
> i.e a sort of passthrough device with one side visible on the host (send
> from guest0-eth0 is received on eth0 in the guest and vice-versa).
>
> Note this is radically different from what i have heard Andrey and co
> talk about and i dont wanna disturb any shit because there seems to be
> some agreement. But if you address me i respond because it is very
> interesting a topic;->
thing is, we have several things we should care about
and some of them 'look' or 'sound' similar, although
they are not really ... I'll try to clarify
first, we want to have 'per guest' interfaces, which
do not clash with any interfaces on the host or in
other guests
then, we want to 'connect' them, implicitely or
explicetly with 'other' interfaces or devices inside
other guests or on the host, here we have the following
cases (some are a little special):
- lo interface, guest and host private (by default)
- tap/tun interfaces, again host/guest private
- tun like interfaces between host and guests
- tun like interfaces between guests
- 'normal' interfaces mapped into guests
on the traffic side we have the following cases:
- local traffic on the host
- local traffic on the guest
- local traffic between host and guest
- local traffic between guests
- routed traffic from guest via host
- bridged traffic from guest via host
special cases here would be tun/tap traffic inside
a guest, but that can be considered local too
> > So, an equally good convention might be to use sequential prefixes
> > on the host, like "tun", "dummy", or a new prefix - then a property
> > of that is what the name of the interface is perceived to be to
> > those who are in the corresponding network namespace.
> >
> > Then the pragmatic question becomes how to correlate what you see
> > from `ip addr list' to guests.
>
> on the host ip addr and the one seen on the guest side are the same.
> Except one is seen (on the host) on guest0-eth0 and another is seen
> on eth0 (on guest).
this depends on the way the interfaces are handled
and how they actually work, means:
if the interfaces _solely_ work via routing or
bridging, then the 'host' end has to exist and be
visible similar to 'normal' interfaces
if the traffic is (magically) mapped from guest
interfaces to real (outside) host interfaces, we
might want the same view as the guest has
(i.e. basically a 'copy' which is not real)
> Anyways, ignore what i am saying if it is disrupting the discussion.
IMHO input is always welcome .. helps the folks to
do better thinking :)
> cheers,
> jamal
>
>
>
|
|
|
Re: [patch 2/6] [Network namespace] Network device sharing by view [message #4155 is a reply to message #4148] |
Fri, 30 June 2006 07:45   |
Andrey Savochkin
Messages: 47 Registered: December 2005
|
Member |
|
|
Hi Jamal,
On Thu, Jun 29, 2006 at 08:15:52PM -0400, jamal wrote:
> On Fri, 2006-30-06 at 09:07 +1200, Sam Vilain wrote:
[snip]
> > We plan to have them separate - so for
> > that to work, each network namespace could have an arbitrary "prefix"
> > that determines what the interface name will look like from the outside
> > when combined. We'd have to be careful about length limits.
> >
> > And guest0-eth0 doesn't necessarily make sense; it's not really an
> > ethernet interface, more like a tun or something.
> >
>
> it wouldnt quiet fit as a tun device. More like a mirror side of the
> guest eth0 created on the host side
> i.e a sort of passthrough device with one side visible on the host (send
> from guest0-eth0 is received on eth0 in the guest and vice-versa).
>
> Note this is radically different from what i have heard Andrey and co
> talk about and i dont wanna disturb any shit because there seems to be
> some agreement. But if you address me i respond because it is very
> interesting a topic;->
I do not have anything against guest-eth0 - eth0 pairs _if_ they are set up
by the host administrators explicitly for some purpose.
For example, if these guest-eth0 and eth0 devices stay as pure virtual ones,
i.e. they don't have any physical NIC, host administrator may route traffic
to guestXX-eth0 interfaces to pass it to the guests.
However, I oppose the idea of automatic mirroring of _all_ devices appearing
inside some namespaces ("guests") to another namespace (the "host").
This clearly goes against the concept of namespaces as independent realms,
and creates a lot of problems with applications running in the host, hotplug
scripts and so on.
>
> > So, an equally good convention might be to use sequential prefixes on
> > the host, like "tun", "dummy", or a new prefix - then a property of that
> > is what the name of the interface is perceived to be to those who are in
> > the corresponding network namespace.
> >
> > Then the pragmatic question becomes how to correlate what you see from
> > `ip addr list' to guests.
>
> on the host ip addr and the one seen on the guest side are the same.
> Except one is seen (on the host) on guest0-eth0 and another is seen
> on eth0 (on guest).
Then what to do if the host system has 10.0.0.1 as a private address on eth3,
and then interfaces guest1-tun0 and guest2-tun0 both get address 10.0.0.1
when each guest has added 10.0.0.1 to their tun0 device?
Regards,
Andrey
|
|
|
|
|
Re: [patch 2/6] [Network namespace] Network device sharing by view [message #4169 is a reply to message #4155] |
Fri, 30 June 2006 13:50   |
jamal
Messages: 12 Registered: June 2006
|
Junior Member |
|
|
Hi Andrey,
BTW - I was just looking at openvz, very impressive. To the other folks,
I am not putting down any of your approaches - just havent
had time to study them. Andrey, this is the same thing you guys have
been working on for a few years now, you just changed the name, correct?
Ok, since you guys are encouraging me to speak, here goes ;->
Hopefully this addresses the other email from Herbert et al.
On Fri, 2006-30-06 at 11:45 +0400, Andrey Savochkin wrote:
> Hi Jamal,
>
> On Thu, Jun 29, 2006 at 08:15:52PM -0400, jamal wrote:
> > On Fri, 2006-30-06 at 09:07 +1200, Sam Vilain wrote:
> [snip]
>
> I do not have anything against guest-eth0 - eth0 pairs _if_ they are set up
> by the host administrators explicitly for some purpose.
> For example, if these guest-eth0 and eth0 devices stay as pure virtual ones,
> i.e. they don't have any physical NIC, host administrator may route traffic
> to guestXX-eth0 interfaces to pass it to the guests.
>
Well there will be purely virtual of course. Something along the lines
for openvz:
// create the guest
[host-node]# vzctl create 101 --ostemplate fedora-core-5-minimal
// create guest101::eth0, seems to only create config to boot up with
[host-node]# vzctl create 101 --netdev eth0
// bootup guest101
[host-node]# vzctl start 101
As soon as bootup of guest101 happens, creating guest101::eth0 should activate
creation of the host side netdevice. This could be triggered for example by
the netlink event message seen on host whic- which is a result of creating guest101::eth0
Which means control sits purely in user space.
at that point if i do ifconfig on host i see g101-eth0
on guest101 i see just name eth0.
My earlier suggestion was that instead of:
host-node]# vzctl set 101 --ipadd 10.1.2.3
you do:
host-node]# ip addr add g101-eth0 10.1.2.3/32
you should still use vzctl to save config for next bootup
> However, I oppose the idea of automatic mirroring of _all_ devices appearing
> inside some namespaces ("guests") to another namespace (the "host").
> This clearly goes against the concept of namespaces as independent realms,
> and creates a lot of problems with applications running in the host, hotplug
> scripts and so on.
>
I was thinking that the host side is the master i.e you can peek at
namespaces in the guest from the host.
Also note that having the pass through device allows for guests to be
connected via standard linux schemes in the host side (bridge, point
routes, tc redirect etc); so you dont need a speacial device to hook
them together.
> > > Then the pragmatic question becomes how to correlate what you see from
> > > `ip addr list' to guests.
> >
> > on the host ip addr and the one seen on the guest side are the same.
> > Except one is seen (on the host) on guest0-eth0 and another is seen
> > on eth0 (on guest).
>
> Then what to do if the host system has 10.0.0.1 as a private address on eth3,
> and then interfaces guest1-tun0 and guest2-tun0 both get address 10.0.0.1
> when each guest has added 10.0.0.1 to their tun0 device?
>
Yes, that would be a conflict that needs to be resolved. If you look at
ip addresses as also belonging to namespaces, then it should work, no?
i am assuming a tag at the ifa table level.
cheers,
jamal
|
|
|
|
Re: [patch 2/6] [Network namespace] Network device sharing by view [message #4172 is a reply to message #4169] |
Fri, 30 June 2006 15:01   |
Andrey Savochkin
Messages: 47 Registered: December 2005
|
Member |
|
|
Jamal,
On Fri, Jun 30, 2006 at 09:50:52AM -0400, jamal wrote:
>
> BTW - I was just looking at openvz, very impressive. To the other folks,
Thanks!
> I am not putting down any of your approaches - just havent
> had time to study them. Andrey, this is the same thing you guys have
> been working on for a few years now, you just changed the name, correct?
The relations are more complicated than just the change of name,
but yes, OpenVZ represents the result of our work for a few years.
>
> Ok, since you guys are encouraging me to speak, here goes ;->
> Hopefully this addresses the other email from Herbert et al.
>
[snip]
> // create the guest
> [host-node]# vzctl create 101 --ostemplate fedora-core-5-minimal
> // create guest101::eth0, seems to only create config to boot up with
> [host-node]# vzctl create 101 --netdev eth0
> // bootup guest101
> [host-node]# vzctl start 101
>
> As soon as bootup of guest101 happens, creating guest101::eth0 should activate
> creation of the host side netdevice. This could be triggered for example by
> the netlink event message seen on host whic- which is a result of creating guest101::eth0
> Which means control sits purely in user space.
I'd like to clarify you idea: whether this host-side device is a real
device capable of receiving and transmitting packets (by moving them between
namespaces), or it's a fake device creating only a view of other namespace's
devices?
[snip]
> > However, I oppose the idea of automatic mirroring of _all_ devices appearing
> > inside some namespaces ("guests") to another namespace (the "host").
> > This clearly goes against the concept of namespaces as independent realms,
> > and creates a lot of problems with applications running in the host, hotplug
> > scripts and so on.
> >
>
> I was thinking that the host side is the master i.e you can peek at
> namespaces in the guest from the host.
"Host(master)-guest" relations is a valid and useful scheme.
However, I'm thinking about broader application of network namespaces,
when they can form an arbitrary tree and may not be in "host-guest" relations.
> Also note that having the pass through device allows for guests to be
> connected via standard linux schemes in the host side (bridge, point
> routes, tc redirect etc); so you dont need a speacial device to hook
> them together.
What do you mean under pass through device?
Do you mean using guest1-tun0 as a backdoor to talk to the guest?
>
> > > > Then the pragmatic question becomes how to correlate what you see from
> > > > `ip addr list' to guests.
> > >
> > > on the host ip addr and the one seen on the guest side are the same.
> > > Except one is seen (on the host) on guest0-eth0 and another is seen
> > > on eth0 (on guest).
> >
> > Then what to do if the host system has 10.0.0.1 as a private address on eth3,
> > and then interfaces guest1-tun0 and guest2-tun0 both get address 10.0.0.1
> > when each guest has added 10.0.0.1 to their tun0 device?
> >
>
> Yes, that would be a conflict that needs to be resolved. If you look at
> ip addresses as also belonging to namespaces, then it should work, no?
> i am assuming a tag at the ifa table level.
I'm not sure, it's complicated.
You wouldn't want automatic local routes to be added for IP addresses on
the host-side interfaces, right?
Do you expect these IP addresses to act as local addresses in other places,
like answering to arp requests about these IP on all physical devices?
But anyway, you'll have conflicts on the application level.
Many programs like ntpd, bind, and others fetch the device list using the
same ioctls as ifconfig, and make (un)intelligent decisions basing on what
they see.
Mirroring may have some advantages if I am both host and guest administrator.
But if I create a namespace for my friend Joe to play with IPv6 and sit
tunnels, why should I face inconveniences because of what he does there?
Best regards
Andrey
|
|
|
Re: strict isolation of net interfaces [message #4173 is a reply to message #4171] |
Fri, 30 June 2006 15:22   |
Daniel Lezcano
Messages: 417 Registered: June 2006
|
Senior Member |
|
|
Eric W. Biederman wrote:
> Daniel Lezcano <dlezcano@fr.ibm.com> writes:
>
>
>>Serge E. Hallyn wrote:
>>
>>>Quoting Cedric Le Goater (clg@fr.ibm.com):
>>>
>>>
>>>>we could work on virtualizing the net interfaces in the host, map them to
>>>>eth0 or something in the guest and let the guest handle upper network layers ?
>>>>
>>>>lo0 would just be exposed relying on skbuff tagging to discriminate traffic
>>>>between guests.
>>>
>>>This seems to me the preferable way. We create a full virtual net
>>>device for each new container, and fully virtualize the device
>>>namespace.
>>
>>I have a few questions about all the network isolation stuff:
>
It seems these questions are not important.
>
> So far I have seen two viable possibilities on the table,
> neither of them involve multiple names for a network device.
>
> layer 3 (filtering the allowed ip addresses at bind time roughly the current vserver).
> - implementable as a security hook.
> - Benefit no measurable performance impact.
> - Downside not many things we can do.
What things ? Can you develop please ? Can you give some examples ?
>
> layer 2 (What appears to applications a separate instance of the network stack).
> - Implementable as a namespace.
what about accessing a NFS mounted outside the container ?
> - Each network namespace would have dedicated network devices.
> - Benefit extremely flexible.
For what ? For who ? Do you have examples ?
> - Downside since at least the slow path must examine the packet
> it has the possibility of slowing down the networking stack.
What is/are the slow path(s) you identified ?
> For me the important characteristics.
> - Allows for application migration, when we take our ip address with us.
> In particular it allows for importation of addresses assignments
> mad on other machines.
Ok for the two methods no ?
> - No measurable impact on the existing networking when the code
> is compiled in.
You contradict ...
> - Clean predictable semantics.
What that means ? Can you explain, please ?
> This whole debate on network devices show up in multiple network namespaces
> is just silly.
The debate is not on the network device show up. The debate is can we
have a network isolation ___usable for everybody___ not only for the
beauty of having namespaces and for a system container like.
I am not against the network device virtualization or against the
namespaces. I am just asking if the namespace is the solution for all
the network isolation. Should we nest layer 2 and layer 3 vitualization
into namespaces or separate them in order to have the flexibility to
choose isolation/performance.
> The only reason for wanting that appears to be better management.
> We have deeper issues like can we do a reasonable implementation without a
> network device showing up in multiple namespaces.
Again, I am not against having the network device virtualization. It is
a good idea.
> I think the reason the debate exists at all is that it is a very approachable
> topic, as opposed to the fundamentals here.
>
> If we can get layer 2 level isolation working without measurable overhead
> with one namespace per device it may be worth revisiting things. Until
> then it is a side issue at best.
I agree, so where are the answers of the questions I asked in my
previous email ? You said you did some implementation of network
isolation with and without namespaces, so you should be able to answer...
-- Daniel
|
|
|
|
|
Re: strict isolation of net interfaces [message #4179 is a reply to message #4173] |
Fri, 30 June 2006 17:58   |
ebiederm
Messages: 1354 Registered: February 2006
|
Senior Member |
|
|
Daniel Lezcano <dlezcano@fr.ibm.com> writes:
> Eric W. Biederman wrote:
>> Daniel Lezcano <dlezcano@fr.ibm.com> writes:
>>
>>>Serge E. Hallyn wrote:
>>>
>>>>Quoting Cedric Le Goater (clg@fr.ibm.com):
>>>>
>>>>
>>>>>we could work on virtualizing the net interfaces in the host, map them to
>>>>>eth0 or something in the guest and let the guest handle upper network layers
> ?
>>>>>
>>>>>lo0 would just be exposed relying on skbuff tagging to discriminate traffic
>>>>>between guests.
>>>>
>>>>This seems to me the preferable way. We create a full virtual net
>>>>device for each new container, and fully virtualize the device
>>>>namespace.
>>>
>>>I have a few questions about all the network isolation stuff:
>>
>
> It seems these questions are not important.
I'm just trying to get us back to a productive topic.
>> So far I have seen two viable possibilities on the table,
>> neither of them involve multiple names for a network device.
>> layer 3 (filtering the allowed ip addresses at bind time roughly the current
>> vserver).
>> - implementable as a security hook.
>> - Benefit no measurable performance impact.
>> - Downside not many things we can do.
>
> What things ? Can you develop please ? Can you give some examples ?
DHCP, tcpdump,.. Probably a bad way of phrasing it. But there
is a lot more that we can do using a pure layer 2 approach.
>> layer 2 (What appears to applications a separate instance of the network
>> stack).
>> - Implementable as a namespace.
>
> what about accessing a NFS mounted outside the container ?
As I replied earlier it isn't a problem. If you get to it through the
filesystem namespace it uses the network namespace it was mounted with
for it's connection.
>> - Each network namespace would have dedicated network devices.
>> - Benefit extremely flexible.
>
> For what ? For who ? Do you have examples ?
See above.
>> - Downside since at least the slow path must examine the packet
>> it has the possibility of slowing down the networking stack.
>
> What is/are the slow path(s) you identified ?
Grr. I put that badly. Basically at least on the slow path you need to
look at a per network namespace data structure. The extra pointer
indirection could slow things down. The point is that we may be
able to have a fast path that is exactly the same as the rest
of the network stack.
If the obvious approach does not work my gut the feeling the
network stack fast path will give us an implementation without overhead.
>> For me the important characteristics.
>> - Allows for application migration, when we take our ip address with us.
>> In particular it allows for importation of addresses assignments
>> mad on other machines.
>
> Ok for the two methods no ?
So far.
>> - No measurable impact on the existing networking when the code
>> is compiled in.
>
> You contradict ...
How so? As far as I can tell this is a basic requirement to get
merged.
>> - Clean predictable semantics.
>
> What that means ? Can you explain, please ?
>> This whole debate on network devices show up in multiple network namespaces
>> is just silly.
>
> The debate is not on the network device show up. The debate is can we have a
> network isolation ___usable for everybody___ not only for the beauty of having
> namespaces and for a system container like.
This subthread talking about devices showing up in multiple namespaces seemed
very much exactly on how network devices show up.
> I am not against the network device virtualization or against the namespaces. I
> am just asking if the namespace is the solution for all the network
> isolation. Should we nest layer 2 and layer 3 vitualization into namespaces or
> separate them in order to have the flexibility to choose isolation/performance.
I believe I addressed Herbert Poetzl's concerns earlier. To me the question
is can we implement an acceptable layer 2 solution, that distrubutions and
other people who do not need isolation would have no problem compiling in
by default.
The joy of namespaces is that if you don't want it you don't have to use it.
Layer 2 can do everything and is likely usable by everyone iff the performance
is acceptable.
>> The only reason for wanting that appears to be better management.
>> We have deeper issues like can we do a reasonable implementation without a
>> network device showing up in multiple namespaces.
>
> Again, I am not against having the network device virtualization. It is a good
> idea.
>
>> I think the reason the debate exists at all is that it is a very approachable
>> topic, as opposed to the fundamentals here.
>> If we can get layer 2 level isolation working without measurable overhead
>> with one namespace per device it may be worth revisiting things. Until
>> then it is a side issue at best.
>
> I agree, so where are the answers of the questions I asked in my previous email
> ? You said you did some implementation of network isolation with and without
> namespaces, so you should be able to answer...
Sorry. More than anything those questions looked retorical and aimed
at disarming some of the silliness. I will go back and try and
answer those. Fundamentally when we have one namespace that includes
network devices, network sockets, and all of the data structures necessary
to use them (routing tables and the like) and we have a tunnel device
that can connect namespaces the answers are trivial and I though obvious.
Eric
|
|
|
|
|
|
|
Re: strict isolation of net interfaces [message #4234 is a reply to message #4157] |
Mon, 03 July 2006 13:36   |
Herbert Poetzl
Messages: 239 Registered: February 2006
|
Senior Member |
|
|
On Fri, Jun 30, 2006 at 10:56:13AM +0200, Cedric Le Goater wrote:
> Serge E. Hallyn wrote:
> >
> > The last one in your diagram confuses me - why foo0:1? I would
> > have thought it'd be
>
> just thinking aloud. I thought that any kind/type of interface could be
> mapped from host to guest.
>
> > host | guest 0 | guest 1 | guest2
> > ----------------------+-----------+-----------+------------- -
> > | | | |
> > |-> l0 <-------+-> lo0 ... | lo0 | lo0
> > | | | |
> > |-> eth0 | | |
> > | | | |
> > |-> veth0 <--------+-> eth0 | |
> > | | | |
> > |-> veth1 <--------+-----------+-----------+-> eth0
> > | | | |
> > |-> veth2 <-------+-----------+-> eth0 |
> >
> > I think we should avoid using device aliases, as trying to do
> > something like giving eth0:1 to guest1 and eth0:2 to guest2
> > while hiding eth0:1 from guest2 requires some uglier code (as
> > I recall) than working with full devices. In other words,
> > if a namespace can see eth0, and eth0:2 exists, it should always
> > see eth0:2.
> >
> > So conceptually using a full virtual net device per container
> > certainly seems cleaner to me, and it seems like it should be
> > simpler by way of statistics gathering etc, but are there actually
> > any real gains? Or is the support for multiple IPs per device
> > actually enough?
> >
> > Herbert, is this basically how ngnet is supposed to work?
hard to tell, we have at least three ngnet prototypes
and basically all variants are covered there, from
separate interfaces which map to real ones to perfect
isolation of addresses assigned to global interfaces
IMHO the 'virtual' interface per guest is fine, as
the overhead and consumed resources are non critical
and it will definitely simplify handling for the
guest side
I'd really appreciate if we could find a solution which
allows both, isolation and virtualization, and if the
bridge scenario is as fast as a direct mapping, I'm
perfectly fine with a big bridge + ebtables to handle
security issues
best,
Herbert
|
|
|
Re: strict isolation of net interfaces [message #4235 is a reply to message #4151] |
Mon, 03 July 2006 14:53   |
Andrey Savochkin
Messages: 47 Registered: December 2005
|
Member |
|
|
Sam, Serge, Cedric,
On Fri, Jun 30, 2006 at 02:49:05PM +1200, Sam Vilain wrote:
> Serge E. Hallyn wrote:
> > The last one in your diagram confuses me - why foo0:1? I would
> > have thought it'd be
> >
> > host | guest 0 | guest 1 | guest2
> > ----------------------+-----------+-----------+------------- -
> > | | | |
> > |-> l0 <-------+-> lo0 ... | lo0 | lo0
> > | | | |
> > |-> eth0 | | |
> > | | | |
> > |-> veth0 <--------+-> eth0 | |
> > | | | |
> > |-> veth1 <--------+-----------+-----------+-> eth0
> > | | | |
> > |-> veth2 <-------+-----------+-> eth0 |
> >
> > [...]
> >
> > So conceptually using a full virtual net device per container
> > certainly seems cleaner to me, and it seems like it should be
> > simpler by way of statistics gathering etc, but are there actually
> > any real gains? Or is the support for multiple IPs per device
> > actually enough?
> >
>
> Why special case loopback?
>
> Why not:
>
> host | guest 0 | guest 1 | guest2
> ----------------------+-----------+-----------+------------- -
> | | | |
> |-> lo | | |
> | | | |
> |-> vlo0 <---------+-> lo | |
> | | | |
> |-> vlo1 <---------+-----------+-----------+-> lo
> | | | |
> |-> vlo2 <--------+-----------+-> lo |
> | | | |
> |-> eth0 | | |
> | | | |
> |-> veth0 <--------+-> eth0 | |
> | | | |
> |-> veth1 <--------+-----------+-----------+-> eth0
> | | | |
> |-> veth2 <-------+-----------+-> eth0 |
I still can't completely understand your direction of thoughts.
Could you elaborate on IP address assignment in your diagram, please? For
example, guest0 wants 127.0.0.1 and 192.168.0.1 addresses on its lo
interface, and 10.1.1.1 on its eth0 interface.
Does this diagram assume any local IP addresses on v* interfaces in the
"host"?
And the second question.
Are vlo0, veth0, etc. devices supposed to have hard_xmit routines?
Best regards
Andrey
|
|
|
|
|
Goto Forum:
Current Time: Fri Oct 24 09:32:10 GMT 2025
Total time taken to generate the page: 0.12932 seconds
|