Home » Mailing lists » Devel » [PATCH net-2.6.26 0/6][NETNS][SOCK]: Make "prot inuse" counters work per-net.
[PATCH net-2.6.26 0/6][NETNS][SOCK]: Make "prot inuse" counters work per-net. [message #28731] |
Thu, 27 March 2008 08:04  |
Pavel Emelianov
Messages: 1149 Registered: September 2006
|
Senior Member |
|
|
These counters show the number of sockets used for each protocol,
but currently they gather global statistics. This is useful to
have them work per-net.
This set adds this possibility, but hides it under CONFIG_NET_NS
option, so that not-virtualized kernel keeps using the optimized
version by Eric (and he's in Cc for that reason).
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
|
|
|
[PATCH net-2.6.26 1/6][NETNS][SOCK]: Add net parameter to sock_prot_inuse_(add|get). [message #28732 is a reply to message #28731] |
Thu, 27 March 2008 08:07   |
Pavel Emelianov
Messages: 1149 Registered: September 2006
|
Senior Member |
|
|
It will be used in netnsizated versions of these calls only.
The net passed to then is either get from a sock, or is available
int place, with two exceptions - the "sockstat" and "sockstat6"
proc files use init_net by now.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
---
include/net/sock.h | 8 +++++---
include/net/udp.h | 2 +-
net/ipv4/inet_hashtables.c | 8 ++++----
net/ipv4/inet_timewait_sock.c | 2 +-
net/ipv4/proc.c | 11 +++++++----
net/ipv4/raw.c | 4 ++--
net/ipv4/udp.c | 2 +-
net/ipv6/inet6_hashtables.c | 4 ++--
net/ipv6/ipv6_sockglue.c | 8 ++++----
net/ipv6/proc.c | 8 ++++----
10 files changed, 31 insertions(+), 26 deletions(-)
diff --git a/include/net/sock.h b/include/net/sock.h
index 1c9d059..a57c58f 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -638,7 +638,8 @@ static inline void sk_refcnt_debug_release(const struct sock *sk)
# define DEFINE_PROTO_INUSE(NAME) DEFINE_PCOUNTER(NAME)
# define REF_PROTO_INUSE(NAME) PCOUNTER_MEMBER_INITIALIZER(NAME, .inuse)
/* Called with local bh disabled */
-static inline void sock_prot_inuse_add(struct proto *prot, int inc)
+static inline void sock_prot_inuse_add(struct net *net,
+ struct proto *prot, int inc)
{
pcounter_add(&prot->inuse, inc);
}
@@ -646,7 +647,7 @@ static inline int sock_prot_inuse_init(struct proto *proto)
{
return pcounter_alloc(&proto->inuse);
}
-static inline int sock_prot_inuse_get(struct proto *proto)
+static inline int sock_prot_inuse_get(struct net *net, struct proto *proto)
{
return pcounter_getval(&proto->inuse);
}
@@ -657,7 +658,8 @@ static inline void sock_prot_inuse_free(struct proto *proto)
#else
# define DEFINE_PROTO_INUSE(NAME)
# define REF_PROTO_INUSE(NAME)
-static void inline sock_prot_inuse_add(struct proto *prot, int inc)
+static void inline sock_prot_inuse_add(struct net *net,
+ struct proto *prot, int inc)
{
}
static int inline sock_prot_inuse_init(struct proto *proto)
diff --git a/include/net/udp.h b/include/net/udp.h
index 635940d..fec52c1 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -115,7 +115,7 @@ static inline void udp_lib_unhash(struct sock *sk)
write_lock_bh(&udp_hash_lock);
if (sk_del_node_init(sk)) {
inet_sk(sk)->num = 0;
- sock_prot_inuse_add(sk->sk_prot, -1);
+ sock_prot_inuse_add(sock_net(sk), sk->sk_prot, -1);
}
write_unlock_bh(&udp_hash_lock);
}
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index 1b6ff51..32ca2f8 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -288,7 +288,7 @@ unique:
sk->sk_hash = hash;
BUG_TRAP(sk_unhashed(sk));
__sk_add_node(sk, &head->chain);
- sock_prot_inuse_add(sk->sk_prot, 1);
+ sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1);
write_unlock(lock);
if (twp) {
@@ -332,7 +332,7 @@ void __inet_hash_nolisten(struct sock *sk)
write_lock(lock);
__sk_add_node(sk, list);
- sock_prot_inuse_add(sk->sk_prot, 1);
+ sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1);
write_unlock(lock);
}
EXPORT_SYMBOL_GPL(__inet_hash_nolisten);
@@ -354,7 +354,7 @@ static void __inet_hash(struct sock *sk)
inet_listen_wlock(hashinfo);
__sk_add_node(sk, list);
- sock_prot_inuse_add(sk->sk_prot, 1);
+ sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1);
write_unlock(lock);
wake_up(&hashinfo->lhash_wait);
}
@@ -387,7 +387,7 @@ void inet_unhash(struct sock *sk)
}
if (__sk_del_node_init(sk))
- sock_prot_inuse_add(sk->sk_prot, -1);
+ sock_prot_inuse_add(sock_net(sk), sk->sk_prot, -1);
write_unlock_bh(lock);
out:
if (sk->sk_state == TCP_LISTEN)
diff --git a/net/ipv4/inet_timewait_sock.c b/net/ipv4/inet_timewait_sock.c
index f12bc24..a741378 100644
--- a/net/ipv4/inet_timewait_sock.c
+++ b/net/ipv4/inet_timewait_sock.c
@@ -91,7 +91,7 @@ void __inet_twsk_hashdance(struct inet_timewait_sock *tw, struct sock *sk,
/* Step 2: Remove SK from established hash. */
if (__sk_del_node_init(sk))
- sock_prot_inuse_add(sk->sk_prot, -1);
+ sock_prot_inuse_add(sock_net(sk), sk->sk_prot, -1);
/* Step 3: Hash TW into TIMEWAIT chain. */
inet_twsk_add_node(tw, &ehead->twchain);
diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index d63474c..8156c26 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -53,14 +53,17 @@ static int sockstat_seq_show(struct seq_file *seq, void *v)
{
socket_seq_show(seq);
seq_printf(seq, "TCP: inuse %d orphan %d tw %d alloc %d mem %d\n",
- sock_prot_inuse_get(&tcp_prot),
+ sock_prot_inuse_get(&init_net, &tcp_prot),
atomic_read(&tcp_orphan_count),
tcp_death_row.tw_count, atomic_read(&tcp_sockets_allocated),
atomic_read(&tcp_memory_allocated));
- seq_printf(seq, "UDP: inuse %d mem %d\n", sock_prot_inuse_get(&udp_prot),
+ seq_printf(seq, "UDP: inuse %d mem %d\n",
+ sock_prot_inuse_get(&init_net, &udp_prot),
atomic_read(&udp_memory_allocated));
- seq_printf(seq, "UDPLITE: inuse %d\n", sock_prot_inuse_get(&udplite_prot));
- seq_printf(seq, "RAW: inuse %d\n", sock_prot_inuse_get(&raw_prot));
+ seq_printf(seq, "UDPLITE: inuse %d\n",
+ sock_prot_inuse_get(&init_net, &udplite_prot));
+ seq_printf(seq, "RAW: inuse %d\n",
+ sock_prot_inuse_get(&init_net, &raw_prot));
seq_printf(seq, "FRAG: inuse %d memory %d\n",
ip_frag_nqueues(&init_net), ip_frag_mem(&init_net));
return 0;
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index d965f0a..247a88d 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -93,7 +93,7 @@ void raw_hash_sk(struct sock *sk)
write_lock_bh(&h->lock);
sk_add_node(sk, head);
- sock_prot_inuse_add(sk->sk_prot, 1);
+ sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1);
write_unlock_bh(&h->lock);
}
EXPORT_SYMBOL_GPL(raw_hash_sk);
@@ -104,7 +104,7 @@ void raw_unhash_sk(struct sock *sk)
write_lock_bh(&h->lock);
if (sk_del_node_init(sk))
- sock_prot_inuse_add(sk->sk_prot, -1);
+ sock_prot_inuse_add(sock_net(sk), sk->sk_prot, -1);
write_unlock_bh(&h->lock);
}
EXPORT_SYMBOL_GPL(raw_unhash_sk);
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 80007c7..979c95f 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -231,7 +231,7 @@ gotit:
if (sk_unhashed(sk)) {
head = &udptable[snum & (UDP_HTABLE_SIZE - 1)];
sk_add_node(sk, head);
- sock_prot_inuse_add(sk->sk_prot, 1);
+ sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1);
}
error = 0;
fail:
diff --git a/net/ipv6/inet6_hashtables.c b/net/ipv6/inet6_hashtables.c
index 340c7d4..580014a 100644
--- a/net/ipv6/inet6_hashtables.c
+++ b/net/ipv6/inet6_hashtables.c
@@ -43,7 +43,7 @@ void __inet6_hash(struct sock *sk)
}
__sk_add_node(sk, list);
- sock_prot_inuse_add(sk->sk_prot, 1);
+ sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1);
write_unlock(lock);
}
EXPORT_SYMBOL(__inet6_hash);
@@ -204,7 +204,7 @@ unique:
BUG_TRAP(sk_unhashed(sk));
__sk_add_node(sk, &head->chain);
sk->sk_hash = hash;
- sock_prot_inuse_add(sk->sk_prot, 1);
+ sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1);
write_unlock(lock);
if (twp != NULL) {
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index d3d93d7..3ffbfa5 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -157,8 +157,8 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
struct inet_connection_sock *icsk = inet_csk(sk);
local_bh_disable();
- sock_prot_inuse_add(sk->sk_prot, -1);
- sock_prot_inuse_add(&tcp_prot, 1);
+ sock_prot_inuse_add(sock_net(sk), sk->sk_prot, -1);
+ sock_prot_inuse_add(sock_net(sk), &tcp_prot, 1);
local_bh_enable();
sk->sk_prot = &tcp_prot;
icsk->icsk_af_ops = &ipv4_specific;
@@ -171,8 +171,8 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
if (sk->sk_protocol == IPPROTO_UDPLITE)
prot = &udplite_prot;
local_bh_disable();
- sock_prot_inuse_add(sk->sk_prot, -1);
- sock_prot_inuse_add(prot, 1);
+ sock_prot_inuse_add(sock_net(sk), sk->sk_prot, -1);
+ sock_prot_inuse_add(sock_net(sk), prot, 1);
local_bh_enable();
sk->sk_prot = prot;
sk->sk_socket->ops = &inet_dgram_ops;
diff --git a/net/ipv6/proc.c b/net/ipv6/proc.c
index 364dc33..4b9d5a9 100644
--- a/net/ipv6/proc.c
+++ b/net/ipv6/proc.c
@@ -36,13 +36,13 @@ static struct proc_dir_entry *proc_net_devsnmp6;
static int sockstat6_seq_show(struct seq_file *seq, void *v)
{
seq_printf(seq, "TCP6: inuse %d\n",
- sock_prot_inuse_get(&tcpv6_prot));
+ sock_prot_inuse_get(&init_net, &tcpv6_prot));
seq_printf(seq, "UDP6: inuse %d\n",
- sock_prot_inuse_get(&udpv6_prot));
+ sock_prot_inuse_get(&init_net, &udpv6_prot));
seq_printf(seq, "UDPLITE6: inuse %d\n",
- sock_prot_inuse_get(&udplitev6_prot));
+ sock_prot_inuse_get(&init_net, &udplitev6_prot));
seq_printf(seq, "RAW6: inuse %d\n",
- sock_prot_inuse_get(&rawv6_prot));
+ sock_prot_inuse_get(&init_net, &rawv6_prot));
seq_printf(seq, "FRAG6: inuse %d memory %d\n",
ip6_frag_nqueues(&init_net), ip6_frag_mem(&init_net));
return 0;
--
1.5.3.4
|
|
|
[PATCH net-2.6.26 2/6][NETNS][SOCK]: Introduce per-net inuse counters. [message #28733 is a reply to message #28731] |
Thu, 27 March 2008 08:13   |
Pavel Emelianov
Messages: 1149 Registered: September 2006
|
Senior Member |
|
|
This is probably the most controversial part of the set.
The counters are stored in a per-cpu array on a struct net. To
index in this array the prot->inuse is declared as int and used.
Numbers (indices) to protos are generated with the appropriate
enum. I though about using some existing IPPROTO_XXX numbers for
protocols but they were too large (IPPROTO_RAW is 255) and did
not differ for ipv4 and ipv6 (there's no IP6PROTO_RAW, etc).
The sock_prot_inuse_(add|get) now use the net argument to
get the counter, but this all hides under CONFIG_NET_NS.
The sock_prot_inuse_(init|fini) are no-ops. DEFINE_PROTO_INUSE
is empty and REF_PROTO_INUSE assigns an index to a proto.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
---
include/net/net_namespace.h | 3 ++
include/net/sock.h | 35 +++++++++++++++++++++++++++++
net/core/sock.c | 52 +++++++++++++++++++++++++++++++++++++++++++
3 files changed, 90 insertions(+), 0 deletions(-)
diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index f8f3d1a..8a37be1 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -18,6 +18,7 @@ struct proc_dir_entry;
struct net_device;
struct sock;
struct ctl_table_header;
+struct net_prot_inuse;
struct net {
atomic_t count; /* To decided when the network
@@ -50,6 +51,8 @@ struct net {
struct ctl_table_header *sysctl_core_hdr;
int sysctl_somaxconn;
+ struct net_prot_inuse *inuse;
+
struct netns_packet packet;
struct netns_unix unx;
struct netns_ipv4 ipv4;
diff --git a/include/net/sock.h b/include/net/sock.h
index a57c58f..84a672c 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -562,8 +562,12 @@ struct proto {
/* Keeping track of sockets in use */
#ifdef CONFIG_PROC_FS
+#ifdef CONFIG_NET_NS
+ unsigned int inuse;
+#else
struct pcounter inuse;
#endif
+#endif
/* Memory pressure */
void (*enter_memory_pressure)(void);
@@ -635,6 +639,36 @@ static inline void sk_refcnt_debug_release(const struct sock *sk)
#ifdef CONFIG_PROC_FS
+#ifdef CONFIG_NET_NS
+enum {
+ NET_INUSE_dccp_v4,
+ NET_INUSE_dccp_v6,
+ NET_INUSE_raw,
+ NET_INUSE_tcp,
+ NET_INUSE_udp,
+ NET_INUSE_udplite,
+ NET_INUSE_rawv6,
+ NET_INUSE_tcpv6,
+ NET_INUSE_udpv6,
+ NET_INUSE_udplitev6,
+ NET_INUSE_sctp,
+ NET_INUSE_sctpv6,
+ NET_INUSE_NR,
+};
+
+# define DEFINE_PROTO_INUSE(NAME)
+# define REF_PROTO_INUSE(NAME) .inuse = NET_INUSE_##NAME,
+
+extern void sock_prot_inuse_add(struct net *net, struct proto *prot, int inc);
+static inline int sock_prot_inuse_init(struct proto *proto)
+{
+ return 0;
+}
+extern int sock_prot_inuse_get(struct net *net, struct proto *proto);
+static inline void sock_prot_inuse_free(struct proto *proto)
+{
+}
+#else /* ! CONFIG_NET_NS */
# define DEFINE_PROTO_INUSE(NAME) DEFINE_PCOUNTER(NAME)
# define REF_PROTO_INUSE(NAME) PCOUNTER_MEMBER_INITIALIZER(NAME, .inuse)
/* Called with local bh disabled */
@@ -655,6 +689,7 @@ static inline void sock_prot_inuse_free(struct proto *proto)
{
pcounter_free(&proto->inuse);
}
+#endif /* CONFIG_NET_NS */
#else
# define DEFINE_PROTO_INUSE(NAME)
# define REF_PROTO_INUSE(NAME)
diff --git a/net/core/sock.c b/net/core/sock.c
index 3ee9506..743f628 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2056,6 +2056,58 @@ void proto_unregister(struct proto *prot)
EXPORT_SYMBOL(proto_unregister);
#ifdef CONFIG_PROC_FS
+#ifdef CONFIG_NET_NS
+struct net_prot_inuse {
+ int val[NET_INUSE_NR];
+};
+
+void sock_prot_inuse_add(struct net *net, struct proto *prot, int val)
+{
+ per_cpu_ptr(net->inuse, get_cpu())->val[prot->inuse] += val;
+ put_cpu();
+}
+EXPORT_SYMBOL_GPL(sock_prot_inuse_add);
+
+int sock_prot_inuse_get(struct net *net, struct proto *prot)
+{
+ int cpu, idx, val;
+
+ idx = prot->inuse;
+ val = 0;
+ for_each_online_cpu(cpu)
+ val += per_cpu_ptr(net->inuse, cpu)->val[idx];
+
+ return val;
+}
+EXPORT_SYMBOL_GPL(sock_prot_inuse_get);
+
+static int sock_inuse_init_net(struct net *net)
+{
+ net->inuse = alloc_percpu(struct net_prot_inuse);
+ return net->inuse ? 0 : -ENOMEM;
+}
+
+static void sock_inuse_exit_net(struct net *net)
+{
+ free_percpu(net->inuse);
+}
+
+static struct pernet_operations net_inuse_ops = {
+ .init = sock_inuse_init_net,
+ .exit = sock_inuse_exit_net,
+};
+
+static __init int net_inuse_init(void)
+{
+ if (register_pernet_subsys(&net_inuse_ops))
+ panic("Cannot initialize net inuse counters");
+
+ return 0;
+}
+
+core_initcall(net_inuse_init);
+#endif
+
static void *proto_seq_start(struct seq_file *seq, loff_t *pos)
__acquires(proto_list_lock)
{
--
1.5.3.4
|
|
|
[PATCH net-2.6.26 3/6][NETNS][SOCK]: Swap unhash and net change for icmp and tcp service sockets. [message #28734 is a reply to message #28731] |
Thu, 27 March 2008 08:16   |
Pavel Emelianov
Messages: 1149 Registered: September 2006
|
Senior Member |
|
|
This is tricky.
The problem is that these sockets are created in init_net not to pin
the new one and then are moved to it with sk_change_net. _After_ this
the socket is unhashed.
The problem is that unhash will decrement the inuse counter for the
socket in a new net, but no in the one it was incremented (init_net).
This will lead to RAW inuse counter increment each time a new net is
created and will be negative for the new net.
Swap unhashing with net changing so that RAW inuse is decremented in
the proper namespace.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
---
net/ipv4/icmp.c | 2 +-
net/ipv6/icmp.c | 2 +-
net/ipv6/mcast.c | 2 +-
net/ipv6/ndisc.c | 2 +-
4 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index 3697e05..74eb856 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -1160,7 +1160,6 @@ int __net_init icmp_sk_init(struct net *net)
goto fail;
net->ipv4.icmp_sk[i] = sk = sock->sk;
- sk_change_net(sk, net);
sk->sk_allocation = GFP_ATOMIC;
@@ -1179,6 +1178,7 @@ int __net_init icmp_sk_init(struct net *net)
* packets.
*/
sk->sk_prot->unhash(sk);
+ sk_change_net(sk, net);
}
/* Control parameters for ECHO replies. */
diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c
index 63309d1..fec5d55 100644
--- a/net/ipv6/icmp.c
+++ b/net/ipv6/icmp.c
@@ -820,7 +820,6 @@ static int __net_init icmpv6_sk_init(struct net *net)
}
net->ipv6.icmp_sk[i] = sk = sock->sk;
- sk_change_net(sk, net);
sk->sk_allocation = GFP_ATOMIC;
/*
@@ -839,6 +838,7 @@ static int __net_init icmpv6_sk_init(struct net *net)
(2 * ((64 * 1024) + sizeof(struct sk_buff)));
sk->sk_prot->unhash(sk);
+ sk_change_net(sk, net);
}
return 0;
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index d810cff..8fbfffc 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -2686,9 +2686,9 @@ static int igmp6_net_init(struct net *net)
}
net->ipv6.igmp_sk = sk = sock->sk;
- sk_change_net(sk, net);
sk->sk_allocation = GFP_ATOMIC;
sk->sk_prot->unhash(sk);
+ sk_change_net(sk, net);
np = inet6_sk(sk);
np->hop_limit = 1;
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index b4d8e33..a13ec57 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1731,7 +1731,6 @@ static int ndisc_net_init(struct net *net)
}
net->ipv6.ndisc_sk = sk = sock->sk;
- sk_change_net(sk, net);
np = inet6_sk(sk);
sk->sk_allocation = GFP_ATOMIC;
@@ -1739,6 +1738,7 @@ static int ndisc_net_init(struct net *net)
/* Do not loopback ndisc messages */
np->mc_loop = 0;
sk->sk_prot->unhash(sk);
+ sk_change_net(sk, net);
return 0;
}
--
1.5.3.4
|
|
|
[PATCH net-2.6.26 4/6][NETNS][SOCK]: Create sockstat and sockstat6 files in net. [message #28735 is a reply to message #28731] |
Thu, 27 March 2008 08:17   |
Pavel Emelianov
Messages: 1149 Registered: September 2006
|
Senior Member |
|
|
This in nothing but register two new pernet ops (for ipv4 and ipv6)
and create the files in init callbacks.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
---
net/ipv4/proc.c | 27 ++++++++++++++++++++++-----
net/ipv6/proc.c | 28 +++++++++++++++++++++++-----
2 files changed, 45 insertions(+), 10 deletions(-)
diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index 8156c26..24ae23b 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -426,25 +426,42 @@ static const struct file_operations netstat_seq_fops = {
.release = single_release,
};
+static __net_init int ip_proc_init_net(struct net *net)
+{
+ if (!proc_net_fops_create(net, "sockstat", S_IRUGO, &sockstat_seq_fops))
+ return -ENOMEM;
+ return 0;
+}
+
+static __net_exit void ip_proc_exit_net(struct net *net)
+{
+ proc_net_remove(net, "sockstat");
+}
+
+static __net_initdata struct pernet_operations ip_proc_ops = {
+ .init = ip_proc_init_net,
+ .exit = ip_proc_exit_net,
+};
+
int __init ip_misc_proc_init(void)
{
int rc = 0;
+ if (register_pernet_subsys(&ip_proc_ops))
+ goto out_pernet;
+
if (!proc_net_fops_create(&init_net, "netstat", S_IRUGO, &netstat_seq_fops))
goto out_netstat;
if (!proc_net_fops_create(&init_net, "snmp", S_IRUGO, &snmp_seq_fops))
goto out_snmp;
-
- if (!proc_net_fops_create(&init_net, "sockstat", S_IRUGO, &sockstat_seq_fops))
- goto out_sockstat;
out:
return rc;
-out_sockstat:
- proc_net_remove(&init_net, "snmp");
out_snmp:
proc_net_remove(&init_net, "netstat");
out_netstat:
+ unregister_pernet_subsys(&ip_proc_ops);
+out_pernet:
rc = -ENOMEM;
goto out;
}
diff --git a/net/ipv6/proc.c b/net/ipv6/proc.c
index 4b9d5a9..6a34794 100644
--- a/net/ipv6/proc.c
+++ b/net/ipv6/proc.c
@@ -243,27 +243,44 @@ int snmp6_unregister_dev(struct inet6_dev *idev)
return 0;
}
+static int ipv6_proc_init_net(struct net *net)
+{
+ if (!proc_net_fops_create(net, "sockstat6", S_IRUGO, &sockstat6_seq_fops))
+ return -ENOMEM;
+ return 0;
+}
+
+static void ipv6_proc_exit_net(struct net *net)
+{
+ proc_net_remove(net, "sockstat6");
+}
+
+static struct pernet_operations ipv6_proc_ops = {
+ .init = ipv6_proc_init_net,
+ .exit = ipv6_proc_exit_net,
+};
+
int __init ipv6_misc_proc_init(void)
{
int rc = 0;
+ if (register_pernet_subsys(&ipv6_proc_ops))
+ goto proc_net_fail;
+
if (!proc_net_fops_create(&init_net, "snmp6", S_IRUGO, &snmp6_seq_fops))
goto proc_snmp6_fail;
proc_net_devsnmp6 = proc_mkdir("dev_snmp6", init_net.proc_net);
if (!proc_net_devsnmp6)
goto proc_dev_snmp6_fail;
-
- if (!proc_net_fops_create(&init_net, "sockstat6", S_IRUGO, &sockstat6_seq_fops))
- goto proc_sockstat6_fail;
out:
return rc;
-proc_sockstat6_fail:
- proc_net_remove(&init_net, "dev_snmp6");
proc_dev_snmp6_fail:
proc_net_remove(&init_net, "snmp6");
proc_snmp6_fail:
+ unregister_pernet_subsys(&ipv6_proc_ops);
+proc_net_fail:
rc = -ENOMEM;
goto out;
}
@@ -273,5 +290,6 @@ void ipv6_misc_proc_exit(void)
proc_net_remove(&init_net, "sockstat6");
proc_net_remove(&init_net, "dev_snmp6");
proc_net_remove(&init_net, "snmp6");
+ unregister_pernet_subsys(&ipv6_proc_ops);
}
--
1.5.3.4
|
|
|
[PATCH net-2.6.26 5/6][NETNS][IPV4]: Use proper net in sockstat file. [message #28736 is a reply to message #28731] |
Thu, 27 March 2008 08:19   |
Pavel Emelianov
Messages: 1149 Registered: September 2006
|
Senior Member |
|
|
Now show the per-net inuse values in sockstat file.
Besides, now we can see per-net fragments statistics in the
same file, since this stats is already per-net.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
---
net/ipv4/proc.c | 41 ++++++++++++++++++++++++++++++++++-------
1 files changed, 34 insertions(+), 7 deletions(-)
diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index 24ae23b..552169b 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -51,27 +51,54 @@
*/
static int sockstat_seq_show(struct seq_file *seq, void *v)
{
+ struct net *net = seq->private;
+
socket_seq_show(seq);
seq_printf(seq, "TCP: inuse %d orphan %d tw %d alloc %d mem %d\n",
- sock_prot_inuse_get(&init_net, &tcp_prot),
+ sock_prot_inuse_get(net, &tcp_prot),
atomic_read(&tcp_orphan_count),
tcp_death_row.tw_count, atomic_read(&tcp_sockets_allocated),
atomic_read(&tcp_memory_allocated));
seq_printf(seq, "UDP: inuse %d mem %d\n",
- sock_prot_inuse_get(&init_net, &udp_prot),
+ sock_prot_inuse_get(net, &udp_prot),
atomic_read(&udp_memory_allocated));
seq_printf(seq, "UDPLITE: inuse %d\n",
- sock_prot_inuse_get(&init_net, &udplite_prot));
+ sock_prot_inuse_get(net, &udplite_prot));
seq_printf(seq, "RAW: inuse %d\n",
- sock_prot_inuse_get(&init_net, &raw_prot));
+ sock_prot_inuse_get(net, &raw_prot));
seq_printf(seq, "FRAG: inuse %d memory %d\n",
- ip_frag_nqueues(&init_net), ip_frag_mem(&init_net));
+ ip_frag_nqueues(net), ip_frag_mem(net));
return 0;
}
static int sockstat_seq_open(struct inode *inode, struct file *file)
{
- return single_open(file, sockstat_seq_show, NULL);
+ int err;
+ struct net *net;
+
+ err = -ENXIO;
+ net = get_proc_net(inode);
+ if (net == NULL)
+ goto err_net;
+
+ err = single_open(file, sockstat_seq_show, net);
+ if (err < 0)
+ goto err_open;
+
+ return 0;
+
+err_open:
+ put_net(net);
+err_net:
+ return err;
+}
+
+static int sockstat_seq_release(struct inode *inode, struct file *file)
+{
+ struct net *net = ((struct seq_file *)file->private_data)->private;
+
+ put_net(net);
+ return single_release(inode, file);
}
static const struct file_operations sockstat_seq_fops = {
@@ -79,7 +106,7 @@ static const struct file_operations sockstat_seq_fops = {
.open = sockstat_seq_open,
.read = seq_read,
.llseek = seq_lseek,
- .release = single_release,
+ .release = sockstat_seq_release,
};
/* snmp items */
--
1.5.3.4
|
|
|
[PATCH net-2.6.26 6/6][NETNS][IPV6]: Use proper net in sockstat6 file. [message #28737 is a reply to message #28731] |
Thu, 27 March 2008 08:21   |
Pavel Emelianov
Messages: 1149 Registered: September 2006
|
Senior Member |
|
|
Do with sockstat file what we've already done for sockstat. Same good
side effect - ipv6 reassembling stats are now shown per-net.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
---
net/ipv6/proc.c | 41 ++++++++++++++++++++++++++++++++++-------
1 files changed, 34 insertions(+), 7 deletions(-)
diff --git a/net/ipv6/proc.c b/net/ipv6/proc.c
index 6a34794..783b30c 100644
--- a/net/ipv6/proc.c
+++ b/net/ipv6/proc.c
@@ -35,16 +35,18 @@ static struct proc_dir_entry *proc_net_devsnmp6;
static int sockstat6_seq_show(struct seq_file *seq, void *v)
{
+ struct net *net = seq->private;
+
seq_printf(seq, "TCP6: inuse %d\n",
- sock_prot_inuse_get(&init_net, &tcpv6_prot));
+ sock_prot_inuse_get(net, &tcpv6_prot));
seq_printf(seq, "UDP6: inuse %d\n",
- sock_prot_inuse_get(&init_net, &udpv6_prot));
+ sock_prot_inuse_get(net, &udpv6_prot));
seq_printf(seq, "UDPLITE6: inuse %d\n",
- sock_prot_inuse_get(&init_net, &udplitev6_prot));
+ sock_prot_inuse_get(net, &udplitev6_prot));
seq_printf(seq, "RAW6: inuse %d\n",
- sock_prot_inuse_get(&init_net, &rawv6_prot));
+ sock_prot_inuse_get(net, &rawv6_prot));
seq_printf(seq, "FRAG6: inuse %d memory %d\n",
- ip6_frag_nqueues(&init_net), ip6_frag_mem(&init_net));
+ ip6_frag_nqueues(net), ip6_frag_mem(net));
return 0;
}
@@ -183,7 +185,32 @@ static int snmp6_seq_show(struct seq_file *seq, void *v)
static int sockstat6_seq_open(struct inode *inode, struct file *file)
{
- return single_open(file, sockstat6_seq_show, NULL);
+ int err;
+ struct net *net;
+
+ err = -ENXIO;
+ net = get_proc_net(inode);
+ if (net == NULL)
+ goto err_net;
+
+ err = single_open(file, sockstat6_seq_show, net);
+ if (err < 0)
+ goto err_open;
+
+ return 0;
+
+err_open:
+ put_net(net);
+err_net:
+ return err;
+}
+
+static int sockstat6_seq_release(struct inode *inode, struct file *file)
+{
+ struct net *net = ((struct seq_file *)file->private_data)->private;
+
+ put_net(net);
+ return single_release(inode, file);
}
static const struct file_operations sockstat6_seq_fops = {
@@ -191,7 +218,7 @@ static const struct file_operations sockstat6_seq_fops = {
.open = sockstat6_seq_open,
.read = seq_read,
.llseek = seq_lseek,
- .release = single_release,
+ .release = sockstat6_seq_release,
};
static int snmp6_seq_open(struct inode *inode, struct file *file)
--
1.5.3.4
|
|
|
Re: [PATCH net-2.6.26 2/6][NETNS][SOCK]: Introduce per-net inuse counters. [message #28751 is a reply to message #28733] |
Thu, 27 March 2008 20:21   |
Eric Dumazet
Messages: 36 Registered: July 2006
|
Member |
|
|
Pavel Emelyanov a écrit :
> This is probably the most controversial part of the set.
>
> The counters are stored in a per-cpu array on a struct net. To
> index in this array the prot->inuse is declared as int and used.
>
> Numbers (indices) to protos are generated with the appropriate
> enum. I though about using some existing IPPROTO_XXX numbers for
> protocols but they were too large (IPPROTO_RAW is 255) and did
> not differ for ipv4 and ipv6 (there's no IP6PROTO_RAW, etc).
>
> The sock_prot_inuse_(add|get) now use the net argument to
> get the counter, but this all hides under CONFIG_NET_NS.
>
> The sock_prot_inuse_(init|fini) are no-ops. DEFINE_PROTO_INUSE
> is empty and REF_PROTO_INUSE assigns an index to a proto.
>
>
Given that :
1) pcounter should really go away from kernel, since Andrew disagree
with the implementation.
2) the need to enumerate all protocols in your enum, it seems ... ugly :)
3) alloc_percpu(struct net_prot_inuse) per net is nice because we dont
waste memory (if we had to use percpu_counters for each proto for example)
I suggest to :
1) not use pcounter anymore
2) change 'inuse' field to 'inuse_idx' or 'prot_num' that is
automatically allocated at proto_register time, instead statically at
compile time.
Just provide a big enough NET_INUSE_NR (might depend on IPV6 present or
not, static or module) to take into account all possible protocols.
struct net_prot_inuse {
int val[NET_INUSE_NR];
};
> Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
>
> ---
> include/net/net_namespace.h | 3 ++
> include/net/sock.h | 35 +++++++++++++++++++++++++++++
> net/core/sock.c | 52 +++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 90 insertions(+), 0 deletions(-)
>
> diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
> index f8f3d1a..8a37be1 100644
> --- a/include/net/net_namespace.h
> +++ b/include/net/net_namespace.h
> @@ -18,6 +18,7 @@ struct proc_dir_entry;
> struct net_device;
> struct sock;
> struct ctl_table_header;
> +struct net_prot_inuse;
>
> struct net {
> atomic_t count; /* To decided when the network
> @@ -50,6 +51,8 @@ struct net {
> struct ctl_table_header *sysctl_core_hdr;
> int sysctl_somaxconn;
>
> + struct net_prot_inuse *inuse;
> +
> struct netns_packet packet;
> struct netns_unix unx;
> struct netns_ipv4 ipv4;
> diff --git a/include/net/sock.h b/include/net/sock.h
> index a57c58f..84a672c 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -562,8 +562,12 @@ struct proto {
>
> /* Keeping track of sockets in use */
> #ifdef CONFIG_PROC_FS
> +#ifdef CONFIG_NET_NS
> + unsigned int inuse;
> +#else
> struct pcounter inuse;
> #endif
> +#endif
>
> /* Memory pressure */
> void (*enter_memory_pressure)(void);
> @@ -635,6 +639,36 @@ static inline void sk_refcnt_debug_release(const struct sock *sk)
>
>
> #ifdef CONFIG_PROC_FS
> +#ifdef CONFIG_NET_NS
> +enum {
> + NET_INUSE_dccp_v4,
> + NET_INUSE_dccp_v6,
> + NET_INUSE_raw,
> + NET_INUSE_tcp,
> + NET_INUSE_udp,
> + NET_INUSE_udplite,
> + NET_INUSE_rawv6,
> + NET_INUSE_tcpv6,
> + NET_INUSE_udpv6,
> + NET_INUSE_udplitev6,
> + NET_INUSE_sctp,
> + NET_INUSE_sctpv6,
> + NET_INUSE_NR,
> +};
> +
> +# define DEFINE_PROTO_INUSE(NAME)
> +# define REF_PROTO_INUSE(NAME) .inuse = NET_INUSE_##NAME,
> +
> +extern void sock_prot_inuse_add(struct net *net, struct proto *prot, int inc);
> +static inline int sock_prot_inuse_init(struct proto *proto)
> +{
> + return 0;
> +}
> +extern int sock_prot_inuse_get(struct net *net, struct proto *proto);
> +static inline void sock_prot_inuse_free(struct proto *proto)
> +{
> +}
> +#else /* ! CONFIG_NET_NS */
> # define DEFINE_PROTO_INUSE(NAME) DEFINE_PCOUNTER(NAME)
> # define REF_PROTO_INUSE(NAME) PCOUNTER_MEMBER_INITIALIZER(NAME, .inuse)
> /* Called with local bh disabled */
> @@ -655,6 +689,7 @@ static inline void sock_prot_inuse_free(struct proto *proto)
> {
> pcounter_free(&proto->inuse);
> }
> +#endif /* CONFIG_NET_NS */
> #else
> # define DEFINE_PROTO_INUSE(NAME)
> # define REF_PROTO_INUSE(NAME)
> diff --git a/net/core/sock.c b/net/core/sock.c
> index 3ee9506..743f628 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -2056,6 +2056,58 @@ void proto_unregister(struct proto *prot)
> EXPORT_SYMBOL(proto_unregister);
>
> #ifdef CONFIG_PROC_FS
> +#ifdef CONFIG_NET_NS
> +struct net_prot_inuse {
> + int val[NET_INUSE_NR];
> +};
> +
> +void sock_prot_inuse_add(struct net *net, struct proto *prot, int val)
> +{
> + per_cpu_ptr(net->inuse, get_cpu())->val[prot->inuse] += val;
> + put_cpu();
> +}
> +EXPORT_SYMBOL_GPL(sock_prot_inuse_add);
> +
> +int sock_prot_inuse_get(struct net *net, struct proto *prot)
> +{
> + int cpu, idx, val;
> +
> + idx = prot->inuse;
> + val = 0;
> + for_each_online_cpu(cpu)
> + val += per_cpu_ptr(net->inuse, cpu)->val[idx];
> +
> + return val;
> +}
> +EXPORT_SYMBOL_GPL(sock_prot_inuse_get);
> +
> +static int sock_inuse_init_net(struct net *net)
> +{
> + net->inuse = alloc_percpu(struct net_prot_inuse);
> + return net->inuse ? 0 : -ENOMEM;
> +}
> +
> +static void sock_inuse_exit_net(struct net *net)
> +{
> + free_percpu(net->inuse);
> +}
> +
> +static struct pernet_operations net_inuse_ops = {
> + .init = sock_inuse_init_net,
> + .exit = sock_inuse_exit_net,
> +};
> +
> +static __init int net_inuse_init(void)
> +{
> + if (register_pernet_subsys(&net_inuse_ops))
> + panic("Cannot initialize net inuse counters");
> +
> + return 0;
> +}
> +
> +core_initcall(net_inuse_init);
> +#endif
> +
> static void *proto_seq_start(struct seq_file *seq, loff_t *pos)
> __acquires(proto_list_lock)
> {
>
|
|
|
|
|
Re: [PATCH net-2.6.26 2/6][NETNS][SOCK]: Introduce per-net inuse counters. [message #28829 is a reply to message #28762] |
Fri, 28 March 2008 07:36  |
Eric Dumazet
Messages: 36 Registered: July 2006
|
Member |
|
|
Pavel Emelyanov a écrit :
> Eric Dumazet wrote:
>
>> Pavel Emelyanov a écrit :
>>
>>> This is probably the most controversial part of the set.
>>>
>>> The counters are stored in a per-cpu array on a struct net. To
>>> index in this array the prot->inuse is declared as int and used.
>>>
>>> Numbers (indices) to protos are generated with the appropriate
>>> enum. I though about using some existing IPPROTO_XXX numbers for
>>> protocols but they were too large (IPPROTO_RAW is 255) and did
>>> not differ for ipv4 and ipv6 (there's no IP6PROTO_RAW, etc).
>>>
>>> The sock_prot_inuse_(add|get) now use the net argument to
>>> get the counter, but this all hides under CONFIG_NET_NS.
>>>
>>> The sock_prot_inuse_(init|fini) are no-ops. DEFINE_PROTO_INUSE
>>> is empty and REF_PROTO_INUSE assigns an index to a proto.
>>>
>>>
>>>
>> Given that :
>>
>> 1) pcounter should really go away from kernel, since Andrew disagree
>> with the implementation.
>>
>
> Does this and ... (below)
>
>
>> 2) the need to enumerate all protocols in your enum, it seems ... ugly :)
>>
>
> Yup :(
>
>
>> 3) alloc_percpu(struct net_prot_inuse) per net is nice because we dont
>> waste memory (if we had to use percpu_counters for each proto for example)
>>
>> I suggest to :
>>
>> 1) not use pcounter anymore
>>
>
> ... this mean that I can rework the inuse accounting in order not
> to use pcounters at all even with CONFIG_NET_NS=n? :)
>
>
Absolutely
I had to do it eventually but my paid work is currently taking me 10-12
hours per day, so please be my guest :)
reference : http://kerneltrap.org/mailarchive/linux-kernel/2008/2/16/873754
>> 2) change 'inuse' field to 'inuse_idx' or 'prot_num' that is
>> automatically allocated at proto_register time, instead statically at
>> compile time.
>>
>
> Hm... I like this approach. Will do.
>
>
>> Just provide a big enough NET_INUSE_NR (might depend on IPV6 present or
>> not, static or module) to take into account all possible protocols.
>>
>
> Well, I though about this, but wasn't sure whether such heuristics
> would be accepted.
>
>
>> struct net_prot_inuse {
>> int val[NET_INUSE_NR];
>> };
>>
>>
>>
>
>
>
Thank you
|
|
|
Goto Forum:
Current Time: Fri Aug 01 17:43:52 GMT 2025
Total time taken to generate the page: 0.33869 seconds
|