Table of Contents
Services built on top of the core networking functionality are described here. They include, for example, the IEEE 802.11 subsystem and the IPsec subsystem. Additionally, networking pseudo devices and their operation is described.
The net80211 layer provides functionality required by
wireless cards. It is located under sys/net80211
.
The code is meant to be shared between FreeBSD and NetBSD and therefore
NetBSD-specific bits should be attempted to be kept in the source file
ieee80211_netbsd.c
(likewise, there is
ieee80211_freebsd.c
in FreeBSD).
The ieee80211 interfaces are documented in Chapter 9 of the NetBSD manual pages. This document does not attempt to duplicate information already available there.
The responsibilities of the net80211 layer are the following:
MAC address based access control
crypto
input and output frame handling
node management
radiotap framework for bpf/tcpdump
rate adaption
supplementary routines such a kernel diagnostic output, conversion functions and resource management
The ieee80211 layer positions itself logically between
the device driver and the ethernet module, although for
transmission it is called indirectly by the device driver instead
of control passing straight through it. For input, the ieee80211
layer receives packets from the device driver, strips any
information useful only to wireless devices and in case of data
payload proceeds to hand the Ethernet frame up to
ether_input
.
The way to describe an ieee80211 device to the ieee80211 layer
is by using a struct ieee80211com, declared
in sys/net80211/ieee80211_var.h
.
It is used to register a device to the ieee80211 from the
device driver by calling ieee80211_ifattach
.
Fields to be filled out by the caller include the
underlying struct ifnet pointer, function
callbacks and device capability flags. If a device is detached,
the ieee80211 layer can be notified with the call
ieee80211_ifdetach
.
A node represents another entity in the wireless network.
It is usually a base station when operating in BSS mode, but can
also represent entities in an ad-hoc network. A node is described
by struct ieee80211_node, declared in
sys/net80211/ieee80211_node.h
. Examples
of fields contained in the structure include the node unicast
encryption key, current transmit power, the negotiated rate set
and various statistics.
A list of all the nodes seen by a certain device is
kept in the struct ieee80211com
instance in the field ic_sta
and
can be manipulated with the helper functions provided in
sys/net80211/ieee80211_node.c
. The
functions include, for example, methods to scan for
nodes, iterate through the nodelist and functionality
for maintaining the network structure.
Crypto support enables the encryption and decryption of
the network frames. It provides a framework for multiple
encryption methods such as WEP and null crypto. Crypto keys
are mostly managed through the ioctl interface and inside
the ieee80211 layer, and the only time drivers need to worry
about them is in the send routine when they must test for
an encapsulation requirement and call
ieee80211_crypto_encap
if necessary.
IPSec is collection of security-related protocols: Authentication Header (AH) and Encapsulated Security Payload (ESP). This section, however, is TODO.
A networking pseudo-device does not have a physical hardware component backing the device. Pseudo devices can be roughly divided into two different categories: pseudo-devices which behave like an interface and devices which are controlled through a device node. An interface built on top of a pseudo-device acts completely the same as an interface with hardware backing the interface.
Since there is no backing hardware involved, most of these interfaces can be generated and destroyed dynamically at runtime. Interfaces that can dynamically de/allocate themselves known as cloning interfaces. They are created and destroyed by the ifconfig tool by using ifconfig create and ifconfig destroy, respectively. Additionally, the interface names available for cloning can be requested by ifconfig -C.
The list of networking pseudo-devices with short descriptions is presented below.
Table 5.1. Available networking pseudo-devices
name | description |
---|---|
bpfilter | Berkeley Packet filter, bpf(4). Can be used to capture network traffic matching certain pattens. |
loop | The loopback network device, lo(4). All output is directed as input for the loopback interface. |
npf | NetBSD Packet Filter, npf(7). Used to filter IP traffic. |
ppp | Point-to-Point Protocol, ppp(4). This interface allows to create point-to-point network links. |
pppoe | Point-to-Point Protocol Over Ethernet, pppoe(4). Encapsulates PPP inside Ethernet frames. |
sl | Serial Line IP, sl(4). Used to transport IP over a serial connection. |
strip | STarmode Radio IP, strip(4). Similar to SLIP, except uses the STRIP protocol instead of SLIP. |
tap | A virtual ethernet device, tap(4). The tap[n] interface is attached to /dev/tap[n]. I/O on the device node maps to Ethernet traffic on the interface and vice versa. |
tun | Tunneling network device, tun(4). Similar to tap, except that the packets handled are network layer packets instead of Ethernet frames. |
gre | Encapsulating network device, gre(4). The gre device can encapsulate datagrams into IP packets in multiple formats, e.g. IP protocols 47 and 55, and tunnels the packets over an IP network. |
gif | the generic tunneling interface, gif(4). gif can encapsulate and tunnel IPv{4,6} over IPv{4,6}. |
faith | IPv6-to-IPv4 TCP relay interface, faith(4). Can use used, in conjunction with faithd, to relay TCPv6 traffic to IPv4 addresses. |
stf | Six To Four tunneling interface, stf(4). Tunnels IPv6 over IPv4, RFC3056. |
vlan | IEEE 802.1Q Virtual LAN, vlan(4). Supports Virtual LAN interfaces which can be attached to physical interfaces and then be used to send virtual lan tagged traffic. |
bridge | Bridging device, bridge(4). The bridging device is used to attach IEEE 802 networks together on the link layer. |
In the distant past, the number of pseudo-device instances for each device type was hardcoded into the kernel configuration and fixed at compile-time. This, while the fastest method for implementation, was wasteful because it allocated resources based on a compile-decision and limiting because it required recompilation when wanting to use the n+1'th device. Cloning devices allow resource allocation and resource release dynamically at runtime.
This discussion will concentrate on cloning interfaces, i.e. cloners which are created using ifconfig.
Most of the work behind in cloning in interface is handled
in common code in net/if.c
in the
if_clone
family of functions. Cloning
interfaces are registered and deregistered using
if_clone_attach
and
if_clone_detach
, respectively. These
calls are usually made in the pseudo-device attach and
detach routines. Both functions take as an argument a pointer
to the if_clone structure, which
identifies the cloning interface.
struct if_clone is initialized by using the IF_CLONE_INITIALIZER macro:
struct if_clone cloner = IF_CLONE_INITIALIZER("clonername", cloner_create, cloner_destroy);
The parameters cloner_create
and
cloner_destroy
are pointers to functions
to be called from the common code. Create is responsible
for allocating resources and attaching the interface to the
rest of the framework while destroy is responsible for the
opposite. Of course, destroy must preserve normal system
semantics and not remove resources which are still in use.
This is relevant with for example the tun device, where users
open the /dev/tun[n]
when using
tun interface n.
A create or destroy operation coming from userspace
passes through sys_ioctl
and
soo_ioctl
before landing at
ifioctl
. The correct struct
if_clone is found by searching the names of the
attached cloners, i.e. doing string comparison. After this
the function pointers in the structure are used.
The operation and attachment to the rest of the operating system kernel for a pseudo device depend greatly on the device's functionality. The following are some examples:
The vlan interface is always attached to a
parent device when operational. It sends packets by vlan
encapsulating them and enqueueing them onto the parent
device packet send queue. It receives packets from
ether_input
, removes the vlan
encapsulation and passes it back to the parent interface's
if_input
routine.
The gif interface registers itself as an interface to the
networking stack by using if_attach
and
gif_output
as the output routine. This
output routine is then called from the network layer output
routine, ip_output
and the output
routine eventually calls ip_output
again
once the packet has been encapsulated. Input is handled
by in_gif_input
, which is called via
the struct protosw input routine
from the encapsulated packet input function.
The tap interface registers itself as a network interface
using if_attach
. When a packet is
sent through it, it notifies the device node listener
or, if corresponding device is not open, simply drops
all packets. When a packet is written to the device node,
it gets injected into the networking stack through
the if_input
routine in the associated
struct ifnet.
In principle there are two pseudo devices involved with packet filtering: npf is involved in filtering network traffic, and bpf is an interface to capture and access raw network traffic. All will be discussed briefly from the point of view of their attachment to rest of the kernel; the packet inspection and modification engines they implement are beyond the scope of this document.
npf is implemented using pfil hooks, while bpf is implemented as a tap in all the network drivers.
pfil is a purely in-kernel interface to support packet filtering hooks. Packet filters can register hooks which should be called when packet processing taken place; in its essence pfil is a list of callbacks for certain events. In addition to being able to register a filter for incoming and outgoing packets, pfil provides support for interface attach/detach and address change notifications. pfil is described on the pfil(9) manual page and is used by NPF to hook to the packet stream for implementing firewalls and NAT.
NPF, found in sys/net/npf
,
is a multiplatform packet filtering device useful for
creating a firewall and Network Address Translation (NAT).
Operation is controlled through device nodes with userland tools.
NPF is documented on multiple different manual pages:
npf(7), npf.conf(5) and npfctl(8) are good starting points.
The Berkeley Packet Filter (bpf)
(sys/net/bpf.c
) provides link layer
access to data available on the network through interfaces
attached to the system. bpf is used by opening a device
node, /dev/bpf
and issuing
ioctl
's to control the operation of
the device. A popular example of a tool using bpf is
tcpdump.
The device /dev/bpf
is a cloning
device, meaning it can be opened multiple times. It is in
principle similar to a cloning interface, except bpf provides
no network interface, only a method to open the same device
multiple times.
To capture network traffic, a bpf device must be attached
to an interface. The traffic on this interface is then passed
to bpf to evaluation. For attaching an interface to an open
bpf device, the ioctl BIOCSETIF is used.
The interface is identified by passing a
struct ifreq, which contains the
interface name in ASCII encoding. This is used to find the
interface from the kernel tables. bpf
registers itself to the interfaces struct
ifnet field if_bpf
to inform the system that it is interested about traffic on
this particular interface.
The listener can also pass a set of filtering rules to capture
only certain packets, for example ones matching a given host and port
combination.
bpf captures packets by supplying a tapping interface,
bpf_tap
-functions, to link layer drivers
and relying on the drivers to always pass packets to it. Drivers
honor this request and commonly have code which, along both the
input and output paths, does:
#if NBPFILTER > 0 if (ifp->if_bpf) bpf_mtap(ifp->if_bpf, m0); #endif
This passes the mbuf to the bpf for inspection. bpf inspects the data and decides is anyone listening to this particular interface is interested in it. The filter inspecting the data is highly optimized to minimize time spent inspecting each packet. If the filter matches, the packet is copied to await being read from the device.
The bpf tapping feature looks very much like the interfaces provided by pfil, so a valid is question is debating the necessity of both. Even though they provide similar services, their functionality is disjoint. The bpf mtap wants to access packets right off the wire without any alteration and possibly copy it for further use. Callers linking into pfil want to modify and possibly drop packets.