Using the packet library

Includes

To use the library, you need to include the appropriate header files. This will probably happen
automatically when including the specific protocol headers. If needed, you may explicitly use
#include "Packets.hh"
explicitly.

\warning Never include any other Packets library header directly, only include \c
Packets.hh or one (or several) protocol headers from the protocol bundles.

Most every use of the packet library starts with some concrete packet typedef. Some fundamental
packet types are provided by \ref protocolbundle_default.

Creating a new packet

Building on those packet types, this example will build a complex packet: This will be an
Ethernet packet containing an IPv4 UDP packet. We begin by building the raw packet skeleton:
These commands create what is called an interpreter chain. This chain consists of four
interpreters. All interpreters reference the same data storage. This data storage is a random
access sequence which contains the data bytes of the packet.

\note The data structures allocated are automatically managed using reference counting. In this
    example we have four packet references each referencing the same underlying data
    structure. This data structure will be freed when the last reference to it goes out of
    scope.

The packet created above already has the correct UDP payload (The string "Hello, world!")
however all protocol fields are empty. We need to set those protocol fields:
udp->source() = 2000u;
udp->destination() = 2001u;
ip->ttl() = 255u;
ip->source() = senf::INet4Address::from_string("192.168.0.1");
ip->destination() = senf::INet4Address::from_string("192.168.0.2");
eth->source() = senf::MACAddress::from_string("00:11:22:33:44:55");
eth->destination() = senf::MACAddress::from_string("00:11:22:33:44:66");
eth.finalizeAll();
As seen above, packet fields are accessed using the <tt>-></tt> operator whereas other packet
facilities (like \c finalizeAll()) are directly accessed using the member operator. The field
values are simply set using appropriately named accessors. As a last step, the \c finalizeAll()
call will update all calculated fields (fields like next-protocol, header or payload length,
checksums etc). Now the packet is ready. We may now send it out using a packet socket
sock.bind( senf::LLSocketAddress("eth0"));
sock.write(eth.data());

Reading and parsing packets

The chain navigation functions are also used to parse a packet. Let's read an Ethernet packet
from a packet socket handle:
sock.bind( senf::LLSocketAddress("eth0"));
sock.read(packet.data(),0u);
This first creates an uninitialized Ethernet packet and then reads into this packet. We can now
parse this packet. Let's find out, whether this is a UDP packet destined to port 2001:
try {
senf::UDPPacket udp (packet.find<UDPPacket>());
if (udp->destination() == 2001u) {
// Voila ...
}
std::cerr << "Ooops !! Broken packet received\n";
std::cerr << "Not a udp packet\n";
}
TruncatedPacketException is thrown by <tt>udp->destination()</tt> if that field cannot be
accessed (that is it would be beyond the data read which means we have read a truncated
packet). More generally, whenever a field cannot be accessed because it would be out of bounds
of the data read, this exception is generated.

The raw data container

Every packet is based internally on a raw data container holding the packet data. This container
is accessed via senf::Packet::data() member.

This container is a random access container. It can be used like an ordinary STL container and
supports all the standard container members.
Packet p = ...;
// Insert 5 0x01 bytes
p.data().insert(p.data().begin()+5, 5, 0x01);
// Insert data from another container
p.data().insert(p.data().end(), other.begin(), other.end());
// Erase a single byte
p.data().erase(p.data().begin()+5);
// XOR byte 5 with 0xAA
p.data()[5] ^= 0xAA;
A packet consists of a list of interpreters (packet headers or protocols) which all reference
the same data container at different byte ranges. Each packet consists of the protocol header \e
plus the packets payload. This means, that the data container ranges of successive packets from
a single interpreter chain are nested.

Example: The packet created above (the Ethernet-IP-UDP packet with payload "Hello, world!") has
4 Interpreters: Ethernet, IPv4, UDP and the UDP payload data. The nested data containers lead to
the following structure
// The ethernet header has a size of 14 bytes
eth.data().begin() + 14 == ip.data().begin()
eth.data().end() == ip.data().end()
// The IP header has a size of 20 bytes and therefore
ip.data().begin() + 20 == udp.data().begin()
ip.data().end() == udp.data().end()
// The UDP header has a size of 8 bytes and thus
udp.data().begin() + 8 == payload.data().begin()
udp.data().end() == payload.data().end()
This nesting will (and must) always hold: The data range of a subsequent packet will always be
within the range of it's preceding packet.

\warning It is forbidden to change the data of a subsequent packet interpreter from the
    preceding packet even if the data container includes this data. If you do so, you may
    corrupt the data structure (especially when changing it's size).

Every operation on a packet is considered to be \e within this packet and \e without and
following packet. When inserting or erasing data, the data ranges are all adjusted
accordingly. So the following are \e not the same even though \c eth.end(), \c ip.end() and \c
udp.end() are identical.
eth.data().insert(eth.data().end(), 5, 0x01);
assert( eth.data().end() == ip.data().end() + 5
&& ip.data().end() == udp.data().end() );
// Or alternatively: (You could even use eth.data().end() here ... it's the same)
ip.data().insert(ip.data().end(), 5, 0x01);
assert( eth.data().end() == ip.data().end()
&& ip.data().end() == udp.data().end() + 5 );
\warning When accessing the packet data via the container interface, you may easily build
    invalid packets since the packet will not be validated against it's protocol.

Field access

When working with concrete protocols, the packet library provides direct access to all the
protocol information.
udp->source() = 2000u;
udp->destination() = 2001u;
ip->ttl() = 255u;
ip->source() = senf::INet4Address::from_string("192.168.0.1");
ip->destination() = senf::INet4Address::from_string("192.168.0.2");
eth->source() = senf::MACAddress::from_string("00:11:22:33:44:55");
eth->destination() = senf::MACAddress::from_string("00:11:22:33:44:66");
The protocol field members above do \e not return references, they return parser instances.
Protocol fields are accessed via parsers. A parser is a very lightweight class which points into
the raw packet data and converts between raw data bytes and it's interpreted value: For example
a senf::UInt16Parser accesses 2 bytes (in network byte order) and converts them to or from a 16
bit integer. There are a few properties about parsers which need to be understood:

\li Parsers are created only temporarily when needed. They are created when accessing a protocol
    field and are returned by value.

\li A parser never contains a value itself, it just references a packets data container.

\li Parsers can be built using other parsers and may have members which return further parsers.

The top-level interface to a packets protocol fields is provided by a protocol parser. This
protocol parser is a composite parser which has members to access the protocol fields (compare
with the example code above). Some protocol fields may be more complex than a simple value. In
this case, those accessors may return other composite parsers or collection parsers. Ultimately,
a value parser will be returned.

The simple value parsers which return plain values (integer numbers, network addresses etc) can
be used like those values and can also be assigned corresponding values. More complex parsers
don't allow simple assignment. However, they can always be copied from another parser <em>of the
same type</em> using the generalized parser assignment. This type of assignment also works for
simple parsers and is then identical to a normal assignment.
// Copy the complete udp parser from udp packet 2 to packet 1
udp1.parser() << udp2.parser();
Additionally, the parsers have a parser specific API which allows to manipulate or query the
value.

This is a very abstract description of the parser structure. For a more concrete description, we
need to differentiate between the different parser types

Simple fields (Value parsers)

We have already seen value parsers: These are the lowest level building blocks witch parse
numbers, addresses etc. They return some type of value and can be assigned such a value. More
formally, they have a \c value_type typedef member which gives the type of value they accept and
they have an overloaded \c value() member which is used to read or set the value. Some parsers
have additional functionality: The numeric parser for Example provide conversion and arithmetic
operators so they can be used like a numeric value.

If you have a value parser \c valueParser with type \c ValueParser, the following will always be
valid:
// You can read the value and assign it to a variable of the corresponding value_type
ValueParser::value_type v (valueParser.value());
// You can assign that value to the parser
valueParser.value(v);
// The assignment can also be done using the generic parser assignment
valueParser << v;

Composite and protocol parsers

A composite parser is a parser which just combines several other parsers into a structure: For
example, the senf::EthernetPacketParser has members \c destination(), \c source() and \c
type_length(). Those members return parsers again (in this case value parsers) to access the
protocol fields.

Composite parsers can be nested; A composite parser may be returned by another composite
parser. The protocol parser is a composite parser which defines the field for a specific
protocol header like Ethernet.

Collection parsers

Besides simple composites, the packet library has support for more complex collections.

\li The senf::ArrayParser allows to repeat an arbitrary parser a fixed number of times.
\li senf::VectorParser and senf::ListParser are two different types of lists with variable
    number of elements
\li The senf::VariantParser is a discriminated union: It will select one of several parsers
    depending on the value of a discriminant.

Vector and List Parsers

Remember, that a parser does \e not contain any data: It only points into the raw data
container. This is also true for the collection parsers. VectorParser and ListParser provide an
interface which looks like an STL container to access a sequence of elements.

We will use an \c MLDv2QueryPacket as an example (see <a
href="http://tools.ietf.org/html/rfc3810#section-5">RFC 3810</a>). Here an excerpt of the
relevant fields:

<table class="fields">
<tr><td>nrOfSources</td><td>Integer</td><td>Number of multicast sources in this packet</td></tr>
<tr><td>sources</td><td>Vector of IPv6 Addresses</td><td>Multicast sources</td></tr>
</table>

To demonstrate nested collections, we use the \c MLDv2ReportPacket as an example. The relevant
fields of this packet are;

<table class="fields">
<tr><td>nrOfRecords</td><td>Integer</td><td>Number of multicast address records</td></tr>
<tr><td>records</td><td>List of Records</td><td>List of multicast groups and sources</td></tr>
</table>

Each Record is a composite with the following relevant fields:

<table class="fields">
<tr><td>nrOfSources</td><td>Integer</td><td>Number of sources in this record</td></tr>
<tr><td>sources</td><td>Vector of IPv6 Addresses</td><td>Multicast sources</td></tr>
</table>

The first example will iterate over the sources in a \c MLDv2QueryPacket:
MLDv2QueryPacket mld = ...;
// Instantiate a collection wrapper for the source list
MLDv2QueryPacket::Parser::sources_t::container sources (mld->sources());
// Iterate over all the addresses in that list
i != sources.end(); ++i)
std::cout << *i << std::endl;
Beside other fields, the MLDv2Query consists of a list of source addresses. The \c sources()
member returns a VectorParser for these addresses. The collection parsers can only be accessed
completely using a container wrapper. The container wrapper type is available as the \c
container member of the collection parser, here it is \c
MLDv2QueryPacket::Parser::sources_t::container.

Using this wrapper, we can not only read the data, we can also manipulate the source list. Here
we copy a list of addresses from an \c std::vector into the packet:
std::vector<senf::INet6Address> addrs (...);
sources.resize(addrs.size());
std::copy(addrs.begin(), addrs.end(), sources.begin())
Collection parsers may be nested. To access a nested collection parser, a container wrapper must
be allocated for each level. An MLD Report (which is a composite parser) includes a list of
multicast address records called \c records(). Each record is again a composite which contains a
list of sources called \c sources():
MLDv2ReportPacket report = ...;
// Instantiate a collection wrapper for the list of records:
MLDv2ReportPacket::Parser::records_t::container_type records (report->records());
// Iterate over the multicast address records
i != records.end(); ++i) {
// Allocate a collection wrapper for the multicast address record
typedef MLDv2ReportPacket::Parser::records_t::value_type::sources_t Sources;
Sources::container_type sources (i->sources());
// Iterate over the sources in this record
for (Sources::container_type::iterator i (sources.begin());
i != sources.end(); ++i)
std::cout << *i << std::endl;
}
In this example we also see how to find the type of a parser or container wrapper.
\li Composite parsers have typedefs for each their fields with a \c _t postfix
\li The vector or list parsers have a \c value_type typedef which gives the type of the
    element.

By traversing this hierarchical structure, the types of all the fields can be found.

The container wrapper is only temporary (even though it has a longer lifetime than a
parser). Any change made to the packet not via the collection wrapper has the potential to
invalidate the wrapper if it changes the packets size.

\see
    senf::VectorParser / senf::VectorParser_Container Interface of the vector parser \n
    senf::ListParser / senf::ListParser_Container Interface of the list parser

The Variant Parser

The senf::VariantParser is a discriminated union of parsers. It is also used for optional fields
(using senf::VoidPacketParser as one possible variant which is a parser parsing nothing).  A
senf::VariantParser is not really a collection in the strict sense: It only ever contains one
element, the \e type of which is determined by the discriminant.

For Example, we look at the DTCP HELLO Packet as defined in the UDLR Protocol (see <a
href="http://tools.ietf.org/html/rfc3077">RFC 3077</a>)
DTCPHelloPacket hello (...);
if (hello->ipVersion() == 4) {
typedef DTCPHelloPacket::Parser::v4fbipList_t FBIPList;
FBIPList::container fbips (hello->v4fbipList());
for (FBIPList::container::iterator i (fbips.begin()); i != fbips.end(); ++i)
std::cout << *i << std::endl;
}
else { // if (hello->ipVersion() == 6)
typedef DTCPHelloPacket::Parser::v6fbipList_t FBIPList;
FBIPList::container fbips (hello->v6fbipList());
for (FBIPList::container::iterator i (fbips.begin()); i != fbips.end(); ++i)
std::cout << *i << std::endl;
}
This packet has a field \c ipVersion() which has a value of 4 or 6. Depending on the version,
the packet contains a list of IPv4 or IPv6 addresses. Only one of the fields \c v4fbipList() and
\c v6fbipList() is available at a time. Which one is decided by the value of \c
ipVersion(). Trying to access the wrong one will provoke undefined behavior.

Here we have used the variants discriminant (the \c ipVersion() field) to select, which field to
parse. More generically, every variant field should have a corresponding member to test for it's
existence:
if (hello->has_v4fbipList()) {
...
}
else { // if (hello->has_v6fbipList())
...
}
A variant can have more than 2 possible types and you can be sure, that exactly one type will be
accessible at any time.

It is not possible to change a variant by simply changing the discriminant:
// INVALID CODE:
hello->ipVersion() = 6;

Instead, for each variant field there is a special member which switches the variant to that type. After switching the type, the field will be in it's initialized (that is mostly zero) state.

std::vector<senf::INet6Address> addrs (...);
// Initialize the IPv6 list
hello->init_v6fbipList();
// Copy values into that list
DTCPHelloPacket::Parser::v6fbipList_t::container fbips (hello->v6fbipList());
fbips.resize(addrs.size());
std::copy(addrs.begin(), addrs.end(), fbips.begin());
\note Here we have documented the default variant interface as it is preferred. It is possible
    to define variants in a different way giving other names to the special members (\c has_\e
    name or \c init_\e name etc.). This must be documented with the composite or protocol parser
    which defines the variant.

Annotations

Sometimes we need to store additional data with a packet. Data, which is not part of the packet
itself but gives us some information about the packet: A timestamp, the interface the packet was
received on or other processing related information.

This type of information can be stored using the annotation interface. The following example
will read packet data and will store the read timestamp as a packet annotation.
struct Timestamp {
senf::ClockService::clock_t value;
};
std::ostream & operator<<(std::ostream & os, Timestamp const & tstamp) {
os << tstamp.value; return os;
}
sock.read(packet.data(), 0u);
packet.annotation<Timestamp>().value = senf::ClockService::now();
In the same way, the annotation can be used later
if (senf::ClockService::now() - packet.annotation<Timestamp>().value
// this packet is to old
// ...
}
It is very important to define a specific structure (or class or enum) type for each type of
annotation. \e Never directly store a fundamental type as an annotation: The name of the type is
used to look up the annotation, so you can store only one annotation for each built-in type. \c
typedef does not help since \c typedef does not introduce new type names, it only defines an
alias.

The annotation type must support the output \c operator<< for description purposes
(e.g. for the \ref senf::Packet::dump() "Packet::dump()" member).

Of course, the annotation structure can be arbitrary. However, one very important caveat: If the
annotation is not a POD type, it needs to inherit from senf::ComplexAnnotation. A type is POD,
if it is really just a bunch of bytes: No (non-static) members, no constructor or destructor and
no base classes and all it's members must be POD too. So the following annotation is complex
since \c std::string is not POD
struct ReadInfo : senf::ComplexAnnotation
{
std::string interface;
senf::ClockService::clock_t timestamp;
};
// ...
packet.annotation<ReadInfo>().interface = "eth0";
packet.annotation<ReadInfo>().timestamp = senf::ClockService::now();
// Or store a reference to the annotation for easier access
ReadInfo & info (packet.annotation<ReadInfo>());
if (info.interface == "eth0") {
// ...
}
Conceptually, all annotations always exist in every packet, there is no way to query, whether a
packet holds a specific annotation.

You should use annotations economically: Every annotation type used in your program will
allocate an annotation slot in \e all packet data structures. So don't use hundreds of different
annotation types if this is not really necessary: Reuse annotation types where possible or
aggregate data into larger annotation structures. The best solution is to use annotations only
for a small number of packet specific informations. If you really need to manage a train-load of
data together with the packet consider some other way (e.g. place the packet into another class
which holds that data).

\see senf::Packet::annotation() \n
    senf::dumpPacketAnnotationRegistry() for annotation debugging and optimization