standard.tex

\documentclass{article}

\usepackage{tabularx, tikz, amsmath, gensymb, titlepic, graphicx}
\usepackage[margin=1in]{geometry}

\usetikzlibrary{shapes,arrows,fit}
\tikzstyle{block} = [draw, rectangle, text width=2cm, text centered, minimum height=1.2cm, node distance=3cm]

\begin{document}

\title{CATS - Communication And Tracking System \\ \large Protocol Standard}
\author{scd31.com}
\titlepic{\includegraphics{logo/logo.png}}
\maketitle

\newpage

\begin{abstract}
  APRS, while an excellent tool, is over 20 years old, and is beginning to show its age. The protocol itself is overly complex and not particularly efficient. The modulation on which it rests - AFSK over FM - is also sub-optimal. The purpose of CATS is to design a modernized solution that overcomes these challenges by utilizing the latest technologies. CATS aims to deliver enhanced efficiency, simplicity in implementation, and improved performance compared to APRS.
\end{abstract}

\newpage

\tableofcontents

\newpage

\section{Introduction}

\subsection{Features of CATS}

CATS has many exciting features over legacy APRS:

\begin{itemize}
\item 2-FSK used instead of FM-AFSK for a 12dB coding gain
\item Forward Error Correction (LDPC), further improving the coding gain
\item Bit-rate increased from 1200 bits/s to 9600 bits/s
\item Maximum packet size 8191 bytes
\item Data whitening used to prevent receiver desynchronization
\item 70cm band is used instead of 2m by default
  \begin{itemize}
  \item Since 2M is more common for voice, it makes it easy to add CATS to an existing setup while still being able to use 2M repeaters - with full duplexing, so that CATS transmissions do not cause loss of reception on 2M
  \end{itemize}
\item FELINET, which is the CATS equivalent of APRS-IS, will push and pull messages from APRS-IS to maintain some compatibility, at least initially. The eventual goal is to drop this link.
\end{itemize}

\subsection{The Pipeline}

CATS packets are created from raw information using the following flow. For reception, the pipeline is reversed. Note that all multi-byte fields are encoded little-endian unless otherwise indicated. \\

\begin{tikzpicture}

    \node [block, name=text1] {Raw data};
    \node [block, right of=text1] (text2) {Whiskers};
    \node [block, right of=text2] (text3) {CRC};
    \node [block, right of=text3] (text4) {LDPC};
    \node [block, below of=text4] (text5) {Whitener};
    \node [block, below of=text3] (text6) {Interleaver};
    \node [block, below of=text2] (text7) {Header};
    \node [block, below of=text1] (text8) {RF};

    \draw [->] (text1) -- (text2);
    \draw [->] (text2) -- (text3);
    \draw [->] (text3) -- (text4);
    \draw [->] (text4) -- (text5);
    \draw [->] (text5) -- (text6);
    \draw [->] (text6) -- (text7);
    \draw [->] (text7) -- (text8);

\end{tikzpicture}

\subsection{Invalid CATS Packets}

The APRS standard is plagued by the existence of invalid packets. A good APRS parser not only handles valid packets - it also handles slightly malformed ones. Different parsers may handle malformed packets differently, since by definition, they are not in a unified standard. This complicates new implementations and leads to fragmentation of the ecosystem.

CATS aims to avoid this problem. A packet that does not confirm rigidly to the standard must have its contents discarded. Even if only one whisker is malformed, all whiskers must be ignored. It is considered an implementation bug if this does not occur. Note that unknown whisker types are not invalid. This allows for future whisker types to be added without breaking compatibility with existing implementations.

\section{Whiskers}

\subsection{Overview}

One main feature of CATS is that packets are constructed from Whiskers. Each Whisker represents one possible attribute of data. One CATS packet can have up to 255 Whiskers, so long as the 8191-byte limit is respected. There are many types of Whiskers:

\begin{table}[!ht]
\setlength\extrarowheight{2pt}
\begin{tabularx}{\textwidth}{|X|X|}
  \hline
  \textbf{Whisker Type} & \textbf{Notes} \\
  \hline
  Identification & Contains the source callsign \\
  \hline
  Timestamp & \\
  \hline
  GPS & \\
  \hline
  Comment & \\
  \hline
  Route & The path that the packet took because the source station and the current station \\
  \hline
  Destination & Who the CATS packet is destined for \\
  \hline
  Arbitrary & Arbitrary array of bytes, intentionally not understood by CATS \\
  \hline
\end{tabularx}
\end{table}

\subsection {Structure}

Each Whisker has the same general structure.

\begin{tabular}{|l|l|l|}
  \hline
  \textbf{Byte offset} & \textbf{Length} & \textbf{Data} \\
  \hline
  0 & 1 & Whisker type \\
  \hline
  1 & 1 & Whisker length in bytes $N$ \\
  \hline
  2 & $N$ & Whisker data \\
  \hline
\end{tabular} \\

If there are multiple Whiskers in a single cluster, their bytes are concatenated together. Since the Whisker length is encoded as a single byte, it means the actual data (not including the type or length bytes) is a maximum of 255 bytes per Whisker. Also, because the length is encoded in the same place in each Whisker, a decoder does not need to support every type. If it sees a type it does not recognize, it can skip the next $N$ bytes and continue onto the next Whisker.

\section{CRC}

\subsection{Overview}

A CRC checksum is used to distinguish when a CATS packet is corrupt. In this case, the packet should be discarded.

\subsection{Structure}

The CRC algorithm used is 16-bit IBM SDLC.

\begin{tabular}{|l|l|l|}
  \hline
  \textbf{Byte offset} & \textbf{Length} & \textbf{Data} \\
  \hline
  0 & $N$ & Compressed Data \\
  \hline
  $N$ & 2 & CRC checksum \\
  \hline
\end{tabular}

\section{LDPC}

\subsection{Overview}

In APRS, a single flipped bit results in a ruined packet. In CATS, Forward Error Correction (FEC) is used to make the protocol tolerant of bit flips. LDPC has been chosen, as it's a mature standard which operates close to the Shannon Limit.

\subsection{Algorithm}

Like APRS, CATS packets are variable-length. This is advantageous - packets that contain small amounts of data don't need to be padded. This increases efficiency. However, LDPC encoding requires data to be of a constant width. The solution is to break our data into chunks, so that we can LDPC-encode each chunk individually.

The structure of the data chunks isn't encoded in the packet. Instead, the data is broken into chunks deterministically. The same algorithm can be used by the receiver to break the chunks apart, decode them, and combine the resulting data together. The valid LDPC codes are TC128, TC256, TC512, TM2048, and TM8192, all with $r=1/2$. These codes are specified in detail in CCSDS document 231.1-0-1. The algorithm works as follows:

\begin{enumerate}
\item Pick the largest code with an input size $M \le N$
\item Take the first M bytes of data and pass them through the LDPC encoder
\item Set $N \leftarrow M - N$
\item Go to step 1, unless $N \le 32$
\item Pad the remaining data with 0xAA to make it 8 bytes long
\item Run the padded data through TC128 encoder
\end{enumerate}

Note that on the final step, the 0xAA padding doesn't end up in the final packet. Only the original data and the parity data is included. \\

For example, imagine our data after adding the parity CRC is 2349 bytes long. The algorithm would work as follows:
\begin{enumerate}
\item Encode the first 512 bytes with TM8192 (1837 bytes remaining)
\item Encode the next 512 bytes with TM8192 (1325 bytes remaining)
\item Encode the next 512 bytes with TM8192 (813 bytes remaining)
\item Encode the next 512 bytes with TM8192 (301 bytes remaining)
\item Encode the next 128 bytes with TM2048 (173 bytes remaining)
\item Encode the next 128 bytes with TM2048 (45 bytes remaining)
\item Encode the next 32 bytes with TC512 (13 bytes remaining)
\item Encode the next 8 bytes with TC128 (5 bytes remaining)
\item Pad the remaining 5 bytes out to 8 bytes with 0xAA 0xAA 0xAA, then encode with TC128
\end{enumerate}

Note that the final packet length will be
\begin{align*}
  L &= 2 + N + (512 * 4) + (128 * 2) + 32 + (8 * 2)\\
  &= 4703
\end{align*}

\subsection{Structure}

\begin{tabularx}{\textwidth}{|l|l|X|}
  \hline
  \textbf{Byte offset} & \textbf{Length} & \textbf{Data} \\
  \hline
  0 & 2 & Number of bytes after adding CRC parity (pre-LDPC encoding) \\
  \hline
  2 & A & First data chunk \\
  \hline
  2 + A & B & LDPC parity for first chunk \\
  \hline
  2 + A + B & C & Second chunk \\
  \hline
  2 + A + B + C & D & LDPC parity for second chunk \\
  \hline
  ... & ... & ... \\
  \hline
\end{tabularx}

\section{Whitener}

\subsection{Overview}

Data whitening is a technique employed to enhance the randomness of data, making it resemble white noise to a greater extent. This process offers several advantages, with the elimination of prolonged sequences of repetitive bits being the most significant one. Ensuring such elimination is crucial to prevent synchronization issues between the transmitter and receiver.

In CATS, each 16 bytes of data is XORed with the hex string ``e9cf 6720 191a 07dc c072 7997 51f7 dd93'' to whiten the data.

\section{Interleaver}

\subsection{Overview}

LDPC makes our CATS packet robust against single bit flips. However, it is not particularly robust against burst errors. This is because a burst will flip several bits in only one (or possibly two) of our LDPC chunks, while the rest remain untouched. If too many bits flip in a single chunk, it will not be possible to recover. Instead, it is better to split this burst across as many chunks as possible, to lower the likelihood that a chunk will be unrecoverable. The interleaver is responsible for reordering the CATS bits in such a way to be robust against bit flips.

In CATS, a 32 bit block interleaver is used. For an bit string input $b_0 b_1 b_2 b_3 ... b_n$, the output from the interleaver is $b_0 b_{32} b_{64} ... b_1 b_{33} b_{65} ... b_3 b_{34} b_{66} ... b_n$. The receiver is responsible for de-interleaving the bit stream to get the original data.

\section{Header}

\subsection{Overview}

Before transmission, a header must be affixed to the packet. This is used so that the receiver can detect the packet and synchronize with the transmitter's phase. It's also used so that the receiver knows how many bytes to listen for before attempting to decode.

\subsection{Structure}

\begin{tabular}{|l|l|l|}
  \hline
  \textbf{Byte offset} & \textbf{Length} & \textbf{Data} \\
  \hline
  0 & 4 & Preamble (0x55 0x55 0x55 0x55) \\
  \hline
  4 & 4 & Sync word (0xAB 0xCD 0xEF 0x12) \\
  \hline
  8 & 2 & Data length in bytes, $L$ \\
  \hline
  10 & $L$ & LDPC-encoded data \\
  \hline
\end{tabular}

Note that $L \le 8191$.

\section{RF}

\subsection{Overview}

After affixing the header, the packet is ready to be transmitted as RF. The following modulation parameters must be used:

\begin{itemize}
\item Modulation Scheme: 2-FSK
\item Bit rate: 9600 bits/s
\item Deviation: 4.8 KHz
\item Frequency: 433.600 MHz
\end{itemize}

\section{Whisker Types}

\subsection{Identification}

As an example of CATS' extreme flexibility, even the packet's identification exists in a whisker. When using CATS as intended, this whisker will likely always be included. There are some situations where it may not be, however. For example, to save data it could be omitted in some packets, as long as it is included often enough for legal station identification. It could also be omitted if CATS is being used outside of amateur frequencies. \textbf{A valid CATS packet contains a maximum of one identification whisker.}

A callsign is an arbitrary string of UTF-8 up to 254 bytes long. In most cases, it should be much shorter. One remaining byte at the end is for the SSID. This allows up to 255 different stations to share a single callsign.

\subsubsection{Structure}

\begin{tabularx}{\textwidth}{|l|l|l|X|}
  \hline
  \textbf{Byte offset} & \textbf{Length} & \textbf{Value} & \textbf{Description} \\
  \hline
  0 & 1 & 0x00 & Whisker Type \\
  \hline
  1 & 1 & $N$ & Whisker Length in bytes \\
  \hline
  2 & $N - 1$ & & Callsign data \\
  \hline
  $2 + N$ & 1 & & SSID \\
  \hline
\end{tabularx}

\subsection{Timestamp}

The timestamp whisker is used to mark the time that a CATS packet occurred at. It is encoded in the Unix timestamp - that is, seconds since January 1, 1970, in the UTC timezone.

\subsubsection{Structure}

\begin{tabular}{|l|l|l|l|}
  \hline
  \textbf{Byte offset} & \textbf{Length} & \textbf{Value} & \textbf{Description} \\
  \hline
  0 & 1 & 0x01 & Whisker Type \\
  \hline
  1 & 1 & 5 & Whisker Length in bytes \\
  \hline
  2 & 5 & & Current time (Unix timestamp) \\
  \hline
\end{tabular}

\subsection{GPS}

The GPS whisker is used to encode latitude, longitude, altitude, precision, heading, and speed. The latitude, longitude, and altitude define a precise point in 3D space. An error is specified in meters. This defines a sphere in space, of which the actual location is somewhere inside. If the latitude, longitude, and altitude are precise, the error should be 0. \textbf{A valid CATS packet contains a maximum of one GPS whisker.}

\subsubsection{Structure}

{\def\arraystretch{1.3}
\begin{tabularx}{\textwidth}{|l|l|l|X|}
  \hline
  \textbf{Byte offset} & \textbf{Length} & \textbf{Value} & \textbf{Description} \\
  \hline
  0 & 1 & 0x02 & Whisker Type \\
  \hline
  1 & 1 & 14 & Whisker Length in bytes \\
  \hline
  2 & 4 & & Latitude (signed integer, ${2^{31} \over 90 \degree}x$) \\
  \hline
  6 & 4 & & Longitude (signed integer, ${2^{31} \over 180 \degree}x$) \\
  \hline
  10 & 2 & & Altitude (16-bit float, meters) \\
  \hline
  12 & 1 & & Maximum location error (unsigned integer, meters) \\
  \hline
  13 & 1 & & Heading (unsigned integer, radians$* {128\over{\pi}}$). Clockwise relative to north \\
  \hline
  14 & 2 & & Speed (16-bit float, meters per second) \\
  \hline
\end{tabularx}
}

\subsection{Comment}

The comment whisker specifies a textual comment on the CATS packet. The content is an arbitrary byte string. \textbf{A valid CATS packet may contain one or more comment whiskers.} In this case, their content should be concatenated together. This allows for comments longer than 255 bytes. After concatenation, the result must be valid UTF-8. Note that an individual comment whisker may or may not be valid UTF-8.

\subsubsection{Structure}

\begin{tabular}{|l|l|l|l|}
  \hline
  \textbf{Byte offset} & \textbf{Length} & \textbf{Value} & \textbf{Description} \\
  \hline
  0 & 1 & 0x03 & Whisker Type \\
  \hline
  1 & 1 & $N$ & Whisker Length in bytes \\
  \hline
  2 & $N$ & & Comment content \\
  \hline
\end{tabular}

\subsection{Route}

The route of a CATS packet is an ordered list of stations which have digipeated the packet. Each station consists of a callsign of arbitrary length, as well as a 1-byte SSID. The callsign must be valid UTF-8. This is identical to what is allowed in an identification whisker. \textbf{A valid CATS packet contains a maximum of one route whisker.}

Each callsign is represented as a set of bytes, which are valid UTF-8. After the last byte of the callsign should be a single byte of 0xFF, which acts as a delimiter. The next byte is the SSID for that callsign. Following that is the beginning of the next callsign. To represent a hop over the Internet, the byte 0xFE is used. This byte must be put where the start of a callsign is detected. The following byte will be used as the start of the callsign for the following hop. Multiple 0xFE bytes may be used in succession to represent multiple hops over the Internet.

Instead of the 0xFF delimiter, 0xFD may be used. This signifies that the previous callsign is not part of the route yet. The first callsign in the route with a 0xFD delimiter must be the next digipeater. No other digipeaters should repeat the packet. A route may consist of a mixture of 0xFF and 0xFD delimiters, as long as there are no 0xFF delimiters after the first 0xFD delimiter.

A callsign with a 0xFF delimiter is known as a ``past'' hop. A callsign with a 0xFD delimiter is known as a ``future'' hop.

The route whisker also contains a maximum amount of hops. This does not include Internet links. The maximum hops is not decremented - instead, the number of callsigns in the route whisker is compared to the maximum number of hops to determine if a packet should be digipeated.

When digipeating, the callsign of the digipeater should be added to the route. The CATS packet should not be modified beyond this.

\subsubsection{Structure}

\begin{tabular}{|l|l|l|l|}
  \hline
  \textbf{Byte offset} & \textbf{Length} & \textbf{Value} & \textbf{Description} \\
  \hline
  0 & 1 & 0x04 & Whisker Type \\
  \hline
  1 & 1 & $N$ & Whisker Length in bytes \\
  \hline
  2 & 1 & & Maximum allowable digipeats \\
  \hline
  3 & $N - 1$ & & Callsigns and SSIDs \\
  \hline
\end{tabular}

\subsubsection{Examples}

\begin{tabularx}{\textwidth}{|X|l|X|l|}
  \hline
  \textbf{Route} & \textbf{Max \# hops} & \textbf{Whisker Data (Hex)} & \textbf{Should digipeat?} \\
  \hline
  VE1ABC\~{}0 \break VE2DEF\~{}234 \break VE3XYZ\~{}14 & 4 & 04 18 04 56 45 31 41 42 43 FF 00 56 45 32 44 45 46 FF EA 56 45 33 58 59 5A FF 0E & Yes  \\
  \hline
  VE1ABC\~{}0 \break [FELINET] \break VE2DEF\~{}234 \break [FELINET] \break VE3XYZ\~{}14 & 3 & 04 1A 03 56 45 31 41 42 43 FF 00 FE 56 45 32 44 45 46 FF EA FE 56 45 33 58 59 5A FF 0E & Yes \\
  \hline
  VE1ABC\~{}0 \break [FELINET] \break [FELINET] & 0 & 04 0B 00 56 45 31 41 42 43 FF 00 FE FE & No \\
  \hline
\end{tabularx}

\subsection{Destination}

CATS packets can optionally have one or more destinations. This can be useful for e.g. sending a message to another amateur radio operator, or for communicating with a service. The destination consists of a UTF-8 callsign and an SSID byte. \textbf{A valid CATS packet may contain zero or more destination whiskers.}

The destination whisker also allows requesting an acknowledgement from the station, to confirm that the packet was received successfully. This is specified in the acknowledgement byte. If the byte is 0, no acknowledgement is requested. Otherwise, an acknowledgement ID is specified in the 7 least significant bits. The MSB must be cleared. To acknowledge, a CATS packet must be crafted with a destination of the original station's callsign. The acknowledgement byte must have the same acknowledgement ID, but with the MSB set. The MSB is how acknowledgement requests are differentiated from acknowledgement responses.

\subsubsection{Structure}

\begin{tabular}{|l|l|l|l|}
  \hline
  \textbf{Byte offset} & \textbf{Length} & \textbf{Value} & \textbf{Description} \\
  \hline
  0 & 1 & 0x05 & Whisker Type \\
  \hline
  1 & 1 & $N$ & Whisker Length in bytes \\
  \hline
  2 & 1 & & Acknowledgement byte \\
  \hline
  3 & $N - 2$ & & UTF-8 callsign \\
  \hline
  $N - 1$ & 1 & & SSID \\
  \hline
\end{tabular}

\subsection{Arbitrary}

The arbitrary whisker is a special type of whisker. Its content is intentionally not understood by the CATS standard. Instead, any content can be encoded. Most stations will ignore this content. This is also useful if the same packet is to be sent multiple times. To prevent de-duplication by receiving nodes, a different arbitrary whisker can be affixed to each one. \textbf{A valid CATS packet may contain zero or more arbitrary whiskers.}

\begin{tabular}{|l|l|l|l|}
  \hline
  \textbf{Byte offset} & \textbf{Length} & \textbf{Value} & \textbf{Description} \\
  \hline
  0 & 1 & 0x06 & Whisker Type \\
  \hline
  1 & 1 & $N$ & Whisker Length in bytes \\
  \hline
  2 & $N$ & & Data \\
  \hline
\end{tabular}

\section{FELINET}

\subsection{Introduction}

FELINET is the Internet counterpart of the CATS standard. Internet gates are responsible for sending received packets to FELINET. Packets received over FELINET can be gated back to RF. Although not part of the standard, gateways exist to move packets between FELINET and APRS-IS. Not all CATS packets can be represented in APRS, so it is preferable to use FELINET when possible.

FELINET is composed of one or more servers. Each server has a list zero or more upstream servers, which it forwards received CATS packets to. Upstream servers push packets to the downstream servers over the same link. Internet gates, or IGates, connect to servers in the same fashion. From the perspective of an upstream server, there is no difference between an IGate and a downstream server.

A FELINET server must discard any received packets that are bitwise identical to any packets it has seen in the previous 10 seconds. This prevents unintentional loops from repeating packets indefinitely and bogging down the network. To prepare raw data for FELINET, it must go through the Whiskers and CRC section of The Pipeline. In other words, CATS packets are gated to the FELINET network before the LDPC section of The Pipeline.


\end{document}