May 15, 2019

TCP socket send buffer deep dive




A typical TCP socket send buffer is composed of three parts: unacked-bytes, unsent-bytes, and free-buffer.
                               +----------------+
                               |                |
                               |                |
                               |  FREE BUFFER   |
                               |                |
                               |                |
                               +----------------+
                               |                |
                               |  UNSENT BYTES  |
                               |                |
                               +----------------+
                               |                |
                               |  UNACKED BYTES |
                               |                |
                               +----------------+

Total send buffer size

Total send buffer size = unacked-bytes + unsent-bytes + free-buffer.  
It can be obtained using the SO_SNDBUF socket option. The buffer size could dynamically change its size as seen needed by the OS. This works for both Linux and macOS.
        slen = sizeof(sndbufsiz);
        err = getsockopt(sd, SOL_SOCKET, SO_SNDBUF, &sndbufsiz, &slen);

Total in-flight bytes

Total inflight bytes = unacked-bytes + unsent-bytes. 
It can be obtained using SO_NWRITE socket option on macOS, and SIOCOUTQ ioctl on Linux.
int get_socket_used(int sd){
    int err;
    int used;
#ifdef __APPLE__
    socklen_t slen = sizeof(used);
    err = getsockopt(sd, SOL_SOCKET, SO_NWRITE, &used,&slen);
    if(err < 0) {
        perror("getsockopt 2");
        exit(1);
    }
#else
    err = ioctl(sd, SIOCOUTQ, &used);
    if(err < 0) {
        perror("ioctl SIOCOUTQ");
        exit(1);
    }
#endif
    return used;
}
On macOS, it seems that this can also be obtained using the TCP_INFO struct, but it is a private API.
u_int32_t       tcpi_snd_sbbytes;       /* bytes in snd buffer including data inflight */

unacked-bytes

On Linux, unacked-bytes can be obtained from the TCP_INFO structure, but the result is number of segments, or bytes. On macOS, TCP_INFO seems to contain this infomration (private API).
int get_socket_unacked(int sd){
    struct tcp_info tcp_info;
    socklen_t tcp_info_length = sizeof(tcp_info);
    if ( getsockopt(sd, IPPROTO_TCP, TCP_INFO, (void *)&tcp_info, &tcp_info_length ) == 0 ) {
        return tcp_info.tcpi_unacked;
    }
    return 0;
}

//For macOS, use TCP_INFO
    u_int64_t       tcpi_txunacked __attribute__((aligned(8)));    /* current number of bytes not acknowledged */
macOS tcp_info definition

unsent-bytes (not including un-acked bytes)

On Linux, unsent-bytes can be obtained from the tcpi_notsent_bytes field of the TCP_INFO structure. NOTE that this requires kernel version to be 4.6 or newer. For Android that means Android 8 or newer. On Linux, it can also be obtained using SIOCOUTQNSD ioctl. It’s not clear how to do this on macOS.
//defined in /usr/include/linux/sockios.h 
int get_socket_unsent(int sd){
    int err;
    int unsent;
    err = ioctl(sd, SIOCOUTQNSD, &unsent);
    if(err < 0) {
        perror("ioctl SIOCOUTQNSD");
        exit(1);
    }
    return unsent;
}

//OR 

int get_socket_unsent(int sd){
    struct tcp_info tcp_info;
    socklen_t tcp_info_length = sizeof(tcp_info);
    if ( getsockopt(sd, IPPROTO_TCP, TCP_INFO, (void *)&tcp_info, &tcp_info_length ) == 0 ) {
        return tcp_info.tcpi_notsent_bytes;
    }
    return 0;
}

Stackoverflow discussion on getting unsent bytes

epoll and kevent

epoll on Linux, and kevent on macOS, get triggered by the unsent-bytes, when the TCP option TCP_NOTSENT_LOWAT is set. On macOS, kevent() doesn’t report socket as writable until the unsent TCP data drops below specified threshold (typically 8 kilobytes).

May 9, 2019

Enable user-id based packet routing on Mac OS


If you would like to route all socket (TCP/UDP) traffic from processes running by a particular user on a Mac OS to be routed differently, you can do that.

1. Add the user to your Mac OS if not already done. In this example, I will add an user named "test1"
2. run the command:
        sudo vi /private/etc/pf.conf
    and add the following line before ' anchor "com.apple/*"
         pass out quick on en0 route-to { utun4 192.168.15.2 } user test1

   Note:
   a) change en0 to your default network interface name on Mac
   b) change utun4 to the network interface you would these packets to be routed to

3. restart pf by doing:
    sudo pfctl -d; sudo pfctl -e -f /etc/pf.conf

Now all processes running by user test1 should be routed to the new interface as specified.

February 26, 2019

keep ssh running in background for tunneling

1. write the following script to a file, named "tunnel.sh" and make it executable (make sure user has public auth enabled on remote host):

while true; do ssh -t -n -R 127.0.0.1:2233:127.0.0.1:22 user@remote.host.com "while true; do ps -ef; sleep 1; done" ; sleep 1; done

2.  Run the the above script in a detached screen session:
    screen -S tunnel -d -m /path/to/tunnel.sh


 That's all. This creates a background screen session, which runs the tunnel.sh script, which loops an ssh command to keep it up and running.

September 28, 2018

July 28, 2018

How to use Netlink to get ipv6 neighbors

1. open netlink socket
2. send request to get neighbors
3. parse the response


The response is in the following format
[ netlink msg ] ... [ netlink msg ]


use the Macros defined in "man 3 netlink" to iterate through the messages. Each netlink message has a header "struct nlmsghdr *", defined in "man 7 netlink".

The netlink message for getting ipv6 neighbors is defined in "man 7 rtnetlink"

for ipv6 neighbors message, each netlink msg block has the following format:
  netlink-msg-header | neighbor-discover-msg-header (struct ndmsg) | route-attributes (struct rtattr) | more route-attributes...


Each route-attributes has the following format:
  header (struct rtattr) | data


The types of rtattr are defined in /usr/include/linux/neighbour.h, with the commons are:
  NDA_DST: data is IP address
  NDA_LLADDR: data is MAC address
  NDA_CACHEINFO: data is struct nda_cacheinfo



Example NetLink payload parsing (not showing nlmsghdr):

0a 00 00 00  :  inet family
02 00 00 00  : ifindex
04 00  : state
00 : flags
01 : type


14 00 : rtattr len
01 00 : type
fe 80 00 00 00 00 00 00 b6 ef fa ff fe d0 fe 76 : Ipv6 address


0a 00 : rtattr len
02 00 : type
b4 ef fa d0 fe 76 : mac


00 00  //padding

08 00  //probe ?
04 00
00 00 00 00


14 00 //rtattr len
03 00 //cache info
6b a8 e8 01 fb 90 e8 01
6b b9 69 00 00 00 00 00

March 31, 2018

CAVEAT: golang uint to string conversion

In golang, you can cast a "uint8" value  to "string".

Be aware that
  if the uint8 x <= 127, string(x) is the ASCII char of the value x,
  if the uint8 x >128, string(x) is 2 bytes long, because of UTF-8