Ake Koomsin

Digging Around FreeBSD Socket API TCP Send Path

When we want to send data across network, it involves either

  • write()/writev()
  • send()/sendto()/sendmsg()

and sockets.

To use write(), writev() and send(), the socket must be connected. On the other hand sendto() and sendmsg() can be used in both connected and unconnected connection.

We normally can find the implementation of system calls in the kernel by adding ‘sys_’ prefix to the name of the system call except send(). In FreeBSD libc, send() is just a wrapper of sendto() with some default parameters.

Let’s take a look at sys_write().

In sys/kern/sys_generic.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
int
sys_write(td, uap)
    struct thread *td;
  struct write_args *uap;
{
  struct uio auio;
  struct iovec aiov;
  int error;

  if (uap->nbyte > IOSIZE_MAX)
      return (EINVAL);
  aiov.iov_base = (void *)(uintptr_t)uap->buf;
  aiov.iov_len = uap->nbyte;
  auio.uio_iov = &aiov;
  auio.uio_iovcnt = 1;
  auio.uio_resid = uap->nbyte;
  auio.uio_segflg = UIO_USERSPACE;
  error = kern_writev(td, uap->fd, &auio);
  return(error);
}

It is just a special form of writev() as it calls kern_writev() eventually. Taking a look at kern_writev() gives us some interesting code pattern.

In sys/kern/sys_generic.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
int
kern_writev(struct thread *td, int fd, struct uio *auio)
{
    struct file *fp;
    cap_rights_t rights;
    int error;

    error = fget_write(td, fd, cap_rights_init(&rights, CAP_WRITE), &fp);
    if (error)
        return (error);
    error = dofilewrite(td, fd, fp, auio, (off_t)-1, 0);
    fdrop(fp, td);
    return (error);
}

fget_write() is called to verify that we have permission to perform write operation and get a pointer to the file associated with the file descriptor. After thus function call, the reference count of the file is increased. That is why in the end fdrop() macro is called to decrease the reference count and perform some cleanup if necessary.

The write operation happens when dofilewrite() is called.

In sys/kern/sys_generic.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
static int
dofilewrite(td, fd, fp, auio, offset, flags)
    struct thread *td;
    int fd;
    struct file *fp;
    struct uio *auio;
    off_t offset;
    int flags;
{
      ...
    if ((error = fo_write(fp, auio, td->td_ucred, flags, td))) {
        ...
    }
      ...
}

dofilewrite() performs some verification and dispatches write operating to an appropriate function by calling fo_write(). The real write function depends on the type of the file.

In sys/sys/file.h
1
2
3
4
5
6
7
static __inline int
fo_write(struct file *fp, struct uio *uio, struct ucred *active_cred,
    int flags, struct thread *td)
{

    return ((*fp->f_ops->fo_write)(fp, uio, active_cred, flags, td));
}

For socket, file operations are defined as follows.

In sys/kern/sys_socket.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
struct fileops    socketops = {
    .fo_read = soo_read,
    .fo_write = soo_write,
    .fo_truncate = soo_truncate,
    .fo_ioctl = soo_ioctl,
    .fo_poll = soo_poll,
    .fo_kqfilter = soo_kqfilter,
    .fo_stat = soo_stat,
    .fo_close = soo_close,
    .fo_chmod = invfo_chmod,
    .fo_chown = invfo_chown,
    .fo_sendfile = invfo_sendfile,
    .fo_flags = DFLAG_PASSABLE
};

That means the actual write function is soo_write().

In sys/kern/sys_socket.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
int
soo_write(struct file *fp, struct uio *uio, struct ucred *active_cred,
    int flags, struct thread *td)
{
    struct socket *so = fp->f_data;
    int error;

#ifdef MAC
    error = mac_socket_check_send(active_cred, so);
    if (error)
        return (error);
#endif
    error = sosend(so, 0, uio, 0, 0, 0, uio->uio_td);
    if (error == EPIPE && (so->so_options & SO_NOSIGPIPE) == 0) {
        PROC_LOCK(uio->uio_td->td_proc);
        tdsignal(uio->uio_td, SIGPIPE);
        PROC_UNLOCK(uio->uio_td->td_proc);
    }
    return (error);
}

Again, soo_write() performs some necessary verification and calls sosend().

Before we go further, let’s take a look at sys_sendt() to see how it differs from normal write() system call.

In sys/kern/uipc_syscalls.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
int
sys_sendto(td, uap)
    struct thread *td;
    struct sendto_args /* {
        int s;
        caddr_t buf;
        size_t  len;
        int flags;
        caddr_t to;
        int tolen;
    } */ *uap;
{
    struct msghdr msg;
    struct iovec aiov;

    msg.msg_name = uap->to;
    msg.msg_namelen = uap->tolen;
    msg.msg_iov = &aiov;
    msg.msg_iovlen = 1;
    msg.msg_control = 0;
#ifdef COMPAT_OLDSOCK
    msg.msg_flags = 0;
#endif
    aiov.iov_base = uap->buf;
    aiov.iov_len = uap->len;
    return (sendit(td, uap->s, &msg, uap->flags));
}

sys_sendmsg() is similar to sys_sendto(). They just handle the arguments differently. They both call sendit() at the end. sendit() performs some check and call kern_sendit() which call sosend() eventually.

In sys/kern/uipc_syscalls.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
static int
sendit(td, s, mp, flags)
    struct thread *td;
    int s;
    struct msghdr *mp;
    int flags;
{
    ...
    error = kern_sendit(td, s, mp, flags, control, UIO_USERSPACE);
    ...
}

int
kern_sendit(td, s, mp, flags, control, segflg)
    struct thread *td;
    int s;
    struct msghdr *mp;
    int flags;
    struct mbuf *control;
    enum uio_seg segflg;
{
    ...
    error = sosend(so, mp->msg_name, &auio, 0, control, flags, td);
    ...
}

sosend() is a basically a wrapper of the function pointed by pru_sosend.

In sys/kern/uipc_socket.c
1
2
3
4
5
6
7
8
9
10
11
12
int
sosend(struct socket *so, struct sockaddr *addr, struct uio *uio,
    struct mbuf *top, struct mbuf *control, int flags, struct thread *td)
{
    int error;

    CURVNET_SET(so->so_vnet);
    error = so->so_proto->pr_usrreqs->pru_sosend(so, addr, uio, top,
        control, flags, td);
    CURVNET_RESTORE();
    return (error);
}

At this point, the actual function depends on the protocol type. In case of TCP, the pru_sosend points to sosend_generic() (This is the default value. UDP has its own sosend_dgram()).

In sys/kern/uipc_generic.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
int
sosend_generic(struct socket *so, struct sockaddr *addr, struct uio *uio,
    struct mbuf *top, struct mbuf *control, int flags, struct thread *td)
{
    long space;
    ssize_t resid;
    int clen = 0, error, dontroute;
    int atomic = sosendallatonce(so) || top;

    ...
            /*
             * XXX all the SBS_CANTSENDMORE checks previously
             * done could be out of date.  We could have recieved
             * a reset packet in an interrupt or maybe we slept
             * while doing page faults in uiomove() etc.  We
             * could probably recheck again inside the locking
             * protection here, but there are probably other
             * places that this also happens.  We must rethink
             * this.
             */
            VNET_SO_ASSERT(so);
            error = (*so->so_proto->pr_usrreqs->pru_send)(so,
                (flags & MSG_OOB) ? PRUS_OOB :
     ...
}

At some point in sosend_generic(), pru_send function pointer is called. For TCP, this points to function tcp_user_send().

In sys/netinet/tcp_usrreq.c
1
2
3
4
5
6
7
8
9
10
11
12
static int
tcp_usr_send(struct socket *so, int flags, struct mbuf *m,
    struct sockaddr *nam, struct mbuf *control, struct thread *td)
{
    ...
        tp->snd_up = tp->snd_una + so->so_snd.sb_cc;
        tp->t_flags |= TF_FORCEDATA;
        error = tcp_output(tp);
        tp->t_flags &= ~TF_FORCEDATA;
    }
    ...
}

The data is passed to tcp_output() to figure out what to be sent and send it to lower layer by ip_output().

In sys/netinet/tcp_output.c
1
2
3
4
5
6
7
8
9
10
11
int
tcp_output(struct tcpcb *tp)
{
    ...
        TCP_PROBE5(send, NULL, tp, ip, tp, th);

        error = ip_output(m, tp->t_inpcb->inp_options, &ro,
            ((so->so_options & SO_DONTROUTE) ? IP_ROUTETOIF : 0), 0,
            tp->t_inpcb);
    ...
}

Eventually, the data in mbuf will be passed to the device driver through ifp->if_output function pointer.

In sys/netinet/ip_output.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
int
ip_output(struct mbuf *m, struct mbuf *opt, struct route *ro, int flags,
    struct ip_moptions *imo, struct inpcb *inp)
{
    ...
        /*
         * Reset layer specific mbuf flags
         * to avoid confusing lower layers.
         */
        m_clrprotoflags(m);
        IP_PROBE(send, NULL, NULL, ip, ifp, ip, NULL);
        error = (*ifp->if_output)(ifp, m,
            (const struct sockaddr *)gw, ro);
        goto done;
    ...
}

In conclusion, sending data across network is not a trivial task. The data from user application will pass through a series of layers. Each layer has its own responsibility.

Thought on Learning Kanji With Remembering the Kanji

First of all, I would like to apologize that I haven’t updated this blog for a very long time. After I graduated in March, I took a little break and started a full-time job on April. I worked on a lot of interesting projects. I cannot share what I worked on because I feel like it is a company secret. In May, I got a notification of acceptance from the University of Tsukuba as a scholarship research student. Therefore, I quited working on June to prepare studying Japanese.

Everyone who studies Japanese language will know that one of the most troublesome aspect, apart from formality, is Kanji. Kanji is derived from Chinese characters. For more information about the history of Kanji, please visit this blog. The author knows how to make it fun. There are more than 3000 Kanji characters as far as I know. Fortunately, may be, there are 2200 Kanji characters that are considered “Frequently Used Kanji (Jouyou Kanji 常用漢字)” However, Kanji in daily life is going to be lower than that but there is still a lot. There is no wonder why Japanese students spend 12 years in schools to learn all necessary Kanji.

Things you will encounter when you learn Kanji

To me, there are 3 things that you will deal with when you learn Kanji which are:

Reading – Most of the time, there are 2 ways of reading a Kanji. There are Kun-yomi (訓読み) and On-yomi (音読み). Kun-yomi is a Japanese style of reading while On-yomi is a Chinese style of reading. Therefore, in most case, there will be at least 2 sounds that you have to know. The sound may also change according to the surroundings. Some characters have the same reading sound.

Meaning – Each character has its own meaning, at least one. Well, you have to remember them. There is no other way around. Some characters are synonymous and this can be a little confusing.

Writing – This is the most discouraging part. Each Kanji character has its own shape with different numbers of strokes. It is very common to forget how to write. Even native speakers sometimes encounter this problem.

As you can see, Kanji alone has a bunch of things to learn. It is considered difficult in the sense that there is a lot and you have to be consistent in learning and reviewing. Self-disciplining is hard, you know. This brings us the question, how should we learn it?

Traditional way of learning Kanji for foreigners

If you go to a Japanese language school, the order of Kanji that they will teach you is based on the Japanese Proficiency Test (JLPT). In order words, the Kanji is taught based on the frequency and the complexity of the meaning. You are probably be taught how to read including some of its compounds and are assigned to write each Kanji for hundreds times believing that you will be able to remember how to write them.

In addition, some Kanji characters may be taught by pictographs, transform picture to character. Unfortunately, not all Kanji can be learned using pictographs.

This approach is good from a perspective that you will be able to learn what are necessary for real life early. My opinion is that it is good for those who are busy. Teaching based on frequency and meaning complexity allows you to be able to access Japanese literatures faster.

However, as the number of Kanji grows, it is undoubtedly easy to forget the meaning and, especially, the writing. Those who used to study Japanese will be familiar with this kind of experience. This has hindered many Japanese learner greatly. However, I don’t think that reading will be a problem. As mentioned earlier, reading of a Kanji is often based on the surroundings. Learners will usually get used to them eventually.

I want to remind that we are unlike the native that they have seen these Kanji characters since they were born and constantly used them. The way the language school teaches is almost identical to how native Japaneses learn in their schools; 12 years remember!.

This probably means that this approach is not efficient for us, foreigners.

Alternative approach

Fortunately, another way to learn Kanji exists. It is called “mnemonic approach”. The fundamental idea of this approach is that it breaks a Kanji character into smaller parts called “primitives” or “radicals” depending on what you prefer (I will go with radical). Those radicals are named and are used to construct a “story” using your imagination to help you remember those Kanji.

I decide to go with this approach using one of the most controversial book, “Remembering the Kanji 1”.

Remembering the Kanji (RTK)

“Remembering the Kanji” is a series of 3 books for studying Kanji written by James W. Heisig. The first book is the most popular one. In the 6th edition, it offers 2200 Kanji characters along with learning approach. The second book teaches you how to read those Kanji. The last book offers another additional 965 Kanji characters for advanced learners. This blog post is about “Remembering the Kanji 1 (RTK1)”.

Heisig will explain why it is better to use mnemonic approach and his motivation in the introduction part of RTK1. You should not skip this part. Each Kanji will be assigned by with a unique keyword without any pronunciation (you will learn in RTK2). The author believes that it is more efficient to separate writing and meaning from reading and make it easier to study. Heisig organizes the book into 3 parts; stories, plots and elements. The first part will give you radicals, show you how to remember the writing and meaning with those radicals. Noting that a Kanji character itself can also serve as a radical. The purpose of the first part is to train the reader how to create and appreciate the story. The second part will give you only a plot leaving the full story as you exercise. The last part will give you only elements of the Kanji. It is the reader’s job to fill up the story.

Thought on Remembering the Kanji 1

TL;DR It really works if you understand the point of the book.

Well, “Remembering the Kanji 1” can be awesome or overpriced worthless book depending on how do you understand the point of the book.

The keywords and the name of radicals can be very difficult especially for non-native English speaker like me. During the study, I always need an English dictionary. It is utterly mysterious how Heisig comes up with those keywords and few keywords are incorrect. If you try to judge the book from these, it is probably worthless for you. However, these are not the main point the book try to teach you.

The point of the book is simply to make you get used to those Kanji and establish a foundation for learning new Kanji in the future. The implication of this is learning Kanji is individual matter. It is very important to understand this. That means you can disagree with him. You can modify the keywords that suits yourself, probably in your own native language. You can group some radicals and name it if you think that it makes your life easier. These are what make this book awesome.

The mnemonic method works for me. Story will guide you how to write including the meaning. There is a certain thing I should emphasize. Heisig wants you to use “imaginative memory”. You are to create stories to “impress” yourself. Impression on the stories will allow you to remember and recall things much easier as it comes out from your imagination. However, you still need a consistent review to. By the time, the story will eventually fade away but you still know how to write those Kanji.

Regarding reviewing, Heisig suggests to review from keywords because you will review both writing and recognizing the Kanji at the same time. To me this is partial true. I sometimes have problem of recalling the meaning because of keywords that I am not familiar with. It may also be may fault that I don’t know English vocabularies enough. However, I tend to be able to recall when I encounter them again.

It is not perfect but it works pretty well.

Studying with Remembering the Kanji

It is dangerous to travel alone. Going through RTK1 on your own will pretty tedious. Here are things I recommend:

  • Anki: Anki is a SRS flashcard application. This is the killer application for those who study languages. It will assist you studying and reviewing.

  • RTK1 6th pre-storied deck: The stories are from a community site for RTK named Reviewing the Kanji You will import this deck to your Anki.

Using the pre-storied to study doesn’t mean you are going to solely remember those stories. Many of pre-storied in the deck are good enough in my opinion. What you have to do is to read the stories, appreciate them and embrace them with vivid imagination. Don’t forget you can disagree with those pre-stories. If you don’t like it, just create your own. As mentioned earlier, it is individual business.

Well, what kind of story should you make? I would say any kind as long as you love it. Your stories may be based movies, novels, comics, religions, politics, your own experiences, your friends’ stories, etc. The key is to create stories that you make yourself enjoy learning.

You are going to study every day. I recommend not to study more than 20 Kanji a day. The more you try to study for each day, the longer time you are going to spend on reviewing. Moreover, studying too fast may reduce your recalling ability; this is from my own experience. At this study rate, you should be able to finish RTK1 in 3 months and a half. However, it is recommend you find you own “Minimal Effect Dose (MED)”. Find you own study rate such that you feel comfortable. The more you study isn’t always better.

My preferred Anki settings for RTK1 is not to introduce new card automatically. I found that reviewing first and then study the new separately is better for me. With this setting, when I don’t feel like I am ready to study new Kanji, I can skip the new.

The reason I recommend you to use Anki is that it will schedule the review for you according to your performance. It works flawlessly from my experience. However, you need to be honest during the review. If you completely forget a Kanji, it is good to re-study it again. During the review, it is good to write Kanji characters onto a paper. I know that nowadays we have computers. However, you should prepare for the situation that it does not exist.

Final thought

One thing that you must understand about learning Kanji is that it is going to take a long time. I want you to keep in mind that your Kanji journey is not end yet after you finish the book. There are a lot of things left to learn. RTK1 is just the beginning. The key to success is that you need to be consistent in you study and stayed motivated.

The next question is what to do next after finishing RTK1? For me, I don’t plan to continue RTK2 and RTK3 yet. I think that the best way to learn a language is to get used to it. It is better to continue on vocabularies, learn how to pronounce and use them; Kanji will serve as another ABC for them. After this, the more you get yourself in to Japanese, the better you are.

It takes time. Don’t worry if you feel exhausted but never give up.

Building Facebook Home With Quartz Composer →

David O Brien has produced a tutorial on using Quartz Composer to emulate Facebook Home. It is worth watching. Quartz Composer, to me, is another hidden gem that Apple provides for developers.

Offscreen Rendering and Multisampling With OpenGL

It has been for a while since my last post. I was enjoying with my senior project, “Accelerating Map Rendering with GPU”. In this project, my friend and I modified Mapnik, an opensource map rendering, to utilize Nvidia’s Path Rendering. Path Rendering is an OpenGL extension provided by Nvidia for vector graphic rendering. Nvidia claims that its extensions aims to reduce overhead from traditional APIs when using them to draw vector graphics. In the end, we are able to make the map production 30-60% faster. Our implementation can be found on https://github.com/ake-koomsin/mapnik_nvpr

There are two important things that we had to achieve in order to use Path Rendering extension for map rendering, offscreen rendering and multisampling. Offscreen rendering is important because we don’t want the renderer to show the intermediate result. Multisampling is for producing a quality map.

I think offscreen rendering combining with multisampling is an important piece of knowledge. It is also hard to find a complete refernce about these two. Therefore, I think it is worth writing about it.

Offscreen Rendering

Offscreen rendering is a technique that are commonly found in game development. Sometimes, there are situations that you want to generate a texture at runtime.

To set up offscreen rendering, you have to create your own “Framebuffer Object (FBO)”. Actually, OpenGL has its own default FBO. A result stored in the default FBO will be shown onto the screen while the result stored in our own FBO will be not. The code below demonstrates how to set up our own FBO.

Setting our own FBO for offscreen rendering
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
int fbo, colorBuffer, depthBuffer;

// Create and bind the FBO
glGenFramebuffers(1, &fbo);
glBindFramebuffer(GL_FRAMEBUFFER, fbo);

// Create color render buffer
glGenRenderbuffers(1, &colorBuffer);
glBindRenderbuffer(GL_RENDERBUFFER, colorBuffer);
glRenderbufferStorage(GL_RENDERBUFFER, GL_RGBA8, width, height);
glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_RENDERBUFFER, colorBuffer);

// Create depth render buffer (This is optional)
glGenRenderbuffers(1, &depthBuffer);
glBindRenderbuffer(GL_RENDERBUFFER, depthBuffer);
glRenderbufferStorage(GL_RENDERBUFFER, GL_DEPTH24_STENCIL8, width, height);
glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_RENDERBUFFER, depthBuffer);
glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_STENCIL_ATTACHMENT, GL_RENDERBUFFER, depthBuffer);

// Bind Texture assuming we have created a texture
glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, aTexture, 0);

// Draw something
drawGraphic();

It is straightforward. First, we create a FBO. After that, we create render buffers and textures and attach them to our FBO.

Multisampling

By default, OpenGL does not care about antialiasing. As a result, the output contains stair-like artifacts which degrade visual quality. We have to enable multisampling by the code below.

Enabling multisampling
1
glEanble(GL_MULTISAMPLE);

Combining Offscreen Rendering and Multisampling Together

It turns out that to combine them together, we need additional set up which are multisample framebuffer storage and multisample texture. The code below demonstrates how to do.

Setting up FBO with Multisampling
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// Create multisample texture
glBindTexture(GL_TEXTURE_2D_MULTISAMPLE, aMultisampleTexture);
glTexImage2DMultisample(GL_TEXTURE_2D_MULTISAMPLE, 16, GL_RGBA, width, height, GL_TRUE);

int fbo, colorBuffer, depthBuffer;

// Create and bind the FBO
glGenFramebuffers(1, &fbo);
glBindFramebuffer(GL_FRAMEBUFFER, fbo);

// Create color render buffer
glGenRenderbuffers(1, &colorBuffer);
glBindRenderbuffer(GL_RENDERBUFFER, colorBuffer);
glRenderbufferStorageMultisample(GL_RENDERBUFFER, 16, GL_RGBA8, width, height);
glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_RENDERBUFFER, colorBuffer);

// Create depth render buffer (This is optional)
glGenRenderbuffers(1, &depthBuffer);
glBindRenderbuffer(GL_RENDERBUFFER, depthBuffer);
glRenderbufferStorageMultisample(GL_RENDERBUFFER, 16, GL_DEPTH24_STENCIL8, width, height);
glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_RENDERBUFFER, depthBuffer);
glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_STENCIL_ATTACHMENT, GL_RENDERBUFFER, depthBuffer);

// Bind Texture assuming we have created a texture
glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, aTexture, 0);

// Enable multisampling
glEanble(GL_MULTISAMPLE);

// Draw something
drawGraphic();

Retrieving the result

After offscreen rendering, you may want to display the result onto the screen. When you are using multisampling FBO, you are not able to use the result stored in the texture directly. You have to do “Blitting” which transfer the result from one FBO to another. The code below shows how to do.

Blitting
1
2
3
glBindFramebuffer(GL_READ_FRAMEBUFFER, multisampledFBO);
glBindFramebuffer(GL_DRAW_FRAMEBUFFER, normalFBO); // Normal FBO can be the default FBO too.
glBlitFramebuffer(0, 0, width, height, 0, 0, width, height, GL_COLOR_BUFFER_BIT, GL_NEAREST);

I hope these snippets are a useful reference.

My First Kernel Extension for Logitech Presenter Device

I do a lot of presentation particularly those class projects. To make my presentation looks more professional, I decide to buy Logitech Professional Presenter R800. It is very nice. I recommend you to have one if you have to give a lot of presentation.

However, what bothers me is that not all buttons work with Apple Keynote, particularly Play button and Blank button. Well, the device is designed for Microsoft PowerPoint on Windows.

The questions are why it doesn’t work and how to make it works with Keynote?

How does the device work?

Actually, that device is merely a wireless keyboard with only 4 keys including, Page Up, Page Down, F5 and . (dot). Page Up key and Page Down key are for moving slides back and forth. F5 is for starting a presentation. Dot key is for blanking.

PowerPoint conforms to all those keys but Keynote doesn’t. Keynote uses, by default, CMD + ALT + P to start playing a slide and B for blanking the slide. Dot key for Keynote is for terminating the slideshow which may or may not what you want.

What normal people would do

The simple and sane way is to modify the shortcut key. You can go to System Preference > Keyboard > Keyboard Shortcuts > Application Shortcuts, add Keynote application and overide Play Slideshow to use F5. However, This approach doesn’t work if you want to modify the Blank key.

You can ignore Blank key actually if you want it to stop playing the slide but I want to blank the screen. To me, stop playing to the slide means the presentation is end and I will be at the computer to stop the slideshow manually.

What I actually do

To satify my geek spirit, I write my own kernel extension for the device.

The first question is where should I start? The answer is ioreg. I know that the device is a keyboard. Keyboard is a kind of Human Interface Device (HID). Apple Keyboard is also a keyboard. Therefore, I should take a look at driver which is used with Apple Keyboard.

Before I go, I should mention a little bit about IOKit Framework first. IOKit Framework is a framework that Apple provides to developers for writing a device driver in an OOP way wtih C++.

The belowing snippet shows the partial output from ioreg:

ioreg output
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
...
| |   |   +-o Apple Internal Keyboard / Trackpad@4600000  <class IOUSBDevice, id 0x100000292, registered, matched, active, busy 0 (1173 ms), r$
    | |   |     +-o IOUSBCompositeDriver  <class IOUSBCompositeDriver, id 0x100000295, !registered, !matched, active, busy 0, retain 4>
    | |   |     +-o Apple Internal Keyboard@0  <class IOUSBInterface, id 0x100000296, registered, matched, active, busy 0 (213 ms), retain 9>
    | |   |     | +-o AppleUSBTCKeyboard  <class AppleUSBTCKeyboard, id 0x10000029a, registered, matched, active, busy 0 (33 ms), retain 12>
    | |   |     |   +-o IOHIDInterface  <class IOHIDInterface, id 0x10000029f, registered, matched, active, busy 0 (31 ms), retain 7>
    | |   |     |   | +-o AppleEmbeddedKeyboard  <class AppleEmbeddedKeyboard, id 0x1000002a0, registered, matched, active, busy 0 (0 ms), retain $
    | |   |     |   |   +-o IOHIDKeyboard  <class IOHIDKeyboard, id 0x1000002a2, registered, matched, active, busy 0 (0 ms), retain 8>
    | |   |     |   |   | +-o IOHIDSystem  <class IOHIDSystem, id 0x1000002ac, registered, matched, active, busy 0 (0 ms), retain 20>
    | |   |     |   |   |   +-o IOHIDStackShotUserClient  <class IOHIDStackShotUserClient, id 0x10000035d, !registered, !matched, active, busy 0, $
    | |   |     |   |   |   +-o IOHIDUserClient  <class IOHIDUserClient, id 0x100000371, !registered, !matched, active, busy 0, retain 5>
    | |   |     |   |   |   +-o IOHIDParamUserClient  <class IOHIDParamUserClient, id 0x100000385, !registered, !matched, active, busy 0, retain 5$
    | |   |     |   |   |   +-o IOHIDEventSystemUserClient  <class IOHIDEventSystemUserClient, id 0x1000003c1, !registered, !matched, active, busy$
    | |   |     |   |   |   +-o IOHIDEventSystemUserClient  <class IOHIDEventSystemUserClient, id 0x1000003c9, !registered, !matched, active, busy$
    | |   |     |   |   +-o IOHIDConsumer  <class IOHIDConsumer, id 0x1000002a3, registered, matched, active, busy 0 (0 ms), retain 8>
    | |   |     |   |   | +-o IOHIDSystem  <class IOHIDSystem, id 0x1000002ac, registered, matched, active, busy 0 (0 ms), retain 20>
    | |   |     |   |   |   +-o IOHIDStackShotUserClient  <class IOHIDStackShotUserClient, id 0x10000035d, !registered, !matched, active, busy 0, $
    | |   |     |   |   |   +-o IOHIDUserClient  <class IOHIDUserClient, id 0x100000371, !registered, !matched, active, busy 0, retain 5>
    | |   |     |   |   |   +-o IOHIDParamUserClient  <class IOHIDParamUserClient, id 0x100000385, !registered, !matched, active, busy 0, retain 5$
    | |   |     |   |   |   +-o IOHIDEventSystemUserClient  <class IOHIDEventSystemUserClient, id 0x1000003c1, !registered, !matched, active, busy$
    | |   |     |   |   |   +-o IOHIDEventSystemUserClient  <class IOHIDEventSystemUserClient, id 0x1000003c9, !registered, !matched, active, busy$
    | |   |     |   |   +-o IOHIDSystem  <class IOHIDSystem, id 0x1000002ac, registered, matched, active, busy 0 (0 ms), retain 19>
    | |   |     |   |     +-o IOHIDStackShotUserClient  <class IOHIDStackShotUserClient, id 0x10000035d, !registered, !matched, active, busy 0, re$
    | |   |     |   |     +-o IOHIDUserClient  <class IOHIDUserClient, id 0x100000371, !registered, !matched, active, busy 0, retain 5>
    | |   |     |   |     +-o IOHIDParamUserClient  <class IOHIDParamUserClient, id 0x100000385, !registered, !matched, active, busy 0, retain 5>
    | |   |     |   |     +-o IOHIDEventSystemUserClient  <class IOHIDEventSystemUserClient, id 0x1000003c1, !registered, !matched, active, busy 0$
    | |   |     |   |     +-o IOHIDEventSystemUserClient  <class IOHIDEventSystemUserClient, id 0x1000003c9, !registered, !matched, active, busy 0$
    | |   |     |   +-o IOHIDLibUserClient  <class IOHIDLibUserClient, id 0x1000003c7, !registered, !matched, active, busy 0, retain 6>
    | |   |     |   +-o IOHIDLibUserClient  <class IOHIDLibUserClient, id 0x10000044b, !registered, !matched, active, busy 0, retain 6>
...

What does this information tell us? It shows the hierarchy of drivers which implys us the chain of command. For example, IOHIDInterface is said to be a provider of AppleEmbeddedKeyboard; you can think of it as a data provider.

The class beginning with IO belongs to the system. As you can see, there are 2 custom drivers, AppleUSBTCKeyboard and AppleEmbeddedKeyboard. I investigate further by the command ioreg -c and find out that AppleUSBTCKeyboard is a subclass of IOUSBHIDDriver and AppleEmbeddedKeyboard is a subclass of IOHIDEventDriver.

My initial assumption about the driver is that it should do something about event dispatching. I start by looking at IOUSBHIDDriver.h and IOHIDEventDriver.h and found nothing that seems to match my assumption. Is my assumption wrong?

I google ‘AppleEmbeddedKeyboard’ and find a source code from Apple open source repository. I take a look at AppleEmbeddedKeyboard.cpp find this method overiding interesting:

dispatchKeyboardEvent method
1
2
3
4
5
6
7
8
9
void AppleEmbeddedKeyboard::dispatchKeyboardEvent(
                                AbsoluteTime                timeStamp,
                                UInt32                      usagePage,
                                UInt32                      usage,
                                UInt32                      value,
                                IOOptionBits                options)
{
    ...
}

It turns out that the method dispatchKeyboardEvent() is declared in IOHIDEventService.h. So, I decide to write the kernel extension based on IOHIDEventDriver. With Apple documentation and suggestion from Pavel Prokofiev who writes macosx-nosleep-extension, I am able to complete it (It is not that hard but takes time to understand what things are).

What my code does is intercepting the key and replace it if necessary.

You can see my source code here https://github.com/ake-koomsin/LogitechWirelessPresenterKext.