Building ChikkaDB - Part 3: Decoding Wire Messages

Written by Kishore Athrasseri

Published on Sun Nov 16 2025

This post is Part 3 of the journey in building ChikkaDB, a mongod-compatible server with an SQLite backend. The full code for this exploration can be found here. Read Part 2

If you have read the previous exploration, you would know that we have a functional proxy server that can intercept messages sent between mongod and mongosh. (If you haven’t, I would encourage you to go back and read that one!) The data is being captured, but the messages are yet to be decoded, except for the header.

We will rename index.ts into proxy.ts, and create a new wire-lib.ts module where we will write the message decoding logic. We will move the incomplete decodeMessage function to this new module.

Let us first get the legacy opcodes OP_QUERY and OP_REPLY out of the way. In all likelihood, they are being used simply for the initial handshake. Once that is done, we can focus on OP_MSG which must be the opcode using which most of the actual communication happens.

OP_QUERY has the following structure.

struct OP_QUERY {
  MsgHeader header;
  int32     flags;
  cstring   fullCollectionName;
  int32     numberToSkip;
  int32     numberToReturn;
  document  query;
 [document  returnFieldsSelector;] // optional
}

document refers to a BSON-encoded document buffer. cstring is a null-terminated string - we will need to write a utility function to read that.

And this is the layout of OP_REPLY.

struct OP_REPLY {
  MsgHeader header;
  int32     responseFlags;
  int64     cursorID;
  int32     startingFrom;
  int32     numberReturned;
  document* documents;
}

Let us define typescript types to represent these message types after decoding. Defining them as a discriminated union will come in handy when we need to switch between different cases and call different decoder functions. It’s one of my most favourite features of Typescript - it is so helpful when you have a polymorphic collection of items which can be one of a set of types.

// wire-lib.ts
export type WireMessage = {
  header: MessageHeader;
  payload: 
    | OpQueryPayload
    | OPReplyPayload;
    // todo: OP_MSG
};

type OpQueryPayload = {
  _type: 'OP_QUERY';
  flags: number;
  fullCollectionName: string;
  numberToSkip: number;
  numberToReturn: number;
  query: Record<string, any>;
  returnFieldsSelector?: Record<string, any>;
};

type OpReplyPayload = {
  _type: 'OP_REPLY';
  responseFlags: number;
  cursorID: BigInt;
  startingFrom: number;
  numberReturned: number;
  documents: Record<string, any>[];
};

It’s time to return to the partially implemented decodeMessage function that already decodes the message header, and add the logic for decoding the payloads. Based on the opcode in the header, we can call the appropriate functions for decoding the payload.

export function decodeMessage(buf: Buffer): WireMessage {
  const messageLength = buf.readInt32LE(0);
  const requestID = buf.readInt32LE(4);
  const responseTo = buf.readInt32LE(8);
  const opCode = buf.readInt32LE(12);

  let payload: WireMessage['payload'];

  switch(opCode) {
    case 2004: 
      payload = decodeOpQueryPayload(buf.subarray(16));
      break;
    case 1:
      payload = decodeOpReplyPayload(buf.subarray(16));
      break;
    // todo: OP_MSG
    default:
      throw new Error('Unknown opcode');
  }

  return {
    header: { messageLength, requestID, responseTo, opCode },
    payload
  };
}

I’m going to skip error-handling logic for now, and just let the code throw and crash if the decoding fails. I’ll return to error-handling once the happy path is in place.

Now on to writing the decoders.

function decodeOpQueryPayload(buf: Buffer): OpQueryPayload {
  let pointer = 0;

  const flags = buf.readInt32LE(pointer);
  pointer += 4;

  const { s: fullCollectionName, len } = readNullTerminatedString(buf, pointer);
  pointer += len;

  const numberToSkip = buf.readInt32LE(pointer);
  pointer += 4;

  const numberToReturn = buf.readInt32LE(pointer);
  pointer += 4;

  const { docs: [query, returnFieldsSelector] } = readBSONDocuments(buf, pointer);

  return {
    _type: 'OP_QUERY',
    flags,
    fullCollectionName,
    numberToSkip,
    numberToReturn,
    query,
    returnFieldsSelector,
  };
}

function decodeOpReplyPayload(buf: Buffer): OpReplyPayload {
  let pointer = 0;
  const responseFlags = buf.readInt32LE(pointer);
  pointer += 4;

  const cursorID = buf.readBigInt64LE(pointer);
  pointer += 8;

  const startingFrom = buf.readInt32LE(pointer);
  pointer += 4;

  const numberReturned = buf.readInt32LE(pointer);
  pointer += 4;

  const { documents } = readBSONDdocuments(buf, pointer);

  return {
    _type: 'OP_REPLY',
    responseFlags,
    cursorID,
    startingFrom,
    numberReturned,
    documents,
  };
}

There are a couple of utility functions that need to be implemented - readNullTerminatedString and readBSONDocuments.

The former is fairly simple. Just read the buffer till there is a null byte (0x00), and decode it. I’m going to assume UTF-8 encoding.

function readNullTerminatedString(buf: Buffer, offset: number): {
  s: string;
  len: number; // including the null termination byte
} {
  const nullIndex = buf.indexOf(0, offset);
  const s = buf.toString('utf-8', offset, nullIndex);
  
  return {
    s,
    len: (nullIndex - offset) + 1
  };
}

BSON documents start with a 4-byte size field, followed by the document. We can try reading as many documents as there are in the buffer, and return them as an array. The function will also return the total number of bytes read, so the invoking function knows where the pointer is after reading the documents.

function readBSONDocuments(buf: Buffer, offset: number): {
  documents: BSON.Document[];
  len: number; // bytes read
} {
  let documents: BSON.Document[] = [];
  let pointer = offset;

  while (pointer < buf.length) {
    if (buf.length - pointer < 4) break;
    const size = buf.readInt32LE(pointer);
    if (buf.length - pointer < size) break;
    const docBuf = buf.subarray(pointer, pointer + size);
    try {
      const doc = BSON.deserialize(docBuf);
      documents.push(doc);
    } catch {
      throw new Error('Invalid BSON at offset ' + pointer);
    }
    pointer += size;
  }

  return {
    documents,
    len: pointer - offset,
  }
}

Decoding OP_MSG messages is a bit more involved, because of the sections field. Unlike the legacy opcodes, OP_MSG is designed to be versatile, so the message format reflects that complexity.

The OP_MSG layout is as follows:

OP_MSG {
  MsgHeader header;
  uint32 flagBits;
  Sections[] sections;
  optional<uint32> checksum;
}

We will ignore the flagBits and the optional checksum at this point, and focus on the sections. This is the field that holds the main contents the message.

A section can be of different kinds. Each section starts with a single-byte that signifies the section’s kind, followed by the section payload. Section-kind 0 is the message body. This is the standard type of section in requests and replies, and has exactly one BSON document as the section payload.

Kind0_Section {
  sectionKind uint8;
  doc         document;
}

Though the Wire Protocol documentation doesn’t mention it, the MongoDB Specifications project states that an OP_MSG can have exactly one kind-0 section.

Section-kind 1 is a container for a subsequent set of documents that may need to be attached to the message. The Specification project page suggests that it is used for commands like bulk writes, where an array of documents needs to be sent over the wire. An OP_MSG can have zero or more kind-1 sections.

Kind1_Section {
  sectionKind                 uint8;
  size                        uint32;
  documentSequenceIdentifier  cstring;
  docs                        documents;
}

The Wire Protocol documentation also mentions a section-kind 2, but doesn’t give any further details, saying it is meant for internal use. We will ignore that.

Let’s define the types and write the logic to decode OP_MSG messages.

type OpMsgPayload = {
  _type: 'OP_MSG';
  flagBits: number;
  sections: OpMsgPayloadSection[];
  checksum?: number;
};

type OpMsgPayloadSection = {
  sectionKind: 0;
  document: BSON.Document;
} | {
  sectionKind: 1;
  size: number;
  documentSequenceIdentifier: string;
  documents: BSON.Document[];
};

function decodeOpMsgPayload(buf: Buffer): OpMsgPayload {
  let pointer = 0;
  const flagBits = buf.readInt32LE(pointer);
  pointer += 4;

  const sections = decodeOpMsgPayloadSections(buf, pointer);

  // Skip the optional checksum

  return {
    _type: 'OP_MSG',
    flagBits,
    sections,
  }
}

And here is the function for decoding the payload sections.

function decodeOpMsgPayloadSections(buf: Buffer, offset: number): OpMsgPayloadSection[] {
  const sections: OpMsgPayloadSection[] = [];
  let pointer = offset;

  while (pointer < buf.length) {
    const sectionKind = buf.readUint8(pointer);
    pointer += 1;

    switch(sectionKind) {
      case 0: {
        const size = buf.readInt32LE(pointer);
        const { documents, len } = readBSONDocuments(buf.subarray(pointer, pointer + size), 0);
        // TODO: Assert that exactly one document exists

        const section = {
          sectionKind,
          document: documents[0]!,
        };

        sections.push(section);
        pointer += size;
        break;
      }

      case 1: {
        const size = buf.readInt32LE(pointer);
        pointer += 4;

        const { s: documentSequenceIdentifier, len } = readNullTerminatedString(buf, offset + pointer);
        pointer += len;

        const { documents } = readBSONDocuments(buf, pointer);

        // TODO: Assert that there are no more bytes to be decoded

        const section = {
          sectionKind,
          size,
          documentSequenceIdentifier,
          documents,
        };

        sections.push(section);
        break;
      }
    }
  }

  return sections;
}

With that, we have enough logic implemented to decode the OP_MSG messages. All that remains to be done is to go back and invoke this decoder in the switch-case for opcode 2013 (OP_MSG).

export type WireMessage = {
  header: MessageHeader;
  payload: 
    | OpQueryPayload
    | OpReplyPayload
    | OpMsgPayload; // newly added
};

// in function decodeMessage
...
  switch(opCode) {
    case 2004: 
      payload = decodeOpQueryPayload(buf.subarray(16));
      break;
    case 1:
      payload = decodeOpReplyPayload(buf.subarray(16));
      break;
    case 2013: // newly added
      payload = decodeOpMsgPayload(buf.subarray(16));
      break;
    default:
      throw new Error('Unknown opcode');
  }
...

Now let’s run the proxy and see the details of the messages. I’m going to print message.payload using console.dir so that the large nested payloads are printed in their entirety. The message log starts looking like this.

client connected from port: 59130
created proxy connection to mongod server
C -> S message { messageLength: 355, requestID: 2, responseTo: 0, opCode: 2004 }
{
  _type: 'OP_QUERY',
  flags: 0,
  fullCollectionName: 'admin.$cmd',
  numberToSkip: 0,
  numberToReturn: -1,
  query: {
    ismaster: 1,
    helloOk: true,
    client: {
      application: { name: 'mongosh 2.5.8' },
      driver: { name: 'nodejs|mongosh', version: '6.19.0|2.5.8' },
      platform: 'Node.js v20.19.5, LE',
      os: {
        name: 'linux',
        architecture: 'x64',
        version: '3.10.0-327.22.2.el7.x86_64',
        type: 'Linux'
      }
    },
    compression: [ 'none' ]
  },
  returnFieldsSelector: undefined
}
S -> C message { messageLength: 329, requestID: 73, responseTo: 2, opCode: 1 }
{
  _type: 'OP_REPLY',
  responseFlags: 8,
  cursorID: 0n,
  startingFrom: 0,
  numberReturned: 1,
  documents: [
    {
      helloOk: true,
      ismaster: true,
      topologyVersion: {
        processId: ObjectId {
          buffer: Buffer(12) [Uint8Array] [
            105,  25,  92,  85, 209,
             76, 144, 123, 116, 178,
             97,  66
          ]
        },
        counter: 0
      },
      maxBsonObjectSize: 16777216,
      maxMessageSizeBytes: 48000000,
      maxWriteBatchSize: 100000,
      localTime: 2025-11-16T05:16:49.730Z,
      logicalSessionTimeoutMinutes: 30,
      connectionId: 10,
      minWireVersion: 0,
      maxWireVersion: 21,
      readOnly: false,
      ok: 1
    }
  ]
}
C -> S message { messageLength: 92, requestID: 3, responseTo: 0, opCode: 2013 }
{
  _type: 'OP_MSG',
  flagBits: 0,
  sections: [
    {
      sectionKind: 0,
      document: {
        buildInfo: 1,
        lsid: {
          id: UUID {
            sub_type: 4,
            buffer: Buffer(16) [Uint8Array] [
              158, 146, 162, 70, 223, 187,
               67,  39, 133, 60,  70, 223,
               10,  60, 230, 14
            ],
            position: 16
          }
        },
        '$db': 'admin'
      }
    }
  ]
}

This is so awesome! Now we know exactly what content is being sent in the messages.

Conclusion

We have all the tools in place to fully understand the sequence of messages exchanged between the MongoDB client and server. That should enable us to start building a custom server that can respond in the exact manner that the client expects, and effectively substitute for mongod.

This post is Part 3 of the journey in building ChikkaDB, a mongod-compatible server with an SQLite backend. The full code for this exploration can be found here