Enhancing Signal’s End-to-End Encryption Algorithm Eliminating Man-in-the-Middle and other…

In this article, we will briefly describe the shortcomings of Open-Whisper System (Signal)’s end-to-end encryption algorithm and why we decided to write an enhanced version that eliminates man-in-the-middle attacks and other limitations.

Apr 21, 2022 10 min read

If you have used Signal or WhatsApp, you may have seen these verification screens. Have you ever wondered what are your options if the numbers do not match (verification failed)?

image

Surprised? You have no option but to accept the fact that someone is watching your conversation. Both Signal and WhatsApp don’t provide any corrective options and unfortunately, you can‘t do anything about it after comparing those 60-digits fingerprints. Whether the reasons are technical or not, it was a concern we had when adding end-to-end encryption in mesibo APIs.

mesibo is a platform for developers to add real-time messaging, group messaging, calls, and group calls to their apps and website. mesibo does not require phone numbers and allows users to download the entire platform and hence mesibo has become one of the most trusted and popular platform for developers to add real-time communication to their apps. Some of the mesibo customers have more sensitive data than others such as financial institutes and healthcare. These customers use mesibo for more than causal chats and therefore their end-to-end encryption requirements are more stringent — especially guaranteed assurance of protection against man-in-the-middle and recovering from it, unlike the Signal/WhatsApp example above**.**

In the following sections, we will describe some of our end-to-end encryption requirements and how we addressed them which makes the implementation by mesibo the strongest in the industry. Since our enhancements are based on Signal’s algorithm, we hope Signal/WhatApp eventually adopt these changes to fix the issue outlined above.

This article only describes the enhancement. If you are not familiar with the original algorithm, please refer to the Signal documentation before reading further.

Why We have Not Used Signal?

End-to-end encryption (E2EE) is not a new technology. The Off-the-Record Messaging by Ian Goldberg and Nikita Borisov in 2004 was possibly the first protocol widely implemented by instant messengers. However, E2EE became a more popular term after Signal improved on OTR and other protocols and then got adopted by WhatsApp.

Hence, when we decided to implement E2EE in mesibo, the first choice was to use Signal’s implementation. It addresses some of the limitations of previous protocols, provides good forward security and deniability, it is well implemented and tested. The Signal source code is available on GitHub which makes it quick to implement.

However, it’s not without shortcomings, and some design choices were a deal-breaker for us, especially, using centralized servers for public-key repository and no man-in-the-middle (MITM) protection, and also to a lesser extent, arguable choice of using AES-CBC cipher. Not to mention, very inefficient implementation for group messaging. Hence, we decided to write our own implementation based on Signal Protocol.

We eliminated these weaknesses by making mesibo end-to-end encryption protocol peer-to-peer and eliminating the server entirely. Not only we have used better AEAD ciphers like AES-GCM & Chacha20-Poly1305 but also made the protocol capable of using multiple ciphers simultaneously which makes eavesdropping multi-fold difficult. We also implemented three different schemes to eliminate MITM and other attacks like UKS. In addition, we enhanced the X3DH and Double Ratcheting algorithms used in Signal protocol, as outlined below.

We will briefly describe our enhancement below. A more technical paper will follow soon.

Serverless, Peer-to-peer Protocol

A server in a path is a possible vulnerability, even if it only has public-key repository. As described in the Signal documentation and WhatsApp white paper, they store public keys on their respective server.

It is often assumed that that public key is public and no harm storing them on the server. However, Alice giving her public key to Bob has a completely different trust model compared to Eve giving Alice’s public key to Bob. This issue has been studied thoroughly and enough literature on how this server-based approach makes MITM and UKS possible under special circumstances. Hence, the first step was to remove the server from the entire encryption and key management process.

The mesibo e2ee protocol is entirely peer-to-peer (P2P). The server has absolutely no role, the server does not even know the existence or the type of encryption. The only downside of the P2P approach is that untill the time both peers exchange keys, the communication will be unencrypted. However, this happens once in a lifetime and only during the initialization, hence the impact is negligible compared to the overall gain.

While the P2P approach is not sufficient to avoid MITM all by itself, it serves as a base for the MITM prevention methods described later.

X3DH-P2P — Enhanced X3DH Key Agreement Protocol

There are significant enhancements to the X3DH key agreement protocol used by Signal, please refer to the Signal document if you are not familiar.

As described in the previous section, there is no central server for storing public key repositories. Hence, there are no key bundles stored on the server. It also eliminates the concept of pre-key which we anyway felt vulnerable.

The entire key agreement protocol is P2P now. The keys are exchanged using Diffie-Hellman key exchange (DH) even during the bootstrap.

mesibo uses four types of elliptic curve public keys, the first two are the same as described in Signal Protocol

  • Identity Key (IK)
  • Ephemeral Key (EK)
  • Signed Ephemeral Keys (SEK). You may think of it as a one-time pre-key in Signal protocol but they are not the same. It’s not one-time but continuously exchanged using DH at a slower pace compared to ephemeral keys, called the signing frequency. The signed ephemeral keys are signed by the identity key. Hence, any malicious attempt to provide Alice with a forged Signed key will fail.
  • [OPTIONAL] Out-of-Band Signed Ephemeral Keys (OSEK) — explained later.

The reason for using signed ephemeral keys is to provide authentication to the KDF chains without incurring per message signature overhead.

Since there are multiple ephemeral keys, the DH is multi-level too — ephemeral DH at the lowest level then signed ephemeral, and then out-of-band signed ephemeral, if used. The threshold on the lower level of DH triggers the upper-level key exchange.

The output of all the DH levels becomes the input to the KDF chains along with identity keys. While ephemeral keys are not signed, another KDF input is (signed key) and in this way, the protocol will continue to provide strong forward security without incurring extra signature overheads per message.

Message Key = KDF(IDENT, DH(SEK), DH(EK), OTHER DATA)

where DH(K) represents the output of a DH calculation.

Out-of-band Signed Ephernal Keys— while signed ephemeral and ephemeral keys are generally exchanged over the same communication channel, out-of-band keys are exchanged over a different communication channel, if available.

For example, Alice and Bob are communicating using mesibo end-to-end encryption protocol, and out-of-band is enabled. At some point, mesibo will ask Alice to send a key to Bob out-of-band. Alice now uses a different channel, say uploading to a secure web server, and then confirms to mesibo that it has been sent out-of-band. On getting confirmation, mesibo now informs Bob to fetch the out-of-band key and perform DH, and vice versa. The output of this out-of-band DH also becomes the input to the KDF chains.

Message Key = KDF(IDENT, DH(OSEK), DH(SEK), DH(EK), OTHER DATA)

If someone wants to intercept the communication now, they need to intercept both in-band and out-band channels. Since mesibo can be configured to use multiple out-out-band channels, the intercepting task can be made next to impossible.

The Double Ratchet Algorithm

The double ratchet algorithm remains the same except the KDF chains receive extra inputs as explained above (out-of-band DH) and also explained in the next section, MITM.

However, the double ratchet algorithm is not very well suited for the group messaging which we plan to improve. The Ratchet Tree from IETF Messaging Layer Security draft could be a candidate.

Better Ciphers and Multi-Cipher mode

Another reservation against using Signal implementation was the choice of AEC-CBC + HMAC-SHA2 cipher (which is surprising). The AES-CBC is known to be vulnerable to various timing and padding attacks, and while it’s difficult to exploit those attacks in E2EE mode where the encryption key is changing for every message, there are better-authenticated ciphers available, and hence no valid reasons not to use them. By default, mesibo uses CTR-based AEAD ( (authenticated encryption with associated data) ciphers — AES-GCM and Chacha20-Poly1305.

mesibo also uses multi-cipher multi-key mode and implements cipher negotiation across devices. The multi cipher multi-key mode allows mesibo to change the cipher on-the-fly to make it further difficult to guess the cipher and multiple keys makes it even more difficult. mesibo provides API where app developers can select multiple ciphers and preferred ciphers.

Eliminating Man-in-the-middle

Most cryptography algorithms wouldn’t even exist if there was no man-in-the-middle, it is a well researched topic. There are three ways mesibo deals with MITM issue

Using Shared Secret (Password) in KDF Input

Alice and Bob agree on a shared secret out-of-band. This shared secret will be fed to the input of KDF on both sides. Due to the hashing nature of KDF, even one character password can change the entire output. Note that, the shared secret is not directly used, it only gets added to KDF inputs and the remaining algorithm remains the same. The message keys are changing for every message but they are now altered by the shared secret.

Most organizations and their customers have some form of password or PIN which makes this method the easiest choice to implement and can totally remove any MITM thread as long the shared secret is confidential.

As we described the Signal/WhatsApp problem in the beginning, if something goes wrong, merely a new password will change the per-message key generation and eliminate any possible MITM attack instantly.

Below is the screenshot from the mesibo demo app implementing the password. mesibo open source demo apps can be downloaded from the Apple AppStore or Google PlayStore.

image

Out-of-Band Identity Key Exchange

Unlike Signal, mesibo does not store identity public keys on the server but they are exchanged between both the parties during the early key negotiation and then also become part of the KDF input. If the identity public key changes, it will change the KDF inputs and so is the output message keys. This is a normal mode of operation (in-band).

In addition to in-band key exchange, mesibo also allows parties to export certificates or use their own certificate and then exchange them out-of-band. This completely removes mesibo from any key exchange. Any attempt to eavesdrop will fail without access to the right identity key.

image

Out-of-Band DH

As described in X3DH section above.

Fingerprint Verification

As shown at the beginning of the article, Signal and WhatApp use 60-byte fingerprints for each user. If Alice and Bob need to verify each other, they really need to compare a whopping 120-bytes, Alice first needs to compare 60-bytes of Bob’s fingerprint and vise-versa. It’s clumsy, gives no extra protection, and often discouraging.

Instead, mesibo creates a unique fingerprint for each pair of users which is derived from the identity keys of both the users. Hence, the fingerprint will be identical for Alice and Bob, which will be different from the unique fingerprint for Alice and Grace. So if Alice views Bob’s fingerprint or Bob views Alice’s fingerprint, they will be identical. Similarly, Alice and Grace will have different fingerprints but identical between them. This approach makes it easy for users to compare fingerprints, and if something is not right, there is an option to reset the entire communication (unlike Signa;/WhatsApp which does not provide any such options so far).

image

Mesibo API Documentation, and Tutorials

Feedback

We will be happy to hear from you and discuss — support@mesibo.com