
Architecture
There three base architectural principles on which Æthernet is built on:
No handshakes protocol. After a client is registered then a request to a server is performed with just a single message containing all required information for encryption, client authentication and the request data itself.
Multiple simultaneous end-points. A client is able to make requests to any of the servers from the client’s cloud at any time including simultaneous requests.
Thick client allows to greatly improve the connectivity of the client - the client can make a decision about the connectivity problems known only on the client’s side.
Message delivery latency
A typical message broker can provide minimum message delivery latency when the Internet connection is stable. High latency or even no ability to be connected are coming when something went wrong:
Client-side connection with high rate of packet loss. The connection to the broker may not be established because losing any packet requires the connection process to be started from the beginning. If many hand-shakes are required for the connection then it is almost impossible to reach the end of the process.
Aethernet uses a “no hand-shakes” / “no round-trips” approach.
If a cloud server replies slowly or is not accessible then the client closes the connection and tries accessing another end-point. It involves seconds of delay.
Aethernet is designed to accept multiple simultaneous duplicated requests. It allows the client to issue the duplicated request to another server if the first server delays the reply for just several milliseconds on top of the ping time.
Identity
All Æthernet clients are identified with the 16 bytes unique identifier (UID) assigned to them at the client registration or self-provisioning process. The UID is a permanent non-public, non-secret number all requests to Aether are authenticated with. The UID is used as an address for sending messages and managing the client by the application server. A new client performs registration under the parent client (aka application server) also addressed by UID.
Client-server architecture
Æthernet is a cloud of independent servers (not connecting to each other) distributed across the globe. Servers cache data from the central database. Every client has a personal cloud which is a subset of the whole Æthernet cloud. A client is able to make requests to servers of this cloud at any rate, one-by-one or simultaneously (including multiple simultaneous requests to a server) depending on use-case. Æthernet constructs the cloud in a way to fit a client's needs: minimizing latency if everything is ok with the connection or when something went wrong. A cloud can be changed at any time. The first entry in the list describing cloud is the closest server on which the client is landed to provide minimum message delivery latency.
Application server
Application server is a regular Æthernet client controlled by a developer that manages child clients: allocates quotas, allows to receive messages from other clients, blocks/deletes clients. In contrast, other back-ends use separated REST/gRPC API to manage clients. Any Æthernet client is managed by the parent. The application server is notified when a new child client is registered or the allocated quota is almost reached. Application server implements just business logic of the application with no worries about infrastructure (scalability, balancing, accessibility, attack resilience, using public IP-addresses etc.) - Æthernet takes care of everything.
Æthernet is a tree structure with the root element - Æther. It manages all child clients at any depth in hierarchy. Dev is a helper application we host that allocates quotas for developers to be able to try the service. For example, when a user runs the chat example then a new client is created and registered. The chat application server (that we host) is notified with a client creation and allocates some quota that allows the client to send and receive messages. Also the Chat application server tells Æthernet that the client is allowed to receive messages from all other clients and all other clients can receive messages from the new one. When the client reaches the quota limit then the chat server deletes the client.
Users - is the namespace for the web-site user’s accounts.
Anonymous is where all self-provisioned applications are hosted.
A more complex hierarchy with Mobile, IoT and Infra nodes is shown above as an example
Admin is a node that is associated with the account on the Æthernet.io
Mobile – the application server that manages the smartphone application instances
IoT is a separate node that manages all IoT devices
Infra is a node for service functionality like storing data into the back-end database etc.
A client life-cycle
Client is the addressable entity of the Æthernet. A client always has a parent and can have multiple children.
There two separated clouds exist:
Registration cloud is resolved with register.aethernet.io and provides a list of servers that support registration API only a new client relies on to become permanently registered
Working cloud accepts registered clients requests only. IP-addresses of server’s end-points are passed during the registration process or can be obtained with cloud.Æthernet.io
Registering new client
First, a client becomes a client after registering in the Æthernet’s central database. At the end, the client receives:
There two separated clouds exist:
permanent unique identifier (UID). Non-public, non-secret (known to only necessary clients that interconnect with the client)
permanent master-key that is stored in the central database only and used for deriving keys for all servers in the Æthernet cloud.
A client’s cloud - a list of servers chosen by Æthernet to minimize response latency and maximize connectivity of the client.
The absolute minimum that represents a client is uid and master-key. Now, the client is able to send messages to Æthernet servers’ like pull-message, send message, request on-line status, request remaining quotas etc.
The registration process is often called “provisioning”, and “just-in-time provisioning” for self-registration.
Working cloud
A client makes connections to the client’s cloud and makes requests. A connection can be long-living or can be gracefully closed by the client or Æthernet server at any time. Pulling messages is performed as a heart-beat request. When connected, a client can receive an order from a server with a message: new message, result of the server’s request execution, a child client is registered, changing the cloud etc.
A client’s cloud can be re-configured by Æthernet at any time but only when the client is online. The purposes of re-configuring the cloud can be: load balancing, client moved closer to another Æthernet server, a server malfunctioning or is shutting down intentionally, a new server is launching.
Any interaction with the Æthernet servers is performed by sending messages which are spawned by Actions - an asynchronous operation of the Aether API.
Action
The action is the object that represents the asynchronous process that sends / receives data to the Æthernet cloud via messages.
A newly taken action is stored into the client’s action list. The action executes periodically. The action can perform a request to Æthernet by sending a message to servers of the registration and working clouds. The action is notified when a reply/request from the server is received. A request may contain several messages.
The action inspects messages and performs some operations:
changes aether / client / cloud etc. states, for example, adding new servers into the cloud
releases themself or other actions
creates other actions
Message
A message is a set of parameters for a remote function executed asynchronously: a client pushes a message to the Æthernet cloud. The server can push a message or multiple messages to the client as a reply.
A message contains authentication information and encrypted data and requires no roundtrips.
Infrastructure
A centralized database contains information about Æthernet clients. When a new client is registered then a new entry in the database is created. The entry contains some information about the client and the most important information is:
UID - the unique identifier of the client. It is a non secret but non-public 16 bytes value. Æthernet has no meaning that the value is UUID
Masterkey - a key for the specific symmetric algorithm the client supports
Crypto Algorithms supported by a client
These values are permanently stored and never changed.
Æthernet uses libsodium, libhydrogen or a mix of them as cryptographic functions and bcrypt for the client registration process. Feature releases of Æthernet will support ascon with just 12 bytes nonce and 8 bytes of tag for AEAD.
Æthernet cloud is frequently reconfigured by removing / adding servers in various data centers around the globe. Æthernet cloud is balanced to archive the lowest message delivery delays, maximum fault tolerance, adequate resilience against attacks and to get the lowest maintenance price.
Registration and working clouds are running on different hardware servers that help to keep existing (already registered) clients running when the registration cloud is down for some reason.
Registration
A new client can be created by an application. The result of the request: {uid, masterkey}. Additionally, the application can retrieve the client’s cloud for the specific region where the client is going to be used to be sure that it is landed to the closest datacenters.
Another way to create a client is self-provisioning, where a registering client passes a registration procedure. The process is unattended for the application and fully relies on the Æthernet infrastructure. Dedicated registration servers are used within the registration cloud.
Registration cloud is maximally distributed to be accessible from any client’s location. The server’s addresses are resolved through registration.aethernet.io. A client can access any IP address from a resolved list or to get just the first IP-address (round-robin). The registration server supports only registration requests.
Registration process (referred to self-provisioning) containing 3 steps. All requests are designed to have the maximum performance so all of them are stateless - a server keeps no data for the sequence of requests to increase the denial of service resistance.
Requesting public key
The verified public key for asymmetric encryption is obtained on this step. The key is used later to encrypt sensitive data and to transfer it to the registration server.
Æthernet doesn't use certificates for setting-up session keys so Key binding should be used to prevent “man-in-the-middle” attacks. Compromising a bounded public key is a catastrophic accident that requires all clients to be updated with the new public key which is painful or impossible in some cases (for IoT devices, for example). To minimize the probability of such an incident Æthernet uses a global public key for signatures that should be bounded. When a new registration server joins the registration cloud then it retrieves some data specific for this particular server:
secret key for asymmetric encryption
public key for asymmetric encryption. This key is transmitted to the registering clients
public key signature generated from global Æthernet PK and SK
Compromising a server doesn’t affect other registration servers because each server has its own keys.
The database contains a pool of [Sk, Pk, Pk_sign] values. When a new registration server is created then the value is transmitted to the server and removed from the database. The database doesn't keep a global secret key for signing, instead, an external tool is used to refill the pool of signed public keys.
Each time the tool is used the private signing key is typed-in and the output keys are stored onto the flash drive and then transmitted to the database. The tool is running on a local dedicated hardware with no internet connection. The signing master key is not stored electronically.
The registering client requests a public key for a specific encryption and the signature method types that it supports. The request is just 5 bytes with the reply of 99 bytes. The request is extremely robust and implemented even within the network interface only. Amplification attacks can be an issue when UDP is used (Not implemented yet).
Requesting proof-of-work parameters
An application account is charged when a new client is registered under the application. To prevent malicious draining of the application’s account and denial of service (from Æthernet and / or user’s application server) with too many dummy clients, a “proof-of-work” from the registering client is required.
The request for this parameters contains:
Parent uid. The uid of the application under which the client wants to be registered
Method type of “proof-of-work” that the client supports. Currently only “bcrypt” hashing is supported.
Secret key encryption method type that the client supports to accept the reply from the server
Secret key for the symmetric encryption that the server should encrypt reply with
Public key method type and signature method type that the client supports. Server replies with the public key for asymmetric encryption and the signature of this key. Client will reply by encrypting the message with this key. The same as with the previous request. The protocol is stateless - the server doesn’t remember the previous requests parameters and just duplicates it here. Making protocol stateful opens the service for denial of service attacks due to a lot of memory that the state takes for tracking - there is no uid associated with the registering client yet exists on this stage so any “session id” must be accepted increasing memory usage dramatically.
The “man-in-the-middle” can’t tamper the request to the server because the verified server public key has been acquired on the previous step. The reply to this request can’t be tampered because it must be encrypted with the client’s secret key.
Register a client with “proof-of-work”
Æthernet uses «proof-of-work» to avoid Sybil attack when a denial of service happens with too many registrations or registered entities. Another side-effect of self-provisioning is charging the application account with the price of every new registered client.
An application (a regular Æthernet client that is designated to have other clients registered as children), if allowed to have child clients, can set the maximum desired rate at which new clients can be registered. The rate is measured in registrations per second total over all registration servers. If the actual rate exceeds the maximum allowed then Æthernet increases the work factor of the proof-of-work algorithm so every new client spends more time computing the proof. Oppositely, if the actual rate is lowering then Æthernet decreases the work factor but not less then the minimum work factor value also set up for the application. Decreasing the minimum work factor too much allows an attacker to periodically «steal» cheap registrations. The central database service accumulates the registrations rate for all registrations servers and sets-up the current value for every application.
The reply of the previous request is specific to bcrypt hashing:
Salt — the bcrypt-specific salt that contains the work factor and the salt itself. The salt is individual for any new registration request. The server doesn’t keep the salt to check it later for the next request because the protocol is stateless and no association with the registering client and the salt is made. Salt is sent to the registration server on the next step to let the server verify the work done by the client. It is impossible to use another salt, generated on the client side because the «password suffix» field is generated based on the salt returned by the server.
Password suffix - is a string value (bcrypt is specified for text password only) that is concatenated with password before process hashing. The suffix is a “black box” value that the server generates and uses for the verification that the actual work is done for proofing the registration. Here is how the server implements this black-box to generate password suffixes (It can be changed at any time with no notification).
Libsodium is used for random number generation, cryptographic hashing and encryption with symmetric keys. Client library requests registration of a client and passes “as-is” bcrypt salt and password suffix so the server can verify the request:
Decodes the suffix from base64 into binary and decrypts the result with this server’s secret key
The request contains just 12 bytes of nonce and the last 12 bytes are just 0. It is ok to use only 12 bytes of nonce with the last 12 bytes hardcoded because it’s not the transferring secret data where possible nonce collisions are critical. It’s just for the proof-of-work verification process.
bcrypt allows the password maximum length of 72 characters. Using 24 nonce bytes with cypher gives 74 bytes of base64 encoded string. 56 bytes with 12 bytes nonce for suffix. The password is the uint32 type stored as a string with a maximum length of 10 characters. The total maximum length is 66 bytes that is less than 72.
each server uses random secret key - reusing the proof on another server fails
tempering nonce or cypher will fail
Checking the integrity of the encrypted data is a lightweight operation - the CPU is not a bottleneck
checks the decrypted time point
if the time point is older than 10 seconds (no guarantee) then the request fails
if the time point within 10 seconds then the server computes the hash of the request and searches the hash in the table that keeps all recent requests’ hashes
if the value is not found than the request is ignored
if the value is found then the value is removed from the table to avoid replay attacks
server automatically removes hashes older than 10 seconds. The amount of data used to store hashes is very small. The hash function type that is used is not cryptographically safe and is very robust
The server-side time-point is very important to check because an attacker can use current work factor (which is typically low) and pre-compute proofs with a very power efficient but slow algorithm. To make this pre-computation useless the server sends salt that is signed by password suffix.
server computes hash of salt and parent uid to be sure that:
bcrypt salt contains the correct work factor and is not substituted by a client
parent uid must be to that one to which the proof-of-work has been computed. It prevents using the fake application with very relaxed parameters for proof-of-work.
Timing attack can’t be performed with the registration server because the server just ignores incorrect requests.
“Proofing-of-work” is done by providing the 4 bytes number. Server converts the number to string and concatenates with password suffix. The resulting string with the bcrypt salt is passed into the bcrypt hash generation function. The result is convoluted with the Crc32 function to get 4 bytes of the integer. The integer must be less or equal to the maximum hash value. The value is adjusted to support the desired rate of the client registrations per second.
The average time a client finds the correct numeric value for the proof-of-work doesn’t mean that a particular client spends this time. The normal distribution law is applied here. To normalize the computation time a proof-of-work pool is implemented. A pool size is a server-defined parameter a client must follow.
The server can reply with new PoW parameters if they have been changed since last request. It prevents the delayed computation attack, when an attacker creates a lot of registrations requests with low PoW factor because the auto-adjustment of the factor applied on the amount of successful registrations. If the auto-adjustment would work with the requesting the PoW parameters, it would be possible to increase the PoW factor of the victim application to a level that the legitimate clients wouldn’t be able to be registered.
At the end the client specifies the method type of symmetric cryptography that it supports and the secret key for the method type. The key is encrypted with the Æthernet service database public key for asymmetric encryption that is transferred with the signature. The signature is verified by the client to be sure that it is safe to encrypt and transfer the master key to the database so even if the registration server is compromised nothing from the client side is compromised.
The reply to the proof-of-work is:
UID
ephemeral UID
working cloud
Proof-of-work hashing algorithms
The hashing algorithm is a key component of the proof-of-work solution that should:
not be complex to fit into the tiny MCUs
consume small amounts of RAM to fit MCUs
fast enough for a single computation to be sure that a registration server can verify the proof pretty fast
not be accelerated with GPU and ASICs
be proofed that not able to be accelerated with other algorithm
be cryptographically resistant
Crc / murmur etc. are very fast but are non-cryptographic. It is possible to produce the number out of the hash value.
SHA-like are very good accelerated by GPUs and ASICs
Homomorphic encryption. We examined this schema but recent reports claimed acceleration by a factor of 75. Total by a factor of 100M in 3 years. Also GPU acceleration factor is 50.
Password hashing algorithms vs bcrypt:
Argon2 - the best known
complex and heavyweight in terms of code size. It is not even implemented in the STM32 crypto library. Bcrypt is implemented.
Memory usage is tens/hundreds Mb of RAM even for just 100 ms iteration and grows up for longer iterations
Argon2 is weaker than bcrypt for iteration time less than 1 second. 1 second for the iteration on desktop is too huge for MCUs and for server-side verification
scrypt
requires huge amount of RAM to prevent be GPU and ASIC accelerated
bcrypt
has a variety of work factors to choose from for the optimal balance between the single iteration time and crypto-resilience
4 Kb of RAM usage
stable code out from Unix
CPU vs GPU vs ASIC
example: CPU - 50K#/s, GPU - 75K#/s, ASIC - 1M#/sec
Accelerated on GPU but the price per iteration is similar to CPU
Accelerated on ASIC but the price per iteration is almost similar to CPU
Baloon hashing is not proven yet but requires less memory than Argon2
Bandwidth
The registration process requires 3 requests/replies to perform. The amount of data transferred (TCP payload), in bytes, for libsodium:
Send,Bytes | Receive | |
---|---|---|
Request keys | 5 | 99 |
Request PoW params | 105 | 216 |
Register | 288 | 140 |
Total 853 bytes. For example, a non-cached TLS session requires about 6.5Kb of data to be transferred.
Working cloud
Working cloud accepts only authenticated requests from already registered clients. Each server has a number that increments for any new server. If a physical server is shutdown and launched again then it appears with the new number in the cloud.
If a server is going to be included into the cloud then, first, it caches some central database fields related to the most active clients. A client’s masterkey is stored only in the central database and never transferred outside, instead, a particular Æthernet cloud server’s key for symmetric encryption is derived from the masterkey key = KDF(masterkey, server_id | key_index). Initially, key_index = 0. If a particular server is compromised then the server is shut down immediately and no other servers are affected by the incident.
A client can renew the key (key_index is incremented) on a particular server at any time giving “forward secrecy”.
Rx (receive) and Tx (transmit) secret keys are generated from a single master key to enhance the security in case when a server or a client are compromised.
Protocol
Æthernet implements multilevel encapsulating binary protocol with versioning. The basic block of the protocol is Message with data that contains parameters of the remotely executed function on server, or server pushes a message to a client. The protocol uses a request-response schema or just sends a message with no reply from the other side. If a connection is established then the server can also send a message to the client - “push message”.
The protocol can use datagrams or streams as an underlying protocol for the Internet connection. Connect-oriented or connectionless protocols. The stream is divided into messages with size specified. A message can encapsulate other messages' data.
Message
The message always starts with a variable size identifier. The IDs are divided by namespaces. The root namespace is with what the remote side starts to deserialize the message. Once the message is deserialized completely (including all encapsulated messages) then another message starts deserialization. Messages are grouped into a packet.
Æthernet uses a limitation for a single packet size - 1500 bytes that is suitable for the most common internet routing with no fragmentation. A user can set a lower limit for C++ library to be able to use other hardware medium for transferring the information.
A particular message’s class instance is created when the message ID is taken from the stream. The message’s deserialization procedure comes with the data members to be initialized. When the message is deserialized then, if the message contains encapsulated messages, then the deserialization process continues with the data. The message can transform the data, for example, decrypt it. The most important part here is that the root messages' IDs are separated from other IDs namespaces. For example, #3 on the root namespace means libsodium encrypted message but #3 on the following namespace can be the NumMessages message.
A message can encapsulate multiple separated messages. Example structure:
0: LibsodiumEncrypted: nonce
1: Response: id
1: SetCloud: servers list
Debug information: timings
1: Compressed - a proxy message that redirects to another namespace
2: Zlib
1: Message: sender_id, text
1: Message: sender_id, text
1: Response: id
1: Online: last_timepoint_seen
Number is a namespace of the messages ID encoding. The root namespace (0) redirects the creation of next messages to the subsequent namespace. Compressed messages redirect to the selection of the compression algorithms and the Zlib message returns the stream decoder back to the namespace 1 where Messages come.
Message IDs namespaces
MessageID uses the variable size integer for encoding the message index to allow it to have more than 256 messages in a single namespace. But it’s difficult to reach because a number of namespaces can exist to extend the number of messages to be supported. RequestID is another great opportunity to reduce the number of IDs to be used.
Repeat message
Repeat messages don't use encryption. It is just a 4 bytes hash code and is extremely lightweight in terms of traffic and computational power.
If one side is going to send exactly the same message to another side then it uses the Repeat message that just says that the last message must be repeated. It uses Hash(Kdf(Meta::id, tx_key | Hash(Message))) with the Meta::id incremented for each message. The Hash(Message) is used to avoid repeated values when the Meta::id is looped. The other side does similar hash computation and, if matches, repeats the previous message and replies with the Repeat message if nothing changed or, replies with the regular message. If the hash code doesn’t match then drops the recorded message and does not reply because the 4 bytes code can be sent by the Man-in-the-Middle. Dropping a message is performed because an attacker can try to send a lot of 4 byte values trying to repeat the message but just a single incorrect value prohibits the Repeat message. The sender, once, not being replied sends the regular encrypted and authenticated message.
Versioning
The versioning of the protocol is implemented via adding new messages with message IDs. No altering of existing messages’ data allowed.