@yz-social/kdht
v0.1.27
Published
Pure Kademlia base, for testing variations.
Downloads
998
Readme
KDHT
A Kademlia Distributed Hash Table. See original paper and wikipedia
Our system works in browsers (and in NodeJS), using WebRTC data channels to pass information from one node to another. (WebRTC is the only peer-to-peer messaging mechanism built into every browser.) This is different from the original Kademlia, in which information was passed over connectionless UDP. A node kept a small set of data about the other nodes that it knew of, such as IP address and port number, and no long-lived limited resources such as sockets.
For testing and development, our system also allows the data channel connection to be simulated, so that large number of nodes can run directly in the same Javascript process, invoking RPC on each other directly rather than over the wire.
KDHT is used in the YZ p2p network.
Basic API
import { WebContact } from '@yz-social/kdht';
class WebContact
static method create({name = uuidv4(), info = true, debug = false} = {})
Promises a WebContact instance with the name property set. It is recommended for both security and robustness that you let the name be generated by the default means each session. For example, other nodes in the network may not wish to interact with a node name that has just recently disconnected. The info and debug parameters can be set to log information about what the instance is doing.
property name
The name of the node within the DHT network.
property attachment
A promise that resolves to the WebContact instance once it is connected to the network. This can be used to, e.g., change the user interface when the node is connected and ready.
property detachment
A promise that resolves once it has disconnected from the network, either by explicit disconnect() by the application and returning truthy, or from network conditions (such as the other side closing) and returning falsy. This can be use to, e.g., change the user interface or create a new WebContact when the ability to interact with the network goes away.
method connect(baseURL = new URL('/kdht', globalThis.location).href)
Connects through the portal at the specified baseURL. Returns a promise that resolves when connected and attachement has resolved. The portal is used only for initial WebRTC signaling to connect to the first other node, after which subsequent connections are made through the network itself. Note that the default is meaningless in NodeJS applications, and must be explicitly supplied.
method disconnect()
Disconnects from all other nodes. Returns a promise that resolves when disconnected and detachement has resolved.
method replicateStorage()
Use this if your application is not disconnecting, but may be shutdown by the operating system, such as when a browser document visibility changes to hidden.
method subscribe({eventName, key, handler, expiration = 60 * 60e3, autoRenewal = false})
Arranges with the network to call handler(item, key) when someone on the network publishes to the key. The subscription lasts only for expiration milliseconds (clamped to one hour), after which the subscription is removed from the network and the instance, and must be renewed if needed. The autoRenewal parameter can be specified truthy to automatically renew the subscription before expiration, for as long as the instance is attached to the network.
The subscription can be explicitly cancelled by supplying a null handler.
key is 128-bit BigInt that defaults to be a hash of eventName.toString(). There are no special semantics for eventName, so if an application needs, e.g., distinct portions of a complex eventName to map to different portions of the DHT address space, the application will need to construct an explictit key.
method publish({eventName, key, payload, subject = payload.toString(), issuedTime = Date.now(), expiration = 10 * 60e3, immediate = false})
Arranges with the network for each subscribed node to call handler({payload, subject, issuedTime}, key) in all subscribed DHT nodes. key is as for subscribe.
If immediate is truthy and the Contact has subscribed to the specified key, then the handler is called immediately and synchronously within the publish(), and not called again when the information gets reflected through the network. Otherwise, the handler is not called until the publication "comes back".
Each published datum is individually expired at the issuedTime + expiration, and removed from the network. Data may be republished by publishing to the same key and subject, or cancelled by doing the same with a null payload. Subscribed nodes do have the handler called with the updated or explicitly cancelled payloads, but not for expired data. A newly subscribed node receives previously published data that has not expired or been cancelled.
The payload may also be omitted, which does not change any existing payload data, but extends the expiration time based on the other parameters, with the limitation that the new expiration is never earlier than has been the case before.
The subject property is used to allow multiple publications to the same key. The network fires the handler for published data when it receives data whose issuedTime is strictly newer than any data that it already has for that subject + key. For example, to have multiple nodes publish, update, or cancel their own data to the same key, and to have every node receive the latest such value from all, then the application can have each node publish with, e.g., their contact.name as the subject.
Implementation
Classes
A Node is an actor in the DHT, and it has a key - a BigInt of Node.keySize bits. The key is computed as a hash of a name string, which is typically a GUID, but in any case unique among the nodes of the network to which the node is connected.
- A typical client will have one
Nodeinstance through which it interacts with one DHT. - A server or simulation might have many Node instances to each interact with the same DHT.
- In any case, there is one
Nodeclass for all uses. For ease of development, it is broken into linear chain of small classes that live in the nodes directory of the repo.
A Node has a Contact object to represent itself to another Node. A Node maintains KBuckets, which each have a list of Contacts to other Nodes.
A Contact is the means through which a Node interacts with another Node instance:
- When sending an RPC request, the
Contactwill "serialize" the senderNodes's contact. - When receiving an RPC response, the sender "deserializes" a string (maybe using a cache) to produce the
Contactinstance to be noted in the receiver'sKBuckets. - In classic UDP Kademlia, a
Contactwould serialize as {key, ip, port}. - Our
WebContactsubclass wraps around a WebRTC connection. - Our
SimulatedConnectionContactsubclass directly contains the "far"Nodeobject itself. - In both cases, a
Contacthas twoNodeobjects, the "host" that owns thisContact(in the node'sKBuckets), and a (possibly empty) representation of the farNode(to maintain, e.g., it'skeyandname). AContactalso has customizable methods toconnectand tosendRPC.
While a Node maintains several Contacts in its KBuckets, these are organized based on the distance from the Contact's key to the Node's key. However, each network-probing operation requires the ephermal creation of Contact information that is based on the distance to the target key being probed for. For this purpose, we wrap the Contacts in a Helper object that caches the distance to the target.
Connecting
Nodes learn about other node names through those that with which they are already connected. In Kademlia, the node will, from time to time, directly connect to these newly discovered nodes.
Browsers do not allow pages to listen for incoming connections. Instead, two WebRTC peers must exchange "signal" information that allows them to simultaneously find each other on the Internet. When a node A needs to directly connect to another node B, A sends signals to B, which responds with its own signals back to A. This happens a couple of times until agreement is reached on how to directly connect. Obviously, these signals messages cannot go directly between two unconnected peers, but must be carried through some intermediary. Once connected to any node in the network, the signals can pass as network messages through the already connected node.
To form that initial connection to another node, a new unconnected node must go through a "portal" server. This consists of an ordinary Web server that handles a GET request that answers the name of one of the nodes that are run by that server and with which the server can directly communicate. The joining node then makes a POST to that same server, specifying its own name and that of the portal's node, as well as the signals. The server passes those signals to the specified portal node, and responds with that portal node's answering signals. As long as the joined node remains connected to any other node in the network, it will connect to other nodes by passing the signals through the network itsef, rather than the portal web server, even if connecting to a node that happens to be on the portal server.
When a node initiates a WebRTC connection to another node, the second node might not have a Contact for the incoming node, and the receiving node should not make one until the connection is established. Further, the appopriate KBucket might already be full, and yet we still need to keep this new Contact. For these reasons, a Node maintains a list of Contacts for which a connection has been or is being made, called looseContacts. (This is somewhat similar to the "replacement cache" used in some DHT, but we do not yet use it for that specific purpose.)
The number of open WebRTC connections is capped to a platform-specific number based on experimental findings. When that number has been exceeded in a node, the node finds a connection to drop from among its looseContacts and its KBuckets.
In the original Kademlia, a ping RPC was sent to nodes during KBucket maintenance to see if they were still alive. In our case, we have a long-lived connection that closes. Further, browsers usually give us a chance to "clean up" when the page is closed, and so we have the opportunity to send an appropriate message when leaving. Thus we don't have to wait for WebRTC to time out. When we receive such closures, we delete the Contact connection, and we check for this rather than sending a ping RPC.
Scripts
The NodeJS script portal.js runs a little ExpressJS Web server and associated portal nodes. (See Connecting, above.). Each portal node is run its own NodeJS sub-process (using NodeJS's cluster mechanism). By default, it runs one portal node for each CPU core but one, allowing one core for the Web server process. The portal nodes are started one at a time, with each after the first connecting to one of the previous portal nodes through the script's own Web server at http://localhost:3000/kdht.
- If an
externalBaseURLis specified, the first portal node will connect to the compatible portal server running at the specified URL, forming one big network. Otherwise, the portal nodes and any other nodes that connect to it will be distinct. npm startruns the script to connect with ki1r0y.com/kdht, whilenpm run withoutExternalruns separately.- The ExpressJS routing parts of
portal.jsis also available separately asrouter.js, so that it can be used as middleware within existing ExpressJS applications. - A very basic Web page is also servered at
http://localhost:3000/node.htmlthat joins the same network.
Once one instance of portal.js is running, the script bots.js can be run any number of times (subject to what the computer can handle). Each invocation launches more nodes that connect through the local portal server. These bots can be stable, or they can be told to periodically disconnect and reconnect for testing. The default number of bots is half the number of logical cores on the computer. The script can be run with npm run bots and npm run thrashbots.
Testing
We use the https://jasmine.github.io/ test framework:
npx jasmineruns all the test that use only simulated network connections. In particular,npx jasmine spec/dhtSimulationsSpec.jsruns several customizable variations of network behavior.npx jasmine spec/testWebrtc.jsruns the portals and bots scripts, creates a node that interacts with them, and shuts down the portals and bots.npm testruns all of these.
Future Work
We currently follow the original Kademlia, in which each node probes the network by iteratively connecting with nodes successively closer to a target key, and asking each for better information before contacting the next. Subsequent work (e.g., R5N) uses recursive routing, where one node asks a connected contact for information, who asks another, etc., with the response and information about the network comming back to the callers. This indirect connection mechanism is more private, like onion-routing, and we would like to move to this. We already have a weak form of this for passing signals to a target with which we want to connect.
