IPFS Kubo 0.39.0: Optimistic Provide for 10x Faster Content Publishing

IPFS Kubo 0.39.0: Optimistic Provide for 10x Faster Content Publishing

IPFS Kubo v0.39.0 introduces Optimistic Provide, a new mechanism that reduces content publishing latency from an average of 15 seconds to under 1 second. This optimization replaces the rigid termination conditions of the traditional Distributed Hash Table (DHT) walk with statistical heuristics, allowing content to be discoverable almost in real-time without compromising record availability.

The Bottleneck in Traditional IPFS Publishing

Content publishing in the Amino DHT used by IPFS involves a two-phase process: a DHT Walk to find the 20 peers closest to the data's Content Identifier (CID) using the XOR distance metric, followed by a Follow-Up phase where the provider record is pushed to those 20 peers.

Historically, the DHT Walk was the primary performance bottleneck. The traditional algorithm required waiting for responses from the three closest discovered peers before terminating. In a permissionless network with high peer churn, these specific peers are often unreachable, forcing the system to backtrack and query more distant nodes. This rigid requirement often resulted in median latencies of ~20 seconds, with some operations taking over two minutes.

How Optimistic Provide Works

Optimistic Provide optimizes the publishing process through three primary technical mechanisms: network size estimation, predictive termination, and early return.

1. Network Size Estimation

To avoid the overhead of crawling the entire network, Kubo now estimates the global network size by piggybacking on the existing routing table refresh mechanism.

  • Mechanism: During a refresh, a node performs lookups for random keys. Assuming peer IDs are uniformly distributed, the distances of the nearest peers follow a Beta distribution for order statistics.
  • Bias Correction: To prevent local density bias (where a peer in a dense region overestimates the network size), the system exponentially downweights data points from non-full buckets in the routing table.

2. Predictive Termination

Using the network size estimate, the initiator makes probabilistic decisions during the DHT walk instead of waiting for rigid confirmations.

  • Per-Peer Storage: If a peer is encountered that is statistically likely (90% confidence) to be among the network-wide 20 closest peers, the record is stored with that peer immediately.
  • Walk Termination: Once the initiator is 90% confident that its current set of 20 closest peers constitutes the global closest set, it terminates the walk immediately, bypassing the need to wait for the three closest peers to respond.

3. Early Return

To prevent unresponsive peers in the follow-up phase from stalling the user experience, Kubo implements an early return threshold.

  • Threshold: Control is returned to the user as soon as 15 of the 20 target peers confirm storage.
  • Asynchronous Completion: The remaining 5 requests continue in the background. Research indicates that reducing the replication factor from 20 to 15 has a negligible impact on record availability.

Performance Results and Availability

Deployment of Kubo v0.39.0 showed a dramatic drop in average upload latency from ~15 seconds to approximately 0.7 seconds.

Impact on Record Availability

Despite the speed-up, record availability remains intact. The peers selected via the optimistic approach are statistically close enough to the target key to ensure retrievability. Furthermore, Kubo's Reprovide Sweep subsequently performs a precise PUT operation in the background to correct any initial placement imprecision, ensuring long-term accuracy without affecting the user's initial experience.

Limitations and Future Improvements

Two primary limitations currently affect the optimization:

  1. Undialable Peers: Approximately 50% of peer identities in the Amino DHT advertise private IP addresses. These peers inflate the network size estimate, making the termination threshold more conservative and reducing potential performance gains.
  2. Cold Start: Nodes require a partial routing table refresh upon startup before they can generate a network size estimate.

Proposed improvements include filtering out private IP addresses from the estimation, integrating the Reprovide Sweep to provide a direct peer count, and persisting the network size estimate to disk to eliminate the cold start delay.

Summary of Technical Gains

Optimistic Provide delivers a speed-up of over one order of magnitude, enabling sub-second PUT operations for 90% of requests in North America and Europe while reducing network overhead by over 40%. For developers and users, this means content is discoverable via HTTP Gateways and other Kubo nodes within one second of being pushed to the network.

Sources