Products

Use Cases

Customer Stories

Resources

Company

/

Product & Technology

Discussing Serverless AI-ready Data Cloud through Native Distributed Caching Acceleration

Discussing Serverless AI-ready Data Cloud through Native Distributed Caching Acceleration

Discussing Serverless AI-ready Data Cloud through Native Distributed Caching Acceleration

baize

Jan 3, 2025

In the contemporary landscape of data management, major manufacturers increasingly favor cost-effective object storage as their primary solution for persistent storage. This preference is driven by the numerous advantages that object storage offers, including low cost, high density, and enhanced serviceability. As enterprises grapple with the exponential growth of data, object storage has emerged as a compelling choice. However, despite its strengths, object storage is not without limitations, particularly concerning Input/Output Operations Per Second (IOPS) and bandwidth. Object storage systems are inherently designed to handle massive volumes of data, making them ideal for scenarios characterized by extensive storage needs. Nevertheless, this design can result in significant drawbacks regarding IO latency, especially in cross-regional storage and large-scale data retrieval operations.

As organizations increasingly adopt a model that separates storage from computation, the limitations of object storage IO speed have become a bottleneck for high-performance computing applications. Addressing the challenge of improving IO speed in object storage is now a critical priority for the industry.

In response to these challenges, DataCloud has emerged as a solution provider focused on delivering cost-effective computing services aimed at accelerating global digitalization. A key innovation from AI-ready Data Cloud is the development of the Near Data Processing (NDP) service component. This advanced component is designed to mitigate data access costs while significantly enhancing access performance and throughput. By integrating distributed cache acceleration services and enabling near-data computing offload, NDP provides a robust framework for achieving high-efficiency computing.

The Serverless AI-ready Data Cloud, championed by Bai Ze and Shang Cheng from the AI-ready Data Cloud team at DataCloud Technology, offers compelling advantages through its native distributed cache acceleration services. By optimizing data access patterns and reducing latency, this solution empowers enterprises to harness the full potential of their data while overcoming the inherent limitations of traditional object storage systems.

Overview of Native Distributed Caching Acceleration Services

NDP serves as a pivotal component in modern data management, providing a standard POSIX interface that abstracts the complexities of underlying storage systems. This abstraction allows upper-layer applications to operate seamlessly, fostering a unified data management platform. By employing varied internal logic tailored to specific access schemes, NDP minimizes the necessity for distinct logic development in upper applications, thereby enhancing service diversity and flexibility.

As its name implies, NDP is intricately linked to data processing, leveraging multi-level caching capabilities to mitigate data retrieval latency during computational tasks. This balance between service performance and cost is essential for achieving superior data processing outcomes. The architecture of the native distributed caching acceleration service is primarily bifurcated into two fundamental components: local-cache, situated close to data processing units, and remote-cache, which is designed to be distributed and scalable.

Both local-cache and remote-cache are further subdivided into two smaller cache types: mem-cache and disk-cache. This strategic segmentation optimizes the utilization of memory and local disk resources, ensuring efficient intra-cluster data circulation within AI-ready Data Cloud clusters. As a result, the costs associated with data access are significantly reduced.

NDP local-cache

High-performance computing typically requires substantial CPU resources. Among various instance specifications, there may be some idle resources in local disks and memory (MEM). Although the cost of these resources is relatively small, their utilization value should not be underestimated. Since the IO latency of local disks is much lower than the network overhead of accessing object storage, using a small portion of CPU resources to manage local disk data as a caching layer can effectively reduce the number of times object storage IO is accessed. The local-cache maximizes the use of local memory and local disks for caching, caching as much data that has been written or read as possible. Consequently, subsequent data retrieval can occur directly from the local cache, bypassing the need for object storage access. The cache management strategies by the local-cache ensures that the IO performance of accessing the local cache is comparable to the IO performance of the direct use of the caching medium, thereby significantly enhancing the efficiency of data retrieval during computations. Furthermore, NDP does not perceive data structures but pools all data for efficient management. This methodology permits caching at the level of data pages, refining the granularity of caching and effectively reducing cache eviction rates. In summary, NDP’s architecture and its dual-layer caching strategy are instrumental in optimizing data access, ultimately leading to improved computational efficiency and reduced operational costs.

NDP remote-cache

In the realm of large-scale data computations, the limitations of local disk resources often become apparent, particularly when the required cache volume exceeds available capacity. This challenge is exacerbated in distributed environments, where overlapping content in local caches across multiple compute nodes leads to diminished overall cache utilization within the cluster. Additionally, the rise of serverless technology poses further complications; upon startup, local caches must reacquire data from object storage, hindering the efficiency of caching mechanisms. To mitigate these issues, NDP has designed a remote-cache solution that establishes an independent caching cluster situated in close proximity to compute resources. This strategic placement significantly reduces cross-regional network latency and the read/write delays associated with object storage, thereby enhancing input/output (IO) performance. The architecture of the caching cluster is designed with a low CPU-to-local disk ratio, ensuring a cost-effective solution that maximizes resource utilization.

The local-cache serves as a client interacting with the remote-cache, creating a unified usage pattern for upper-layer applications. This abstraction shields applications from the underlying complexities of the caching infrastructure while facilitating rapid cache exchanges. Such a design not only streamlines operations but also improves responsiveness in data retrieval. The manager of the remote-cache is the core of the caching cluster's topology management. This component provides clients with real-time cluster topology information through a primary-backup model, ensuring continuous data accessibility. Additionally, it retains essential metadata that can be leveraged for value-added features, enhancing the overall utility of the caching system. The daemon of the remote-cache serves as the data node, responsible for storing all cached data slices and associated metadata. By design, the node that hosts the first data slice also manages the corresponding metadata, effectively distributing the metadata load and preventing centralized access bottlenecks. This architecture promotes efficiency and scalability, crucial for modern data-intensive applications.

The remote-cache employs a consistent hashing algorithm for its caching. The integration of a consistent hashing algorithm within remote-cache systems presents numerous benefits that enhance data management and retrieval strategies. At the core of its design, the consistent hashing algorithm addresses three pivotal aspects: data uniformity, scalability, and reliability. This algorithm significantly mitigates hash collisions, thereby ensuring that distributed data is evenly spread across service nodes. The result is effective load balancing, which optimally distributes access load and maintains performance stability. The scalability of remote-cache is notably enhanced through the consistent hashing approach. When nodes are added or removed, the algorithm minimizes cache invalidation, thus facilitating a seamless scaling process. This adaptability makes it possible to deploy on low-frequency bid instances, which are often constrained by resource availability but still require efficient data caching solutions.

The remote-cache implements a PIN caching capability aimed specifically at managing hot table data. This feature ensures that frequently accessed data remains prioritized and resistant to eviction, thus maintaining optimal performance during peak read and write operations. Once a predetermined Time To Live (TTL) expires, the data is unpinned and moved into an eviction phase, thereby supporting a long-term caching strategy for hot tables and enabling dynamic resource allocation.

Furthermore, the optimization of data retrieval from object storage through the Network Data Platform (NDP) enhances efficiencies in handling large Input/Output (IO) operations. By splitting these operations into multiple smaller tasks, NDP increases concurrency, thereby reducing waiting times when accessing data from object storage. This performance boost is crucial in environments where timely data access is critical.

Compared to the community version of Alluxio, NDP’s remote-cache offers distinct advantages by avoiding the limitations on Input/Output Operations Per Second (IOPS) that stem from centralizing metadata nodes. It boasts high throughput and low latency, coupled with horizontal scalability, making it a superior choice for contemporary data management challenges. The ability to compress the processing latency of individual requests to the microsecond range, along with reducing network IO latency to the millisecond range, underscores the system's efficiency. These features culminate in the full utilization of local bandwidth, further enhancing overall system performance.

The Value Brought by Native Distributed Cache Acceleration Services

In terms of functionality, NDP currently stably supports IO capabilities for object services across multiple platforms such as AWS S3, KS3, US3, COS, OSS, TOS, and GCS. Its robust functionality ensures seamless input/output (IO) capabilities, effectively bridging the gaps inherent in multi-platform interactions. By standardizing error handling and unifying disparate IO processes, NDP significantly alleviates the burden of redundant development for upper-layer applications. This layered approach not only simplifies integration but also enhances the overall management of data access within an AI-ready Data Cloud framework.The current iteration of NDP supports programming interfaces in C, C++, and Java, with an expansion to Python on the horizon. This strategic move is designed to accommodate a broader array of IO components, thereby fostering enhanced connectivity and versatility. As the demand for agile data solutions increases, the capability to interact across multiple languages is vital for integrating diverse applications, which will enable developers to leverage NDP’s resources with greater ease.

In terms of performance, according to internal E2E testing of the AI-read Data Cloud product, when there is no cache, the first read from local-cache shows no difference in performance compared to accessing object storage directly; upon subsequent reads, hitting the local-cache results in a 30-fold improvement in compute IO performance compared to direct object storage access. Although reading from remote-cache has about a 15% performance degradation compared to local-cache when caching is available, it offers significantly larger cache capacity and a much higher hit rate than local-cache. Additionally, the NDP delivers linear scalability while maintaining millisecond latency during both scaling and contraction phases. This user-transparent scalability ensures that businesses can adapt dynamically to fluctuating data demands without compromising performance or incurring excessive costs. By addressing the challenges of object storage IO performance, the AI-read Data Cloud product can provide users with an ultimate low-cost, high-speed response experience. The AI-ready Data Cloud will offer efficient, low-cost solutions for businesses like search, recommendation, advertising, and large models, making it an excellent choice for those seeking ultimate cost-effectiveness and computational efficiency.