D3N RGW Data Cache
Contents
Datacenter-Data-Delivery Network (D3N) uses high-speed storage such as NVMe flash or DRAM to cache datasets on the access side. Such caching allows big data jobs to use the compute and fast storage resources available on each Rados Gateway node at the edge.
Many datacenters include low-cost, centralized storage repositories, called data lakes, to store and share terabyte and petabyte-scale datasets. By necessity most distributed big-data analytic clusters such as Hadoop and Spark must depend on accessing a centrally located data lake that is relatively far away. Even with a well-designed datacenter network, cluster-to-data lake bandwidth is typically much less than the bandwidth of a solid-state storage located at an edge node.
Architecture
D3N improves the performance of big-data jobs by speeding up repeatedly accessed dataset reads from the data lake. Cache servers are located in the datacenter on the access side of potential network and storage bottlenecks. D3Ns two-layer logical cache forms a traditional caching hierarchy * where caches nearer the client have the lowest access latency and overhead, while caches in higher levels in the hierarchy are slower (requiring multiple hops to access), The layer 1 cache server nearest to the client handles object requests by breaking them into blocks, returning any blocks which are cached locally, and forwarding missed requests to the block home location (as determined by consistent hashing) in the next layer. Cache misses are forwarded to successive logical caching layers until a miss at the top layer is resolved by a request to the data lake (Rados)
* currently only layer 1 cache has been upstreamed.
See MOC D3N (Datacenter-scale Data Delivery Network) and Red Hat Research D3N Cache for Data Centers.
Implementation
The D3N cache supports both the S3 and Swift object storage interfaces.
D3N currently caches only tail objects, because they are immutable (by default it is parts of objects that are larger than 4MB). (the NGINX RGW Data cache and CDN supports caching of all object sizes)
Requirements
An SSD (/dev/nvme,/dev/pmem,/dev/shm) or similar block storage device, formatted (filesystems other than XFS were not tested) and mounted. It will be used as the cache backing store. (depending on device performance, multiple RGWs may share a single device but each requires a discrete directory on the device filesystem)
Limitations
D3N will not cache objects compressed by Rados Gateway Compression (OSD level compression is supported).
D3N will not cache objects encrypted by Rados Gateway Encryption.
D3N will be disabled if the
rgw_max_chunk_size
config variable value differs from thergw_obj_stripe_size
config variable value.
D3N Environment Setup
Running
To enable D3N on an existing RGWs the following configuration entries are required
in each Rados Gateways ceph.conf client section, for example for [client.rgw.8000]
:
[client.rgw.8000]
rgw_d3n_l1_local_datacache_enabled = true
rgw_d3n_l1_datacache_persistent_path = "/mnt/nvme0/rgw_datacache/client.rgw.8000/"
rgw_d3n_l1_datacache_size = 10737418240
The above example assumes that the cache backing-store solid state device is mounted at /mnt/nvme0 and has 10 GB of free space available for the cache.
The persistent path directory has to be created before starting the Gateway.
(mkdir -p /mnt/nvme0/rgw_datacache/client.rgw.8000/
)
If another Gateway is co-located on the same machine, configure it’s persistent path to a discrete directory,
for example in the case of [client.rgw.8001] configure
rgw_d3n_l1_datacache_persistent_path = "/mnt/nvme0/rgw_datacache/client.rgw.8001/"
in the [client.rgw.8001]
ceph.conf client section.
In a multiple co-located Gateways configuration consider assigning clients with different workloads to each Gateway without a balancer in order to avoid cached data duplication.
NOTE: each time the Rados Gateway is restarted the content of the cache directory is purged.
Logs
D3N related log lines in radosgw.*.log contain the string
d3n
(case insensitive).low level D3N logs can be enabled by the
debug_rgw_datacache
subsystem (up todebug_rgw_datacache=30
)
CONFIG REFERENCE
The following D3N related settings can be added to the Ceph configuration file
(i.e., usually ceph.conf) under the [client.rgw.{instance-name}]
section.
- rgw_d3n_l1_local_datacache_enabled
Enable datacenter-scale dataset delivery local cache
- type
bool
- default
false
- rgw_d3n_l1_datacache_persistent_path
path for the directory for storing the local cache objects data
- type
str
- default
/tmp/rgw_datacache/
- rgw_d3n_l1_datacache_size
datacache maximum size on disk in bytes
- type
size
- default
1Gi
- rgw_d3n_l1_eviction_policy
select the d3n cache eviction policy
- type
str
- default
lru
- valid choices
lru
random