147 lines
5.1 KiB
Plaintext
147 lines
5.1 KiB
Plaintext
Caching of blob data
|
|
====================
|
|
|
|
(This test is adapted from ZEO/tests/zeo_blob_cache.test in ZODB3.)
|
|
|
|
We support 2 modes for providing clients access to blob data:
|
|
|
|
shared
|
|
Blob data are shared via a network file system. The client shares
|
|
a common blob directory with the server.
|
|
|
|
non-shared
|
|
Blob data are loaded from the database and cached locally.
|
|
A maximum size for the blob data can be set and data are removed
|
|
when the size is exceeded.
|
|
|
|
In this test, we'll demonstrate that blobs data are removed from a blob
|
|
cache when the amount of data stored exceeds a given limit.
|
|
|
|
Let's start by setting up some data:
|
|
|
|
>>> blob_storage = create_storage(blob_dir='blobs',
|
|
... blob_cache_size=3000, blob_cache_size_check=10)
|
|
>>> from ZODB.DB import DB
|
|
>>> db = DB(blob_storage)
|
|
|
|
Here, we passed a blob_cache_size parameter, which specifies a target
|
|
blob cache size. This is not a hard limit, but rather a target. It
|
|
defaults to no limit. We also passed a blob_cache_size_check
|
|
option. The blob_cache_size_check option specifies the number of
|
|
bytes, as a percent of the target that can be written or downloaded
|
|
from the server before the cache size is checked. The
|
|
blob_cache_size_check option defaults to 100. We passed 10, to check
|
|
after writing 10% of the target size.
|
|
|
|
.. We're going to wait for any threads we started to finish, so...
|
|
|
|
>>> import threading
|
|
>>> old_threads = list(threading.enumerate())
|
|
|
|
We want to check for name collisions in the blob cache dir. We'll try
|
|
to provoke name collisions by reducing the number of cache directory
|
|
subdirectories.
|
|
|
|
>>> import relstorage.blobhelper
|
|
>>> orig_blob_cache_layout_size = relstorage.blobhelper.BlobCacheLayout.size
|
|
>>> relstorage.blobhelper.BlobCacheLayout.size = 11
|
|
|
|
Now, let's write some data:
|
|
|
|
>>> import ZODB.blob, transaction, time
|
|
>>> conn = db.open()
|
|
>>> for i in range(1, 101):
|
|
... conn.root()[i] = ZODB.blob.Blob()
|
|
... with conn.root()[i].open('w') as f: _ = f.write((str(i)*100).encode('latin-1'))
|
|
>>> transaction.commit()
|
|
|
|
We've committed 10000 bytes of data, but our target size is 3000. We
|
|
expect to have not much more than the target size in the cache blob
|
|
directory.
|
|
|
|
>>> import os
|
|
>>> def cache_size(d):
|
|
... size = 0
|
|
... for base, dirs, files in os.walk(d):
|
|
... for f in files:
|
|
... if f.endswith('.blob'):
|
|
... try:
|
|
... size += os.stat(os.path.join(base, f)).st_size
|
|
... except OSError:
|
|
... if os.path.exists(os.path.join(base, f)):
|
|
... raise
|
|
... return size
|
|
|
|
>>> cache_size('blobs') > 2000
|
|
True
|
|
>>> def check():
|
|
... return cache_size('blobs') < 5000
|
|
>>> def onfail():
|
|
... return cache_size('blobs')
|
|
|
|
>>> from ZEO.tests.forker import wait_until
|
|
>>> wait_until("size is reduced", check, 99, onfail)
|
|
|
|
If we read all of the blobs, data will be downloaded again, as
|
|
necessary, but the cache size will remain not much bigger than the
|
|
target:
|
|
|
|
>>> for i in range(1, 101):
|
|
... with conn.root()[i].open() as f: data = f.read()
|
|
... if data != (str(i)*100).encode('ascii'):
|
|
... print( 'bad data', str(i), data)
|
|
|
|
>>> wait_until("size is reduced", check, 99, onfail)
|
|
|
|
>>> for i in range(1, 101):
|
|
... with conn.root()[i].open() as f: data = f.read()
|
|
... if data != (str(i)*100).encode('ascii'):
|
|
... print( 'bad data', str(i), data)
|
|
|
|
>>> for i in range(1, 101):
|
|
... with conn.root()[i].open('c') as f: data = f.read()
|
|
... if data != (str(i)*100).encode('ascii'):
|
|
... print( 'bad data', str(i), data)
|
|
|
|
>>> wait_until("size is reduced", check, 99, onfail)
|
|
|
|
Now let see if we can stress things a bit. We'll create many clients
|
|
and get them to pound on the blobs all at once to see if we can
|
|
provoke problems:
|
|
|
|
>>> import threading, random
|
|
>>> def run():
|
|
... conn = db.open()
|
|
... for i in range(300):
|
|
... time.sleep(0)
|
|
... i = random.randint(1, 100)
|
|
... with conn.root()[i].open() as f: data = f.read()
|
|
... if data != (str(i)*100).encode('ascii'):
|
|
... print( 'bad data', str(i), data)
|
|
... i = random.randint(1, 100)
|
|
... with conn.root()[i].open('c') as f: data = f.read()
|
|
... if data != (str(i)*100).encode('ascii'):
|
|
... print( 'bad data', str(i), data)
|
|
... conn.close()
|
|
|
|
>>> threads = [threading.Thread(target=run) for i in range(10)]
|
|
>>> for thread in threads:
|
|
... thread.setDaemon(True)
|
|
>>> for thread in threads:
|
|
... thread.start()
|
|
>>> for thread in threads:
|
|
... thread.join(99)
|
|
... if thread.isAlive():
|
|
... print( "Can't join thread.")
|
|
|
|
>>> wait_until("size is reduced", check, 99, onfail)
|
|
|
|
.. cleanup
|
|
|
|
>>> for thread in threading.enumerate():
|
|
... if thread not in old_threads:
|
|
... thread.join(33)
|
|
|
|
>>> db.close()
|
|
>>> relstorage.blobhelper.BlobCacheLayout.size = orig_blob_cache_layout_size
|