Requirements for Storage in Curator File Service

Updated on December 10th, 2022

Due to the requirement of an on-disk stage for restoring material from deep storage, only the direct access archives can be used where the file storage type is set to S3:

  • The S3 lifecycle policy must be modified to archive to an accessible storage tier – Any accessible tier will work. Please note that S3 Glacier Flexible Retrieval and Glacier Deep Archive are not accessible tiers. Intelligent tiering can be used, but it is important not to select the archive option when choosing this tier.
  • Any existing historical files in Glacier (Flexible retrieval or Deep Archive) will not be accessible to Curator File Server (CFS). Currently this must be manually restored to an accessible storage tier (such as Glacier Instant retrieval).

Different types of deployments and their requirements

Deployed on-premises only, Curator File Server will be deployed on-premise and will track on-premise “Online” metadata (such as HiResPath, IsilonHiResPath etc). Content purged to Archive will not be available until a restore. There is no automatic restore from Archive in Curator File Service V1.

Deployed to the Cloud only, File Server can use S3, tracking S3 metadata (such as S3HiResPath). In order to use this file storage type, you must ensure that your cloud compute resource has access to the S3 bucket and follows the rules above.

In hybrid deployments, the location of Curator File Service, and, in some circumstances, use of storage is dictated by the fact that Curator File Service requires access to the storage where all assets that File Server uploads to or delivers from to users reside. It is assumed that the compute resources running in the cloud have no access to a ground station (on-premises) storage. CFS can only be deployed at one location and therefore it must have access to all other location storage volumes where it uploads to or delivers from reside.

Where there are multiple ground stations in Hybrid deployment with on-premise storage, and access to all assets’ locations is not possible from any one on-premise resource, S3 will be the only available storage option. This will be the case where on-premise storage is available only to its ground station, as is usual, and therefore some files are stored in locations that CFS would not have access to. CFS can be deployed on any ground station. All files must be copied to S3 as part of the ingest process. CFS will track S3 metadata (for example, S3HiResPath).

For single ground station Hybrid, with both on-premise "Online" and S3, Curator File Service must be deployed on-premise. CFS will track both S3 (for example, S3HiResPath) and on-premise “Online” metadata (such as HiResPath, IsilonHiResPath, etc).

Currently, with CFS V1, additional configuration is required to allow S3 access from the on-premise server. The bucket must be whitelisted using the organisation's public IP using a bucket policy. It is important to note that list permissions are not provided on the bucket. You will still need the object key (location) in order to download the asset from S3. Please ensure you confirm that all files have been successfully copied across to S3 before purging them from the on-premise server in order to maintain uninterrupted access.


Was this article helpful?