Cloudsoda is a unique solution to moving content between locations, much like our internal Xfer device, however this device uses Cloudsoda.io's bespoke software. Typical configurations will have us use it to move content up to cloud storage (s3 bucket) and restore from these locations. Future editions of the Cloudsoda release may include delete functionality and file metadata extraction.
The solution is deployed by the Partner and we have created an API device integration that can be setup in Device Director (DD).
The core requirements will be:
- The SoDA Device setup in DD
- A profile for the SoDA device setup in DD
- A mediastore for jobs to use the device and profile setup in Process Engine
- Cloudsoda host and root locations to be setup (this is typically done by the Partner directly on their own hardware, but may require some help from IPV with regards to physical storage locations to be designed/established on a network level for config)
- s3 buckets and policies to be configured as a destination/source
Device Setup
A device will need to be created that can make API calls to the Cloudsoda host and pass job information between it and Device Director. Jobs are typically sent to or restored from.
Resolution:
You will need to create a SoDA device in Device Director. Navigate to the Devices tab, and select Create Device:
In the next section, from the Type drop-down, choose SoDA, and give the device a name and the host address. This can either be the FQDN or the IP. Then select Continue:
UPDATE (NOV 2021):
Currently it is necessary to input a host address with the following syntax:
Host Address-https://ipv-nova-1-api.cloudsoda.io
Admin Client Address-https://ipv-nova-1.cloudsoda.io/
As you can see above, this is derived by taking the admin website address and adding -api. With this, Curator should be able to access SoDA. Without, you will likely get an error message.
In the next section, configure the device, setting the Maximum Devices to the number of concurrent jobs you want the device to run at any one time (much like a licensed channel limit for XCode):
It is recommended to check the limitations of jobs of the Cloudsoda host. Though technically unlimited, we recommend starting with 5 and increasing as necessary, for jobs initiated by users via plugin or automatically vs the hardware capabilities of the host. Configuring the limits appropriately in this section helps to avoid bottlenecks whereby not enough jobs can be run in Device Director.
NOTE: It is probably a good idea at this point to set up two hosts here (depending on requirements): one for uploading and another for restoring, so that the type of job and device count are sensible, i.e., 5 for a upload device, 5 for a restore device.
Once you are done, click Create and wait for Device Director to establish connectivity to the Cloudsoda host.
Profile Setup
Another requirement of the setup is to create a profile that can be used in device director to facilitate the job submission via the API integration. For a typical setup, we would configure two profiles: one for uploading and another for restoring. All of the configuration options will be outlined here, but these should be configured uniquely to suit your organisation.
Resolution:
In Device Director, create a SoDA profile. To do this, navigate to Device Director and select the Profiles tab, the select Create Profile:
In the next page, from the Type drop-down, select SoDA and then click Continue:
The next section has a number of options which will be explained here. Bear in mind that these should reflect your organisation's specific requirements. Multiple variations on the details given here may be required.
First, give the profile a distinct name:
or
This will be down to the requirements of the Cloudsoda integration:
The next option is to configure the type of transfer that will be done using this profile. This can be either Copy or Move. Select the desired option from the drop-down menu:
The next option is to set a transfer limit on the jobs being run. Typically this is left blank so as not to bottleneck speeds of the jobs, but you do have the option to set a limit if required:
The next option is to do with conflict handling on the destination location for the job using this profile. The options are Rename, Overwrite and Skip. Each has its place in setups. The default is Rename.
The logic behind each option is straightforward:
- Rename will create a new file if one with the same name already exists at the destination location, adding a (1) at the end of the new file's name.
- Overwrite will ignore a file at the destination with the same name and overwrite it.
- Skip will not ignore a file at the destination with the same name, and will not create a file or overwrite the existing if this is the case.
Finally, you must select the retrieval type you want to associate with the profile, this will only be used based on the profile being used for a restore task, not a upload. There are four options - No AWS, Bulk, Expedited and Standard:
Again, the logic behind each of these is pretty basic:
- No AWS will indicate a job that isn't typically going to integrate with a s3 bucket requirement, so would be a straight move or copy much like Xfer.
- Bulk will leverage the s3 integration Cloudsoda has and bulk run multiple files as an upload or restore.
- Expedited will again leverage s3 integration, and prioritise the restore job from a glacier type bucket (at an additional cost).
- Standard will do the same but on a less urgent timescale, thus not incurring additional cost.
No AWS is the default, but we would use Standard unless the client/partner requires anything else.
Once the options are configured, select Create:
Basic MediaStore Setup
The core and most flexible part of the Cloudsoda deployment requirement is configuring mediastores to send or restore content via the API integration with the device configured earlier in the guide.
Resolution:
Mediastore to copy HiRes up to s3 bucket via Cloudsoda:
This is the most basic of mediastores for Cloudsoda integration.
Keys:
DryRun: False - Dry Run isn't really used for copying up to archive since there is no real cost in doing so, this is a key that is present but used in the opposite action of restoring from a glacier s3 bucket, as such isn't used for jobs copying/moving up to s3 buckets
FilePathMetadataKey: SoDAAWSPath - This is the metadata name that we want to associate with the file path that will be saved against our asset. This can be anything you like, but must have been created in Curator System Administrator prior to being used:
FolderPattern: {asset.name} - This is the pattern we will use for the folder structure in the s3 bucket when copying the HiRes file to the s3 bucket.
OutputProfile: SODA-Copy - This is the Device Director Profile we use to pass settings to the SoDA device used in Device Director. In this case the profile is set to do a copy function when working with the HiRes sent to the device.
Path: s3://somes3bucketname/ - This is the bucket name that we set for the s3 bucket. We need this for a path key, however in reality this is what is also configured in the Cloudsoda interface for a storage node, as the path in that (they must match). Note that we have a "/" at the end of this path, there is logic to add/strip this out where necessary, but if the SodaRoot Key has a "/" at the end, then this key should also have one.
PathMetadataKey: SoDAAWSPath - Much like the previous FilePathMetadataKey key, this is the same metadata name that we will use for the value of the path key so we keep a record and track it against the asset.
SodaRoot: soda://somesodanodename/ - Again, over in our Cloudsoda interface where we create a storage node of various types of s3 buckets (normal/glacier etc.) we have to give the storage node a name. This is then preceded with a soda:// to work with the API when we want to send jobs to Cloudsoda. Again the other key used Path is the path used in this soda storage node, which must match in both the mediastore and the Cloudsoda storage node. Again, much like that setup, the "/" should be present if it is in the Path key, but we have logic in the workflows that are used to strip or add this where required.
Source: HI-RES - The source store for our source file to be used in the job sent to the SoDA device in Device Director. Typically this will be the default HI-RES mediastore, but could be any store that has a path key to a file that we want to use, i.e. variations on the HI-RES mediastore.
StoreType: Archive - This is to determine the behavior of the workflows used when sending a file. In this case we want to use the archive logic, for sending a single file job to a destination and set path keys.
Workflow: Spawn - SoDA Transfer - A dedicated workflow used for transferring of files using SoDA device. This is used for both restore and send/archive. At the time of document creation this is at version: 1.0.15
Mediastore to restore HiRes files from an s3 bucket location:
With the restore integration of the workflow, we leverage the approval functionality, where you can force conditions for files to be restored, such as if the duration of the file is longer than 15 minutes (the default approval requirements) but in the Cloudsoda integration, it is useful to dry run a restore job to get information about the cost it would incur from an s3 glacier bucket and a other things like the time it may take a restore job to run. These are passed as dry run variables and saved as metadata against the approval asset/collection of assets for admins to approve in Clip Select.
The keys are outlined here. These explain the function of approval and how it has been integrated into Curator, leveraging the Cloudsoda API and what that means for the admin and end user.
Keys:
ApprovalRequired: True - This will trigger the dry run of a job, allowing you to use the next key to determine requirements for approval. The dry run will submit the job to Cloudsoda but as a test run. Doing so allows a test run to be run, which is also known as a dry run. When this completes, you can get from it some useful metadata, like duration, and cost of the job you would run in full upon approval.
ApprovalRequiredFrom: APPROVAL-BY-ADMIN - The mediastore name used to add approval logic to the approval job (see next mediastore breakdown).
ApprovalStore: SODA-SMB-RESTORE - The name of the store used to set approval conditions and keys, which incidentally is the same mediastore this key is being placed into. Doing this means that you do not have to create additional mediastores, and allows the workflow to be used for both the dry run and settings of approval directly.
NOTE: You could change this to another approval store, but in the Cloudsoda case, it is cleaner for us to use the same mediastore and keep the keys together.
ApprovalWorkflow: Spawn - SoDA Transfer - The same workflow as the previous mediastore. This workflow can be used for both send and restore transfers. In this case, the workflow is also used to process the approval functionality as well for the integration of the into curator.
CumulativeMetadataFields: SoDATransferDuration,double;SoDATimeSpan,timespan;SoDATransferCost,doubleb - This key is used for setting metadata the we glean from the dry run job against the parent asset and approval asset. These are TransferDuration and TimeSpan and TransferCost. You can then use these in the metadata for the view of the approval asset in Clip Select. When you create a restore job, of one or more assets, these can be added to an approval collection asset. By doing this, this also sets the total of the duration and cost for all the assets per job submission:
DeviceGroup: SoDA-Restore - It is not a requirement to set up a separate group here, but this helps if you have configured one SoDA device in Device Director, allowing it to be used for both types of transfer. If many jobs are run, all of the available job channels on the device will quickly run out (this is typically set to 5). By creating a second cloned device using the same details with a different name, and creating a separate group for it, so it will only be used in the restore type task, you can split the load of send or restore jobs between devices. Of 5 channels set, only two job batches are run at the same time, so it is useful to split tasks between send and restore:
This is an example showing the two devices, identical but split by name, allowing you to split them into different device groups
Device groups are set to default (used for the send task), and in this case, restore:
HiResAccessible: True - A key used to tell the workflow you have access to the hi-res file, so it does not skip trying to use it. It is useful not to skip trying to use the hi-res file, as this is what you are trying to restore in this instance.
OutputProfile: SODA-Copy - The Device Director profile you want to use for this transfer when sending the job to Cloudsoda. As the file is being restored, it should not be moved out of the s3 bucket. It should use Cloudsoda to copy across from the s3 bucket to your destination.
Path: \\someuncpathshare\ - This is the same UNC path that the Cloudsoda storage node used for the SMB location. This MUST match the path in the Cloudsoda node store. Note that the path has a "\" on the end of the value, make sure the SodaRoot key also has this if its present here.
Side Note: You can see here that this is a UNC path. The SoDA device MUST be able to access it. Since hybrid scenarios also need services to access the hi-res files after restores, there may be mapping issues between this UNC path value and mapped drive values used for Process Engine setup and access in the hybrid cloud ground setup. If this is the case, be sure to check your PE Basic Configuration Settings and your Mediastore Path keys so that they can be resolved to the same unc path - More detail on this can be found in super advanced hybrid setup training.
PathMedataKey: SoDASMBPath - Much like previous path keys, this will be the metadata name that the Path key value is saved against for the asset.
SodaRoot: soda://somesodanodename/ - The name of the SoDA root must match that of the Cloudsoda node in the SoDA device. Also within this node, the path must match the other key Path. Again note that the "/" at the end must match the path in having a "/" or not. They also must match the value in the Cloudsoda node.
Source: SODA-AWS - This is the source store you are transferring from, which is the same mediastore used to send the hi-res asset to the s3 bucket used earlier. This will pass the correct Cloudsoda path values to the job for the source of your asset in the s3 glacier bucket.
StoreType: Vault - This sets the logic for the workflow on how to treat the logic behind the type of mediastore and job being done. In this case Vault has been selected. This is a flexible store type allowing more keys, path info and folder structure to be set.
Additional Mediastore for Cloudsoda:
Mediastore used to configure Approval conditions and settings:
This is a breakdown of the Approval mediastore, and what the Keys are used for.
ApprovalReason: Some free text - This key is used to pass as a message for needing approval to the end user that requests an asset that will need approval.
ApprovalStore: SODA-SMB-RESTORE - The mediastrore used to return back to once approval conditions are set.
ApprovalWorkflow: Spawn - SoDA Transfer - All Cloudsoda -related approval tasks have been combined into the one workflow, so you only need to configure future updates in one workflow while using SoDA and its integration into Curator.
CumulativeMetadataFields: SoDATransferDuration,double;SoDATimeSpan,timespan;SoDATransferCost,double - Much like the previous key used in the previous mediastore for restore, this is the value you want to glean from the dry run, that is passed to be set as metadata to the approval and parent assets.
CuratorFolder: Approval - The virtual folder name in the folder structure in Curator (via clip select etc) that the approval asset will be put into. In the case of the SoDA approval assets, one or more assets will be put into a single collection per restore request.
Email: [email@example.com] - The email address used to email an admin or distribution group that an approval asset is ready for approval or rejection.
MaximumDuratrion: [a number could be set here] - Although not used in this example, if the duration of the asset is larger than the number specified here (in mins), the asset will trigger an approval requirement (if lower, approval is not required).
MetadataView: ApprovalSoDA - A view that can be configured in Curator System Administrator, where the metadata names in this view are set against the approval asset created.
NameofApprover: Admins - The name passed in email to the end user, describing who will be approving or rejecting the asset.
StoreType: Approval - the store type that dictates the behaviour of the workflow that uses functions related to approval to set metadata and paths and keys related to the process.
Mediastore for copying Proxy files to a s3 bucket via Cloudsoda:
Below is an example mediastore that has slight differences to the similar store that sends Hi-Res single files.
DryRun: False - When sending/archiving the files to a destination, approval is not required to do this, so no dry run task neds to be set to do so.
FolderPattern: {ProxySubFolder} - This adds the default year/month/day/proxyname/ folder pattern to the s3 bucket location, much like the local proxy folder pattern on online/ground storage.
HiResAccesible: False - Since a proxy is being transferred and you DON'T want to transfer or rely on whether the hi-res file is reachable, this should be set to false to ignore the hi-res access and state.
MediaStoreTemplate: Proxy - This is NOT relevant to other stores, but in this mediastore, THIS MUST BE SET. The workflow depends on this key being set to treat the transfer job as multi file proxy content (HLS m3u8 etc) and not single file transfer.
OutputProfile: SODA-Copy - The Device Director profile that will be used to set conditions to send to the SoDA device in Device Director for the transfer job.
Path: s3://somebucketnameins3/ - A bucket path, that is also set in the Cloudsoda interface's storage node for the proxy s3 bucket.
PathMetadataKey: SoDAAWSProxyPath - The metadata name that will be used to store the path key against the curator asset for the proxy path.
SodaRoot: Soda://somenodename/ - The name of the Cloudsoda interface's storage node for the proxy s3 bucket connection. Note the "/" is on the end and should be for the Path key as well.
Source: PROXY-V3-SODA - a custom copy of the PROXY-V3 mediastore that has some tweaked values to source the proxy with some details for the Cloudsoda job (i.e., the Cloudsoda SMB UNC path to source the proxy from for the soda API):
The store here is a clone of the PROXY-V3 mediastore but has a UNC Path key that the Cloudsoda interface can use and a SodaRoot key as well. This allows this store to be used as the SMG/ground hop direct to the proxy location on local/ground storage and all its default values to be integrated into the Core mediastore.
Note: In a typical setup, a two step move can be done to transfer files to Cloudsoda via a few mediastores, i,e., Proxy-V3 -> a Cloudsoda ground mediastore -> Cloudsoda s3 bucket mediastore. This is required so Cloudsoda can see the destination and the source locations. Values can be inputted into the Core mediastores, but cloning them in this way allows different UNC paths to be set for the Path key, in case there are issue with UNC paths for the PE basic settings, hybrid/cloud locations and how they are the same but referenced by mapped drives or UNC paths for services and clients.
In this case a clone store should be created, with the extra keys that could go in a separate store inputted directly into it.
StoreType: Archive - This is to determine the behavior of the workflows used when sending a file. In this case the archive logic should be used to send a single file job to a destination and set path keys.
TxDateMetadataKey: ArchivedToAWSDate - A rarely used key. As the transfer job is processed, a unique date time value can be applied to a metadata name and saved to the asset on the final update of the job. In this case it will be set to a name of ArchivedToAWSDate so the asset can track when the Proxy was sent to the S3 bucket.
Workflow: Spawn - SoDa Transfer - The workflow leveraged to do anything SoDA-related. This has some specific logic for proxy transfer as well as the approval and restore logic.
Resolution Conclusion:
Much like Xfer is quite a flexible tool, Cloudsoda can be used to do moves or copies. Mediastores will then act as the configuration side to specify specifics for jobs while leveraging copy or move profiles along side set modes of restore such as normal, escalated (at additional cost from s3/glacier storage), etc.
System Architecture Do's & Don'ts
Since Cloudsoda will most likely be deployed into cloud/hybrid environments, there are some things that need to be taken into account when planning the setup of the system so that Cloudsoda and its host will work smoothly with the integration of Curator.
Mapped Drives, UNC paths, IPs - what to use?
Often the SoDA device may be mapped to root locations based on UNC paths that the Cloudsoda interface uses because it is a Linux-based device. End users may not have the same access to this location - in this case path swaps may be required. End users may have mapped drives that resolve to the same location, but PE in the cloud may have and IP for a host over UNC or mapped.
Configuration of all of these options MUST be taken into account and planned correctly in advance of setup.