Storage_Configuration

BHLe Cluster Storage Configuration

[image to come to supplement]

Physical Structure

100TB SCSI/SATA SAN storage overall across 4 storage subsystems

10TB reserved for VMWare ESX Datastore and backup node
45TB currently configured for gpfs shared storage with HSM (heirarchical storage management)
remainder currently unassigned but ready for GPFS provision as required.

GPFS is controlled by two physical blades: gpf-ares and gpf-hades; these blades function only to provide GPFS filesystems and act as a redundant pair.

GPFS Filesystem

The gpfs filesystems are mounted on all gpfs nodes as /gpfs/[filesystem]
Currently a single gpfs shared disk is configured: bhlfsa (mounted as /gpfs/bhlfsa on the gpfs nodes)

A pair of virtual machines participate as gpfs clients: (cld-demeter and cld-hestia) and have direct access to this filesystem. These nodes are configured to act as NFS servers to make the storage more generally available .

NFS shared gpfs

As gpfs has specific requirements for the host operating system, sharing beyond the gpfs cluster is achieved via nfs to make the drive accessible to other hosts.

NFS Serving

A pair of servers are configured to share NFS in a clustered configuration, using CTDB. The live configuration requires settings which are specific to the NHM hosting instution on the SMB side of CTDB so this isn't documented here. Essentially this method uses GPFS storage to hold the NFS state information, controlled via the ctdb service which enables failover/IP takeover when required. Details are available on request.

(See ctdb.samba.org for CTDB information;)

The hosts present shared drives as the cluster identity: dfs-ctdb

The GPFS is attached in /etc/fstab on the NFS servers:

/dev/bhlfsa /gpfs/bhlfsa gpfs rw,mtime,atime,dev=bhlfsa,autostart 0 0

NFS shared on those hosts is configured in /etc/exports:

/gpfs/bhlfsa/nfs 157.140.72.0/24(rw,sync,fsid=101)
/gpfs/bhlfsa/nfs/ingest 157.140.72.0/24(rw,sync,fsid=102)
/gpfs/bhlfsa/nfs/preingest 157.140.72.0/24(rw,sync,fsid=103)
/gpfs/bhlfsa/nfs/proto 157.140.72.0/24(rw,sync,fsid=104)
/gpfs/bhlfsa/nfs/upload 157.140.72.0/24(rw,sync,fsid=105)

the /gpfs/bhlfsa/nfs share is currently available; however for separation of function across multiple process machines, it is not writeable at the top level, only via the subdirectories. it may be better/more secure going forward to use only the other shares. Architecture of those components is still underway so this will not be removed until/unless it is fully feasible to do so.

NFS clients

Non gpfs client nodes can access this filesystem by adding to /etc/fstab

dfs-ctdb:/gpfs/bhlfsa/nfs /mnt/bhl-nfs nfs rw,hard,intr,rsize=32768,wsize=32768 0 0

(use a mapping for one of the more specific shares where feasible)

HSM

The third component to the storage is Tivoli Storage Manager HSM; this component is used to allow migration of data from production disk while maintaining transparent access to that data. HSM applies at the gpfs filesystem level, and is currently configured to apply to /gpfs/bhlfsa. The tape layer is served by an array of LTO4 drives in the NHM library.

HSM will apply migration rules to shift unused data across to tape from disk when the bhlfsa usage hits an 80% threshold.