Resources > Networked Storage Glossary|
General Networked Storage Systems
Appliances - A storage platform designed to perform a specific task, such as NAS, routers, virtualization, etc.
Auto Loaders - A single tape drive with robotics and multiple tape cartridges.
Libraries - A large-scale tape device with robotics that can house multiple tape drives and a significant amount of tape cartridges.
Modular - Building block system where controller/host connection function is physically separate from storage, and storage and/or controller function may be added independently of each other.
Monolithic - Single box system containing all control and storage in a single system, where the storage is integrated with the other components and cannot be segregated.
NAS/File - Storage system that connects to a network (typically Ethernet) that presents storage as network volume shares (CIFS) or NFS mounted devices.
Purpose-built - Storage device designed for a specific purpose or application that includes specialized hardware/software to provide that function.
Solid State - RAM (memory)-based disk device.
Utility - Storage capable of providing quality of service, carrier class availability and serviceability, on-line expansion or reallocation of resources, typically used as a back-end to an IT profit center.
Virtual Tape - A system that presents itself as a tape device (drive, autoloader, or library) that actually contains disk drives and memory (and also tape devices possibly), in order to provide performance improvement, and possibly increased connectivity to multiple backup software environments concurrently.
RAID Storage Systems
RAID (Redundant Array of Independent Disks) - A disk subsystem that is used to increase performance or provide fault tolerance. RAID can also be set up to provide both functions at the same time. RAID is a set of two or more ordinary hard disks and a specialized disk controller that contains the RAID functionality. Developed initially for servers and stand-alone disk storage systems, RAID is increasingly popular in desktop PCs. RAID can also be implemented via software only, but with less performance, especially when rebuilding data after a failure.
RAID improves performance by disk striping, which interleaves bytes or groups of bytes across multiple drives, so more than one disk is reading and writing simultaneously. Fault tolerance is achieved by mirroring or parity. Mirroring is 100% duplication of the data on two drives (RAID 1), and parity is used (RAID 3 and 5) to calculate the data in two drives and store the results on a third: a bit from drive 1 is XOR'd with a bit from drive 2, and the result bit is stored on drive 3 (see OR for an explanation of XOR). A failed drive can be hot swapped with a new one, and the RAID controller automatically rebuilds the lost data. In addition, RAID systems may be built using a spare drive (hot spare) ready and waiting to be the replacement for a drive that fails.
RAID systems come in all sizes from desktop units to floor-standing models. Any desktop PC can be turned into a RAID system by adding a RAID controller board and the appropriate number of IDE or SCSI disks. Stand-alone units may also include large amounts of cache as well as redundant power supplies.
In the late 1980s, RAID used to mean an array of "inexpensive" disks, being compared to large computer disks or SLEDs (Single Large Expensive Disk). As hard disks became cheaper, the RAID Advisory Board changed the name to mean "independent." See SAN and sector sparing.
RAID LEVEL 0
Level 0 is disk striping only, which interleaves data across multiple disks for better performance. It does not provide safeguards against failure.
RAID LEVEL 1
Uses disk mirroring, which provides 100% duplication of data. Offers highest reliability, but doubles storage cost.
RAID LEVEL 2
Bits (rather than bytes or groups of bytes) are interleaved across multiple disks. The Connection Machine used this technique, but this is a rare method.
RAID LEVEL 3
Data are striped across three or more drives. Used to achieve the highest data transfer, because all drives operate in parallel. Parity bits are stored on separate, dedicated drives.
RAID LEVEL 4
Similar to Level 3, but manages disks independently rather than in unison. Not often used.
RAID LEVEL 5
Most widely used. Data are striped across three or more drives for performance, and parity bits are used for fault tolerance. The parity bits from two drives are stored on a third drive.
RAID LEVEL 6
Highest reliability, but not widely used. Similar to RAID 5, but does two different parity computations or the same computation on overlapping subsets of the data.
RAID LEVEL 10
A combination of RAID 1 and RAID 0 combined. Raid 0 is used for performance, and RAID 1 is used for fault tolerance.
Archive - Software or appliance technologies designed to migrate fixed content data from primary storage to secondary storage, stored offline or online, to a form of transportable media to put away for long term data retrieval.
ARM: (Automated Resource Management) - Storage software solutions which combine elements of SNM, SRM and Policy Management in order to provide user-defined SLA enforcement and auto-provisioning of capacity.
Backup - Software or appliance technologies designed to enable the migration of data from primary storage to secondary storage, in order to enable restoration in of primary data in the event of a system failure.
Cluster - Technology used to ensure the greatest available uptime of any system. HA Technology includes redundant hardware and intelligence that allows hardware and applications to “fail over”, restart and maintain proper business functions.
Data Management - A category of storage management software designed to migrate data from primary storage to secondary storage in order to protect primary data sets or to create a replica copy of primary data.
File System - A system that organizes files and metadata in a hierarchical structure and is responsible for managing the physical placement of blocks that make up the files.
HSM/ADM: (Hierarchial Storage Management/Automated Data Management) - Solutions that migrate data within the storage environment according to user-defined policies.
Policy Management - Enables users to set specific policies about how storage is implemented and used. Solutions containing policy management will automatically enforce those policies and take specific actions based on user-defined conditions.
Replication - Software or an appliance used to make an exact copy of data from one location to a secondary location(s).
Security - Hardware and/or software solutions that are designed specifically to protect storage assets and resident data from unauthorized access.
SRM: (Storage Resource Management) - Storage software solutions that discover, assess, and report the usage patterns of storage systems.
SNM: (Storage Network Management) - Storage software solutions that discover, monitor, manage, and display the physical elements that make up the storage network infrastructure.
Virtualization - Storage software solutions that abstract the physical and logical storage assets from the host systems.
Appliances - An integrated system made up of purpose-built software and hardware designed for a specific task (or set of tasks) and sold as a turnkey solution. The concept is to simplify installation and management, and deliver high performance.
Directors - High-end version of storage switches (FC, ICON, SCON) that are designed specifically with 32+ ports and are highly available. These usually have redundant switch cores and are based upon a bladed architecture.
Intelligent Switching Platforms - Switches which provide a platform for intelligent storage services to reside on, which makes the storage intelligence, such as replication, volume management, virtualization, and file serving, part of the switching infrastructure. These switches are often multiprotocol (FC, IP, IB).
Routers/Gateways - A router is a device or, in some cases, software in a computer, that determines the next network point to which a packet should be forwarded toward its destination. A gateway is a network point that acts as an entrance to another network.
Switches - Devices that reside in the network and direct traffic based on source and destination addresses. Most common examples are Fibre Channel or Ethernet switches.
WAN/SAN Extension - Gateways that connect geographically distributed SANs via existing WAN infrastructure. These solutions may use IP, DWDM, or SONET transport technologies and support protocols that carry FC blocks via those WAN networks (FCIP, iFCP etc).
Assessment & Design - A methodical process conducted by companies that identifies key storage usage patterns.
Implementation & Deployment - Installation, setup, migration, and process change services of storage technologies.
Solution Integration - The planning and staging step of all technology elements prior to production deployment.
Testing - "Proof of concept" testing prior to design selection and or deployment.
Training - The staff education and competency development step that typically follows deployment and includes knowledge transfer.
Array Controllers - The intelligent component of a disk subsystem where control software executes.
ATA: (Advanced Technology Attachment) - The specification for IDE drives. The specification was published in 1994 as ANSI standard X3.221-1994, titled AT attachment Interface for Disk Drives.
Chip - A set of microminiaturized, electronic circuits that are designed for use as processors and memory in computers and countless consumer and industrial products.
DLT: (Digital Linear Tape) - A family of tape device and media technologies developed by Quantum Corporation.
ESCON: (Enterprise System Connection) - A 200Mb/sec serial I/O bus used on IBM Corporation’s Enterprise System 9000 data center computers. Similar to Fibre Channel in many respects, ESCON is based on redundant switches to which computers and storage subsystems connect using serial optical connections.
FC: (Fibre Channel) - A high-speed transport technology used to build storage area networks (SANs). Although Fibre Channel can be used as a general-purpose network carrying ATM, IP and other protocols, it has been primarily used for transporting SCSI traffic from servers to disk arrays. The Fibre Channel Protocol (FCP) serializes SCSI commands into Fibre Channel frames. IP, however, is used for in-band SNMP network management. Fibre Channel not only supports singlemode and multimode fiber connections, but coaxial cable and twisted pair as well.
Fibre Channel can be configured point-to-point, via a switched topology or in an arbitrated loop (FC-AL) with or without a hub, which can connect up to 127 nodes (see below). It supports transmission rates up to 2.12 Gbps in each direction, and 4.25 Gbps is expected. Fibre Channel uses the Gigabit Ethernet physical layer and IBM's 8B/10B encoding method, where each byte is transmitted as 10 bits. Fibre Channel provides both connection-oriented and connectionless services. Following are the class and functional levels of the architecture.
FCIP: (Fiber Channel Over Internet Protocol) - An Internet Protocol (IP)-based storage networking technology developed by the Internet Engineering Task Force (IETF). FCIP mechanisms enable the transmission of Fibre Channel (FC) information by tunneling data between storage area network (SAN) facilities over IP networks.
FICON: (Fibre Connectivity) - IBM Corporation’s Implementation of ESCON over Fibre Channel.
GigE: (Gigabit Ethernet) - A type of high-speed network hardware. Also known as 1000Base-T.
HBA: (Host Bus Adapter) - Any hardware bridge between a storage interconnect and a system’s I/O bus.
HCA/TCA: (Host Channel Adapter/Target Channel Adapter) - Host Channel Adapters (HCA) and Target Channel Adapters (TCA) enable servers and I/O devices to connect to the InfiniBand fabric, respectively. These adapters are comprised of specialized chips that process the InfiniBand link protocol at wire speed and without incurring any host overhead.
InfiniBand - A switched I/O architecture that delivers a high performance, low latency interconnect for the data center.
IP: (Internet Protocol) - A protocol that provides connectionless best-effort delivery of datagrams across heterogeneous physical networks.
IP Storage (Internet Protocol storage) - Using IP and Gigabit Ethernet to build storage area networks (SANs). Traditional SANs were developed using the Fibre Channel transport, because it provided gigabit speeds compared to 10 and 100 Mbps Ethernet used to build messaging networks at that time. Fibre Channel equipment has been costly, and interoperability between different vendors' switches was not completely standardized. Since Gigabit Ethernet and IP have become commonplace, IP storage enables familiar network protocols to be used, and IP allows SANs to be extended throughout the world. Network management software and experienced professionals in IP networks are also widely available. iSCSI, iFCP and mFCP are protocols that turn SCSI and Fibre Channel Protocol commands into TCP/IP.
iSCSI: (Internet Small Computer System Interface) - A protocol that serializes SCSI commands and converts them to TCP/IP.
IS-NIC: (Integrated Storage Network Interface Card) - A NIC card that incorporates TCP/IP offload features and fully implements the TCP/IP stack.
LTO: (Linear Tape Open) - A family of open magnetic tape standards developed by HP, IBM and Seagate that are licensed to third-party vendors.
NAS: (Network-Attached Storage) - A file-storage device that is accessed through a network.
NIC: (Network Interface Card) - A printed circuit board that connects a computer or other node to a network. Also known as a network adapter.
SAN: (Storage Area Network) - A network that transfers data between computer systems and storage devices via peripheral channels such as SCSI or Fibre Channel.
SCSI: (Small Computer System Interface) - A collection of ANSI standards and proposed standards that define I/O buses primarily for connecting storage subsystems or devices to hosts through host bus adapters.
Snapshot: A fully usable copy of a defined collection of data that contains an image of the data as it appeared at the point in time at which the copy was initiated. A snapshot may be either a duplicate or a replicate of the data it represents.
TOE: (TCP/IP Offload Engine) - Network Interface cards with hardware or firmware onboard which is designed to offload the majority of all TCP/IP processing from the host CPU. The intent is to speed up I/O by increasing the speed of TCP/IP processes.
SAN (Storage Area Network) - A network of storage disks. In large enterprises, a SAN connects multiple servers to a centralized pool of disk storage. Compared to managing hundreds of servers, each with their own disks, SANs improve system administration. By treating all the company's storage as a single resource, disk maintenance and routine backups are easier to schedule and control. In some SANs, the disks themselves can copy data to other disks for backup without any processing overhead at the host computers.
The SAN network allows data transfers between computers and disks at the same high peripheral channel speeds as when they are directly attached. Fibre Channel is a driving force with SANs and is typically used to encapsulate SCSI commands. SSA and ESCON channels are also supported.
SANs can be centralized or distributed. A centralized SAN connects multiple servers to a collection of disks, whereas a distributed SAN typically uses one or more Fibre Channel or SCSI switches to connect nodes within buildings or campuses. For long distances, SAN traffic is transferred over ATM, SONET or dark fiber. To guarantee complete recovery in a disaster, dual, redundant SANs are deployed, one a mirror of the other and each in separate locations.
Another SAN option is IP storage, which enables data transfer via IP over fast Gigabit Ethernet locally or via the Internet to anywhere in the world. See LAN free backup.
Channel Attached Vs. Network Attached
A related storage device is the network attached storage (NAS) system. The NAS is a disk subsystem that attaches to the LAN like any server or workstation. But rather than containing a full-blown operating system, it typically uses a slim microkernel specialized for handling only file reads and writes (CIFS/SMB, NFS, NCP). Adding or removing a NAS box is like adding or removing any node in a network. In contrast, the channel-attached storage system of the SAN must be taken offline to reconfigure it. However, the NAS is subject to the variable behavior and overhead of a network that may contain thousands of users.
SAN-NAS terminology is confusing (storage area network vs. network attached storage). NASs are often contrasted with SANs, but are also included under the "storage network" umbrella. The major difference is that the channel-attached SAN extends the peripheral channel to long distances, whereas the NAS is another file server on the network.
NAS (Network Attached Storage) - A specialized file server that connects to the network. A NAS device contains a slimmed-down (microkernel) operating system and file system and processes only I/O requests by supporting popular file sharing protocols such as NFS (Unix) and SMB/CIFS (DOS/Windows). Using traditional LAN protocols such as Ethernet and TCP/IP, the NAS enables additional storage to be quickly added by plugging it into a network hub or switch. As network transmission rates have increased from Ethernet to Fast Ethernet to Gigabit Ethernet, NAS devices have come up to speed parity with direct attached storage devices.
General-purpose computers with a full-blown operating system such as Windows or Unix are sometimes labeled as NAS products, but the true NAS is built from scratch as a dedicated file I/O device.