Figure 1 - Data Integrity Field Appended To 512-Byte Standard Block
Figure 1 - Data Integrity Field Appended To 512-Byte Standard Block
T10/03-224r0
1. Introduction
The purpose of this document is to provide justification for the End-to-End Data Protection proposal described in document T10/03-111. Data integrity concerns are not new, and protection mechanisms abound. Packet-based storage transport protocols have CRC protection on command and data payloads. Interconnect buses have parity protection. Memory systems have parity detection/correction schemes. I/O protocol controllers at the transport/interconnect boundaries have internal data path protection. Data availability in storage systems is frequently measured simply in terms of the reliability of the hardware components and the effects of redundant hardware. But the reliability of the software, its ability to detect errors, and its ability to correctly report or apply corrective actions to a failure have a significant bearing on the overall storage system availability. The primary objective of the proposed End-to-End Data Protection mechanism is as follows:
Define a standardized data integrity protection mechanism that spans transport and device boundaries to protect data as it is transferred from application space to storage media, then back.
There are proprietary mechanisms already in existence that provide end-to-end data protection. The primary motivation behind defining a standardized mechanism based on SCSI command and architecture standards are as follows:
Improve fault isolation by allowing each device in the data path to understand the protection scheme. Increased interoperability between standardized initiator and target devices.
The proposed data protection mechanism is intended to complement, not replace, end-to-end protection schemes defined at the application level. SCSI architecture primarily defines transport-level operations. The proposed mechanism spans transport boundaries by embedding the protection into the data stream itself. This allows target devices to receive data with standardized data integrity protection, store the data and associated data integrity fields in a proprietary fashion, then return the data and associated data integrity fields in a standardized fashion. Intermediate storage devices that bridge or connect SCSI domains can check the standardized data integrity as data is transferred between SCSI domains.
T10/03-224r0
As shown below in Figure 2, the Data Integrity Field is composed of three sub-fields, the Reference Tag, the Meta Tag and the Guard. The Reference Tag nominally contains information associated with a specific data block within some context, such as the lower 4 bytes of the Logical Block Address. During a multi-block data transfer, this field is incremented by one for each successive block. Since the Reference Tag is not required to be LBA based, the proposal defines additional SCSI read and write block commands that allow the Reference Tag base to be specified on a per I/O basis. The Meta Tag contains additional context information that is nominally held fixed within the context of a given I/O operation. The Guard field is computed from the data in the standard data block. The tag values in the DIF are excluded from the computation.
Guard 2 bytes
T10/03-224r0
4. System Example
It is common in target devices to have a buffered architecture in which data is staged in an intermediate buffer after it is received from the initiator, or prior to transfer to the initiator. Figure 3 shows an illustration of a target device that is an intermediate device that appears as a SCSI target to host systems in one SCSI domain, designated A, and a SCSI initiator to disk drives where the data is actually stored in a second domain, designated B. This system will be used to illustrate the benefits of end-to-end data protection.
Data Buffer/Cache
Host
HBA
Drive
SCSI Domain "A" I/O Interface I/O Interface SCSI Domain "B"
Host
HBA
Drive
SCSI Target SCSI Initiator SCSI Target
SCSI Initiator
T10/03-224r0
destination for disk read operations, and is addressed via a scatter/gather list. Buffer memory in the disk drive is the destination for disk write operations, and the source for disk read operations. From the perspective of the intermediate storage device, the drive is addressed as a SCSI target device using SCSI domain B address elements.
T10/03-224r0
Host LBA 319 -> Drive 2, LBA 127 Host LBA 320 -> Drive 3, LBA 64
Data
... 55 7C 8B ...
Drive 0
Data
319/00/5C Drive 1
319/00/5C Buffer 452 Mem Addr 0x1C4000 I/O Interface ... 18 4A 1C ...
... 18 4A 1C ...
320/00/14 Buffer 823 Mem Addr 0x337000 Host I/O I/F Structure Op = Receive Ref Tag Base = 319 Scatter/Gather List Addr 0x1C4000, Lth 512 Addr 0x337000, Lth 512 Drive I/O I/F Structure Op = Transmit Ref Tag Base = 319/320 Scatter/Gather List Addr xxxx, Lth 512
Data
Drive 3
320/00/14
... 18 4A 1C ...
Drive 4
320/00/14
Initiator
Target
Initiator
Target
Figure 4 - Host to Drive Data Flow Illustration with Intermediate Storage Device
T10/03-224r0
T10/03-224r0
5. Interoperability
The proposal allows the initiator to arbitrarily specify Reference Tags on a per I/O basis using algorithms that are unknown to the target device. This can be done only in a homogeneous system where the algorithm is controlled by a single entity such as the file system or HBA vendor. Realistically, heterogeneous systems require that the same algorithm be used by all initiators in a given SCSI domain. In the system example shown in Figure 3, if both host systems wish to access a common set of logical units in SCSI domain A, then they must use the same algorithm for Reference Tags. This can be enforced at a system level by the intermediate storage device by forcing the Reference Tags to be LBA based via the Data Integrity Mode Page, hence the expression LBA locked for that SCSI domain. Note that this does not mean that the Reference Tags must be LBA locked in SCSI domain B. If the intermediate storage device is a mapping device that stripes data across multiple drives, then this is likely a homogeneous environment, where all initiators in that domain understand the data layouts. In this case, the Reference Tags applied in SCSI domain A can be passed through to SCSI domain B, even if the tags dont match the drive LBAs. If the Reference Tags are LBA locked in SCSI domain A, then the intermediate storage device can allow legacy and DIF capable systems to share access to a volume as illustrated in Figure 5.
Memory
512
DIF
I/O
512 + DIF
512 + DIF
I/O
512 + DIF
512 + DIF
512 + DIF Check and Forward 512 + DIF Generate and Insert 512 + DIF
512
512 + DIF