File - $LogFile (2)
Previous
Next
Overview
Author's note: The information presented here is intended to complement the information presented in the "NTFS Recovery Support" subchapter in Windows Internals, by Mark E. Russinovich and David Solomon.
Layout of the File
Unnamed Data Stream
At the beginning of the log files are two LFS_RESTART_PAGE pages, 4KB each, CurrentLsn should be used to determine which is more recent.
There are two restart pages because one of them may become corrupt if there is a power failure during write. (Note: the NTFS v5.1 driver will write two identical copies when the volume is cleanly dismounted).
The rest of the log file comprised of a series of LFS_RECORD_PAGE pages, the size of each page is specified in the LFS_RESTART_AREA structure.
As of LFS v1.1 (The latest version as of this writing, being used from Windows NT 4.0 onwards), the first two pages have special purpose and are called tail pages or tail copies, the rest of the pages are sometimes called regular pages.
The tail pages are used for backup purposes, and in some circumstances may contain information that haven't yet been written to the regular log area.
LFS_RESTART_PAGE
Offset |
Size |
Name |
Description |
0x00 |
8 |
MULTI_SECTOR_HEADER |
Signature: 'RSTR' |
0x08 |
8 |
ChkDskLsn |
|
0x10 |
4 |
SystemPageSize |
The size of each LFS_RESTART_PAGE |
0x14 |
4 |
LogPageSize |
The size of each LFS_RECORD_PAGE |
0x18 |
2 |
RestartOffset |
The offset of the LFS_RESTART_AREA relative to the start of this structure |
0x1A |
2 |
MinorVersion |
Signed integer, LFS minor version number |
0x1C |
2 |
MajorVersion |
Signed integer, LFS major version number |
0x1E |
Variable |
UpdateSequenceArray |
|
RestartOffset |
Variable |
LFS_RESTART_AREA |
|
LFS_RESTART_AREA
Offset |
Size |
Name |
Description |
0x00 |
8 |
CurrentLsn |
The LSN of the latest record at the time this structure was written |
0x08 |
2 |
LogClients |
Number of LFS_CLIENT_RECORD entries |
0x0A |
2 |
ClientFreeList |
The index of the first free log client record in LogClientArray (NoClient = 0xFFFF) |
0x0C |
2 |
ClientInUseList |
The index of the first in-use log client record in LogClientArray (NoClient = 0xFFFF) |
0x0E |
2 |
Flags |
SinglePageIO = 0x0001, CleanDismount = 0x0002 |
0x10 |
4 |
SeqNumberBits |
SeqNumberBits = 64 - (FileSizeBits - 3), where FileSizeBits is the number of bits chosen to represent the log file size (greater than or equal to the number of bits needed) |
0x14 |
2 |
RestartAreaLength |
Length of this structure including LogClientArray |
0x16 |
2 |
ClientArrayOffset |
The offset of the first LFS_CLIENT_RECORD relative to the start of this structure |
0x18 |
8 |
FileSize |
The size of the log file |
0x20 |
4 |
LastLsnDataLength |
The length of data belongs to the record associated with CurrentLsn (not including the LFS_RECORD_HEADER) |
0x24 |
2 |
RecordHeaderLength |
|
0x26 |
2 |
LogPageDataOffset |
|
0x28 |
4 |
RevisionNumber |
This value is incremented by 1 every time the LogRestartArea is being written (initial value is chosen at random) |
ClientArrayOffset |
Variable |
LogClientArray |
LFS_CLIENT_RECORD array of clients who are using this log file (currently the only client is NTFS itself) |
LFS_CLIENT_RECORD
Offset |
Size |
Name |
Description |
0x00 |
8 |
OldestLsn |
|
0x08 |
8 |
ClientRestartLsn |
The LSN of the latest restart record at the time this structure was written |
0x10 |
2 |
PrevClient |
The index of the next client in LogClientArray (NoClient = 0xFFFF) |
0x12 |
2 |
NextClient |
The index of the next client in LogClientArray (NoClient = 0xFFFF) |
0x14 |
2 |
SeqNumber |
Client sequence number |
0x16 |
6 |
Padding |
|
0x1C |
4 |
ClientNameLength |
Number of bytes |
0x20 |
128 |
ClientName |
|
LFS_RECORD_PAGE
Offset |
Size |
Name |
Description |
0x00 |
8 |
MULTI_SECTOR_HEADER |
Signature: 'RCRD' |
0x08 |
8 |
LastLsnOrFileOffset |
Last LSN that starts on this page for regular log pages, FileOffset for tail copies (indicates the location in the file where the page should be placed) |
0x10 |
4 |
Flags |
RecordEnd = 0x00000001 (Indicates that a log record ends on this page) |
0x14 |
2 |
PageCount |
Number of pages written as part of the IO transfer. a MultiPage record is likely to be written as part of two separate IO transfers (since the last page may have room for more records that will be written in a later transfer) |
0x16 |
2 |
PagePosition |
One-based |
0x18 |
2 |
NextRecordOffset |
The offset of the free space in the page, if the last record does not end on this page then this value is not incremented and will point to the start of the record. |
0x1A |
6 |
Padding |
|
0x20 |
8 |
LastEndLsn |
|
0x28 |
Variable |
UpdateSequenceArray |
|
|
Variable |
Data |
|
LFS_RECORD
Offset |
Size |
Name |
Description |
0x00 |
8 |
ThisLsn |
|
0x08 |
8 |
ClientPreviousLsn |
|
0x10 |
8 |
ClientUndoNextLsn |
|
0x18 |
4 |
ClientDataLength |
|
0x1C |
2 |
ClientSeqNumber |
|
0x1E |
2 |
ClientIndex |
|
0x20 |
4 |
RecordType |
ClientRecord = 1, ClientRestart = 2 |
0x24 |
4 |
TransactionId |
|
0x28 |
2 |
Flags |
MultiPage = 0x0001 |
0x2A |
6 |
Padding |
|
0x30 |
ClientDataLength |
Data |
|
Structures specific to the NTFS client
RESTART_AREA
Offset |
Size |
Name |
Description |
0x00 |
4 |
MajorVersion |
NTFS log client major version |
0x04 |
4 |
MinorVersion |
NTFS log client minor version |
0x08 |
8 |
StartOfCheckpointLsn |
|
0x10 |
8 |
OpenAttributeTableLsn |
|
0x18 |
8 |
AttributeNamesLsn |
|
0x20 |
8 |
DirtyPageTableLsn |
|
0x28 |
8 |
TransactionTableLsn |
|
0x30 |
4 |
OpenAttributeTableLength |
|
0x34 |
4 |
AttributeNamesLength |
|
0x38 |
4 |
DirtyPageTableLength |
|
0x3C |
4 |
TransactionTableLength |
|
0x40 |
8 |
Unknown1 |
$USN Journal related length |
0x48 |
8 |
PreviousRestartRecordLsn |
The value of CurrentLsn on previous mount? |
0x50 |
4 |
BytesPerCluster |
|
0x54 |
4 |
Padding |
|
0x58 |
8 |
UsnJournal |
MFT_SEGMENT_REFERENCE of the $USN journal |
0x60 |
8 |
Unknown2 |
$USN Journal related |
0x68 |
8 |
UnknownLsn |
This field is present starting in Windows Vista and later |
Windows NT 4.0 writes RESTART_AREA records that are 64 bytes long.
Windows 2000 / XP / 2003 write RESTART_AREA records that are 104 bytes long.
Windows Vista / 7 / 8 / 10 write RESTART_AREA records that are 112 bytes long.
When x64 versions of Windows format the volume they set the RESTART_AREA version to 1.0 (x86 will set the version to 0.0).
The record version will be maintained when moving disks between operating systems using the same record length.
NTFS_LOG_RECORD
Offset |
Size |
Name |
Description |
0x00 |
2 |
RedoOperation |
NTFS_LOG_OPERATION |
0x02 |
2 |
UndoOperation |
NTFS_LOG_OPERATION |
0x04 |
2 |
RedoOffset |
Offset MUST be aligned to 8 byte boundary |
0x06 |
2 |
RedoLength |
|
0x08 |
2 |
UndoOffset |
Offset MUST be aligned to 8 byte boundary |
0x0A |
2 |
UndoLength |
|
0x0C |
2 |
TargetAttributeOffset |
Offset of the attribute in the open attribute table, 0 is a valid value for operations that do not require TargetAttribute |
0x0E |
2 |
LCNsToFollow |
|
0x10 |
2 |
RecordOffset |
|
0x12 |
2 |
AttributeOffset |
|
0x14 |
2 |
ClusterBlockOffset |
|
0x16 |
2 |
Reserved |
|
0x18 |
8 |
TargetVCN |
|
0x20 |
Variable |
LCNsForPage |
|
RedoOffset |
RedoLength |
RedoData |
|
UndoOffset |
UndoLength |
UndoData |
|
The NTFS_LOG_RECORD_HEADER structure definition reserves 8 bytes for the first LCN. UndoOffset / UndoOffset MUST be greater than or equal to 0x28.
NTFS_LOG_OPERATION
Value |
Name |
Data |
0x00 |
Noop |
|
0x01 |
CompensationLogRecord |
|
0x02 |
InitializeFileRecordSegment |
FILE_RECORD_SEGMENT |
0x03 |
DeallocateFileRecordSegment |
|
0x04 |
WriteEndOfFileRecordSegment |
ATTRIBUTE_RECORD_HEADER |
0x05 |
CreateAttribute |
ATTRIBUTE_RECORD_HEADER |
0x06 |
DeleteAttribute |
|
0x07 |
UpdateResidentAttributeValue |
(Attribute Value) |
0x08 |
UpdateNonResidentAttributeValue |
(Attribute Value) |
0x09 |
UpdateMappingPairs |
(Mapping Pair bytes) |
0x0A |
DeleteDirtyClusters |
array of LCN_RANGE |
0x0B |
SetNewAttributeSizes |
NEW_ATTRIBUTE_SIZES |
0x0C |
AddIndexEntryToRoot |
INDEX_ENTRY |
0x0D |
DeleteIndexEntryFromRoot |
INDEX_ENTRY |
0x0E |
AddIndexEntryToAllocationBuffer |
INDEX_ENTRY |
0x0F |
DeleteIndexEntryFromAllocationBuffer |
INDEX_ENTRY |
0x10 |
WriteEndOfIndexBuffer |
INDEX_ENTRY |
0x11 |
SetIndexEntryVcnInRoot |
VCN |
0x12 |
SetIndexEntryVcnInAllocationBuffer |
VCN |
0x13 |
UpdateFileNameInRoot |
DUPLICATED_INFORMATION |
0x14 |
UpdateFileNameInAllocationBuffer |
DUPLICATED_INFORMATION |
0x15 |
SetBitsInNonResidentBitMap |
BITMAP_RANGE |
0x16 |
ClearBitsInNonResidentBitMap |
BITMAP_RANGE |
0x17 |
HotFix |
|
0x18 |
EndTopLevelAction |
|
0x19 |
PrepareTransaction |
|
0x1A |
CommitTransaction |
|
0x1B |
ForgetTransaction |
|
0x1C |
OpenNonResidentAttribute |
OPEN_ATTRIBUTE_ENTRY (The attribute name is stored in the UndoData field) |
0x1D |
OpenAttributeTableDump |
OPEN_ATTRIBUTE_ENTRY restart table |
0x1E |
AttributeNamesDump |
ATTRIBUTE_NAME_ENTRY array |
0x1F |
DirtyPageTableDump |
DIRTY_PAGE_ENTRY restart table |
0x20 |
TransactionTableDump |
TRANSACTION_ENTRY restart table |
0x21 |
UpdateRecordDataInRoot |
(value) |
0x22 |
UpdateRecordDataInAllocationBuffer |
(value) |
RESTART_TABLE header
Offset |
Size |
Name |
Description |
0x00 |
2 |
EntrySize |
|
0x02 |
2 |
NumberEntries |
|
0x04 |
2 |
NumberAllocated |
|
0x06 |
6 |
Padding |
|
0x0C |
4 |
FreeGoal |
|
0x10 |
4 |
FirstFree |
|
0x14 |
4 |
LastFree |
|
OPEN_ATTRIBUTE_ENTRY, NTFS v1.2, client v0.0
Offset |
Size |
Name |
Description |
0x00 |
4 |
AllocatedOrNextFree |
0xFFFFFFFF if the entry is allocated |
0x04 |
4 |
PointerToAttributeName |
This value is used by the live system and should be ignored when read from disk |
0x08 |
8 |
FileReference |
MFT_SEGMENT_REFERENCE |
0x10 |
8 |
LsnOfOpenRecord |
This is the LSN of the client record preceding the OpenNonResidentAttribute record |
0x18 |
1 |
DirtyPagesSeen |
Boolean |
0x19 |
1 |
AttributeNamePresent |
Boolean |
0x1A |
2 |
Padding |
|
0x1C |
4 |
AttributeTypeCode |
|
0x20 |
8 |
AttributeName |
UNICODE_STRING, This value is used by the live system and should be ignored when read from disk |
0x28 |
4 |
BytesPerIndexBuffer |
Meaningful if (AttributeTypeCode == IndexAllocation) |
OPEN_ATTRIBUTE_ENTRY, NTFS v3.0+, client v0.0
Offset |
Size |
Name |
Description |
0x00 |
4 |
AllocatedOrNextFree |
0xFFFFFFFF if the entry is allocated |
0x04 |
4 |
AttributeOffset |
Self-reference, NTFS v5.1 driver calulates AttributeOffset using entry length of 0x28, the reason is unclear |
0x08 |
8 |
FileReference |
MFT_SEGMENT_REFERENCE |
0x10 |
8 |
LsnOfOpenRecord |
|
0x18 |
4 |
Padding |
|
0x1C |
4 |
AttributeTypeCode |
|
0x20 |
8 |
PointerToAttributeName |
This value is used by the live system and should be ignored when read from disk |
0x28 |
4 |
BytesPerIndexBuffer |
Meaningful if (AttributeTypeCode == IndexAllocation) |
OPEN_ATTRIBUTE_ENTRY, NTFS v3.0+, client v1.0
Offset |
Size |
Name |
Description |
0x00 |
4 |
AllocatedOrNextFree |
0xFFFFFFFF if the entry is allocated |
0x04 |
4 |
BytesPerIndexBuffer |
Meaningful if (AttributeTypeCode == IndexAllocation) |
0x08 |
4 |
AttributeTypeCode |
|
0x0C |
1 |
DirtyPagesSeen |
Boolean |
0x0D |
3 |
Padding |
|
0x10 |
8 |
FileReference |
MFT_SEGMENT_REFERENCE |
0x18 |
8 |
LsnOfOpenRecord |
|
0x20 |
8 |
PointerToAttributeName |
This value is used by the live system and should be ignored when read from disk |
DIRTY_PAGE_ENTRY, client v0.0
Offset |
Size |
Name |
Description |
0x00 |
4 |
AllocatedOrNextFree |
0xFFFFFFFF if the entry is allocated |
0x04 |
4 |
TargetAttributeOffset |
Offset of the attribute in the open attribute table |
0x08 |
4 |
LengthOfTransfer |
|
0x0C |
4 |
LCNsToFollow |
|
0x10 |
4 |
Reserved |
|
0x14 |
8 |
VCN |
|
0x1C |
8 |
OldestLsn |
Oldest LSN of log record update that has not yet been written through to the disk |
0x24 |
8 * LCNsToFollow |
LCNsForPage |
Array of LCNs (logical cluster number) |
DIRTY_PAGE_ENTRY, client v1.0
Offset |
Size |
Name |
Description |
0x00 |
4 |
AllocatedOrNextFree |
0xFFFFFFFF if the entry is allocated |
0x04 |
4 |
TargetAttributeOffset |
Offset of the attribute in the open attribute table |
0x08 |
4 |
LengthOfTransfer |
|
0x0C |
4 |
LCNsToFollow |
|
0x10 |
8 |
VCN |
|
0x18 |
8 |
OldestLsn |
Oldest LSN of log record update that has not yet been written through to the disk |
0x20 |
8 * LCNsToFollow |
LCNsForPage |
Array of LCNs (logical cluster number) |
TRANSACTION_ENTRY
Offset |
Size |
Name |
Description |
0x00 |
4 |
AllocatedOrNextFree |
0xFFFFFFFF if the entry is allocated |
0x04 |
4 |
TransactionState |
Uninitialized = 0, Active = 1, Prepared = 2, Committed = 3 |
0x08 |
8 |
FirstLsn |
|
0x10 |
8 |
PreviousLsn |
|
0x18 |
8 |
UndoNextLsn |
|
0x20 |
4 |
UndoRecords |
Number of undo log records pending abort |
0x24 |
4 |
UndoBytes |
Number of bytes in undo log records pending abort |
I am not sure of the circumstances requiring the transaction table to be written to disk, under normal circumstances the transaction table can be implied from the log records and is not explicitly written to disk.
ATTRIBUTE_NAME_ENTRY
Offset |
Size |
Name |
Description |
0x00 |
2 |
OpenAttributeOffset |
Offset of the attibute with this name in the open attribute table |
0x02 |
2 |
NameLength |
In bytes |
0x04 |
(NameLength + 1) * 2 |
Name |
Null terminated |
BITMAP_RANGE
Offset |
Size |
Name |
Description |
0x00 |
4 |
BitmapOffset |
|
0x04 |
4 |
NumberOfBits |
|
LSN (logical sequence number)
The LSN is both the ID of an LFS record and the location of the record in the log file.
To calculate the file offset of an LSN_RECORD given its LSN:
FileOffset = ((lsn << SeqNumberBits) & 0xFFFFFFFFFFFFFFFF) >> (SeqNumberBits - 3)
Tail pages
In LFS v1.1, the log pages are packed, meaning that if we have room on the last page written (in a previous IO transfer), then records from the current IO transfer will be added to it.
To avoid losing both the records from the previous IO transfer and the current one if the page becomes corrupt due to a power failue, two special pages called tail copies are used.
Note that only the first page and the last page of an IO transfer may have tail copies (the pages in between never puts us at risk).
Two copies are not strictly needed but may be useful for some implementations.
NTFS Recovery
Analysis pass
• NTFS scans forward in log file from beginning of last checkpoint
• Updates transaction/dirty page tables it copied in memory
• NTFS scans tables for oldest update record of a non-committed transactions
Redo pass
• NTFS looks for "page update" records which contain volume modification that might not have been flushed to disk
• NTFS redoes these updates in the cache until it reaches end of log file
• Cache manager "lazy writer thread" begins to flush cache to disk
Undo pass
• Roll back any transactions that weren't committed when system failed
• After undo pass - volume is at consistent state
• Write empty LFS restart area; no recovery is needed if system fails now
Copyright ©