Disk Safe Best Practices

The all new CDP Disk Safe is built with industrial grade protection from crashes and power loss. The CDP Disk Safe is highly reliable and robust. It has automatic mechanisms that protect Terabytes of archived data from crashes and power failures not found in any other disk to disk backup software.

This is accomplished using an atomic write journal. Before any changes are made to a Disk Safe file, a new on-disk journal file is created. Before any Disk Safe pages are altered, the original content of the Disk Safe pages are written to the journal file. This allows any transaction to be completely rolled back even if interrupted by a crash or power failure. The Disk Safe assumes that the operating system will buffer writes and that a write request will return before data has actually been written to disk and that write operations will be reordered by the operating system. For this reason, the Disk Safe performs FlushFileBuffers() on Windows or fsync() operation on Linux at key points.


Always Use Stable and Reliable Storage | Standard and Advanced Editions - Store Disk Safes on a Different Disk | Only use USB Drives to Transport Disk Safes | 32-Bit or 64-Bit for CDP Enterprise Servers? | Windows Best Practices | Linux Best Practices | Defragment File Systems | Avoid Disk Safe Vacuum | Corruption of Data Archived in the Disk Safe | Data Corruption in Your Environment

Always Use Stable and Reliable Storage

Tips for selecting stable and reliable storage:


Standard and Advanced Editions - Store Disk Safes on a Different Disk

It is recommended that when using Standard or Advanced Editions that you store your Disk Safes on a different physical disk from the disk or disks you are replicating. While CDP does support writing the Disk Safe to the same disk and even the same partition and file system you are replicating it is NOT recommended. Replicating a drive to itself can cause severe disk saturation and performance issues.

Windows - Example Recommended Configurations
Linux - Example Recommended Configurations
Windows - Not Recommended Except for Testing purposes
Linux - Not Recommended Except for Testing purposes

Only use USB Drives to Transport Disk Safes

Use USB drives to transport Disk Safes. For example, a USB drive is excellent for moving a Disk Safe from your office to your data center. Or between two CDP Servers connected by a slow network connection.  When using a USB drive, follow the steps to safely remove the device from your computer.

R1Soft does not recommend using a USB drive as the every day, primary storage for Disk Safes. USB drives tend to be unreliable and it has been reported that many USB drives ignore FlushFileBuffers() and fsync() requests to clear hard disk cache.


32-Bit or 64-Bit for CDP Enterprise Servers?

Always use a 64-Bit CPU(s) and 64-Bit operating system for CDP 3 Enterprise whenever possible. The main benefit of a 64-Bit environment is the CDP Server process will not be limited to a 32-bit address space.

If you are using a 32-Bit server, each processes address space is limited to less than 2 GB, even when Physical Address Extensions (PAE) are enabled. The CDP 3 Server is a single multi-threaded process Each thread will readily execute on a different CPUs/cores at the same time however all the threads share the same virtual address space limiting the total memory a CDP Server on 32-bit to less than 2 GB. Why not 4 GB? Windows and Linux virtual memory space is actually 31-bit (2 GB) because the last bit is reserved. Also some portions of the 31-bit virtual address space is further reserved by the operating system. Typical usable virtual address space on 32-bit is somewhere between 1.5 - 1.8 GB depending on the environment.


Windows Best Practices


Linux Best Practices


Defragment File Systems

The CDP Disk Safe is a storage system for long-term archiving of unique block-level (units of data below the file system) deltas (small differences in data). CDP extends the Disk Safe files and makes write in predictable 32 KB increments. This helps the modern Windows and Linux file systems pre-allocate space for the Disk Safe files and naturally reduce file fragmentation. As the Disk Safe block stores (.db files) for each device, you are protecting age some data inside those files remains forever and some is marked deleted and recycled for new deltas as old data is merged out over time. If there is no recycled free space inside of the file for new deltas, the file is extended on disk. This can happen anytime a new recovery point is created.

Generally, any files that remain on Disk for a long period of time and continue to be written to and extended in size are subject to file system fragmentation. This is true for the R1Soft Disk Safe and other long-term storage mechanisms, for example, a relational database like MS SQL Server and MySQL.   

R1Soft recommends for optimal performance that you periodically defrag your file systems where Disk Safes are stored. If it is feasible in your environment a weekly file system defrag would be optimal.

Windows

We recommend contig by MS Sysinternals to analyze the fragmentation of specific files and folders and Auslogics Defrag for regular scheduled file system defrags

1. Determine how fragmented your disk safe files are.

contig -a -s D:\PATH\TO\YOUR\DISK\SAFES

This may take a while. 

2. Use Auslogics to schedule weekly file system defragment tasks on your CDP Server.

Linux

Unfortunately, the Linux operating system is lacking stable online file system defrag capability.

Here are your options:

1. XFS file system has defrag capability.

How to use XFS defrag http://www.linux.com/archive/feature/141404.

Tutorial for XFS on CentOS http://blogwords.neologix.net/neils/?p=1.

Note on Linux XFS
It is not possible to use Multi-Point Replication with XFS.

2. Ext4 promises Defrag in the future.

Ext4 has an experimental online defrag capability. Eventually, we all expect this to become stable for production servers.

Ext4 has extents which help reduce file fragmentation. Extents do not eliminate file system fragmentation and Ext4 can still benefit from defrag just like NTFS on Windows.

Be aware the consensus is ext4 online defrag is NOT ready for production systems.

Here is a thread in ubuntu bugs about the topic of ext4 online defrag: https://bugs.launchpad.net/ubuntu/+bug/321528 (look for I_KNOW_E4DEFRAG_MAY_DESTROY_MY_DATA_AND_WILL_DO_BACKUPS_FIRST).

2. Offline Defrag for Ext3 and Ext4.

Is it possible to perform an offline defrag of any Linux file system. Due to the work involved, it would be recommended to perform this task only once or twice a year.

  1. Shutdown your CDP Server:

    /etc/init.d/cdp-server stop

  2. Add an intermediate disk available capable of holding all of your Disk Safes (could be network storage).
  3. Copy all of your Disk Safes to the intermediate storage:

    cp -af /disk/safes/* to /mnt/storage/

  4. Re-format the file system that has the primary copy of your Disk Safe:

    mkfs /dev/YOUR_DEVICE

  5. Copy all of the Disk Safes files back:

    cp -af /mnt/storage/* /disk/safes/

  6. Start the CDP Server:

    /etc/init.d/cdp-server start

    Note on Linux Defrag
    There appears to be a widespread fallacy that some how Linux file systems (ext2, ext3, and ext4) are magically immune to fragmentation. This could not be further from the truth. It's true that the kernel does whats called pre-allocation as you are writing to a file. It will attempt to notice that an application keeps extending a file and allocate contiguous blocks ahead of what it is doing. Windows also does this for NTFS. No matter how fantastical pre-allocation or the file system may be files get fragmented and you need a way to pack them. Linux file systems are no exception.  

Avoid Disk Safe Vacuum

For best performance only vacuum your Disk Safes when you absolutely must reclaim unused storage space. For more about Disk Safe vacuum, see: Vacuuming Disk Safes.

If you are a service provider using CDP 3 Enterprise edition as a multi-tenant system use Volumequotas based on "Size of Deltas in Disk Safe" instead of "On Disk Size". This way your customers only pay for what is stored in the Disk Safe instead of the on-disk foot print which includes the unused parts of the Disk Safe file being recycled for future deltas.


Corruption of Data Archived in the Disk Safe

The CDP 3 Disk Safe is highly reliable and robust. Even with industrial grade protection, there are still ways for your data to become lost or damaged beyond repair. If any of the following events occur, you may corrupt your Disk Safe. If your Disk Safe becomes corrupted, you may lose all or some of your archived data.

If a CDP 3 Disk Safe is corrupted by any one of the events below, it can not be repaired.


Data Corruption in Your Environment