An Introduction To Backups

A additional write up for the Complete Guide To The Paranoid User has become evident after re-reading some older work. I missed the mark on some other key details that will be addressed in subsequent write ups. In this document I shall walk through the details concerning backups, the needs, and the processes.IntroductionProbably the most important computer maintenance chore people neglect is having backups. In this section we’ll cover a few different types of backups and cover their purposes. One thing too keep in mind is there is fundamentally two different types of backups, image backup and file based backups. In computer terminology an “image” is essentially an exact copy of all the bytes of a hard drive or partition. Image’s themselves will be a single file on the media we are backing up on however, unlike a zip file, we cannot pull individual files we want out of the image file. The recovery procedure for an image file is essentially to take the data from the image file and write it on to the new hard drive. This results in having an exact copy of when the backup was taken be what is restored when we implement the recovery. One thing to remember is that the new hard drive we’re recovering the image on must be the same size or larger as the original drive that was imaged. So if we imaged a 500 GB drive, the new drive we recover on must be a minimum of 500 GB. We can recover onto a larger one, such as a 1 TB drive, however when recovering with the image, it will not automatically take up the extra 500 GB and leave it alone. However after recovering we can expand the volume of the recovered system to claim the extra space on the drive.The other type of back up is backing up files from our system. This is more or less similar to copying files from our computer to a flash drive and can include backing up everything in our home folder.(Windows) C:\Users\ (Linux / Unix) /home/ I will be referring to files in these folders as “user data” and files in folders above these as “system data”. The difference user data being the files you create and work with yourself as a user, where as system data is where the operating system keeps it’s files and configuration data.VersioningThe first type of “backup” (technically it is not a backup, but relevant for our purposes) is what I will call “versioning” although this is not really a technical name for it either. On Windows this is called the “Shadow Volume” and Mac uses a similar program called “Versions”. Linux, does not really have an equivalent. On the surface Timeshift seems do this, however it works quite different and so is only feasible for backing up system data, not user data. The reason is that Timeshift makes copies of the files (it is essentially a front end for rsync) where as on Windows, it takes an incremental image of the drive. Essentially it creates and image of the main C: drive when first installed, then periodically takes another image but only stores the blocks that are different from the previous image. This is a much more efficient method and so Windows Shadow Volumes are able to also work on user data as well as system data. However by default it keeps these backups on the same drive that it is imaging, so although it will not protect you against the drive dying or getting lost, it does provide a way of recovering older versions of modified or deleted files. Additionally since it makes these incremental backups on the same drive it is backing up, you may want to disable it so that it does not take up space on your drive.Whether you wish to use it or not, first you should check to see if it is currently being used. To do so, enter “System Protection” in the Windows search bar and something like “Create a restore point” should show up as an option. Once open, it should be on the “System Protection” tab, if not navigate there. From there click “Configure”. From here you have the option to turn it on or off as well as delete old restore points if you would like to turn it off. From here you can restore your entire system from clicking “System Restore” and selecting from the available restore points.However, if you just want to restore a single file, all you need to do is right click on the file in file explorer and select the “Restore Previous Versions” options. Additionally you can preview the older version to see if it is the one you want. For files that are deleted, and for some reason or another are not in the recycle bin, you can do the same procedure on the folder that contained the file and view the contents of the folder from the available snapshotsImage BackupsProbably the most important type of backup to do is an image backup, that way our entire system can be restored in the event of a hard drive failure, theft, etc. Ideally, we would have two backups of our system. One, we would keep in our house and another off site that could either be on the a cloud provider like Dropbox or Google Drive (although of course we would want it encrypted) or on a physical device somewhere like a bank safety deposit box or storage unit. That way in the event of something extreme such as a house fire or burglary where our on site backup is stolen, we still have something we can recover from. However I know the vast majority of people will not actually do this, so a more realistic option is to have the on site image backup and then have a small backup of important files, such as password manager database, scans of personal documents, crypto wallet / seeds, phone backups, etc onto a flash drive that we store offsite or upload to the a cloud provider, however in either case we would again want these files encrypted.To do an image backup, we will want an external hard drive that is the same size or larger as the drive that we are backing up. Typically external drives are at least 1TB which should be enough for most laptops, however if you have a desktop with more storage, you may have to shop around for a bigger one. They definitely exist, but are of course a bit more expensive. Many external hard drives you buy will come installed with some sort of backup software on them already, however they differ based on manufacturer and I personally have never used them before so cannot comment on whether they should be used or not. If you buy one of these drives and wish to do a backup the “proper” way, the first thing you will need to do is format it to the file system your operating system uses. Not only will this get rid of the preinstalled backup software on them, but it’s also a requirement on Mac and Windows for the backup drive to be the appropriate file system. On Mac that would be APFS (Apple File System) and NTFS (New Technology File System) on WindowsWindows - All you need to do is open File Explorer and right click on the connected drive you’d like to do the backup on and click “format’. Windows will display the default settings and this is what you want, change the volume label to whatever name you like, but the rest is fine. However if when you plug the drive into your computer, it does not show up in file explorer, you may need to format it manually. To do this just type “disks” into the Windows search bar and select “Create and format hard disk partitions”. Click on it and you will see a window displaying your series of disks. Disk 0 will always be the operating system’s disk, we can confirm that since the C: drive is on a partition of Disk 0. From here simply click on the area representing the Disk 1 partition and select “format” from the menu. From here we will be presented with a similar menu seen earlier when formatting from file explorer, just make sure it is set to NTFS and name the volume what you like and format it. From here you can now use the built in Windows backup tool to create image backups of your system.Linux - Most distributions will come with a graphical backup tool installed (if defaults are selected) that should no be hard to figure out if you installed Linux, once you have your drive formatted and mounted, presumably the external drive will be /dev/sdb and your operating system drive will be /dev/sda and for the example we will say it is mounted under run/media/backup. The first thing you will want to do is zero out the unused space on your drive. This way the unused space of the drive will be easily compressed since it will just all be zeros. To do this we will use the following commanddd if=/dev/zero of=zero ; rm zero What this command does is pull from an unending stream of zeros and writes them to a file in the current directory called “zero”. It will do this until the drive runs out of space and the command fails, afterwards it will run the next command that deletes the file freeing all that space. Now we can create our image. First we will want to make sure we have the program “pv” installed so we can monitor our progress, it should be in your distro’s repository. Next we will use the following command to create and compress our disk image onto our backup drive.sudo pv < /dev/sda | gzip -c > /media/backup/computer.img.gz To do the restoration, you’d boot Linux from a USB with the recovery drive connected and the new drive installed, and the process would be much the same.pv < /media/backup/computer.img.gz | gunzip -c | dd of=/dev/sda Multiple Hard DrivesI briefly want to cover configuring multiple hard drives, even though it is not really a backup, it can be relevant to RAIDS which I will discuss later. Most of this will not be relevant for laptops, but for desktops, but I will cover using an external hard drive as an extension of your main drive. The first and simplest option, for desktops with multiple drives, is to put multiple drives on the same volume. This way you could use the combined size of two different drives to act as a single one.A spanned volume is a dynamic volume consisting of disk space on more than one physical disk. By creating spanned volume, you can merge multiple unallocated spaces of physical disks into one logical volume so as to utilize space on multiple disks efficiently. When users need to create a volume but do not have enough unallocated space for the volume on a single disk, users can create a volume with desired size by combining areas of unallocated space from multiple disks. The areas of unallocated space can be different size. This kind of volume is called spanned volume. If the space allocated to the volume on one disk is filled up, users can store data to the next disk.Spanned volume allows users to get more data on the disk without using mount points. By combining the multiple unallocated spaces of physical disks into a spanned volume, users can release drive letters for other uses and create a large volume for file system use. Increasing the capacity of an existing spanned volume is called "extending". Existing spanned volumes which are formatted with NTFS file system can be extended by all unallocated space on all disks. But, after extending a spanned volume, if users want to delete any portion of it they need to delete the entire spanned volume.For something like a laptop with an external drive that is connected 90% of the time, but not all the time, It would be better to keep it a separate volume and if you would like to, you could either mount the drive to a folder, rather than as a drive letter, or create a link to the external drive from somewhere on your C: drive. On Windows, when the hard drive starts to run out of space, you usually add another to extend the available storage. Although it is a quick solution, as time goes on, you could end up with a long list of drives on your computer, which sometimes may not be the best approach to organize your data. Instead of using other solutions like Storage Spaces or Redundant Array of Independent Disks (RAID) to combine drives into a logical volume, Windows also allows you to mount a hard drive to a folder rather than using a drive letter. This approach will not only reduce the number of drive letters, but it will also help you to organize your drives better. Furthermore, this is also a solution when you share a folder on the network, and the storage is running out of space. Instead of creating a new network share, you can simply assign a mount point folder path to a hard drive inside the folder already shared in the network to make more storage available.

Open File Explorer and Browse to the folder location you want the mount-point to appear

Click the New folder button from the "Home" tab

Confirm a name for the folder - for example, Storage Pool and Open the newly created folder

Click the New folder button from the "Home" tab to create a folder to mount the drive - for example, HardDrive1

Repeat previous step to create an additional folder depending on the number of hard drives you want to mount as folders

Open Start, Search for Disk Management and click the top result to open the console

Right-click the drive you want to mount and select the Change Drive Letter and Paths option

Click the Add button and Select the Mount in the following empty NTFS folder option

Click the Browse button and Select the folder you created

Click the OK button. Click the OK button again. (Optional) Right-click the drive again and select the Change Drive Letter and Paths option.

Select the current drive letter (not the mount point). Click the Remove button.

Click the Yes button. Once you complete the steps, the secondary hard drive will now be accessible from the folder location you created.

Creating a soft link, or shortcut, is very simple. Go to the folder on your main drive where you will want the link to be (such as your home folder C:\Users\) and right click and select “New” and then “Shortcut” on the next menu, then you will be prompted to ask where the shortcut will point to and you will just select the external drive. Now that shortcut you created will essentially act like a regular folder that stores it's contents on the external drive.RAIDRAID or Redundant Array of Inexpensive Disks is a more advanced method of combining of combining multiple drives into one logical one, typically with redundancy as is in the name. Typically most consumer motherboards will only support RAID 1 and 0. However I cannot really imagine a scenario where I would recommend RAID 0 for personal use. The purpose of RAID 0 is to improve read and write speeds to the drive by striping data across 2 or more drives. The problem with this, unlike having an extended volume, is that if any disk in the array goes out; you lose all the data on the array. As opposed to an extended volume, you would only lose what was on the drive that died, you could recover the data that is on the good ones. Additionally for a personal computer, the performance benefit of RAID 0 will not really make much of a difference, except maybe loading video games however you would still probably be better off just using good quality SSDs without RAIDRAID 0 - This level provides striping without parity. It performs read and write operations simultaneously. It offers faster speed than other levels. It requires at least two hard disks. It fills all disks equally.RAID 1 - This level provides parity without striping. It writes data on two disks simultaneously. If one disk fails, it uses another disk as a backup. It requires double hard disks. For example, if we want to use two hard disks, we have to deploy four hard disks, or if we need one disk, we must deploy two. The first hard disk stores the original data, while the other disk stores the exact copy of the first disk. Since it saves data in two places, it provides a slower speed than level 0.RAID 5 - This level provides both parity and striping. It requires at least three disks. It writes parity data equally on all disks. If one disk fails, it can reconstruct data from parity data available on the remaining disks. It provides a combination of integrity and performance.RAID 6 - RAID level 6 functions similarly to level 5 but saves parity data on two locations, allowing for the failure of two disks.RAID 1 is the relevant backup choice for a majority of users. Essentially RAID 1 configures two drives to be exact copies of each other. So if you put two 1TB drives into RAID 1, the operating system would see one 1TB drive. However you really have two drives with exactly the same data on them, so if one drives that data is still there and usable, you just replace the bad drive and rebuild the array. Now you may have heard the maxim RAID is not a backup. For enterprise purposes it is true, however I think it is fine for personal use. The reason being, the main concern of personal backups is hard drive failure. Since we are talking strictly about desktops, since laptops that can do RAID are very rare. Plus most people who do backups will use an external drive they will leave connected to or close to their computer. Meaning in the event of a home burglary, fire, flood, etc. The backup drive is likely to be lost if the computer is. Again I encourage everyone to have some kind of offsite backup, whether it is a full backup or just your most important files. Additionally personal computers have the recycle bin (even most Linux distros have an equivalent now) and Shadow Volume to recover from accidental deletion, unlike most servers. Point being, for personal use, I believe RAID 1 qualifies as an on site backup.For setting up a RAID configuration in Linux, we will utilize the program MDADM. Creating the RAID, for the sake of this I will work in a Debian system, and two disks, which will be part of the RAID1 setup. Such disks are recognized as sdb and sdc, using the lsblk command will list your current disks and display something similar...sda 254:0 0 7G 0 disk ├─sda1 254:1 0 6G 0 part / ├─sda2 254:2 0 1K 0 part └─sda5 254:5 0 1021M 0 part [SWAP] sdb 254:16 0 1G 0 disk sdc 254:32 0 1G 0 disk Although it is possible to create the RAID directly using raw disks, it is always a good idea to avoid that, and, instead, create one partition on each of the two disks. To perform such task we will use parted. The first thing we want to do is to create a partition table. For the sake of this example we will use mbr partition tables, but gpt ones are required in real world scenarios if using disks of 2TB or larger. To initialize a disk, we can run the following command... sudo parted -s /dev/sdb mklabel msdos Now, we can create a partition which takes all the available space sudo parted -s /dev/sdb mkpart primary 1MiB 100% We can now put the RAID flag on the partition (this will set the partition type to fd - “Linux raid autodetect”). sudo parted -s /dev/vdb set 1 raid on In this case we worked on the /dev/sdb device, obviously we should repeat the same operations also on the /dev/sdc disk.Once we initialized and partitioned the disks we can use mdadm to create the actual setup. All we have to do is to run the following command...sudo mdadm \ --verbose \ --create /dev/md0 \ --level=1 \ --raid-devices=2 \ /dev/sdb1 /dev/sdc1 Once we run the command, we should see the following output...mdadm: Note: this array has metadata at the start and may not be suitable as a boot device. If you plan to store '/boot' on this device please ensure that your boot-loader understands md/v1.x metadata, or use --metadata=0.90 mdadm: size set to 1046528K Continue creating array? y In this case we can answer affirmatively to the question and continue creating the array...mdadm: Defaulting to version 1.2 metadata mdadm: array /dev/md0 started. To see information and the state of the created RAID setup, we can run mdadm with the --detail option, passing the name of the device we want to check. In this case, the output is the following...sudo mdadm --detail /dev/md0 /dev/md0: Version : 1.2 Creation Time : Fri Apr 23 11:16:44 2021 Raid Level : raid1 Array Size : 1046528 (1022.00 MiB 1071.64 MB) Used Dev Size : 1046528 (1022.00 MiB 1071.64 MB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Update Time : Fri Apr 23 11:17:04 2021 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Consistency Policy : resync Name : debian:0 (local to host debian) UUID : 4721f921:bb82187c:487defb8:e960508a Events : 17 Number Major Minor RaidDevice State 0 254 17 0 active sync /dev/sdb1 1 254 33 1 active sync /dev/sdc1 Once the filesystem is created, we should mount it somewhere, and than proceed using it just as a normal block device. To make the system auto-mount the device at boot we should create an entry for it in the /etc/fstab file. When doing so, we should reference the RAID device by its UUID, since its path may change on reboot. To find the UUID of the device, we can, once again, use the lsblk command.lsblk -o UUID /dev/md0 UUID 58ff8624-e122-419e-8538-d948439a8c07 I cannot give too much info on configuring RAID 1 on Windows, since it is done in the BIOS/UEFI of your computer. You will need to look up your computer’s motherboard model to see if it supports RAID and how to configure it. Generally to get to your BIOS you will press a key like “Delete” or “F11” during the boot process to enter it, you typically have a brief window to do it, so usually you just spam that key after turning your computer on until it loads you into the BIOS. Also for Windows, you will likely need to install drivers specific for your brand of CPU (AMD or Intel). These are typically available from your motherboard manufacture’s website and if you’re installing Windows onto a RAID array, you will likely need the driver on a flash drive so that they can be installed before Windows is installed, otherwise the Windows installer will not be able to utilize the RAID array.ConclusionWhile this has been a quick introduction to backups and its need for security, I would also like to link some GUI options below that I would recommend for users not familiar with terminal options. All programs listed are available in all major distributions and can be mixed and matched based on your current set up.