Dealing with failed hard drives

Hard drives are balanced disks with grooves, like a record player spinning at a very high speed and just like with a record player or CD if the spin becomes unbalanced the disk will be damaged and won’t work. Hard drives also have a software component to them which can also be damaged.

Physical failures can happen when the disks become unbalanced in the case and scratches against the arm (needle). Anything that can affect the spin can create scratches, these include vibrations, dust and excessive heat (which can warp the metal components).

Hard drives are separated into lots of segments, when it can’t write to a segment the segment becomes unusable.  Software failures can cause bad segments which can create corruption of data and the loss of important data. This can happen when there is an interrupted write processes caused by power-cuts, forced shut downs of the machine and forced termination of processes (a program that is writing data to the disk). It can also create bad sectors which can affect performance, as the disk is trying to write, it will find a bad sector and have to then try and find a good sector to write to.

Physical precautions to take:
o  Keep the machine cooled to prevent overheating – use an air conditioned room.
o  Keep it clean to prevent dust from overheating the drives, unbalancing the disks or creating short circuits on other computer components.
o  Keep the machine in a server room that has a floating floor to prevent vibrations from affecting the hard drives.
o  Keep the machine on UPS (Uninterruptible Power Supply) to prevent power loss from interrupting write processes and allow enough time for safe shut downs.
o  Don’t press and hold the off/on switch. Go to Windows and shut the machine down safely.

Software precautions to take:
o  Don’t kill the DVRServer process.
o  Use tools such as Disk Defragmenter regularly.
o  Don’t force the machine to shut down, wait for it to safely end processes.
o  Regularly install updates for the machine, including Windows and Mirasys. This prevents bugs from causing processes to unexpectedly fail.
o  Ensure you have the latest Firmware for the machine, check the Motherboard Manufacturers website for updates (there are many different SATA firmware, make surhe you download and install the correct one and follow the correct procedure to prevent harddrive corruption and loss of data).

Mirasys is constantly writing to hard drives 24/7 to record data, the disks are heavily used and failures do happen. If the data is critical then make regular Archives to prevent loss. It is recommended to keep backups on machines separate from the source.

Windows can fix drives that softfail but it takes time, during this time the drive will be unavailable to the OS and to Mirasys, it will become a failed drive. In this situation there is nothing wrong with the drive and the recordings on it, we can re-add the drive and retain the data.

Troubleshooting:

To check if a disk has failed go to the Storage Settings in System Manager.

FailedDrive1

From this we can see if a drive has failed.  If this drive has failed before then you will need to watch it closely as it may fail again and this could be a sign that it needs to be replaced.

If you suspect a drive is physically failing then the number one indicator is a slow machine performance. To find out which drive it is – safely disconnect them from the machine one at a time and see if you get improved performance.

It is possible to have more than one drive fail. Under normal operations the drives will spin at the same speed but if one of the drives is having problems it will spin at a different speed, this can affect the other drives by causing the chassis of the machine to vibrate.


Leave a comment