hoakley June 2, 2022 Macs, Technology

Which are more reliable: hard disks or SSDs?

However else they might differ, hard disks are cheaper for any given capacity than SSDs. Two main reasons often given for preferring SSDs whenever possible are that they’re faster, and more reliable. This article considers the latter question.

Failure modes

Hard disks are elaborate electro-mechanical devices within which one or more platters coated with magnetic particles are spun at high speed. They’re written to and read by heads which are moved just above the surface of the platter. They are thus subject to failure in many ways, including the integrity and function of the magnetic particles, physical contact between a head and platter (literally a crash between them), and the electric motors positioning the heads and spinning the platter. These all deteriorate with use, and result in local errors or bad blocks, and eventually complete failure of the disk.

Like most electro-mechanical devices, hard disk failure rate is generally thought to follow a U curve, with a high rate of failure in the first few weeks of use, declining to a low rate which is sustained until the disks start to wear out.

Solid-state disks contain no moving parts, and fail differently from hard disks. They can develop bad blocks during use, and some studies have claimed that their rate of development may be as high or higher than in hard disks. However, the main cause of failure is thought to occur when their memory has reached its limit of being erased and written to again, when its write endurance has been reached.

Unfortunately, different studies have resulted in conflicting claims. For example, it’s generally thought that each block has a finite write endurance, so the amount of data written to an SSD is a critical determinant in when it will fail. However, other studies claim that device age is more important than write endurance, which is contradictory.

SSDs normally follow a form of U curve, with high initial failure rates, followed by a long period in which failure is uncommon. If write endurance is the main determinant of failure, then those low rates should continue until their write endurance is reached, at which point failure becomes inevitable.

Thus failure modes and lifetimes for hard disks and SSDs are completely different. Wear and tear, the cause of most failure in hard disks, occurs over time, relatively independently of the amount of data written to the disk, but is adversely affected by events such as spinning up the platters. For SSDs, write endurance can be consumed rapidly when large quantities of data are written repeatedly to small SSDs, or very slowly when they are almost entirely used for reading data.

For both types of storage, early failures appear to differ little. Rates of failure during the bottom of the U curve, the working life, are normally low and appear to be little different. Most important is the turning point at which the failure rate climbs rapidly with advancing age. Users should aim to replace that storage just before that occurs.

Conditions of use

Hard disks have been used in different situations. Until ten years ago, most desktop Macs came fitted with an internal hard disk. Most users ran up that hard disk when they powered their Mac on each day, and many also put the hard disk to sleep at various times before shutting down. Apple switched internal storage in MacBook Pro notebooks to SSDs at about the same time. Prior to that, their hard disks were often spun up several times each day.

RAID and NAS systems are more likely to be left running for prolonged periods, spinning hard disks up far less frequently, but commonly reading and writing greater quantities of data. Unlike data centres, though, most working environments are less controlled in temperature, long considered to be a significant determinant in hard disk life.

Studies

Storage manufacturers have some of the most extensive data on hard disk and SSD failure rates, but don’t publish it as a rule. However, they do publish claims for their products, often in terms of ‘mean time between failure’ (MTBF). Typical figures given by major manufacturers are of the order of a million hours, which could be interpreted as indicating that on average their products should last over a century. Despite those, most storage media are only given 2-3 year warranties, and ‘enterprise’ quality storage with longer warranties is now unusual and costly.

Among the more useful studies are those produced by Backblaze, the cloud service provider, over the last seven years. Although those provide valuable insights into data centre use of specific makes and models of hard disks, the disks used and conditions of use are very different from the great majority of home and office users. While they say a great deal about the reliability and working life of the specific models they use, they can’t be used to draw more general conclusions about failure rates or expected working life for other applications or models.

Comparable large scale studies of SSDs are generally older, smaller, and for different conditions of use. For example, the last survey from Backblaze covered only three years experience with SSDs, almost all of 500 GB or less being used to boot and run their storage servers. Most of the SSDs used are from Seagate, and the number of failures reported is too low to provide any useful comparison with their data from hard disks.

Given the size and value of the storage market, it’s shocking how little information on failure rates is available.

Write amplification

Given the importance of write endurance in determining SSD working life, one factor which appears unexplored is the use of SLC write cache in most modern SSDs. Essentially, this uses some of the normal SSD blocks in SLC mode, which writes data at high speed and low density. Once that has been filled, the SSD then copies the written data in slower time to other blocks in the SSD at normal density. This effectively turns each write into a sequence of two writes: write amplification.

I’m not aware of any studies that have been made to assess the effects of write amplification on SSD working life, although SMART reports giving the total data written to an SSD should account for this, so monitoring the wear of your SSD using that figure should remain accurate.

Conclusions

Although good estimates don’t exist, the failure rates of hard disks and SSDs during their expected working life appear low, and not significantly different. Greater differences are seen between different manufacturers, models and batches.

Hard disks will become significantly more likely to fail after three years, but some will retain low failure rates for seven years or more. It’s impossible to predict which will last reliably into old age.

In theory at least, SSDs should have a longer working life before failure unless they’re subjected to abnormally high write rates consuming their write endurance rapidly. Working life could extend as long as ten years, but that will vary according to model and use.

39Comments

Add yours

1

Schnuffsche on June 2, 2022 at 6:59 am

Excellent summary! I find the aspect of write endurance of SSDs in view of the soldered in current Mac SSDs. My MBA M1 2TB scored roughly 250 TB writes in about 1 year of daily use. I wonder how long it holds until I consumed the SSD’s lifetime…upon which it will be a dead paperweight.

LikeLiked by 2 people
- 2
  
  hoakley on June 2, 2022 at 9:05 am
  
  Thank you.
  Write endurance does vary, but I estimate that Apple internal SSDs should be good for at least 1,000 and possibly 5,000-10,000 write cycles. On that basis, assuming that the writes are properly balanced, your internal SSD should be good for its theoretical maximum of 10 years, and will probably die of old age rather than reaching its write endurance.
  Use can change that dramatically, of course, and using large amounts of the internal SSD as VM cache can greatly shorten working life.
  Howard.
  
  LikeLiked by 1 person
- 3
  
  kapitainsky on June 2, 2022 at 11:08 am
  
  It is massive amount of writes IMHO (I consider myself heavy user and did only 30TB over one year on 1TB drive) – of course depends on your workload but also possible that you do not have enough RAM for whatever you do. M1 macs use swap very aggressively – and it works very well thx to SSD speed on these systems. But it comes at the cost of SSD endurance.
  
  Also, it is very easy to see how much life your disk has left (at least according to its spec).
  
  Install smartmontools (brew install smartmontools) and then:
  
  smartctl -a /dev/disk1
  
  Output of above command will contain lines like this:
  
  Available Spare: 100%
  Available Spare Threshold: 99%
  Percentage Used: 0%
  
  Pretty much self-explanatory.
  
  LikeLiked by 2 people
  - 4
    
    Schnuffsche on June 2, 2022 at 11:20 am
    
    thank you. I used DriveDx and it gave me 95% life left, so plenty. Indeed, it seems like M processors resort a lot more to swaps then their Intel counterparts. My iMac has a lot less written on its SSD, I use it the same as the M1 (but this machine also has *lots* of RAM…).
    
    LikeLiked by 2 people
    - 5
      
      hoakley on June 2, 2022 at 11:24 am
      
      It’s very easy to see how much swap is being used, in Activity Monitor, or just doing spot checks on the space used on the VM volume. If your M1 has ample RAM, I suspect it will be zero, as I’ve seen on my own M1 Macs.
      Howard.
      
      LikeLike
  - 6
    
    hoakley on June 2, 2022 at 11:22 am
    
    Thank you.
    It’s often asserted that “M1 macs use swap very aggressively” although I don’t know of any evidence at all that the kernel manages it any differently on M1 series chips than on T2 Macs, which also have very high performance internal SSDs. I suspect much of this has come from some of the early adopters who got M1 models with only 8 GB of RAM and then thrashed them with a lot of stress testing. Several of those remarked how their internal SSD wear became alarming, which isn’t surprising given the crazy things they were doing. Other factors attributed include the use of Unified Memory, which is also a red herring. I have monitored my four M1 Macs, and see no evidence that they use swap any differently from my T2 models.
    As the comment already states the known write usage, which is 125 TB per TB per year, well within normal limits. Apple doesn’t provide write endurance figures for its SSDs, and anyone claiming to know them should be suspect. However, at that rate, as I’ve written, that SSD has many more years life left, by any estimate of write endurance.
    Howard.
    
    LikeLiked by 1 person
    - 7
      
      James on June 2, 2022 at 2:35 pm
      
      One thing that I noticed as well as at least one other person is that you can get an oddly high number of writes in a short period of time for no real discernible reason. I have a 16 GB/1 TB M1 MacBook Air. I currently have 55.7 TBW with 98% life left. This means that my 18 month old SSD will outlast the laptop’s useful life. But I did encounter a very strange situation where over a weekend when the computer wasn’t being used, the computer wrote 15 TB. I went from 39.2 TBW to 54.2 TBW.
      
      There is one other person that I know saw this same odd bug (?)and had it even worse where he showed 28 TBW over a about 31 hours.
      
      I’ve only had it happen once but bears watching.
      
      LikeLiked by 2 people
    - 8
      
      hoakley on June 2, 2022 at 2:54 pm
      
      Thank you.
      Those are seriously strange, as even awake and fully active, it’s hard to see how a Mac could spontaneously write nearly 1 TB per hour out of the blue like that. I’m suspicious that those numbers could be artefact, generated internally within the SSD.
      Howard.
      
      LikeLike
    - 9
      
      James on June 3, 2022 at 1:42 pm
      
      I did consider that it might have something to do with wear leveling but it is an order of magnitude too large (absent bugs).
      
      LikeLiked by 1 person
- 10
  
  kapitainsky on June 2, 2022 at 11:24 am
  
  From practical perspective I fully agree with Howard.
  
  Apple does not tell us exactly what SSD model they are using but for example Samsung 970 PRO has endurance of 2400 TB for 2TB disk.
  
  Now I did 30TB writes in 1 year on 1TB drive and it still shows (Percentage Used: 0%). So whatever disk Apple is using I can aproximate that it provides at least 3000 TB endurance for 1TB model. In your case it would mean 6000 TB – 25 years of use with your workload.
  
  LikeLiked by 2 people
  - 11
    
    hoakley on June 2, 2022 at 11:26 am
    
    Apple doesn’t use conventional SSDs – as the disk controller isn’t in the ‘SSD’, but in the M1 chip, just as with the T2. That’s why it doesn’t matter whether they’re soldered in or not, as you can’t get replacements anyway, as they’re not a standard component, but made for Apple.
    Howard.
    
    LikeLike
12

Andrew B on June 2, 2022 at 7:11 am

SSDs are made using a technology called Flash. As you have explained, Flash memory is susceptible to deterioration of write endurance over time. Also, Writing to Flash and reading from it are seen as quite slow.

Thankfully there are new types of memory being developed that have many advantages over Flash. One of the most promising is called ReRAM (also known as RRAM). ReRAM is claimed to be many times more durable than Flash, is cheaper to produce and is faster and more energy efficient by orders of magnitude. ReRAM can also be scaled down to much smaller geometries than Flash, which is limited to around 40 nm and above. Hopefully it will be only a few years before ReRAM is commonly available, both as embedded RAM in various chips and as discrete RAM to replace Flash SSDs and even storage class memory in data centres. Progress never stops. I won’t mention any companies involved because it might look like advertising.

LikeLiked by 2 people
- 13
  
  hoakley on June 2, 2022 at 9:06 am
  
  Thank you.
  I eagerly look forward to that, although I’m afraid I’m very sceptical: with every change and advance, we’re given grand promises which are seldom fulfilled. Thunderbolt’s 40 Gb/s is but one obvious example.
  Howard.
  
  LikeLike
14

kapitainsky on June 2, 2022 at 12:00 pm

SSD drives are superior compared to their mechanical brothren in every aspect but one. How long they can actually store data when not powered?

From everything I read it is clear – SSD left in the drawer will become corrupted faster than HDD. Mechanical disks detteriorate as well but much slower.

There is not much data available but I have seen one study:

https://www.ni.com/en-gb/support/documentation/supplemental/18/effects-of-temperature-on-ssd-endurance.html

It is pretty much theoretical when talking about internal laptop SSD – e.g. my 1 TB had 30 TB writes last year – assuming that wear leveling works every its cell is “refreshed” every 12 days.

But we all use more and more high capacity, high density (TLC, QLC) drives for external storage and archiving (like Time Machine).

For both SSD and HDD if somebody uses them for longer term archiving it is beneficial to “excersise” them from time to time. If disk is unused – connect it periodically to your computer (once a year shoud be enough for most). And ideally read all content. It allows disk firmware to detect early signs of problems and refresh problematic sectors content.

I just do stupid raw full disk read:

sudo dd if=/dev/rdisk[N] of=/dev/null

LikeLiked by 2 people
- 15
  
  Fazal Majid on June 2, 2022 at 12:27 pm
  
  NAND flash can lose data in as little as 1 to 2 years when unpowered. It is not an archival medium.
  
  LikeLiked by 2 people
  - 16
    
    kapitainsky on June 2, 2022 at 1:09 pm
    
    I am not talking about industrial grade archiving – this is something not what this blog is about. Rather about every year cheaper SSD drives stretching more and more NAND flash technology (like QLC). It means that more and more people just buy them as general purpose drive – to save holiday pictures and put in the drawer.
    
    Definitely all available sources point to limitations of SSD here compared to HDD – link I provided earlier is about SLC drives. We went through MLC, TLC now QLC… I wish there would be some data available but I could not find anything. In my opinion QLC storage endurance will be much worse than for SLC.
    
    LikeLiked by 1 person
- 17
  
  hoakley on June 2, 2022 at 2:48 pm
  
  Please don’t use hard disks or SSDs for archiving anything you want to retrieve in a few years time. I have written about archiving before, mostly about 2 years ago after my father died and Covid was killing so many. Sadly, there’s no good archival medium available for consumer use, but the best solution seems to be archival-quality BD. But then how many folk have a BD burner suitable?
  Howard.
  
  LikeLike
  - 18
    
    Michele Galvagno on June 19, 2022 at 10:55 am
    
    What is BD? Blue ray Disk? If so, don’t physical disks like CD and DVD get worn out in time? What is archival-quality of that medium? Could you link to the article about archiving you mentioned?
    
    LikeLiked by 1 person
    - 19
      
      hoakley on June 19, 2022 at 11:12 am
      
      Optical disks do age, and some don’t age at all well. Many folk are finding commercial audio CDs and DVDs are now becoming unusable because of that.
      Yes, BD is Blu-ray Disc, and this article explains some of the archival media available and options.
      Howard.
      
      LikeLiked by 1 person
    - 20
      
      Michele Galvagno on June 19, 2022 at 11:31 am
      
      Thanks! I’ve read it, wildly fascinating! I’ll comment here to keep things together. Could you please advise-no rush, when you have time, even a future article-on:
      – a BrD burner (and related software) that works well on macOS & that is being actively maintained
      – what disks to purchase (even a range of options, if you feel there is more than a few good ones)
      – how to make the list of files (which I would like to learn how to do also for some of my Mac’s folders)
      Thank you so much!
      
      LikeLiked by 1 person
    - 21
      
      hoakley on June 19, 2022 at 3:05 pm
      
      OK. I think I’ll need to start from scratch again, as my old BD burner requires two USB-A ports, and I’m sure there are better USB-C models available now.
      I’ll get back with an article once I’ve sorted this out.
      Howard.
      
      LikeLiked by 1 person
    - 22
      
      Michele Galvagno on June 19, 2022 at 3:57 pm
      
      Thank you so much! Sorry to give you new troubles!
      
      LikeLiked by 1 person
- 23
  
  Duncan on June 2, 2022 at 4:17 pm
  
  “For both SSD and HDD if somebody uses them for longer term archiving it is beneficial to “excersise” them from time to time. If disk is unused – connect it periodically to your computer (once a year shoud be enough for most). And ideally read all content. It allows disk firmware to detect early signs of problems and refresh problematic sectors content.
  
  I just do stupid raw full disk read…”
  
  Better still, for files that you intend to store and keep (along with duplicates or triplicates): First create checksums of all the files (using Howard’s free D/F/cinch utilities or similar), and then periodically verify their integrity during your ‘exercise’ session. Not only will that read every file, but it will also notify you if anything has changed (gotten corrupted) over time.
  
  (I have been doing this informally but this discussion has prompted me to now set calendar reminders for each spring and fall equinox.)
  
  LikeLiked by 2 people
  - 24
    
    kapitainsky on June 2, 2022 at 5:20 pm
    
    For some long term archive and valuable files it is very good way indeed. I use program called hashdeep to do the same. But it does not work for Time Machine volumes. I have some HDD disk used for periodic TM backup and dd is only easy option to read all its content. I do this once a year.
    
    Also every few years I refresh all disk content by running program called badblocks in non-destructive mode – it just re-writes every sector.
    
    It is poor person way to maintain long term archive:) Othewise I would need some tapes or optical storage – overkill for me.
    
    LikeLiked by 1 person
    - 25
      
      hoakley on June 2, 2022 at 7:15 pm
      
      Fintch/Dintch/cintch should work fine on Time Machine backups, as they save the hash as an extended attribute to that file. That should be preserved in both HFS+ and APFS backups. When you check the hashes, a fresh hash is computed for each file and compared to that saved in the xattr, ensuring that all the data gets read, although none gets written to that disk.
      Howard.
      
      LikeLike
26

EcleX on June 2, 2022 at 2:07 pm

Thanks for the interesting article. Not all SSD were created equal:

The SSD Endurance Experiment: They’re all dead
https://techreport.com/review/27909/the-ssd-endurance-experiment-theyre-all-dead

LikeLiked by 2 people
- 27
  
  hoakley on June 2, 2022 at 2:51 pm
  
  Thank you.
  That’s an interesting article, and an interesting study, when interpreted carefully.
  For those without the time to read that article, it was published in 2015, reporting tests run on six SSDs of 240-256 GB size, which were subjected to continuous erase-write cycling over a period of 18 months starting in 2013. So the SSDs being tested were vintage 2013, that’s nine years ago, at about the same time that Apple was switching the MacBook Pro to using SSDs.
  The SSDs tested worked fine for up to about 600 TB, and some went on to more than 2,000 TB without error. All surpassed their manufacturer’s claims. Assuming wear levelling was effective, that means they survived 2,500-8,300 write cycles.
  Assuming the worst, that your 2 TB internal SSD writes 100 TB per year, that would give it a life expectancy in the range of 50-166 years.
  So the headline might have been click-grabbing, but the results are actually much more encouraging. Given that SSDs have moved on nearly a decade since those tests, we should expect even better performance today.
  Perhaps it should have been titled ‘Your SSD is likely to outlive you’?
  Howard.
  
  LikeLike
  - 28
    
    EcleX on June 2, 2022 at 5:45 pm
    
    Thanks. In general, it is probably most relevant for servers and data centers…
    
    LikeLiked by 1 person
    - 29
      
      hoakley on June 2, 2022 at 5:53 pm
      
      Only if they’re still using 250 GB SSDs from ten years ago. And I’m not sure what they’d be using SSDs for – Backblaze, for example, doesn’t use them for data, only for server boot disks.
      Howard.
      
      LikeLike
30

Ralf on June 2, 2022 at 9:35 pm

Some write here how much has already been written on the SSD. Is there a link where it is described how this can be determined? Thank you

LikeLiked by 2 people
- 31
  
  hoakley on June 2, 2022 at 9:45 pm
  
  A good starting point is the Wikipedia article. It has lots of onward links too.
  Howard.
  
  LikeLike
  - 32
    
    Ralf on June 2, 2022 at 10:35 pm
    
    Thank you. While reading Wiki, I came across Trim. In the 2019 article “Caring for SSDs: TRIM, wear levelling and APFS” you question whether TRIM is useful for APFS. Is there any news update on whether TRIM should be enabled for Macs with M1/APFS?
    
    LikeLiked by 2 people
    - 33
      
      hoakley on June 3, 2022 at 3:44 pm
      
      No. Apple doesn’t mention TRIM in any user documentation AFAIK. I haven’t enabled TRIM using trimforce on any of my M1 Macs, and they seem perfectly able to TRIM external SSDs. So I wouldn’t enable it explicitly unless you can show you’re having problems due to TRIM not occurring on an external SSD. TRIM is, apparently, enabled for all internal SSDs anyway.
      Howard.
      
      LikeLike
34

Michele Galvagno on June 19, 2022 at 10:58 am

Thank you for this! In the beginning you mention the U curve, which would make me think of a parabolic or hyperbolic curve, but then SE for it gives me a sinusoidal curve, which made me confused.
Could you elaborate briefly on what makes disks more likely to fail in their infancy?
Thanks

LikeLiked by 1 person
- 35
  
  hoakley on June 19, 2022 at 11:17 am
  
  No, the U curve is the shape of a letter U, with high rates of failure early on, followed by a period with low rates. Then, at towards the end of life, failure rates increase again, just like the letter.
  Early failures are normally due to manufacturing and component defects. Years ago in electronics they were very high – the DOA (dead on arrival) rate for early IBM PCs could be as high as 50%. Improved QA and testing during manufacture have reduced that, but the first couple of weeks of use still brings a much higher failure rate than, say, after 1-24 months use.
  Most warranty repairs and replacements occur during those first few weeks.
  Howard.
  
  LikeLiked by 1 person
36

A Skeptical Author on July 18, 2022 at 10:06 am

Thank you for resolving the differences!! Very helpful.

LikeLiked by 1 person
- 37
  
  hoakley on July 18, 2022 at 12:13 pm
  
  Thank you.
  Howard.
  
  LikeLike
38

jeffsyrop on August 28, 2022 at 1:06 am

This was really a good article, very accessible for guys like me who are not at your level of understanding of the Mac’s OS. This is something that we all wonder about, and a great example of how useful your posts are. Thank you!

I like that you give a shout out to Backblaze. I’ve tried many online backup services, and Backblaze is the simplest and most robust. I truly love this company. Backblaze plus Time Machine backups on 2 disks makes my data basically immortal.

LikeLiked by 1 person
- 39
  
  hoakley on August 28, 2022 at 5:33 pm
  
  Thank you.
  Howard.
  
  LikeLiked by 1 person