RAID-0 SSD failure: I/O error, err_mask=0x4












1















Set up



I have an ASUS UX301LA-DE022H. It contains two SSD SanDisk SD6SP1M-256G-1102, 256G each, configured as an Intel Firmware RAID 0 (a.k.a. fake RAID).



What happened



I was using Windows normally. Went away for a few minutes and when I came back, the PC would display a black screen and would only boot to the UEFI configuration screen with no boot options.



So the PC has not endured any shock/physical damage. At this point I suspect a messy Windows update or a software/physical drive failure.



In a nutshell



One of the SSD is not detected anymore, making the whole RAID 0 disk invalid. The most relevant error from dmesg is failed to IDENTIFY (I/O error, err_mask=0x4).



What is the problem? Is it a physical failure? What is the most likely component to fail? I would be curious to know which electronic component failed in that case.



How would a data recovery company proceed to recover the data? Would they replace the SSD controller? Would they look for a dead resistor?





Find below all details:



Investigation




  • the computer takes 120 seconds to display the UEFI configuration screen

  • there is no boot options available from the UEFI configuration screen


  • one SSD is functional (but it's only half of the RAID 0!):





    • it is detected while booting on Linux USB stick



      > dmesg|grep ata2
      [ 3.590698] ata2: SATA max UDMA/133 abar m2048@0xf7d22000 port 0xf7d22180 irq 43
      [ 51.454606] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
      [ 51.455389] ata2.00: ACPI cmd ef/10:09:00:00:00:b0 (SET FEATURES) succeeded
      [ 51.456504] ata2.00: ATA-8: SanDisk SD6SP1M256G1102, X231302, max UDMA/133
      [ 51.456510] ata2.00: 500118192 sectors, multi 1: LBA48 NCQ (depth 31/32), AA
      [ 51.457752] ata2.00: ACPI cmd ef/10:09:00:00:00:b0 (SET FEATURES) succeeded
      [ 51.459283] ata2.00: configured for UDMA/133


    • when the SSD is by itself, the PC starts immediately without any problem


    • when the SSD is by itself, it is correctly detected by the UEFI configuration




ssd-working-uefi





  • one SSD is not functional:





    • it is NOT detected while booting on Linux USB stick



      > dmesg|grep ata1
      [ 3.590697] ata1: SATA max UDMA/133 abar m2048@0xf7d22000 port 0xf7d22100 irq 43
      [ 3.904513] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
      [ 9.013343] ata1.00: qc timeout (cmd 0xec)
      [ 9.013356] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
      [ 9.327983] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
      [ 19.466671] ata1.00: qc timeout (cmd 0xec)
      [ 19.466683] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
      [ 19.466690] ata1: limiting SATA link speed to 3.0 Gbps
      [ 19.781305] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
      [ 50.826666] ata1.00: qc timeout (cmd 0xec)
      [ 50.826678] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
      [ 51.141298] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)


    • when the SSD is by itself, the PC starts slowly


    • when the SSD is by itself, it is incorrectly detected by the UEFI configuration




ssd-not-working-uefi




  • both SATA ports are OK: I tried the functional SSD on each port and it was correctly and quickly detected.

  • when both SSD are present, the UEFI configuration screen shows both disks. That last point puzzles me: it seems like the PC is able to know there are two SSDs, but times out trying to reach one of them.


both-ssd




  • both SSD present no visual damage


ssd-1ssd-2



Additional info (only showing relevant part):



> blkid
/dev/sdb: TYPE="isw_raid_member"

> lsscsi -L
[1:0:0:0] disk ATA SanDisk SD6SP1M2 302 /dev/sdb
device_blocked=0
iocounterbits=32
iodone_cnt=0x6d
ioerr_cnt=0x2
iorequest_cnt=0x6d
queue_depth=31
queue_type=simple
scsi_level=6
state=running
timeout=30
type=0

> smartctl -iA /dev/sdb
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.14.15-1-ARCH] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model: SanDisk SD6SP1M256G1102
Serial Number: 141196400698
LU WWN Device Id: 5 001b44 beb8b143a
Firmware Version: X231302
User Capacity: 256,060,514,304 bytes [256 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: Unknown (0x0010)
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ATA8-ACS T13/1699-D revision 6
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Sun Jul 22 03:01:37 2018 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 4
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct 0x0032 100 100 --- Old_age Always - 0
9 Power_On_Hours 0x0032 253 100 --- Old_age Always - 3184
12 Power_Cycle_Count 0x0032 100 100 --- Old_age Always - 16004
166 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 1
167 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 19
168 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 117
169 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 379
171 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 0
172 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 0
173 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 27
174 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 39
187 Reported_Uncorrect 0x0032 100 100 --- Old_age Always - 0
194 Temperature_Celsius 0x0022 058 047 --- Old_age Always - 42 (Min/Max 18/47)
212 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 0
230 Unknown_SSD_Attribute 0x0032 100 100 --- Old_age Always - 90
232 Available_Reservd_Space 0x0033 100 100 004 Pre-fail Always - 100
233 Media_Wearout_Indicator 0x0032 100 100 --- Old_age Always - 7187
241 Total_LBAs_Written 0x0030 253 253 --- Old_age Offline - 1266
242 Total_LBAs_Read 0x0030 253 253 --- Old_age Offline - 1203
243 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 0


Other Linux commands such as dmidecode, fdisk, lsblk and lspci did not provide more relevant information.



NB: I found some related questions such as Failure of 1 SSD in Raid-0 that was bootdrive stopping computer from booting and How to fix missing RAID1 drive However I was not able to access the RAID configuration screen at startup.



If possible, I would like to retrieve the data from those disks. At this point, I'm not interested into scratching the data and turning the remainig disk into a single disk. Eventually, I will contact a data recovery company but I would like to know what is the problem and if there is anything I can do.



Please refer to In a nutshell section for the question.










share|improve this question

























  • One if your SSDs has failed catastrophically (what you describe is a typical SSD failure). You recover from backup or pay $lots$ to a specialist recovery firm. Poking arround with data recovery tools on your PC won't get the data off.

    – davidgo
    Jul 22 '18 at 9:29











  • Ok I'm fine with that conclusion, but how would a data recovery company proceed to recover the data? What is the most likely component to fail? Would they replace the SSD controller? Would they look for a dead resistor? (I've updated the question)

    – JBE
    Jul 22 '18 at 19:55
















1















Set up



I have an ASUS UX301LA-DE022H. It contains two SSD SanDisk SD6SP1M-256G-1102, 256G each, configured as an Intel Firmware RAID 0 (a.k.a. fake RAID).



What happened



I was using Windows normally. Went away for a few minutes and when I came back, the PC would display a black screen and would only boot to the UEFI configuration screen with no boot options.



So the PC has not endured any shock/physical damage. At this point I suspect a messy Windows update or a software/physical drive failure.



In a nutshell



One of the SSD is not detected anymore, making the whole RAID 0 disk invalid. The most relevant error from dmesg is failed to IDENTIFY (I/O error, err_mask=0x4).



What is the problem? Is it a physical failure? What is the most likely component to fail? I would be curious to know which electronic component failed in that case.



How would a data recovery company proceed to recover the data? Would they replace the SSD controller? Would they look for a dead resistor?





Find below all details:



Investigation




  • the computer takes 120 seconds to display the UEFI configuration screen

  • there is no boot options available from the UEFI configuration screen


  • one SSD is functional (but it's only half of the RAID 0!):





    • it is detected while booting on Linux USB stick



      > dmesg|grep ata2
      [ 3.590698] ata2: SATA max UDMA/133 abar m2048@0xf7d22000 port 0xf7d22180 irq 43
      [ 51.454606] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
      [ 51.455389] ata2.00: ACPI cmd ef/10:09:00:00:00:b0 (SET FEATURES) succeeded
      [ 51.456504] ata2.00: ATA-8: SanDisk SD6SP1M256G1102, X231302, max UDMA/133
      [ 51.456510] ata2.00: 500118192 sectors, multi 1: LBA48 NCQ (depth 31/32), AA
      [ 51.457752] ata2.00: ACPI cmd ef/10:09:00:00:00:b0 (SET FEATURES) succeeded
      [ 51.459283] ata2.00: configured for UDMA/133


    • when the SSD is by itself, the PC starts immediately without any problem


    • when the SSD is by itself, it is correctly detected by the UEFI configuration




ssd-working-uefi





  • one SSD is not functional:





    • it is NOT detected while booting on Linux USB stick



      > dmesg|grep ata1
      [ 3.590697] ata1: SATA max UDMA/133 abar m2048@0xf7d22000 port 0xf7d22100 irq 43
      [ 3.904513] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
      [ 9.013343] ata1.00: qc timeout (cmd 0xec)
      [ 9.013356] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
      [ 9.327983] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
      [ 19.466671] ata1.00: qc timeout (cmd 0xec)
      [ 19.466683] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
      [ 19.466690] ata1: limiting SATA link speed to 3.0 Gbps
      [ 19.781305] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
      [ 50.826666] ata1.00: qc timeout (cmd 0xec)
      [ 50.826678] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
      [ 51.141298] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)


    • when the SSD is by itself, the PC starts slowly


    • when the SSD is by itself, it is incorrectly detected by the UEFI configuration




ssd-not-working-uefi




  • both SATA ports are OK: I tried the functional SSD on each port and it was correctly and quickly detected.

  • when both SSD are present, the UEFI configuration screen shows both disks. That last point puzzles me: it seems like the PC is able to know there are two SSDs, but times out trying to reach one of them.


both-ssd




  • both SSD present no visual damage


ssd-1ssd-2



Additional info (only showing relevant part):



> blkid
/dev/sdb: TYPE="isw_raid_member"

> lsscsi -L
[1:0:0:0] disk ATA SanDisk SD6SP1M2 302 /dev/sdb
device_blocked=0
iocounterbits=32
iodone_cnt=0x6d
ioerr_cnt=0x2
iorequest_cnt=0x6d
queue_depth=31
queue_type=simple
scsi_level=6
state=running
timeout=30
type=0

> smartctl -iA /dev/sdb
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.14.15-1-ARCH] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model: SanDisk SD6SP1M256G1102
Serial Number: 141196400698
LU WWN Device Id: 5 001b44 beb8b143a
Firmware Version: X231302
User Capacity: 256,060,514,304 bytes [256 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: Unknown (0x0010)
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ATA8-ACS T13/1699-D revision 6
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Sun Jul 22 03:01:37 2018 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 4
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct 0x0032 100 100 --- Old_age Always - 0
9 Power_On_Hours 0x0032 253 100 --- Old_age Always - 3184
12 Power_Cycle_Count 0x0032 100 100 --- Old_age Always - 16004
166 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 1
167 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 19
168 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 117
169 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 379
171 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 0
172 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 0
173 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 27
174 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 39
187 Reported_Uncorrect 0x0032 100 100 --- Old_age Always - 0
194 Temperature_Celsius 0x0022 058 047 --- Old_age Always - 42 (Min/Max 18/47)
212 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 0
230 Unknown_SSD_Attribute 0x0032 100 100 --- Old_age Always - 90
232 Available_Reservd_Space 0x0033 100 100 004 Pre-fail Always - 100
233 Media_Wearout_Indicator 0x0032 100 100 --- Old_age Always - 7187
241 Total_LBAs_Written 0x0030 253 253 --- Old_age Offline - 1266
242 Total_LBAs_Read 0x0030 253 253 --- Old_age Offline - 1203
243 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 0


Other Linux commands such as dmidecode, fdisk, lsblk and lspci did not provide more relevant information.



NB: I found some related questions such as Failure of 1 SSD in Raid-0 that was bootdrive stopping computer from booting and How to fix missing RAID1 drive However I was not able to access the RAID configuration screen at startup.



If possible, I would like to retrieve the data from those disks. At this point, I'm not interested into scratching the data and turning the remainig disk into a single disk. Eventually, I will contact a data recovery company but I would like to know what is the problem and if there is anything I can do.



Please refer to In a nutshell section for the question.










share|improve this question

























  • One if your SSDs has failed catastrophically (what you describe is a typical SSD failure). You recover from backup or pay $lots$ to a specialist recovery firm. Poking arround with data recovery tools on your PC won't get the data off.

    – davidgo
    Jul 22 '18 at 9:29











  • Ok I'm fine with that conclusion, but how would a data recovery company proceed to recover the data? What is the most likely component to fail? Would they replace the SSD controller? Would they look for a dead resistor? (I've updated the question)

    – JBE
    Jul 22 '18 at 19:55














1












1








1








Set up



I have an ASUS UX301LA-DE022H. It contains two SSD SanDisk SD6SP1M-256G-1102, 256G each, configured as an Intel Firmware RAID 0 (a.k.a. fake RAID).



What happened



I was using Windows normally. Went away for a few minutes and when I came back, the PC would display a black screen and would only boot to the UEFI configuration screen with no boot options.



So the PC has not endured any shock/physical damage. At this point I suspect a messy Windows update or a software/physical drive failure.



In a nutshell



One of the SSD is not detected anymore, making the whole RAID 0 disk invalid. The most relevant error from dmesg is failed to IDENTIFY (I/O error, err_mask=0x4).



What is the problem? Is it a physical failure? What is the most likely component to fail? I would be curious to know which electronic component failed in that case.



How would a data recovery company proceed to recover the data? Would they replace the SSD controller? Would they look for a dead resistor?





Find below all details:



Investigation




  • the computer takes 120 seconds to display the UEFI configuration screen

  • there is no boot options available from the UEFI configuration screen


  • one SSD is functional (but it's only half of the RAID 0!):





    • it is detected while booting on Linux USB stick



      > dmesg|grep ata2
      [ 3.590698] ata2: SATA max UDMA/133 abar m2048@0xf7d22000 port 0xf7d22180 irq 43
      [ 51.454606] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
      [ 51.455389] ata2.00: ACPI cmd ef/10:09:00:00:00:b0 (SET FEATURES) succeeded
      [ 51.456504] ata2.00: ATA-8: SanDisk SD6SP1M256G1102, X231302, max UDMA/133
      [ 51.456510] ata2.00: 500118192 sectors, multi 1: LBA48 NCQ (depth 31/32), AA
      [ 51.457752] ata2.00: ACPI cmd ef/10:09:00:00:00:b0 (SET FEATURES) succeeded
      [ 51.459283] ata2.00: configured for UDMA/133


    • when the SSD is by itself, the PC starts immediately without any problem


    • when the SSD is by itself, it is correctly detected by the UEFI configuration




ssd-working-uefi





  • one SSD is not functional:





    • it is NOT detected while booting on Linux USB stick



      > dmesg|grep ata1
      [ 3.590697] ata1: SATA max UDMA/133 abar m2048@0xf7d22000 port 0xf7d22100 irq 43
      [ 3.904513] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
      [ 9.013343] ata1.00: qc timeout (cmd 0xec)
      [ 9.013356] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
      [ 9.327983] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
      [ 19.466671] ata1.00: qc timeout (cmd 0xec)
      [ 19.466683] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
      [ 19.466690] ata1: limiting SATA link speed to 3.0 Gbps
      [ 19.781305] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
      [ 50.826666] ata1.00: qc timeout (cmd 0xec)
      [ 50.826678] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
      [ 51.141298] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)


    • when the SSD is by itself, the PC starts slowly


    • when the SSD is by itself, it is incorrectly detected by the UEFI configuration




ssd-not-working-uefi




  • both SATA ports are OK: I tried the functional SSD on each port and it was correctly and quickly detected.

  • when both SSD are present, the UEFI configuration screen shows both disks. That last point puzzles me: it seems like the PC is able to know there are two SSDs, but times out trying to reach one of them.


both-ssd




  • both SSD present no visual damage


ssd-1ssd-2



Additional info (only showing relevant part):



> blkid
/dev/sdb: TYPE="isw_raid_member"

> lsscsi -L
[1:0:0:0] disk ATA SanDisk SD6SP1M2 302 /dev/sdb
device_blocked=0
iocounterbits=32
iodone_cnt=0x6d
ioerr_cnt=0x2
iorequest_cnt=0x6d
queue_depth=31
queue_type=simple
scsi_level=6
state=running
timeout=30
type=0

> smartctl -iA /dev/sdb
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.14.15-1-ARCH] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model: SanDisk SD6SP1M256G1102
Serial Number: 141196400698
LU WWN Device Id: 5 001b44 beb8b143a
Firmware Version: X231302
User Capacity: 256,060,514,304 bytes [256 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: Unknown (0x0010)
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ATA8-ACS T13/1699-D revision 6
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Sun Jul 22 03:01:37 2018 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 4
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct 0x0032 100 100 --- Old_age Always - 0
9 Power_On_Hours 0x0032 253 100 --- Old_age Always - 3184
12 Power_Cycle_Count 0x0032 100 100 --- Old_age Always - 16004
166 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 1
167 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 19
168 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 117
169 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 379
171 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 0
172 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 0
173 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 27
174 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 39
187 Reported_Uncorrect 0x0032 100 100 --- Old_age Always - 0
194 Temperature_Celsius 0x0022 058 047 --- Old_age Always - 42 (Min/Max 18/47)
212 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 0
230 Unknown_SSD_Attribute 0x0032 100 100 --- Old_age Always - 90
232 Available_Reservd_Space 0x0033 100 100 004 Pre-fail Always - 100
233 Media_Wearout_Indicator 0x0032 100 100 --- Old_age Always - 7187
241 Total_LBAs_Written 0x0030 253 253 --- Old_age Offline - 1266
242 Total_LBAs_Read 0x0030 253 253 --- Old_age Offline - 1203
243 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 0


Other Linux commands such as dmidecode, fdisk, lsblk and lspci did not provide more relevant information.



NB: I found some related questions such as Failure of 1 SSD in Raid-0 that was bootdrive stopping computer from booting and How to fix missing RAID1 drive However I was not able to access the RAID configuration screen at startup.



If possible, I would like to retrieve the data from those disks. At this point, I'm not interested into scratching the data and turning the remainig disk into a single disk. Eventually, I will contact a data recovery company but I would like to know what is the problem and if there is anything I can do.



Please refer to In a nutshell section for the question.










share|improve this question
















Set up



I have an ASUS UX301LA-DE022H. It contains two SSD SanDisk SD6SP1M-256G-1102, 256G each, configured as an Intel Firmware RAID 0 (a.k.a. fake RAID).



What happened



I was using Windows normally. Went away for a few minutes and when I came back, the PC would display a black screen and would only boot to the UEFI configuration screen with no boot options.



So the PC has not endured any shock/physical damage. At this point I suspect a messy Windows update or a software/physical drive failure.



In a nutshell



One of the SSD is not detected anymore, making the whole RAID 0 disk invalid. The most relevant error from dmesg is failed to IDENTIFY (I/O error, err_mask=0x4).



What is the problem? Is it a physical failure? What is the most likely component to fail? I would be curious to know which electronic component failed in that case.



How would a data recovery company proceed to recover the data? Would they replace the SSD controller? Would they look for a dead resistor?





Find below all details:



Investigation




  • the computer takes 120 seconds to display the UEFI configuration screen

  • there is no boot options available from the UEFI configuration screen


  • one SSD is functional (but it's only half of the RAID 0!):





    • it is detected while booting on Linux USB stick



      > dmesg|grep ata2
      [ 3.590698] ata2: SATA max UDMA/133 abar m2048@0xf7d22000 port 0xf7d22180 irq 43
      [ 51.454606] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
      [ 51.455389] ata2.00: ACPI cmd ef/10:09:00:00:00:b0 (SET FEATURES) succeeded
      [ 51.456504] ata2.00: ATA-8: SanDisk SD6SP1M256G1102, X231302, max UDMA/133
      [ 51.456510] ata2.00: 500118192 sectors, multi 1: LBA48 NCQ (depth 31/32), AA
      [ 51.457752] ata2.00: ACPI cmd ef/10:09:00:00:00:b0 (SET FEATURES) succeeded
      [ 51.459283] ata2.00: configured for UDMA/133


    • when the SSD is by itself, the PC starts immediately without any problem


    • when the SSD is by itself, it is correctly detected by the UEFI configuration




ssd-working-uefi





  • one SSD is not functional:





    • it is NOT detected while booting on Linux USB stick



      > dmesg|grep ata1
      [ 3.590697] ata1: SATA max UDMA/133 abar m2048@0xf7d22000 port 0xf7d22100 irq 43
      [ 3.904513] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
      [ 9.013343] ata1.00: qc timeout (cmd 0xec)
      [ 9.013356] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
      [ 9.327983] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
      [ 19.466671] ata1.00: qc timeout (cmd 0xec)
      [ 19.466683] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
      [ 19.466690] ata1: limiting SATA link speed to 3.0 Gbps
      [ 19.781305] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
      [ 50.826666] ata1.00: qc timeout (cmd 0xec)
      [ 50.826678] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
      [ 51.141298] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)


    • when the SSD is by itself, the PC starts slowly


    • when the SSD is by itself, it is incorrectly detected by the UEFI configuration




ssd-not-working-uefi




  • both SATA ports are OK: I tried the functional SSD on each port and it was correctly and quickly detected.

  • when both SSD are present, the UEFI configuration screen shows both disks. That last point puzzles me: it seems like the PC is able to know there are two SSDs, but times out trying to reach one of them.


both-ssd




  • both SSD present no visual damage


ssd-1ssd-2



Additional info (only showing relevant part):



> blkid
/dev/sdb: TYPE="isw_raid_member"

> lsscsi -L
[1:0:0:0] disk ATA SanDisk SD6SP1M2 302 /dev/sdb
device_blocked=0
iocounterbits=32
iodone_cnt=0x6d
ioerr_cnt=0x2
iorequest_cnt=0x6d
queue_depth=31
queue_type=simple
scsi_level=6
state=running
timeout=30
type=0

> smartctl -iA /dev/sdb
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.14.15-1-ARCH] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model: SanDisk SD6SP1M256G1102
Serial Number: 141196400698
LU WWN Device Id: 5 001b44 beb8b143a
Firmware Version: X231302
User Capacity: 256,060,514,304 bytes [256 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: Unknown (0x0010)
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ATA8-ACS T13/1699-D revision 6
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Sun Jul 22 03:01:37 2018 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 4
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct 0x0032 100 100 --- Old_age Always - 0
9 Power_On_Hours 0x0032 253 100 --- Old_age Always - 3184
12 Power_Cycle_Count 0x0032 100 100 --- Old_age Always - 16004
166 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 1
167 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 19
168 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 117
169 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 379
171 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 0
172 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 0
173 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 27
174 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 39
187 Reported_Uncorrect 0x0032 100 100 --- Old_age Always - 0
194 Temperature_Celsius 0x0022 058 047 --- Old_age Always - 42 (Min/Max 18/47)
212 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 0
230 Unknown_SSD_Attribute 0x0032 100 100 --- Old_age Always - 90
232 Available_Reservd_Space 0x0033 100 100 004 Pre-fail Always - 100
233 Media_Wearout_Indicator 0x0032 100 100 --- Old_age Always - 7187
241 Total_LBAs_Written 0x0030 253 253 --- Old_age Offline - 1266
242 Total_LBAs_Read 0x0030 253 253 --- Old_age Offline - 1203
243 Unknown_Attribute 0x0032 100 100 --- Old_age Always - 0


Other Linux commands such as dmidecode, fdisk, lsblk and lspci did not provide more relevant information.



NB: I found some related questions such as Failure of 1 SSD in Raid-0 that was bootdrive stopping computer from booting and How to fix missing RAID1 drive However I was not able to access the RAID configuration screen at startup.



If possible, I would like to retrieve the data from those disks. At this point, I'm not interested into scratching the data and turning the remainig disk into a single disk. Eventually, I will contact a data recovery company but I would like to know what is the problem and if there is anything I can do.



Please refer to In a nutshell section for the question.







hard-drive boot ssd raid






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jul 22 '18 at 19:56







JBE

















asked Jul 22 '18 at 3:50









JBEJBE

1496




1496













  • One if your SSDs has failed catastrophically (what you describe is a typical SSD failure). You recover from backup or pay $lots$ to a specialist recovery firm. Poking arround with data recovery tools on your PC won't get the data off.

    – davidgo
    Jul 22 '18 at 9:29











  • Ok I'm fine with that conclusion, but how would a data recovery company proceed to recover the data? What is the most likely component to fail? Would they replace the SSD controller? Would they look for a dead resistor? (I've updated the question)

    – JBE
    Jul 22 '18 at 19:55



















  • One if your SSDs has failed catastrophically (what you describe is a typical SSD failure). You recover from backup or pay $lots$ to a specialist recovery firm. Poking arround with data recovery tools on your PC won't get the data off.

    – davidgo
    Jul 22 '18 at 9:29











  • Ok I'm fine with that conclusion, but how would a data recovery company proceed to recover the data? What is the most likely component to fail? Would they replace the SSD controller? Would they look for a dead resistor? (I've updated the question)

    – JBE
    Jul 22 '18 at 19:55

















One if your SSDs has failed catastrophically (what you describe is a typical SSD failure). You recover from backup or pay $lots$ to a specialist recovery firm. Poking arround with data recovery tools on your PC won't get the data off.

– davidgo
Jul 22 '18 at 9:29





One if your SSDs has failed catastrophically (what you describe is a typical SSD failure). You recover from backup or pay $lots$ to a specialist recovery firm. Poking arround with data recovery tools on your PC won't get the data off.

– davidgo
Jul 22 '18 at 9:29













Ok I'm fine with that conclusion, but how would a data recovery company proceed to recover the data? What is the most likely component to fail? Would they replace the SSD controller? Would they look for a dead resistor? (I've updated the question)

– JBE
Jul 22 '18 at 19:55





Ok I'm fine with that conclusion, but how would a data recovery company proceed to recover the data? What is the most likely component to fail? Would they replace the SSD controller? Would they look for a dead resistor? (I've updated the question)

– JBE
Jul 22 '18 at 19:55










0






active

oldest

votes











Your Answer








StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "3"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1342264%2fraid-0-ssd-failure-i-o-error-err-mask-0x4%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Super User!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1342264%2fraid-0-ssd-failure-i-o-error-err-mask-0x4%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

"Incorrect syntax near the keyword 'ON'. (on update cascade, on delete cascade,)

Alcedinidae

Origin of the phrase “under your belt”?