Monday 15 April 2024

Either Excel or Time Machine or Apple Desktop Broke my Spreadsheets and then fixed one of them



This is scary - where's my data ?

*** The is a descriptive story of a data corruption event but does not as of April 2024 contain a definitive root cause analysis to the problem described. There is no fixing of the "data corruption" that appears. If you are having a similar problem, I suggest you get onto Apple support. ***

Background

The storage on modern computers allows us to save many thousands of files adding up to terabytes of data. However, it has been my experience working in both IT support and generally on computers over the years that only a few files are really really important to people and they're often not that big. I certainly fit within that category as two of my most important files are Excel spreadsheets that together weigh just over a megabyte. I've had these two spreadsheets for at least seven or eight years working across a variety of machines. I have both updated the machine I from a Mac powerbook to a new Mac mini to a new Mac studio and I've also updated Microsoft Office when requested from 2007 to the latest versions.

These spreadsheets have lasted the test of time quite well although I have occasionally had to recover them when they became corrupted - I kind of assumed that the corruption was Excel but now I see there may have been another cause. Such corruption has not occurred in quite a while and prior to this year the Mac mini and these spreadsheets have been stable for at least 18 months,

On about the 2nd of April I moved from a Mac mini M1 to a Mac studio M2. Using the time machine facility to restore my backups the entire environment move smoothly from the Mac mini to the Mac studio. There seem to be very few glitches or errors, though I did notice that Excel was not was not able to open, file names that it had recorded in its recent history list. Even though according to the file names alongside the entries the files did exist. Double clicking to open files from the Finder worked as expected. The suggested remedy for this was to delete the cache file in ~/library/blah blah blah. 

Technically how's the data stored ?

These files are important to me and they have a couple of layers of password encryption.

Two Excel documents in the format .XLSX.   Once called DoshNSav_piv and the other DoshNsav_Trackers. They are both multi sheet workbooks and have been saved in the Microsoft Excel password format. I have worked on these spreadsheets over many years and they contain pivot tables but no macro code.

Those two Excel files are held within a re-mountable .sparseimage disc image that also has a password. called FinanceDocs.sparseimage .
The disc image is held in the normal Mac file system, it's double clicked when I want to use the files which then mounted as though it was on external media. When finished I unmount the disc image file. This is known as encryption at rest, but it's fairly basic created using built-in utilities as provided by the standard Apple Application Disk Utility.


Time Machine 

Data needs to be backed up and as a long time user of Time machine I rely on it for backup services. It's not the only backup I do but these are automatic and regularly saved to an external NAS storage box. Time Machine does a synthetic back up creating a new disc image that contains the entire contents of the drive based on the previous entire contents of the drive. In essence it just saves the data that belongs to new files but retains pointers to the data for older files. When restoring data the backups can be accessed either through a GUI mechanism, where you look back through previous folder listings, or by browsing the historical disk images.  The crucial mechanism is that pulling a file from one of the backups should return the entire contents of that file and its attributes in the way that it was when the backup was run.

Here is a listing from the time machine backups showing each backup with date_timestamp
 

Opening one of the backup images shows the files backed up. The disk image file containing the spreadsheets is listed.




So what went wrong  ?

I decided to open the spreadsheets to do a couple of updates in the same way that I have done over many many years. After mounting the disk image, the spreadsheets would not open giving the error message out the top of this page. This was ugly according to the finder listings they both showed the type Microsoft XL workbook ( .XLSX ) I tried changing the file type (on a copy) to something else and back again but to no avail if anything it just broke the file contents even more. Tried opening them on another machine with a similar configuration, but that also failed with the same error message. This shows that the spreadsheets are corrupted. The fact that they would have been closed correctly ( no machine crashed to speak of) points towards data corruption.


What should be seen when opening a password encrypted Excel document. Top version when open in Excel, bottom version when using finder "Space bar to sample" view.



What's in the backups ?

As the contents of the Mac studio were entirely ported over from the previous Mac mini using the Time machine protocol for starting a new Apple machine I thought maybe they will have been some trouble during that conversion. I had a collection of backups of the previous machine as well as backups from this machine. Restoring, the disk images from previous backups worked reliably but when extracting the Excel spreadsheets from the disc image they failed to open. There were two sets of backups  one from the previous Mac mini that ran up to the end of March this year and the other from the new Mac studio. I extracted a set of the disc images from various backups rolling back over time. I extracted the Excel sheets from those disc images to get the listing as follows.



Clicking on each of the Excel spreadsheets quickly showed which ones worked and which ones didn't. It was entirely unclear on first glance as to why some of them would open and some didn't. For some reason some old skills kicked back in and I thought I'd try the UNIX command "file" to see if there was any difference in the file typing. 


clive@BBComp Special measures % file */*                      


2024-0105-225736/DoshNsav_Trackers_piv.xlsx:                                 CDFV2 Encrypted

2024-0105-225736/DoshNsav_pivZ.xlsx:                                         Composite Document File V2 Document, Cannot read short stream

2024-0105-225736/FinanceDocs_2024-0105-225736.sparseimage:                   data


2024-0201-071633/DoshNsav_Trackers_piv.xlsx:                                 CDFV2 Encrypted

2024-0201-071633/DoshNsav_pivZ.xlsx:                                         CDFV2 Encrypted

2024-0201-071633/FinanceDocs_2024-0201.sparseimage:                          data


2024-0303-085151/DoshNsav_Trackers_piv-2024-0303-08515_OK.xlsx:              Microsoft Excel 2007+

2024-0303-085151/DoshNsav_Trackers_piv.xlsx:                                 CDFV2 Encrypted

2024-0303-085151/DoshNsav_pivZ.xlsx:                                         CDFV2 Encrypted

2024-0303-085151/FinanceDocs_2024-0303-08515.sparseimage:                    data


2024-0314-153641/DoshNsav_Trackers_piv.xlsx:                                 CDFV2 Encrypted

2024-0314-153641/DoshNsav_pivZ.xlsx:                                         Apple Desktop Services Store

2024-0314-153641/FinanceDocs-2024-0314-153641.sparseimage:                   data


2024-0320-065313/DoshNsav_Trackers_piv.xlsx:                                 CDFV2 Encrypted

2024-0320-065313/DoshNsav_pivZ.xlsx:                                         Apple Desktop Services Store

2024-0320-065313/FinanceDocs_2024-0320-065313.sparseimage:                   data


2024-0324-011945/DoshNsav_Trackers_piv.xlsx:                                 Apple Desktop Services Store

2024-0324-011945/DoshNsav_pivZ.xlsx:                                         Apple Desktop Services Store

2024-0324-011945/FinanceDocs_2024-0324-011945.sparseimage:                   data


2024-0327-134121/DoshNsav_Trackers_piv.xlsx:                                 Apple Desktop Services Store

2024-0327-134121/DoshNsav_pivZ.xlsx:                                         CDFV2 Encrypted

2024-0327-134121/FinanceDocs-2024-0327-134121.sparseimage:                   data


2024-0331-214023/DoshNsav_Trackers_piv.xlsx:                                 Apple Desktop Services Store

2024-0331-214023/DoshNsav_pivZ.xlsx:                                         CDFV2 Encrypted

2024-0331-214023/FinanceDocs_2024-0331-214023.sparseimage:                   data


2024-0401-211118_MMlast/DoshNsav_Trackers_piv.xlsx:                          Apple Desktop Services Store

2024-0401-211118_MMlast/DoshNsav_pivZ.xlsx:                                  CDFV2 Encrypted

2024-0401-211118_MMlast/FinanceDocs-2024-0401-211118.sparseimage:            data


2024-0402-082024_BBCFirst/DoshNsav_Trackers_piv.xlsx:                        Apple Desktop Services Store

2024-0402-082024_BBCFirst/DoshNsav_pivZ.xlsx:                                CDFV2 Encrypted

2024-0402-082024_BBCFirst/FinanceDocs_2404-0402-082024_BBCFIRST.sparseimage: data


2024-0415-000021/DoshNsav_Trackers_piv.xlsx:                                 Apple Desktop Services Store

2024-0415-000021/DoshNsav_pivZ.xlsx:                                         CDFV2 Encrypted

2024-0415-000021/FinanceDocs_2024-0415-000021.sparseimage:                   data



These results were very useful. it turns out where the file is of the type "CDFV2 Encrypted " the file would open correctly and where it is marked with "Apple desktop services store" quotes opening gives the failure. During the entire historical usage of the spreadsheet I have never knowingly changed the type of the spreadsheet. It had always been an encrypted, saved with the password Excel spreadsheet. 

What is very peculiar and highlighted above how the files change types and then back again. See lines marked in yellow and red.  In the yellow lines, the file DoshNsav_pivz changes type between backups and then in the red lines, back again. Meanwhile the _Trackers file changes to Apple Desktop Services but not back again and remains inoperable.  The file with type "Microsoft Excel 2007+" is a version saved without a password.




I first thought that the Excel files have lost the marker that indicates they are encrypted. Excel then tries to open them as a plain file and obviously fails on some kind of structural integrity test. I could not think of any explanation as to why this marker would change and or change back during the process of doing a backup. This is a data integrity issue as files and their associated attributes should not be changed through the process of doing a backup and restore. There may have been macOS or Microsoft update in that time span.

Looking at the first readable data in each of the files shows as follows.
For a readable Excel file :

BBComp  % strings 2024-0402-082024_BBCFirst/DoshNsav_pivZ.xlsx| head

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

<encryption xmlns="http://schemas.microsoft.com/office/2006/encryption" xmlns:p="http://schemas.microsoft.com/office/2006/keyEncryptor/password" xmlns:c="http://schemas.microsoft.com/office/2006/keyEncryptor/certificate"><keyData saltSize="16" blockSize="16" keyBits="128" hashSize="20" cipherAlgorithm="AES" cipherChaining="ChainingModeCBC" hashAlgorithm="SHA1" saltValue="cfDF3XFob

T&,9

"&Z#

ae+B

'd/#

[)ps

Yjx5<

Y!

  b

1Bi2


For one of the broken file: 


BBComp % strings 2024-0402-082024_BBCFirst/DoshNsav_Trackers_piv.xlsx| head 

Bud1

nIlocblob

gIlocblob

smodDdutc

sdsclbool

slsvCblob

Bbplist00

WXWY

    [XiconSize_

showIconPreview_

calculateAllSizesWcolumns_

BBComp % 



Further investigations

Running "Disk first aid" ( known as fsck in unix circles) on the disk image did give some indications that the may be a problem with the disk image container.


Disk First Aid showing errors within the directory structure that were repaired 

Disk First aid - a clean run


 But has not fixed the broken files.

 % file /Volumes/PoundsDocsClive/Dosh*

/Volumes/PoundsDocsClive/DoshNsav_GameOver.xlsx:      CDFV2 Encrypted

/Volumes/PoundsDocsClive/DoshNsav_May_2017_Copy.xlsx: CDFV2 Encrypted

/Volumes/PoundsDocsClive/DoshNsav_Trackers_piv.xlsx:  Apple Desktop Services Store

/Volumes/PoundsDocsClive/DoshNsav_old.xlsx:           CDFV2 Encrypted

/Volumes/PoundsDocsClive/DoshNsav_pivZ.xlsx:          Apple Desktop Services Store

/Volumes/PoundsDocsClive/Dosh_Nasdaq_Pivot.xlsx:      Microsoft Excel 2007+



But the damage may have been done by then . This is a very specific use case opening an extendable volume changing one file and closing it again. 
Such a working practice would involve extending the volume but not by very much, and redoing the encryption. It's my understanding that Time Machine then has to back up the whole of the disc image because of part of it has changed. Time Machine works on at the file level, not at the block, level.

Anyway, it's all rather annoying but I think I might have been ignoring this for awhile as you can see from the names of the files. Occasionally I've had to rebuild these two spreadsheets probably because of some minor corruption noted.

A quick look to see if I can spot any corruption within the Excel spreadsheet proved fruitful. Using the side-by-side graphical compare program provided with the Xcode kit we can see with the spreadsheet on the left side and the .DSstore on the right hand side that the spreadsheet has been over written with the first part of the store. The magic number for a DS store is the string Bud1pp which can be seen at the start of both files. This would also explain why the corrupted files are described as "Apple desktop services store." this is a classic case of file corruption where the contents of one file appear within the contents of another.

The mechanism of how this occurred remains to be determined.



Who needs to fix this ?


My original conclusion, before I discovered that the corruption was data from the .DSstore file fell between Apple and Microsoft. I have now concluded that this a a problem with the handling of disk images in the Apple infrastructure.

That's an interesting question and during my career in tech support I have been caught between vendors each of which say the problem belongs to the other one. For me there needs to be some kind of mechanism to see and or change the type category of the file to ensure that Excel will open them as an encrypted/password file if they are an encrypted/password file. I understand that Microsoft has used a number of different encryption schemes for their spreadsheets. Somehow coordination has been lost with the file type handling in time machine.

Personally I would would love to help but I bet if I phoned up either tech support team they would say please send over your files and the passwords used and be honest because this is personal financial information that's not gonna happen. The other way is to re-create the problem, but I'm sure that would take quite a while to find the edge condition in historical backups that is causing this problem.

If you think you see this same problem feel free to use the strings | head -10  command in the Terminal app to see if you have the same symptoms. Bud1pp  showing in the corrupted file. Then put in a support case in with Apple. Let me know in the comments below if either you've seen this problem or if you have a fix or more information about it.

In Summary
  • This is a classic case of data corruption at the file level. The data at the start of the Excel file has been over written by data that should be in the .DS_Store file.
  • It's very hard to spot when such data corruption has occurred as there is no external visible marker (except in this case the unix file type). The problem only becomes apparent when the Excel file is opened.
  • I suspect, but I cannot prove that the file did not self repair or become unbroken but that I recovered it from back up after sensing some corruption within the file.
  • Some file types are identified by the dot3 or four letters on the end others are identified by data markers within the file. Excel appears to use a combination of both to identify whether a file has a password encryption or not. In this case, corrupted contents of a spreadsheet prevented from being opened.
  • What's needed now is a script that reliably recreates the issue.
  • I logged this as a support case with Apple but unless I can recreate the issue easily - not much chance it will progress being that it occurred on previous machine.
  • AAAARGH lucky I have backups and know how to use them.
  • Read more about .DS_Store files and some of the problems they cause here.  















.





Sunday 7 April 2024

And the house top computer performance badge is passed

The house badge of top computer performance passed to a new machine today.

Back in 2007 I put together a PC that would match the performance of a T90 Cray supercomputer. It needed to be able to do more than 30 giga flops and have over 20GB of memory. Turned out to be a great machine with a water cooled 12 processor CPU,  3 * 9TB Hdd for lots of storage and a Nvidia graphics card for evaluating the Cuda programming environment.

The previous hand built machine was used it for Folding at home, Cuda programming research, some gaming and other computational heavy lifting. ( more notes here ). 

 

Now upgraded to ....



The new machine is a M2 Studio with 2TB storage, 96GB memory and now proudly wears the house top computer badge.

   

Using a simple single thread benchmark the Studio is about 40 * faster than the previous machine taking just 9s to deliver results that the older machine chewed on for over 6 minutes. The studio is also silent and draws very little power.  The computational test is ...

% time echo "6^7^8" | bc -l | cksum
2486905450 4617825


The top spec Studio


It's not just the computational performance that makes this machine outstanding. Disc speed to the internal SSD measures at a whooping 6.5 GB a second compared with 300 MB a second to a directly attached SSD and 100 MB a second over gigabit to a NAS box of storage. As the Mac Studio has an absurdly large amount of ram, it's probably buffering the internal IO that achieving the phenomenal speeds.


Mac Studio
USB 3 Attached SSD

Network atached storage



Monday 22 May 2023

Puzzle solving entries have moved to a new blog.

The puzzle solving entries of this blog have moved to https://puzzlesolvingsmtstyle.blogspot.com/

Check out how to reframe common puzzles into the language of an SMT solver.

For example:






 




Wednesday 26 April 2023

Digital security for your classic or performance car

Digital security for your car and/or motorbike

We all love our classic and performance cars and would be heartbroken if it was lost or stolen. As well as the standard arrangements of immobilisers and car alarms that can be tricky to fit to classic wiring there are a couple of new techniques for protecting your classic car or motorbike. Some initial technology is required as both these techniques work in combination with an Android or Apple iPhone.

Active CCTV

CCTV for your house and garage has come a long way in the last 10 years. The cameras are much more intelligent about recognising when cars or people move within the area of view. If you have house fed power to your garage, you can almost certainly fit CCTV cameras that will constantly monitor the garage inside and out. Cameras can be configured to alert your phone when there is a movement within your garage. Unlike traditional alarm systems an ongoing subscription is not required as monitoring and alerting is done directly to your mobile phone. 
For garages beyond the reach of power and Wi-Fi self sufficient cameras are available that will  send the pictures using a mobile phone signal. Some charges would apply in this case. Separate boxes for recording are also redundant as cameras can store recordings of interest on an included memory chip. The modern camera types are no longer just a passive recording or scanning devices and are available with two way audio and siren trigger. Typically, the price for a powered network enabled camera is about £70.

Reolink battery powered camera with two way audio and siren trigger

Tracking Tags

The second most interesting development to keep your car safe and located is the small tracking AirTags. A small tag about the size of a £2 coin is secreted somewhere on the car. When the car moves beyond defined limits alerts are sent to your phone. The tags provide occasional updates as to the location of the car that can then be tracked and located. These AirTags have a replaceable battery that lasts about a year. The tags work by using the same technology that is used to find lost phones. Whilst they do not have specific GPS receivers, they send their location back to base using any available phone that comes into the range. As there are over 1 billion mobile phones on the planet these days you're never that far away from a mobile phone. 
These tags are great for knowing the last known time and location of a car. There are a couple of little wrinkles when using air tags that are separated from the phone for longer periods of time, however, this type of device can provide an extra layer of security and a recovery for a classic car. The cost for an apple AirTag is about £35 each or £120 for a four pack with no on going subscription apart from what's needed to keep your mobile phone going.  
The dangers of thieves and stalkers planting these tracking tags has been mitigated within the “Find My” tracking system but everyone should be aware of their existence.  

Apple AirTag, about 1 inch / 25mm in Diameter

  

Other more active trackers with live GPS locations are available but come with an ongoing service subscription cost. 





SuperToy waits for the summer driving fun to begin.

Monday 10 April 2023

Cray Research Supercomputer mini videos and other exhibits

During a visit to The Computer Museum of America TCMoA  some time was spent generating mini videos of the Cray Research supercomputers on display. Check out the collection here.

A couple of other exhibits from the Cray-History.net website follow....

Cray Research Supercomputer family tree