Re: 350+ scans of original upholstery-carpet-top cards

Posted by BH On 2011/2/27 13:14:19
A little update on my progress - boring stuff, but you can't build a great house without a solid foundation.

After skimming titles from the Search results to begin filling in a spreadsheet for analysis, I initially came up with only 362 entries (not counting one non-applicable photo that happened to contain the string 'set'). Oddly, that figure falls short of the 375 pages in the original PDF collection.

Yet, while transcribing the gallery ID numbers from the search results, page-by-page, I noticed some gaps in their sequence. So, I made a point to note the skipped ID numbers, and after splicing them into the browser URL, one-by-one, I picked up another 30 titles. Turns out that each was missing the term "Set". However, this drove the total over the top to 392, which was puzzling.

So, I copied the titles in my spreadsheet to a second column, stripped each one down to just the part number (P/N), and then resorted the table in P/N order. That revealed over two dozen genuinely duplicated entries. I don't know whether that was the result of overlapping work from two people on one project or the reported problem with an FTP upload session, but we'll work that out later. I left those entires in place in my spreadsheet, but deducting the count of duplicate entries dropped the total down below 375.

Next, I compared my spreadsheet to a list that I had composed a few weeks ago, identifying part numbers (P/Ns) by page numbers in the PDF collection, which turned up another 11 missing P/Ns. I went back to the Photo Archive and searched by part number to pick-up those titles. Again, each was missing the word "set", but those gallery ID numbers had not been apparent from my original Search results. Understand that while that numbering would be continuous within any given batch upload, the Search results were in alphabetical order, which chopped things up. I might have found these additional gaps if I had sorted my spreadsheet by gallery ID number, but hindsight is always 20/20.

Further inspection revealed a few more duplicates and even one set of triplicates, and I found a couple of digits transposed in two P/Ns. I also found a few missing images, but two of them appear to be redundant in the original PDF set. I still can't get the bottom line to balance out exactly, but it looks like we're within +/-1.

This week, I'll begin auditing the titles for any typos or inconsistencies and look into the parts books for expanded model coverage. However, working line-by-line, that's gonna take some time.

This Post was from: https://packardinfo.com/xoops/html/modules/newbb/viewtopic.php?post_id=71582