Open Data Dive 2019 follow up: fault categories and fault types

When and why

In March 2019, the Restart Project’s first Open Data Dive was held in London on Open Data Day. Volunteers got stuck into the task of analysing all of the Fixometer data in order to determine ‘fault types’ and ‘fault categories’.

The aim of the event was to produce data that will

  1. Assist the Restart Project’s contribution to the European Commission’s Ecodesign Public Consultation
  2. Help future repair attempts
  3. Inform further enhancements to repair party data logging experience and data ingestion

What happened next

You may be wondering what happened to that data. Well, there was an internal review and further revision concentrating on the records where the fault_type and/or fault_category values were either left empty or Unknown was selected.

Some of these records had simply been overlooked on the day, many were probably the result of unclear guidelines. The pick lists were not ideal but the exercise gave us a chance to find potential improvements, which is what this post is all about… read on!

Fault categories and fault types

What is Firmware and what is a Peripheral?

First thing to mention is that the review showed very few records categorised as either Firmware or Peripheral, therefore the plan is to roll them up so…

Firmware => Software
Peripheral => Hardware

The fault_category pick list options will now only include

  • Hardware
  • Software
  • Unknown

Not enough info in the Problem field

There were 1,033 (of 2,272) records having either Unknown/empty fault_category and/or fault_type, of these 254 have empty Problem values therefore Unknown is valid.

Info but Unknown and empty fault categories and fault types

You can see the various combinations in this spreadsheet

One pick list to rule them all

The fault_type pick lists on the day differed according to device category, dividing into…

  • Desktop
  • Laptop
  • Tablet

… three different pick lists with most of the options being hardware-related.

Analysis and review of the results from the day produced a common set of values for a single pick list to be shared across device categories and with options that we hope are more descriptive and useful. See tab “Map Fault Type”

The ideal pick list should have options that are

  • comprehensive
  • concise
  • unambiguous

New fault types

There are roughly 700 records where fault_type is empty or Unknown but where a new_fault_type has been specified and many of these can be mapped to a proposed map_fault_type value.

Finally

So… what do you think about this list and its relevance to the data in that spreadsheet?

Seeking feedback and suggestions! :slight_smile:

Stay tuned for further analysis and viz of the ODD day’s results post-review to follow.

3 Likes

Assuming the user will pick from two lists, depending on the fault_category (hardware / software)… So…

Software
Boot
Configuration
Multiple
Operating system
Other
Virus/malware

Hardware
Boot
Chassis
Integrated keyboard
Integrated pointing device
Integrated media component
Integrated screen
Internal damage
Internal storage drive
Multiple
Optical drive
Other
Overheating
Performance
Ports/slots/connectors
Power/battery
System board
Unknown

Noting that some like “boot”, “other”, “multiple” would have to be repeated in each fault category, as for example sometimes a boot problem is hardware and sometimes it’s software?

Please correct me if I’m wrong. Copying @Panda here because I think he might be interested.

Currently there is no relationship between fault_category and fault_type. What do you see as the benefit of relating them?

Would picking a fault_category be mandatory in order to pick a fault_type?

Would you have to pick a fault_category first and then a fault_type?

Don’t forget that “Unknown” is a valid fault_category for fault_types such as “Performance” and “Boot”.

Considering that some issues such as ‘boot’ and ‘performance’ (this latter one for some reason put in the hardware section) can be either software or hardware or both, and that the Software list is short wouldn’t it be more efficient and effective to have only one list to choose from to categorise the issue? And if you want to separate between mostly software and mostly hardware issues you could have this more generic category automatically derived from the issue category entered in the fixometer (e.g. I enter keyboard problem, then the system can automatically put that in a hardware super-category, if useful).

The spreadsheet shows the single list. The fault_type values are not related to any fault_category. (See tab “Map Fault Type”)

In the short term I think it best to not over-validate data input in order to see what users select themselves. There could then be a review with data correction if necessary and input validation can be implemented if deemed useful.

On the ODD day there were a couple of faults categorised as “Software” with fault_type “Keyboard”. Personally I would consider an issue with key assignment, drivers etc. to be a “Configuration” fault_type.

The full list of devices-faults showing the values input on the ODD (fault_type_odd) day and the fault_type mapped from the new list can be viewed in this sheet

The data has now been imported into the Fixometer and is available in Metabase along with some charts viewable on the Repairs Overview dashboard https://data.therestartproject.org/dashboard/23

1 Like