No David, you are missing the point.... or you are trying to disguise it!
In each of your numbered points, you are skating over the glaring facts. The key to that is in your comment:-
"Again, this is not because of some inherent problem processing and analysing the data, it's because of cost/storage limitations"
Contained in there is the key.....
So.... it's the cost/storage limitation which decide if you'll collect the data, is it? (Actually, it's NOT.... it's the cost/benefit ratio. Anyway.......).
....It therefore follows, inexorably, that the data which you DIDN'T collect and/or DIDN'T store was rejected because you don't 'want' it.... That data fits into the 'set' called: 'too much data'!
Your use of the word 'storage limitations' is the 'dead-give-away'! If you CANNOT have 'too much', how can storage BE a problem?
Ok.... you might want to call it POTENTIAL data or just numbers.... but it's STILL that data which you categorised as 'not worth keeping'.
If I am wrong, you'll be able to explain EXACTLY how it can be 'not worth keeping' and NOT fit the set described as 'too much data'!
Let's try it another way
We both know that there is a HUGE amount of data 'embedded' within 'the exhaust'. We could find out about combustion temp, gas analysis, unburned fuel, exhaust 'echo', scavenging, etc., etc.,. If someone came up with a way to analyse (for example) the exhaust, cheaply and efficiently, which could tell you the PERFECT exhaust length and the cheap analysis of that data would give you 1bhp gain for a £50 investment.... we'd ALL buy one and we'd store THAT data for analysis..... wouldn't we?
However, we also know that we could analyse the unburned fuel for (e.g.) £100,000 and a small super-computer at £500,000 to analyse that data and it might give us 0.01bhp improvement. Would we bother with THAT?
What's the difference between the two sets of data? They are both JUST data! We'd choose to collect/store one but not the other.
If the gas analyser's price dropped to £10 and the data storage for an entire meeting would fit on a single memory stick...... but that you STILL needed the £500,000 super computer to ANALYSE it, meaningfully..... .... NOW..... would you collect and store that data ....HOWEVER CHEAPLY....KNOWING that you could NEVER afford to analyse it??????
We both know the answer! No you wouldn't! Not even Mclaren would store data that they had no reasonable hope of analysing within a reasonable time or cost! They store what they MIGHT use! The do not store data that which they K*N*O*W they CANNOT use or which will never produce a good cost/benefit ratio!!!!
How do those two sets of data differ? The first set of data was cost effective to collect, store and analyse and produced a valuable power improvement. The second set of data was NOT and CAN never be. We WERE able to collect and store the data from the analyser cheaply and effectively, we just could not DO anything useful WITH it.
The second set of data thus fits into the class...... " too " (f***ing) "much data". We were ABLE to collect and store it but we KNEW it wasn't cost effective.... so we didn't! And THAT is an example of 'too much data'!
Even if we did NOT need the super computer to analyse it but the storage of the data required a Petabyte or Exabyte of storage per second....... and we needed a whole MEETING’s worth of data to gain valid information.....
That’s possibly the tidiest example of ‘too much data’.......
None of this is new, or 'my invention': this is standard data handling stuff.
In my work with the 'International Company', we know we could store STUPENDOUS amounts of data, e.g., the sound of each footprint of customers in the store. We KNOW that it could assist us in sales BUT..... to collect, store and then analyse it would exceed the storage capacity of the company. From the cost/benefit ratio analysis, the board would say two things to me if I suggested collecting that data:-
1) "You are suggesting we collect and store too much USELESS data " 2) "Your consultancy contract is terminated!"