Friday, 3 June 2016

Expert Witness Water Process Engineering: Data Analysis

Things are progressing with my latest expert witness report. I'm producing a Part 35 compliant report alongside analysing a mass of data about the operation of a tertiary sewage treatment plant.

Expert witness work of the kind I do often involves process troubleshooting. There can be no proper process troubleshooting without stats. I have never had to carry out a three way ANOVA with interactions or use Duncan's multiple range test professionally, but if you have tens of thousands of data points, plotting them on a scatter diagram isn't going to tell you much.

At a minimum you are going to need to use summary statistics to make some sense of the data. Using MS Excel to show you the mean, minimum, maximum, and the upper limits of the 95% confidence interval of your dataset is a good place to start. You might generate correlation coefficients for relationship between parameters.

In some posts here and elsewhere I have criticised the kind of mathematics taught on chemical engineering courses for being too "pure", and insufficiently relevant to modern practice. Some have interpreted this as me saying no maths should be taught, and refuting that quite different idea, what is known in philosophy as a "straw man" fallacy.

I find people who cannot find an valid argument to counter what I am saying frequently wish to discuss what I am almost implying, what they feel about what I am saying, or something I am not saying at all. So, in the interests of clarity, I am saying that maths is good, necessary and useful to the practising engineer. Laplace transforms may not be of much use, but stats are essential. It is a pity that university curricula view things the opposite way around.

I could teach my twelve year old to get Excel to produce summary statistics in fifteen minutes. What an expert offers is not the knowledge of how to do this, but an understanding of what the results mean and what they do not mean. It is commonplace even amongst supposed experts to misunderstand what we are 95% confident about within the 95% confidence interval, to never check whether the assumptions underlying the valid use of parametric statistics are true, and to be unaware of when and how to apply non-parametric statistics.

If you were to hire a statistician, they would hopefully understand all of the above better than I do. They would not however understand what the outputs mean because these are not just numbers. These are clues as to the state of a process. Chemical/ process engineers understand processes. Maths is just a tool to help them see more clearly, and inform their answer to the questions they are being asked. These questions usually add up to the same thing. "How well does this process work?"

The answer to this question is often "It doesn't", and then we sometimes get back to those discussion of what I am almost implying, what they feel about what I am saying, or something I am not saying at all. I'm pretty sure these arguments do not impress courts any more than they impress me.

No comments: