In the past couple of weeks, I have noticed several issues related to plots. So, I am putting down my thoughts on plotting in this blog post. Of course, some of what I am saying could be wrong; for example, it might be possible to avoid all pitfalls and errors I mention in this post, even when you are using LibreOffice Calc. If so, please leave your tips, tricks, suggestions, and, links in the comments below. With that disclaimer, here you go.

(1) Do not cut-paste data in to LibreOffice Calc and generate plots; even though it seems easy and the GUI allows you to pick type of plots etc, it is very easy to mess up the data itself. In addition, if you revisit the plot after some time, it is not very easy to retrace your path through the worksheet.

(2) So, use some scripting language/program to generate plots. There is a wide variety of them out there; my recommendations are R, MayaVi and Paraview; in some cases, Ovito is also useful. GNU Octave is useful for quick plots; in my experience for importing data (saved in csv format for example) and dealing with large data sets GNU Octave is not ideal. I tend to see GNU Octave as the equivalent of a hand calculator; if you are doing by hand some bi-section algorithm implementation, a hand-calculator is useful but not if you want a generic code; in that case, you will go to a computer. In a similar fashion, for quick plots, and smaller data sets, you can use Octave; for others, R, MayaVi, Paraview and Ovito work better.

(3) It is a good idea to have some template codes; as they say, laziness is one of the virtues of great programmers (along with impatience and hubris); so, it makes sense to put effort once and generate a good template and work with it or tinker with it in all subsequent instances. Of course, this can, at times, lead to issues. For example, about a couple of weeks ago, a couple of us spent lots of time trying to debug a code because the code was returning us ellipses when we should be getting circles. Finally, we realised this is because we have set the size ratio of the plot to 1:1 while the data had the ratio of 1:16 or something like that. This leads to the other tip, namely, that even if you are using a tried and tested template, once in a while, if you have difficulty, go through the plotting script once.

(4) The kind of resources that are available for somebody today in terms of plotting and generating meaningful graphics is enormous; just like TeX changed typesetting forever, the current software tools have changed the visualisation in remarkable ways — for better and to a point of no return. However, many practitioners have not yet taken full advantage of these tools. Here are a couple of tips based on what I notice: the plots you make should answer the questions you have in mind. So, if you want to compare Scenario A and Scenario B, do not plot A and C and B and D, and try to compare A and B; I know it sounds silly; but, let me say that it is not uncommon. Once you plot A and B together, the other rookie mistake that people make is that the ranges of these quantities could be very large so that the region where they are supposed to differ is masked. So, there are options to generate insets or to introduce gaps in the plot axes. Learn to make use of these features. This leads to my next tip.

(5) Our high school physics teacher insisted that we wrote our guesses for answers for a given physics problem before solving them. In plotting too, this is a good exercise; in addition to helping us develop an intuition for the data, it also helps avoid silly mistakes. I have seen weeks being wasted because we have been looking at some trend without paying attention to the scale; this problem is exacerbated with the tendency to make animations or movies; the plotting software continuously keeps scaling the range of the data to give colours to the extreme values — say blue and red; after a while, we tend to read the plots by colour without paying attention to numbers; the numbers might now be changing in the fourth decimal place but the plot still shows blue and red giving a false impression of inhomogeneity while for all practical purposes, the homogeneity has been achieved. So, before plotting, set your expectation; plot and see if it is met; pay special attention to the numbers. This leads to another important tip: even if it is a rough plot, plotting without axes, tick marks, numbers and labelling means nothing. Again, your high school teachers were correct; label the axes first, set the scale bar and then plot.

(6) Like all communication tools, plots are not just great tools of communication, but also tools that bring clarity to those who are making them. So, once you plot, try to interpret; write down the conclusions that you derive from the plot; check if they are consistent with the assumptions and expectations; if they differ, explore why. You should do this till it becomes a second nature for you — that when you see a plot, irrespective of whether you made it or somebody else, you will immediately draw the conclusions from the plots and critically analyse those conclusions. This skill is probably the most coveted and is the most scarce — simply because it does take years of involved practice to acquire! But the good news is that it is an acquired skill and it is not if you will acquire it but when!!

Visualization has always been a great tool for science and engineering: think of Florence Nightingale, Alexander von Humboldt, and Richard Feynmann. A thoughtful plot or diagram, done with integrity and meaningful data has the potential to change our understanding forever and can help us to scale greater heights and/or explore to greater depths!!

So, happy plotting!!

Here is a paper, the title of which is a short summary of what the paper does:

Aggressively optimizing validation statistics can degrade interpretability of data-driven materials models

Thanks to a colleague and friend, I learnt about another Hall-Petch and data analysis paper:

A new exponential function to represent the effect of grain size on the strength of pure iron over multiple length scales

For more papers along similar lines, see this and this post of mine!