[CONTACT]

[ABOUT]

[POLICY]

[ADVERTISE]

Automate CSV typesetting for plot

Found at: republic.circumlunar.space:70/~johngodlee/posts/2020-01-20-latex-csv.txt

TITLE: Automate CSV typesetting for plot data sheets
DATE: 2020-01-20
AUTHOR: John L. Godlee
====================================================================

		

		
I am preparing for fieldwork right now. I'll be measuring trees in 
a number of one hectare plots. The trees in these plots have 
already been measured in the past so there is a large dataset in 
.csv format that I can use to help with the remeasurements.

		
In the past I would have printed this dataset using Excel, which 
means playing around with the annoying "Set Print Area" options and 
then battling with the Print dialog to get the options I need. This 
time I decided to use a combination of R and LaTeX tied together 
with a shell script to accomplish the same thing. Although it took 
me longer this time, I'll have the same workflow for next time 
which should make things a lot quicker in the long term.

		
I used R to format the data with the columns I wanted and then 
export the original dataset as a list of .csv files, one for each 
plot.

		
Then I wrote a template LaTeX file which can be fed a .csv file to 
render it as a table which spans multiple pages:

		
    \documentclass[8pt,a4paper]{article}

		
    \usepackage{pgfplotstable}
    \usepackage{booktabs}
    \usepackage{longtable}

		
    \usepackage{geometry}
    \geometry{left=1cm,
        right=1cm,
        top=1cm,
        bottom=1.6cm}

		
    \input{filename_var.tex}

		
    \begin{document}

		
    \centering{\Large{\textbf{\name}}}

		
    \pgfplotstabletypeset[
    begin table=\begin{longtable},
    end table=\end{longtable},
    col sep=comma,
    ignore chars={"},
    every head row/.style={before row=\toprule,after 
row=\endhead\bottomrule},
    every last row/.style={after row=\bottomrule},
    display columns/0/.style={string type, column 
name={\textbf{Plotcode}}},
    display columns/1/.style={string type, column 
name={\textbf{Plot ID}}},
    display columns/2/.style={string type, column 
name={\textbf{Stem}}},
    display columns/3/.style={string type, column 
name={\textbf{Tree}}},
    display columns/4/.style={string type, column 
name={\textbf{Species}}},
    display columns/5/.style={string type, column 
name={\textbf{DBH}}},
    display columns/6/.style={string type, column 
name={\textbf{POM}}},
    display columns/7/.style={string type, column 
name={\textbf{Alive}}}
    ]{plot_data_sheets/\file}

		
    \end{document}

		
The pgfplotstable package allows me to pull in a csv file and 
render it as a table. In this case a longtable which can span 
multiple pages. I set the format and column name for each column 
with the display columns... lines and ensure that the header row 
appears at the start of every new page using the every head... and 
every last... lines. \input{filename_var.tex} sources a .tex file 
which provides the variables \file and \name, which give the full 
file name, and the file name without extension for the given .csv 
file.

		
I used a shell script to generate filename_var.tex and run pdflatex 
to render a .pdf for each of the .csv files:

		
    #!/bin/sh

		
    for i in plot_data_sheets/*.csv ; do
        file=$(basename -- "$i")
        name="${file%.*}"

		
        printf '%s\n' "\\newcommand{\\name}{$name}" 
"\\newcommand{\\file}{$file}" > filename_var.tex

		
        pdflatex --jobname="plot_data_sheets/$name" table_ex
        pdflatex --jobname="plot_data_sheets/$name" table_ex

		
        rm plot_data_sheets/*.aux
        rm plot_data_sheets/*.log
    done

		
The .pdf files created at the end look like this:

		
  ![Plot data sheet .pdf 
output](https://johngodlee.xyz/img_full/latex_csv/render.png)

		
A .pdf version can be downloaded here.

		
  [downloaded 
here](https://johngodlee.xyz/files/latex_csv/render.pdf)

NEW PAGES:

[ODDNUGGET]

[GOPHER]