\enableregime % probably not needed here, but it's a strong habit [utf-8] \usemodule % the module we're talking about [database] \usemodule % My Way [mag-01] \usemodule [int-load] \def\loadsetups{} \usemodule % interface defined in XML [mod-00] \complexloadsetups % syntax definition [m-database.xml] % remove this: why doesn't \defaultencoding switch to EC automatically? \usetypescript [palatino][ec] % \defaultencoding \setupbodyfont [palatino,10pt] \setuptyping % all the \type-ing will be done in TeX [file] [option=TEX] \setuptype [option=TEX] \setupitemize [packed] \setupframedtexts [frame=off, background=screen, width=\textwidth] % TODO (the way I imagine that the My Way module should work): % - put title= into PDF title % - put author= into PDF author % - put My Way into PDF subject \setvariables [magazine] [title={Creating tables using CSV\\(comma||separated values)}, author={Mojca Miklavec}, date=\currentdate, ] % well, since the above is not implemented yet, we'll do it manually \setupinteraction [author=Mojca Miklavec, title={Creating tables using CSV (comma-separated values)}, subject={My Way}, keyword={ConTeXt, TeX, table, tables, CSV, comma-separated values}, ] % Other TODO's % - fix colors in the \type+\bug+ % FIXME: temporary turn the shading off, otherwise it takes too long for GV to display the file \startuseMPgraphic{paper} % sh := define_circular_shade(a,a,0,bbheight(OverlayBox), % \MPcolor{InnerColor},\MPcolor{OuterColor}) ; % fill OverlayBox withshade sh ; fill OverlayBox withcolor .5white ; \stopuseMPgraphic \setupinmargin [style=italic] % Bug in \type needs some workaround to show the content at least partially OK \startbuffer[abstract] \CONTEXT\ offers good support for creating complex tables (Natural Tables, tabulate, table, tables, linetable), see \useURL[wikitable][http://wiki.contextgarden.net/Tables_Overview]\from[wikitable], but creating simple tables is still cumbersome in the \TEX\ word. The \type{database} module should simplify input of tables (where no row|| or column||merging is required): instead of writing lengthy and %???(opposite of clear/clean/terse/visible/easy to scan) cumbersome\type{ \bTR\bTD-}s or\type{ \NC\NR-}s you can now separate the rows with newlines and columns with commas, spaces, tabs or other character(s) of your choice. \stopbuffer \starttext \setups[titlepage] \setups[title] %\setupheadertexts[Simple Alignment] \section{Motivation} Writing and manipulating tables in applications such as Excel\copyright\ is a childplay. But for writing a table in \TeX\ environment you first need to study 10 pages of manual and even if you already have a table in some plain text file, office application or on a web page, you still need to add dozens of commands to tell \TeX\ how to typeset it. I asked Hans if there was no simpler way to do it. And the answer was: % XXX: should become obsolete if the features are implemented \defineseparatedlist [NaturalTable] [separator=comma, before=\bTABLE,after=\eTABLE, first=\bTR,last=\eTR, left=\bTD,right=\eTD] \startbuffer \startseparatedlist[NaturalTable] Of,course ,it is! \stopseparatedlist \stopbuffer \typebuffer \placefigure[force]{none}{\getbuffer} \section{Defining a new \quotation{data parser}: \textbackslash defineseparatedlist} %\inmargin{Put into the module!!!} In order to turn the above code into a table and to save you some typing, the following definition was provided in the module: \startbuffer \defineseparatedlist [NaturalTable] [separator=comma, before=\bTABLE,after=\eTABLE, first=\bTR,last=\eTR, left=\bTD,right=\eTD] \stopbuffer \typebuffer \def\mycommand#1{{\tt\sl#1}} \type{NaturalTable} is the name of the list, \type{comma} (which could also be written as \type{{,}}) means that each comma will start a new column, while the other six parameters define the rules for typesetting the data. %: arguments of \type{before} and \type{after} will be put before and after the block/table, arguments of \type{first} and \type{last} will be used as the first and the last thing put into a row, while \type{left} and \type{right} will be used to surround each cell of the table. %A new set of rules can be defined with a command \type{\defineseparatedlist} The syntax of \type{\defineseparatedlist} is as follows: \showsetup{defineseparatedlist} \subsubject{separator} Character(s) separating the data cells. There are currently three predefined values: \type{comma} (the default one), \type{space}\footnote{It's there just to justify the effort put into the introduction of a new keyword ;)\crlf\type+separator={ }+ works just as well} and \type{tab} which is a bit special\footnote{\TeX\ usually doesn't distinguish between space and tab unless it's explicitely instructed to do so}, but you can use \quote{any} other character(s) as long as they don't have some special meaning. \type{separator=X} will thus start a new column each time when the character \quote{\type{X}} is encountered. \def\visiblespace{\leavevmode\hbox{\tt\char`\ }} %\starttable[|l|p(10cm)|] %\HL %\NC\bf separator \NC\bf meaning \NC\SR %\HL %\NC comma \NC Each comma (,) starts a new column. If you need to write a comma in your fileds, use braces around (TODO: example) \NC\SR\HL %\NC space \NC Whitespaces (including tabs) start a new column. More spaces together count as one (usual behaviour in \TeX). If you need spaces in your cells, use braces around them, \textbackslash space or \textbackslash\visiblespace \NC\SR\HL %\NC tab \NC Very suitable for \quote{copy||pasted} data from other applications or from web. In some editors it is difficult to distinguish tab from space and some even convert tabs to spaces automatically: if you use one of those editors it's probably unlikely that you would want to use tab as a separator. \NC\SR\HL %\NC\sl TEXT \NC \quote{any} character (sequence) \NC\SR\HL %\stoptable \subsubject{quotechar\footnote{Taco's favourite!}} Triggers literate handling of the cell content, usually it is double quote ("). It is mostly meant to be used for parsing proper CSV\footnote{comma-separated values, as already noted in the title} data. \startbuffer \defineseparatedlist [CSV] [separator={,}, quotechar={"}, before={\starttabulate[|r|c|l|]},after=\stoptabulate, first=\NC,last=\NR, left=,right=\NC] \startCSV some data,&,"a comma, hidden inside a quote" quoted quotes,"""","need lots of ""quotes""" \TeX\ commands,are $\lnot$,processed UTF-8,should ¬,be a problem \stopCSV \stopbuffer \typebuffer \placetable[force]{none}{\getbuffer} {\it Do I see a space after TeX? Well, forget it. It's not that important.} {\sl Note: According to CSV specification, content of a single cell could span across multiple rows (preserving newlines) if quoted properly. This won't work here, at least not until pdflua\TeX\ is out.} \subsubject{before/after, first/last, left/right} \starttable[|l|l|l|] \HL \NC\bf keyword \NC\bf used for \NC\bf examples of possible arguments \NC\SR \HL \NC before \NC beginning of table \NC \type{\bTABLE \starttable \starttabulate} \NC\FR \NC after \NC end of table \NC \type{\eTABLE \stoptable \stoptabulate} \NC\LR \HL \NC first \NC beginning of row \NC \type{\bTR \NC } or \type{\VL} \NC\FR \NC last \NC end of row \NC \type{\eTR \NR} \NC\LR \HL \NC left \NC beginning of cell \NC \type{\bTD} \NC\FR \NC right \NC end of cell \NC \type{\eTD \NC} \NC\LR \HL \stoptable Note that for \type+\starttable +and \type+\starttabulate +you also need to specify the pattern, such as \type{before=\starttable[|l|l|l|]} for three left-aligned columns. In contrast to natural tables where the number of columns is able to adapt itself according to the data, you have to watch out here, so that you provide the exact number of columns, otherwise you may run into troubles. \subsubject{command} Instead of creating a table, you can also provide your own command accepting the same number of parameters as the number of columns in the data. If non-empty, the module will ignore any settings for \type{before}/\type{after}, \type{first}/\type{last} and \type{left}/\type{right} and use the supplied command instead. Suppose the you wanted to print addressed on envelopes to send your magazine to some \TEX\ user groups. You would first define a command to print the envelope: \startbuffer \def\SendMe#1#2#3#4{\framed [align={flushleft,lohi}, width=4cm, height=2.5cm]{#1\crlf#2\crlf\crlf\uppercase{#3\crlf#4}}} \stopbuffer \typebuffer \getbuffer An alternative to using \type+\SendMe{name}{adress}{post office}{country}+ for each entry is now to \type+\defineseparatedlist +for the whole list: \startbuffer \defineseparatedlist[Address][separator={;},command=\SendMe] \startAddress NTG;Maasstraat 2;NL-5836 BB Sambeek;The Netherlands Dante~e.V.;Postfach 101840;D-69008 Heidelberg;Germany \stopAddress \stopbuffer \typebuffer \placefigure[force]{none}{\getbuffer} %{\it I would like to place them into the same row.} \subsubject{setups} Until I figure out how to explain it, I hope that the example below will be descriptive enough to give you an idea how to use it. Some files come with comments (usually lines starting with \#). To ignore such lines, the following recipe might help you: \startbuffer \unprotect \startsetups CSV:unix \catcode`\#=\@@comment \stopsetups \protect \defineseparatedlist[CSV][setups=unix,...] \stopbuffer \typebuffer \section{Recycling: \textbackslash setupseparatedlist} If you want to use space instead of comma as a separator in a list that is already defined, all you have to do is to \startbuffer \setupseparatedlist[NaturalTable][separator=space] \startseparatedlist[NaturalTable] setup an\ existing {separated list} and watch for\space the\space spaces. \stopseparatedlist \stopbuffer \typebuffer \placetable[force]{none}{\getbuffer} \showsetup{setupseparatedlist} \section{Using: \textbackslash startseparatedlist} Once you have sucessfully defined a separated list called {\tt\sl NAME}, there are basically three ways to use it: \startitemize \tt\sl \item \type+\startseparatedlist[+NAME\type+] ... \stopseparatedlist + \item \type+\start+NAME\type+ ... \stop+NAME\type+ + \item \type+\processseparatedfile[+NAME\type+][+filename\type+]+ \stopitemize Sadly enough this doesn't work (it must be my mistake somewhere): \startbuffer \showsetup{startseparatedlistname} \showsetup{processseparatedfile} \stopbuffer \typebuffer \getbuffer Some time ago Willi sent me some data about the decreasing number of cows in Holland\footnote{one unit meaning 1000 cows} in an Excel table. I copy-pasted the content into a simple text editor (so that tabs were placed between single cells) and commented out the first two lines\footnote{I wanted to plot the data with another program which didn't know what to do with words when it should plot numbers}. The arrows are there just to visualize tabs. \startbuffer[TSV-example] # Number of cows in Holland # Year Total Milking Pregnant 1995 1709 1449 260 1997 1606 1387 219 1999 1520 1307 212 2001 1496 1345 151 2003 1492 1324 169 2005 1421 1263 158 \stopbuffer \placefigure[force]{none}{ \startframedtext %\typefile{\jobname-TSV-example.tmp} \def\visualtab{\color[darkgray]{$\rightarrow$}} \bgroup \obeylines\tt \# Number of cows in Holland \# Year \visualtab\ Total \visualtab\ Milking\ \visualtab\ Pregnant 1995 \visualtab\ 1709 \visualtab\ 1449 \visualtab\ 260 1997 \visualtab\ 1606 \visualtab\ 1387 \visualtab\ 219 1999 \visualtab\ 1520 \visualtab\ 1307 \visualtab\ 212 2001 \visualtab\ 1496 \visualtab\ 1345 \visualtab\ 151 2003 \visualtab\ 1492 \visualtab\ 1324 \visualtab\ 169 2005 \visualtab\ 1421 \visualtab\ 1263 \visualtab\ 158 \egroup \stopframedtext } Let's first define the appropriate {\sl separatedlist}: \startbuffer \defineseparatedlist [TSV] % tab-separated values [separator=tab, before=,after=, % we'll place them explicitely first=\bTR,last=\eTR, left=\bTD,right=\eTD, setups=unix] \stopbuffer \typebuffer \getbuffer We might want to use boldface and background color for the first row. We also have to begin the table explicitly because we didn't set any command to start and stop the table\footnote{If we did, we couldn't join the data from two different sources: we provide the header line explicitly and use a file as source of the data.}. \startbuffer \setupTABLE[r][1][style=bold,background=color,backgroundcolor=gray] \bTABLE % Header \startTSV Year Total Milking Pregnant \stopTSV % Content \processseparatedfile[TSV][\jobname-TSV-example.tmp] \eTABLE \stopbuffer \typebuffer \placefigure{none}{\getbuffer} \section{Known bugs} \startitemize % seems it was fixed (or I didn't know how to use it) %\item comments behave strangely (probably in the \TeX-ish way, but it's currently \quotation{impossible} to insert comments since then there's no newline at the end and subsequent rows are merged together). Consider the following example: \startbuffer \startseparatedlist[NaturalTable] a,b,c % some more d,e,f \stopseparatedlist \stopbuffer %\typebuffer %\placefigure[force]{none}{\getbuffer} \item Recent versions of the module introduced some problems with UTF-8 character handling in normal mode (with \type{quotechar} it works OK). Example: \startbuffer \startseparatedlist[NaturalTable] č,š,ž \stopseparatedlist \stopbuffer \typebuffer Other 8-bit regimes work OK. \item blank cells have problems at the end of line: \startbuffer \startseparatedlist[NaturalTable] a,b c, \stopseparatedlist \stopbuffer \typebuffer %\placefigure[force]{none}{\getbuffer} \stopitemize \section{Wishlist / TODO} \startitemize \item \type+\defineseparatedlist[name][nameofotherlist]+ to inherit properties %\item I always forget the difference between \type+before/after+, \type+first/last+. Would \type+beforetable/aftertable+, \type+beforerow/afterrow+, \type+beforecell/aftercell+ (\type+tablestart/tablestop+, \type+rowstart/rowstop+, \type+cellstart/cellstop+) be more descriptive and easier to remember perhaps? \item selecting columns (and rows?) A handy feature would be something like \type+usecolumns={1-3,5}+, which would select only the columns 1, 2, 3 and 5 and: \startitemize \item ignore redundant information (unneeded columns/too long lines), \item \quotation{add} empty cells if the data line would be too short. \stopitemize An example of a valid definition would thus be: \startbuffer \defineseparatedlist [Address] [separator={;}, command=\SendMe, usecolumns={1-4}] \startAddress NTG;Maasstraat 2;NL-5836 BB Sambeek;The Netherlands;ignored data Dante~e.V.;Postfach 101840;D-69008 Heidelberg \stopAddress \stopbuffer \typebuffer Comments at the end of the first row would be ignored, and though leaving fields out doesn't really belong to good (programming) habits, the second line with one semicolon missing will pretend as if the field with Country would be present and blank. Without \type+usecolumns={1-3}+ an error would be raised in such case. \item special treatment of header lines (I'm not sure yet how exactly this should work.) \stopitemize %This is only meant as a distraction and it's not worth investing too much time in it. I would use it for clarification in this document, but it's really really low on the priority list: %\startitemize %\item setting tab width in verbatim (might be complex since one needs to keep track of current column) %\item visualizing tab in verbatim (for example with $\rightarrow$) %\item how to convert dots into commas in certain columns in tables? %\stopitemize %\setups[listing] \setups[lastpage] \stoptext