Ideas tagged with data-manipulation
A large scale facility can be described as an object producing, as an output, datasets `D_i`, that scientists analyse to obtain results `R_i`. The ideal data analysis trajectory for an experiment is thus `D_i-->R-i-->P_i`, where `P_i` denotes the desired output: a publication. Most of the tim...
File extensions for data sharing sometimes lie about their contents. Here is an algorithm to infer the actual delimiter of a CSV, TSV or any related format: - Assume that alpha-numeric characters (A-Z, a-z, 0-9) and the period/full stop (.) are cannot be delimiters. - Begin with input te...
By Tim McNamara