Ideas tagged with text processing

Detecting the delimiter in CSVs that lie

File extensions for data sharing sometimes lie about their contents. Here is an algorithm to infer the actual delimiter of a CSV, TSV or any related format: - Assume that alpha-numeric characters (A-Z, a-z, 0-9) and the period/full stop (.) are cannot be delimiters. - Begin with input te...

By Tim McNamara