In mid-2023, a paper gained significant attention by claiming that a simple combination of and k-Nearest Neighbors (k-NN) could outperform complex BERT (Bidirectional Encoder Representations from Transformers) models in text classification tasks.
: Critical "peer review" on platforms like Twitter and Hacker News revealed "bad numbers" in the original paper, showing that the Gzip-based method only appeared superior due to specific data handling errors. Compression Fundamentals: Gzip vs. ZIP vaosfzip
: The researchers suggested that the basic mathematical principles of compression (identifying patterns) were more efficient for certain NLP tasks than deep learning. In mid-2023, a paper gained significant attention by
Which Compression Saves the Most Storage $? (gzip, Snappy, LZ4, zstd) vaosfzip