The article explains compressibility as a concept in SEO that could serve as a quality signal for search engines. Compressibility refers to how much a file’s size can be reduced by replacing repeated words and phrases with shorter references, helping search engines save space, reduce bandwidth, and improve retrieval speed.
In addition to compression’s basic function, the article highlights a 2006 research paper by experts Marc Najork and Dennis Fetterly. The paper suggests that compressibility can help detect spam, identifying duplicate pages, doorway pages, and those with repetitive keywords. It notes that pages with higher compression ratios often correlate with low-quality or spammy content. Search engines can use compression to analyze redundant content, measuring it by the compression ratio (uncompressed size divided by compressed size).
The study found that a compression ratio of 4.0 or higher often indicated spammy content, although results varied at higher levels due to fewer data points.
The concept remains relevant, as SEOs still try to rank duplicate content or overuse keywords, tactics that can increase compressibility and signal spam.