absorb.md

Cukereuse

1 mentions across 0 people

Unknown speaker
paper · 2026-04-23
Recommended

We fill the gap with cukereuse, an open-source Python CLI combining exact hashing, Levenshtein ratio, and sentence-transformer embeddings in a layered pipeline, released alongside an empirical corpus of 347 public GitHub repositories... The tool, corpus, labelled pairs, rubric, and pipeline are released under permissive licences.

cukereuse Detects 80% Duplicate Steps Across 1.1M Gherkin Instances in BDD Repos