Hi all,
I came across an interesting article [1] on patterns of code evolution. I
got
curious, ran their analysis on the Lucene repo, and produced a breakdown of
lines of code per year and the chance of a line of code still existing after
5 years (see attached images).
I think the big drop in the first plot corresponds to Solr moving to its own
project. Not sure about the other jumps - maybe someone else has insight
there.
The second plot shows that there is relatively little churn in Lucene; 45%
of
the code written 5 years ago is still around. For many modern projects, this
curve is a steeper exponential. In Lucene, it's closer to linear. The
article
argues that this could point to better design and more modularity, which
makes
it so code isn't rewritten much.
Just a fun thing to share!
Stefan
[1] https://erikbern.com/2016/12/05/the-half-life-of-code.html
I came across an interesting article [1] on patterns of code evolution. I
got
curious, ran their analysis on the Lucene repo, and produced a breakdown of
lines of code per year and the chance of a line of code still existing after
5 years (see attached images).
I think the big drop in the first plot corresponds to Solr moving to its own
project. Not sure about the other jumps - maybe someone else has insight
there.
The second plot shows that there is relatively little churn in Lucene; 45%
of
the code written 5 years ago is still around. For many modern projects, this
curve is a steeper exponential. In Lucene, it's closer to linear. The
article
argues that this could point to better design and more modularity, which
makes
it so code isn't rewritten much.
Just a fun thing to share!
Stefan
[1] https://erikbern.com/2016/12/05/the-half-life-of-code.html