Hacker News new | past | comments | ask | show | jobs | submit login

Efficiency is not the issue at this point. My prototype diffing algorithm was linear and there have been improvements on it already (I think something called "truediff" is linear but an order of magnitude better! I could be misremembering the name, don't quote me :) ).

The real difficult part is in how you represent AST-level changes, which will limit what your merging algorithm can do. In particular, working around moving "the same subtree" into different places is difficult. Imagine the following conflict:

([1,3], [4,2,5]) <-- q -- ([1,2,3], [4,5]) -- p --> ([1,3], [2,4,5])

Both p and q move the same thing to different places so they need a human to make a decision about what's the correct merge. Depending on your choice of "what is a change", even detecting this type of conflict will be difficult. And that's because we didn't add insertions nor deletions. Because now, say p was:

([1,2,3], [4,5]) -- p --> ([1,3], [2,5])

One could argue that we can now merge, because '4' was also deleted hence the position in which we insert '2' in the second list is irrelevant.

If we extrapolate from lists of integers to arbitrary ASTs the difficulties become even worse :)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: