Saturday, May 10, 2014

Copying code != Copying implementation

A little over a week ago, I wrote an article about some common porting mistakes for a new library - LibreSSL. I've since received multiple questions about copying code from other projects. Mainly, if the original project uses certain code or another popular library is using some compatibility code, what is wrong with copying one of those directly?

The most obvious problem with copying from some compatibility library is that they may not be written properly. The primary concern behind most compatibility libraries is that things should compile. Just because things compile doesn't mean errors and all scenarios are being handled properly, or that the implementation is in any way secure. The popular libbsd and many GNU library shims may seem to get the job done, but only superficially.

However, the real problem is that copying some code does not mean you copied the implementation. There's a lot more to an implementation than just some code. For example, perhaps the source in question is being compiled with certain parameters or with special compilers, and the code is written specifically for that situation. If you copied the code but not the build environment, you only nabbed part of the implementation.

Another point to consider is if the code you copied is even used how you think it is. Let's look at a function from NetBSD:

consttime_memequal(const void *b1, const void *b2, size_t len)
 const char *c1 = b1, *c2 = b2;
 int res = 0;

 while (len --)
  res |= *c1++ ^ *c2++;

  * If the compiler for your favourite architecture generates a
  * conditional branch for `!res', it will be a data-dependent
  * branch, in which case this should be replaced by
  * return (1 - (1 & ((res - 1) >> 8)));
  * or rewritten in assembly.
 return !res;
Note how the comment here says that this code may not function correctly on all architectures. In fact, some architectures may have an implementation in assembly which works correctly, whereas the C version does not.

NetBSD, like most C libraries, offers assembly variants for certain functions on certain architectures. If you copy a C file from it without the assembly files for the architecture in question, then the implementation was not copied, and in fact, may even produce incorrect results. You can't always depend on scanning comments either, as previous versions of this C file didn't contain the comment.

Now, let's say you copied files and their build environment, including possible assembly variants, does that now mean the implementation was definitively copied? No! Some functions used in the copied code may actually depend on certain characteristics within this particular implementation. Let me provide an example. Let's say consttime_memequal() was implemented as follows:

int consttime_memequal(const void *s1, const void *s2, size_t n)
   return !memcmp(s1, s2, n);
consttime_memequal() must run in a constant amount of time regardless if whether its two parameters are equal or not. Yet this version here wraps directly to memcmp(), which does not make such guarantees. Assuming this version was correct on the platform in question, it is correct because of an external factor, not because of the C code presented here.

Some compilers may provide their own version of memcmp() which they'll use instead of the C library's, such as GCC. Perhaps some compiler provides memcmp(), along with a compile option to make all memcmp() operations constant-time. The file containing this function can then be compiled with constant-time-memcmp enabled, and now everything else can call consttime_memequal() without any special compilers or build options.

Now while the above is possible, it's not the only possibility. Perhaps memcmp() on the platform in question just happens to be constant-time for some reason. In such a scenario, even if you used the exact same compiler with the exact same build options, and the exact same code, your implementation is still incorrect, because your underlying functions behave differently.

Bottom line, just copying code is incorrect. If some function is supposed to be doing something with certain properties beyond just fulfilling some activity (perhaps constant time, or defying compiler optimizations), then you must review the entire implementation and determine what aspect of it gives it this property. Without doing so, you're only fooling yourself into believing you copied an implementation.

Implementation copying checklist:
  • Understand what you're attempting to copy, and what properties it carries.
  • Ensure what you're copying actually performs its objectives.
  • Ensure you copy the entire implementation.
  • Ensure there's nothing about the compiler, compile options, or other aspects of the build environment which you forgot to copy.
  • Ensure you copy any needed alternatives for certain architectures.
  • Ensure the code does not depend on different implementations of functions you already have.
  • Test everything for correctness, including matching output for error situations, and extra properties.


insane coder said...

On this topic, The Hidden Cost of Software Reuse may also be of interest.

umphy said...

Just out of curiosity, how applicable do you think this applies to higher level programming languages?

insane coder said...

Extremely applicable. All the time I'm running into some JavaScript snippet which works great with the built in JS standard library implementation on one browser, but not on another.

You'll run into these issues wherever you have different compilers, compiler options, or different standard library implementations for the language in question.

However, for many of the higher level programming languages, there's often only a single implementation, and little to no options, so you won't find these issues there at first. They begin to pop up once a second implementation is created and becomes popular.

Steven Salvatore said...

connecting hp deskjet 3520 printer wireless setup is made simple. The given information will hand you to connect printer on a wireless network.

CloudsDeal said...

hotels in toronto pearson airport canada,Google Workspace Promo Code Business Standard, Business Startergoogle workspace discount code,Travel,hill stations in india for honeymoon,Best Hotels in Dubai, Hotels in Dubai for honeymoon,hotels for honeymoon in dubai,Best Hotels in Goa, Hotels in for Goa, Goa honeymoon Hotels,honeymoon hotels in goa near beach,Best Hotels for Honeymoon in Maldives, best hotel in maldives for honeymoon,Best Hotels in Switzerland, Hotels in Switzerland for honeymoon, honeymoon in switzerland,

CloudsDeal said...

Headout discount, Headout Singapore, NYC, Paris, oyo rooms near delhi railway station delhi, best top hotels in maldives, web hosting affiliate programs in India Headout deals, write for us travel guest post,York, cheap hotels in new york city best seo agency in mumbai, gatlinburg indoor pool, gatlinburg hotels with indoor pools, free, high da profile creation sites list, top 500

Unknown said...

Thanks for sharing this amazing post. I am the best broker for forex trading at Traders Central and will help you in choosing the right forex.

MBBS in Philippines said...

Wisdom Overseasis authorized India's Exclusive Partner of Southwestern University PHINMA, the Philippines established its strong trust in the minds of all the Indian medical aspirants and their parents. Under the excellent leadership of the founder Director Mr. Thummala Ravikanth, Wisdom meritoriously won the hearts of thousands of future doctors and was praised as the “Top Medical Career Growth Specialists" among Overseas Medical Education Consultants in India.

Southwestern University PHINMAglobally recognized university in Cebu City, the Philippines facilitating educational service from 1946. With the sole aim of serving the world by providing an accessible, affordable, and high-quality education to all the local and foreign students. SWU PHINMA is undergoing continuous changes and shaping itself as the best leader with major improvements in academics, technology, and infrastructure also in improving the quality of student life.

Easy Loan Mart said...

So, in summary: code snippets on Stack Overflow are protected by copyright unless they are so small that any two programmers would come up with substantially the same code. ... The StackOverflow license requires attribution and sharing of changes to downstream recipients of the code.
You are also read more Get Instant Loan

토토사이트 said...

Toto sites that specialize in Toto Sports are therefore mostly old private Toto sites. The first generation of Totosites has always focused on safety toys that focused on sports betting. 토토사이트 온라인 바카라 안전놀이터

Bella Swan said...

Copying code from Stack Overflow is a form of code cloning; that is, duplicating code from within a project or between projects and reusing it. Depending on who you ask, as little as 5-10% or as much as much as 7-23% of code is cloned from somewhere else. Whether these clones are good or bad is up for debate.
You are also read more Supercharger Crypto Calculator

Shaun James said...

Your article is informative for how to copying code from other projects. Your explanation is very good Keep it up and share more updates with us. Now its time to avail Best Moving Company In Tuscaloosa for more information.

Aparent Link Affilate Marketing said...

You can make a WordPress site that can convert business merchandise, Affiliate Programs best can increase the probability of conversions.