Probably that Arm-js emulator you mention is faster: Unicorn is based on QEMU, which internally does the translation by translating the guest machine code to an IR. After optimizing it, this IR usually gets compiled back to the host machine code, but thanks to the TCI [1] the IR can be directly interpreted anywhere without the need of a compiler backend. All these steps incur on an additional overhead that wouldn't be there if one interpreted ARM code directly.
You say "you" multiple times, but I did not implement this other emulator: I simply found it yesterday while evaluating options for emulating armv7 on top of JavaScriptCore.
On the other side: this CPU emulator is written using high-level JavaScript, and I think some of its control flow is implemented using exception handling, whereas compiling Unicorn to asm.js will get reconstituted in modern broswers back to reasonable native code (and qemu is designed to be fast). I don't think it is clear cut which is faster.
(I would do a benchmark, but for my use case I actually am working with interpreted JS, which is massively different, and I had already evaluated that Unicorn.js wasn't flexible enough for my intent, so it is more of a curiosity than a question I would spend time to answer myself.)
My bad, I realized my mistake and was editing my message right before your answer.
Indeed, performance is not clear and probably depends a lot on the browser's JavaScript engine. I will try to find that out in the future with some benchmark.
[1] http://wiki.qemu.org/Features/TCI