finally figured out that ntfs-3g is faster than ntfsmount only because of
3 reasons:
1) turned on noatime option by default
2) ntfs-3g builds without debug output by default
3) the only real optimization: almost always add resident attributes.
However by accident patch in ntfs-3g for 3) breaks several code paths (why
I am not surprised?), thus I rewrote whole ntfs_attr_add() logic.