Pid locking needs a different lockfile-version: MDB_env's with and
without pid locking must not coexist, they can sabotage each other.
Store MDB_LOCK_FORMAT = (version | "use locking" flag) instead.
Treat unexpected errors as "don't know". Invert Pidcheck return
value, so nonzero including error codes = "the process may exist".
On Windows: Catch exited but still existing processes. Handle
undefined PROCESS_QUERY_LIMITED_INFORMATION.
On Unix: don't trust F_GETLK error to leave the input alone,
the fcntl() doc seems unclear.
Whenever we enter cursor_set() the sub-cursor's flag must be
cleared. If the new cursor position has valid subdata it will
be initialized again, if not then the sub-cursor has nothing
to point to.
(Restructuring for upcoming mdb_page_spill work.)
mdb_freelist_save() can't just Get() the destination, since
mdb_page_spill() may have put the destination in the read-only map.
TODO: Can this new put() modify the freelist, which would break it? The
final iteration's put() can shorten the node, the rest uses MDB_CURRENT.
We could set P_KEEP on dirty freeDB leaves and ovpages, since they are
all about to be modified. But the code in this commit must stay anyway,
if mdb should support dropping a 256G DB. I.e. too big for dirty_list.
When collapsing root, must also move cursor index down,
not just the page pointer.
Also in mtest, break from NEXT loops on error, otherwise it just
prints the previous key/data again, which looks confusing.
If mdb_page_touch() sees a page in txn's dirty_list, that
is the page version txn's cursors should have. Fail if
the user may be seeing and depending on another version.
Restore mc_flags and xcursors, they were tracked but not merged.
Simplify: Track parent txn's original cursors after backing them
up, instead of tracking copies and merging them back at commit.
Page leak, mdb_page_alloc(). On error, don't shorten me_pghead.
Memleak, mdb_ovpage_free(). Free page or keep it in dirty_list.
Bad MIDL, mdb_midl_need(). Fix midl[-1] (allocated size).
Catch I/O errors. Do nothing between OS call failure and ErrCode().
Do not use errno after non-OS-errors like write() >= 0, which could
give a failure return of success (errno 0) or some irrelevant error
code. Drop seek calls, use pwrite/pread/Windows OVERLAPPED offset.