BEGIN{ open F,"Makefile"; our $top_dir=pop @{[split /\s/,(grep /^top_srcdir/,<F>)[0]]}; eval "use lib '$top_dir'"; close F; }
use My::Web;
require "CGI";
+use project::captive::doc::Macros;
My::Web->init(
"__PACKAGE__"=>__PACKAGE__,
- "title"=>'Captive NTFS doc',
- "head_css"=>"
-.productname { font-family: cursive; }
-.fname { font-family: monospace; }
-.constant { font-family: monospace; }
-.author { font-family: cursive; }
-.stuff { font-style: italic; font-size: larger; margin-left: 20%; margin-right: 10%; }
-.function { font-family: monospace; }
-.type { font-family: monospace; }
-.command { font-family: monospace; }
-.instruction { font-style: italic; }
-",
+ "title"=>'Captive NTFS Developer Documentation',
+ "head_css"=>$doc_Macros_head_css,
);
My::Web->heading();
-sub doc_img ($$)
-{
-my($img_base,$caption)=@_;
-
- my $r="";
- $r.='<table border="0" align="center">'."\n";
- $r.="\t<tr><td>".img($img_base,$caption)."</td></tr>\n";
- $r.="\t<caption>$caption</caption>\n";
- $r.='</table>'."\n";
- $r.=vskip "2ex";
- return $r;
-}
-
-sub captive_srcfile ($)
-{
-my($filename)=@_;
-
- a_href 'http://cvs.jankratochvil.net/viewcvs/*checkout*/priv/captive/'.$filename.'?rev=HEAD',$filename;
-}
-
-my $freespeech=a_href 'http://www.gnu.org/philosophy/free-sw.html','Free';
-my $freebeer=a_href 'http://www.gnu.org/philosophy/free-sw.html','free (as in beer)';
-
-sub productname
-{
-my($url,$name)=@_;
-
- return '<span class="productname">'.a_href($url,CGI::escapeHTML($name)).'</span>';
-}
-my $Wine=productname 'http://www.winehq.com/','Wine';
-my $ReactOS=productname 'http://www.reactos.com/','ReactOS';
-my $LinuxNTFS=productname 'http://linux-ntfs.sourceforge.net/','Linux NTFS';
-my $GnomeVFS=productname 'http://developer.gnome.org/doc/API/gnome-vfs/','Gnome-VFS';
-my $GnomeVFSmodule=productname 'http://developer.gnome.org/doc/API/gnome-vfs/modules.html','Gnome-VFS-module';
-my $gnulinux='GNU/Linux';
-
-
-# FIXME:
-# Compatibility with NT4 etc. - just legal reasons.
-
-# FIXME:
-# name="cache_manager" Cache Manager
-
-# FIXME:
-# name="sandbox"
-
-
print <<"HERE";
-<h1>Reasons for the Implementation</h1>
-
- <p>Currently there is no possibility to any of the available $freespeech
- ($freespeech used in the following text in the meaning of
- "@{[ a_href 'http://www.gnu.org/philosophy/free-sw.html','free as in speech' ]}")
- operating systems to reliably write to the most common disk partition
- filesystem type – <span class="productname">Microsoft NTFS</span>. It would
- be already supported a long time ago but there is no proper documentation of
- <span class="productname">NTFS</span> filesystem data structures available.
- Since <span class="productname">Microsoft</span> corporation continues in its
- propagation of <span class="productname">Microsoft Windows NT</span>
- (<span class="productname">NT</span> identifier used in the following text
- applies to all the products of <span class="productname">Microsoft</span>
- <span class="productname">NT</span> series such as
- <span class="productname">NT 4.0</span>,
- <span class="productname">2000</span> as NT-5.0
- and
- <span class="productname">XP</span> as NT-5.1.)
- based operating systems <span class="productname">NTFS</span> is the default
- disk file system type for vendor preinstalled <span class="productname">Microsoft Windows</span>.
-
- <p>Unfortunately the <span class="productname">NTFS</span> filesystem has too
- complex data structure to allow a complete reverse enginnering process in
- reasonable time. Currently available $freespeech solutions such as $LinuxNTFS
- filesystem have already implemented reliable reverse
- engineered read-only access. However <a name="reliability">reliabile</a>
- read-write part of the access would require much better
- knowledge of the <span class="productname">NTFS</span> data structures.
- Currently only rewriting of already existing file data blocks is supported
- by $LinuxNTFS — no file creation, no file deletion, no directory operations etc.
- Also any future versions of <span class="productname">NTFS</span> filesystem
- would require another major reverse engineering effort.</p>
-
-
-<h1>Challenges of the Project</h1>
-
- <p>The <a name="NTFSgoal">ultimate goal</a> of this project is definitely the
- free implementation of @{[ a_href '#reliability','reliable' ]} read-write <span
- class="productname">NTFS</span> filesystem driver. This project chose to
- solve this problem in the style of $Wine project by using the original binary
- <span class="fname">ntfs.sys</span> and emulating all the required layers of
- <span class="productname">Microsoft Windows NT</span> for it.</p>
-
- <p>Unfortunately this effort is tainted by only partial and generally
- insufficient documentation of API between filesystem driver
- (<span class="fname">ntfs.sys</span>) and the
- <span class="productname">Microsoft Windows NT</span>
- ("@{[ a_href 'http://mail.gnu.org/archive/html/libtool/2000-09/msg00000.html','W32' ]}"
- in the following text) kernel <span class="fname">ntoskrnl.exe</span>. Note
- that this API is a different than the one being used in the $Wine project
- since <span class="productname">Wine</span> implements only the user space
- part of W32.</p>
-
-
-<h1>Architecture</h1>
-
- <p>The principle of the
- project lies in the glue between
- <span class="productname">Microsoft Windows NT</span> kernel space
- environment and $gnulinux user space process environment:</p>
-
- @{[ doc_img 'arch-W32','Microsoft Windows Subsystems Architecture' ]}
- @{[ doc_img 'arch-captive','Captive Subsystems Architecture' ]}
-
- <a name="existing_emulation"><h2>Existing Emulation Projects</h2></a>
-
- <p>There were two well-known $freespeech projects emulating W32 subsystems
- to reach the compatibility with various W32 components:
- $Wine and $ReactOS. Sad moment is that the goals of this project do not fit
- very well into any role in those two ones. Therefore this project went
- its own way of emulation:</p>
-
- <table align="center" border="1">
- <tr>
- <th>@{[ a_href '#guestosnote','Guest-OS' ]}</th>
- <th>@{[ a_href '#hostosnote' ,'Host-OS' ]}</th>
- <th>Implements</th>
- <th>W32 kernel library</th>
- </tr>
- <tr>
- <td>$Wine</td>
- <td>$gnulinux</td>
- <td>W32 user space</td>
- <td><span class="fname">ntdll.dll</span></td>
- </tr>
- <tr>
- <td>$ReactOS</td>
- <td><span class="constant">i386</span> hardware</td>
- <td>W32 kernel and user space</td>
- <td><span class="fname">ntoskrnl.exe</span></td>
- </tr>
- <tr style="height: 1ex;"></tr>
- <tr>
- <td>this project</td>
- <td>$gnulinux</td>
- <td>W32 kernel</td>
- <td><span class="fname">ntoskrnl.exe</span></td>
- </tr>
- <caption>Emulation Projects Characteristics</caption>
- </table>
-
- <dl>
- <a name="guestosnote"><dt>Guest-OS</dt></a>
- <dd>@{[ a_href 'http://www.vmware.com/support/reference/common/glossary/#guestos','Guest OS' ]}:
- An operating system that runs inside a virtual machine.</dd>
- <a name="hostosnote" ><dt>Host OS</dt></a>
- <dd>@{[ a_href 'http://www.vmware.com/support/reference/common/glossary/#hostos' ,'Host OS' ]}:
- An operating system that runs on the host machine.</dd>
- </dl>
-
- <p>While $ReactOS provides the necessary W32 kernel subsystem emulation
- code we also need to run such @{[ a_href '#guestosnote','Guest-OS' ]} in the
- @{[ a_href '#hostosnote','Host-OS' ]} $gnulinux. Initially it was planned to
- extend $Wine with the W32 kernel space emulation functionality but
- fortunately <span class="author">Steven Edwards</span> pointed to the $ReactOS
- which better suits the needs of this project by its already implemented W32
- kernel space emulation.</p>
-
- <p>The <a name="reactos_nocare">original reasons</a> for developing
- $ReactOS still make no sense to the author of this project. Free
- implementation of W32 platform standalone running on the machine hardware
- is no longer free as most od the W32 applications are usually closed source
- and the user still looses its freedom on the application level anyway. Even
- in the case of available free applications there still remains the
- disadvantage of loosing the Host-OS platform availability if implemented in
- the $Wine style. For these ideology incompatibilities not much effort was
- made for acceptance the fixes and improvements of $ReactOS by this project.
- Moreover new functionality is not being implemented to the $ReactOS part
- but it is coded in Gnome style in the project specific source files
- place.</p>
-
- <p>The most serious problem of $ReactOS is its dependence on the direct
- <span class="constant">i386</span> hardware instead of some
- @{[ a_href '#hostosnote','Host-OS' ]} as required by the goals of this project.
- W32 is designed to be hardware-independent using its
- <span class="fname">hal.dll</span>. Unfortunately $ReactOS does not follow
- this design and thus there are needed various patches and replaces of its
- various parts and its hardware-dependent code. Despite it $ReactOS code
- base still made a big asset for this project.</p>
-
- <p class="stuff">... and @{[ a_href 'http://www.reactos.com/','ReactOS' ]} cannot run on Linux!<br />
-
-
-
- <p>Some API functions are provided both by
- <span class="fname">ntdll.dll</span> and
- <span class="fname">ntoskrnl.exe</span> in W32.
- <span class="author">Casper Hornstrup</span> enlightened such functions
- calling conventions have to be differentiated as
- <span class="fname">ntdll.dll</span> lives in the user space (low address
- space – below <span class="constant">0x80000000</span>) and
- <span class="fname">ntoskrnl.exe</span> in the kernel space (high address
- space – above <span class="constant">0x80000000</span>). Although they
- contain slightly different set of symbols (functions)
- <span class="fname">ntdll.dll</span> still can be considered as a user
- space interface to the kernel space implementation by
- <span class="fname">ntoskrnl.exe</span>.</p>
-
- <p>Currently there are
- no plans to ever extend the project's crossplatformity beyond the
- <span class="constant">i386</span> processor
- (<span class="constant">i386</span> used here as
- @{[ a_href 'http://www.intel.com/','Intel' ]} architecture covering 32-bit
- processors compatible with <span class="constant">i386</span>,
- <span class="constant">i486</span>, ...).</p>
-
- <h2>API Function Implementation Choices</h2>
-
- <p>During the initial point of the project development all the API
- functions were defined as unimplemented, of course. Any call of such
- unimplemented function is fatal and results in program termination. When we
- need to implement any required API function we have multiple choices to do
- so:
- @{[ a_href '#functype_pass','Direct pass to original <span class="fname">ntoskrnl.exe</span>' ]},
- @{[ a_href '#functype_wrap','Wrap of the original <span class="fname">ntoskrnl.exe</span> function' ]},
- @{[ a_href '#functype_native_reactos','Native implementation – $ReactOS' ]},
- @{[ a_href '#functype_native_wine','Native implementation – $Wine' ]}
- or
- @{[ a_href '#functype_native_libcaptive','Native implementation – project specific' ]}.
- <!-- a_href '#functype_undef','Undefined function' -->
-
- <h2>"patched" vs. "unpatched" Libraries</h2>
+<h1>Captive NTFS Developer Documentation</h1>
- <p>Library is called <span class="constant">patched</span> if we require
- loading its original binary code file. Project needs to patch it to be able
- to trap all the function entry points. The only currently
- <span class="constant">patched</span> library of this project is
- <span class="fname">ntoskrnl.exe</span>.</p>
- <p>Library is called <span class="constant">unpatched</span> if no original
- binary code is needed since all of its functions are completely emulated by
- @{[ a_href '#functype_native','the native implementations' ]} of this project.
- The typical <span class="constant">unpatched</span> representative is
- <span class="fname">hal.dll</span> as it specializes on the hardware
- dependent code and therefore it must be completely replaced by this project
- running in the $gnulinux operating system environment. Early versions of
- this project had also full <span class="constant">unpatched</span>
- <a href="#native_ntoskrnl">native implementation of
- <span class="fname">ntoskrnl.exe</span></a> but it no longer applies.</p>
+<ul>
- <h2>Memory Management</h2>
-
- <p>Original <span class="productname">Microsoft Windows NT</span>
- architecture uses two address space areas – user space and kernel space.
- User space is mapped in the range <span class="constant">0x00000000</span>
- to <span class="constant">0x7FFFFFFF</span>, kernel space is mapped in the
- range <span class="constant">0x80000000</span>
- (<span class="constant">KERNEL_BASE</span> in $ReactOS sources) to
- <span class="constant">0xFFFFFFFF</span>. All these virtual memory ranges
- represent addresses after their MMU (Memory Management Unit) mapping, of
- course. More discussion can be found in the
- <a href="http://www.microsoft.com/hwdev/platform/server/PAE/PAEmem.asp">description
- by <span class="productname">Microsoft</span></a>.</p>
-
- <p>This project runs in the virtual address space used both for the UNIX
- user space process part and for the W32 kernel space. Therefore this
- project defines that W32 kernel runs in the whole range
- <span class="constant">0x00000000</span> to
- <span class="constant">0xFFFFFFFF</span> since there are no special mapping
- assumptions about the UNIX user space process mapping. No W32 user space
- exists in this project. Such approach also nullifies any special memory
- moving operations between W32 kernel space and W32 user space memory areas
- (such as <span class="function">MmSafeCopyToUser()</span>).</p>
-
- <h2>Unicode Strings and Characters</h2>
-
- <p>W32 platform uses 16-bit type <span class="type">wchar_t</span> while $gnulinux uses a
- 32-bit one. This can be problem during GCC (GNU C Compiler)
- compilation of combination of native UNIX C sources (assuming 32-bit
- GCC with 32-bit <span class="type">wchar_t</span>) and
- $ReactOS C sources (assuming W32 compiler with 16-bit
- <span class="type">wchar_t</span>) for literal wide strings
- (C source file systax: <span class="command">L"wstring"</span>).
- Possibilities to solve this issue list:</p>
+<li><a href="About.html.pl">About</a>
+ <ul>
+ <li><a href="About.html.pl#reasons">Reasons for the Implementation</a></li>
+ <li><a href="About.html.pl#challenges">Challenges of the Project</a></li>
+ <li><a href="About.html.pl#versions">Microsoft Windows Versions Compatibility</a></li>
+ </ul></li>
+<li><a href="Architecture.html.pl">Architecture</a>
+ <ul>
+ <li><a href="Architecture.html.pl#existing_emulation">Existing Emulation Projects</a></li>
+ <li><a href="Architecture.html.pl#law">Laws and Licensing Conditions</a>
<ul>
- <li>
- <p>Using <span class="constant">-fshort-wchar</span> GCC option and
- strictly differentiate between compilation of
- <span class="productname">ReactOS</span> code and UNIX code.</p>
-
- <p>pros: No source modifications needed, no runtime performance hit.</p>
-
- <p>cons: No type checking if some part of code has bad compilation
- flags, complicated way to completely split
- <span class="productname">ReactOS</span> and UNIX code.</p>
- </li>
- <li>
- <p>Wrap all <span class="productname">ReactOS</span> literal constants
- by some conversions function call (implemented as macro
- <span class="function">REACTOS_UCS2()</span> by this project).</p>
-
- <p>pros: Any forgotten/mistaken conversions are type-checked and warned
- during the compilation by GCC.</p>
-
- <p>cons: All compiled <span class="productname">ReactOS</span> sources
- files containing literal wide strings have to be wrapped/modified,
- performance hit by runtime string conversions.</p>
-
- <p>This solution was chosen to get the internal sanity checking
- benefit.</p>
- </li>
- </ul>
-
- <h2>Supported Binary Formats</h2>
-
- <p>The native W32 binary format is identified as
- <span class="constant">PE-32</span> (Portable Executable 32-bit), such
- files have all the usual extensions such as
- <span class="fname">.sys</span>, <span class="fname">.exe</span>,
- <span class="fname">.dll</span> etc. <span class="constant">PE-32</span>
- loading support was already implemented by $ReactOS, its memory mapping
- specifics just had to be ported to $gnulinux environment by this project.
- This loading support does not (yet) cover importing of debug symbols from
- W32 <span class="fname">.PDB</span> (Program DataBase) files in $gnulinux
- ABI (Application Binary Interface) compatible way.</p>
-
- <p>This project also supports transparent loading of UNIX
- <span class="fname">.so</span> (Shared Object file) binary format. If you
- have W32 source files for some W32 library you can try to compile it by GCC
- to get the shared library with $gnulinux ABI compatible debug information
- (GCC option <span class="constant">-ggdb3</span> recommended). Beware of
- possible compilation problems as <span class="productname">Microsoft</span>
- C code expects <span class="constant">exception</span> handling to be
- supported by the compiler (definitely not the case of the plain C compiler
- of GCC) — all the exception catching code should be discarded as any
- @{[ a_href '#exception_fatal','generated exceptions are always fatal' ]} when
- such driver is running in the scope of this project. You can use the
- following script of this project to compile W32 filesystem source files as
- UNIX <span class="fname">.so</span>:
- @{[ captive_srcfile 'src/w32-mod/ext2fsd.so-build.sh' ]}</p>
-
- <p>Be aware of some differences if you use
- <span class="constant">PE-32</span> binary format file vs.
- <span class="fname">.so</span> format file.
- <span class="constant">PE-32</span> use the appropriate W32 specific
- @{[ a_href '#calltype','cdecl/stdcall/fastcall call types' ]},
- <span class="fname">.so</span> must be completely compiled in the standard
- UNIX @{[ a_href '#calltype_cdecl','cdecl call type semantics' ]}.
- @{[ a_href '#functype_native','Native function implementations' ]} do not need
- to be explicitely exported by <span class="fname">captivesym</span> as they
- are resolved automatically by the UNIX dynamic system linker. It may be
- surprising you will have to fix all such missing symbol exports if you
- advance during the development from the debugging
- <span class="fname">.so</span> file for the production version of the
- original <span class="constant">PE-32</span> binary file.</p>
-
- <h2>Reverse Engineering</h2>
-
- <p>This project has no intentions to reverse engineer and document the
- filesystem data structures themselves since they are being encapsulated by
- the filesystem driver. For these reasons the resources available in
- projects such as $LinuxNTFS get out of any possible use. This project goal
- is to provide fully compatible API interface to the rest of the W32 system
- to persuade the filesystem driver it is running in the native
- <span class="productname">Microsoft Windows XP</span> environment.</p>
-
- <p>All the W32 filesystem drivers are running in the W32 kernel address
- space and this area of W32 API is not much documented by
- <span class="productname">Microsoft</span>. Some API functions are not
- documented at all and the others are documented insufficiently for a their
- possibly needed reimplementation from scratch. Documentation being
- consulted primarily consists of
- <span class="productname">@{[ a_href 'http://msdn.microsoft.com/library/default.asp?url=/library/en-us/kmarch/hh/kmarch/kmhdr_6enb.asp','MSDN (Microsoft Developer Network) Kernel-Mode Driver Architecture: Windows DDK' ]}</span>
- documentation and also various other 3rd party documentation resources such as
- <span class="productname">@{[ a_href 'http://www.osr.com/ntinsider/1996/cacheman.htm',
- 'The NT Cache Manager Description' ]}</span>,
- <span class="productname">@{[ a_href 'http://www.winntmag.com/Articles/Print.cfm?ArticleID=3864',
- 'Learn About NT'."'".'s File-system Cache' ]}</span>,
- <span class="productname">@{[ a_href 'http://www.ntfsd.org/archive/',
- 'NT File System Developers mailing list archives' ]}</span>
- including various
- @{[ a_href 'http://www.google.com/search?q=site%3Amicrosoft.com','fulltext searches' ]}
- through Internet from case to case.</p>
-
- <p>Sometimes no sufficient documentation was found and some code behaviour
- had to be reverse engineered directly from the binaries of
- <span class="fname">ntoskrnl.exe</span>,
- <span class="fname">cdfs.sys</span>,
- <span class="fname">fastfat.sys</span>
- and primarily
- <span class="fname">ntfs.sys</span>.
- Up to now the code was disassembled by
- <span class="productname">@{[ a_href 'http://www.simtel.net/pub/pd/29498.html','IDA Freeware' ]}</span>
- and by
- <span class="productname">dumpbin.exe</span> of
- <span class="productname">Microsoft Visual Studio</span>.
- <span class="productname">dumpbin.exe</span> is fortunately able to
- interpret debug symbols from W32 <span class="fname">.PDB</span>
- (Program DataBase) debug information files.</p>
-
- <h3><span class="productname">dumpbin.exe</span>:</p></h3>
-
- <p>You should use the following options for
- <span class="productname">dumpbin.exe</span>:</p>
-
- <blockquote class="command">
- <p>dumpbin.exe /all /rawdata:none /disasm /pdbpath:verbose FILENAME.SYS</p>
- </blockquote>
-
- <p>You should see the following line in the output:</p>
-
- <blockquote class="command">
- <p>PDB file found at '.\\FILENAME.pdb'</p>
- </blockquote>
-
- <h3><span class="productname">WinDbg</span> Windows NT kernel debugging</h3>
-
- <p><span class="productname">WinDbg</span> is downloadable from:
- @{[ a_href 'http://www.microsoft.com/whdc/ddk/debugging/installx86.mspx' ]}</p>
-
- <p>This is (the only?) tool able to debug filesystem drivers incl.
- <span class="fname">ntfs.sys</span>. You will need two computers running
- <span class="productname">Microsoft Windows</span> — one computer will run
- <span class="productname">WinDbg</span> while the other one will be
- frozen in remote Windows NT kernel debug mode. It does not matter which
- <span class="productname">Microsoft Windows</span> version will be run
- on the <span class="productname">WinDbg</span> side.</p>
-
- <p>The most easy way to setup two computers is to use commercial
- <span class="productname">@{[ a_href 'http://www.vmware.com/download/workstation.html','VMware Workstation' ]}</span>
- where you can run two virtual machines simultaneously on single PC
- hardware and you can connect them by a virtual serial port provided by
- <span class="productname">VMware</span>.</p>
-
- <h4><span class="productname">WinDbg</span> side setup</h4>
-
- @{[ doc_img 'ntdebug-vmware-windbg',
- '<span class="productname">VMware</span> virtual serial port'
- .' of <span class="productname">WinDbg</span> side' ]}
-
- <p>You should setup <span class="productname">WinDbg</span> according
- to:</p>
-
- @{[ doc_img 'ntdebug-windbg-port','Port settings of <span class="productname">WinDbg</span>' ]}
- @{[ doc_img 'ntdebug-windbg-sym','Symbols files location of <span class="productname">WinDbg</span>' ]}
-
- <span class="constant">Symbols</span> should point to the directory where
- reside files extracted from the symbol archive for your version of
- <span class="productname">Microsoft Windows</span>. In the case of the
- recommended <span class="productname">Microsoft Windows XP Service Pack 1 Checked Build</span>
- you should use:
- @{[ a_href 'http://msdl.microsoft.com/download/symbols/packages/windowsxp/xpsp1sym_x86_chk.exe' ]}</p>
-
- <blockquote class="command">
- <p># Rename xpsp1sym_x86_chk.exe contents .pdb files for WinDbg<br />
- @{[ CGI::escapeHTML(q{for i in *.pdb*;do ext="`echo $i|sed 's/^.*\.pdb\.\(.*\)$/\1/'`";if [ "$i" = "$ext" ];then echo "BAD:$i";break;fi;base="`echo $i|sed 's/\(\.pdb\)\..*$/\1/'`";echo "md $ext";echo "move /-y $i $ext\\$base";done|sort -u|sed 's/$/'`echo -ne '\r'`'/g' >/tmp/rename.bat}) ]}</p>
- </blockquote>
-
- <p>The resulting <span class="command">rename.bat</span> for
- <span class="command">xpsp1sym_x86_chk.exe</span> can be found at:
- @{[ a_href 'xpsp1sym_x86_chk-rename.bat.zip' ]}</p>
-
- <p>The resulting directory should contain at least
- <span class="command">sys\\ntfs.pdb</span>
- and
- <span class="command">exe\\ntoskrnl.pdb</span>.</p>
-
- <p>Your successfuly connected target (after the steps described
- below) should look like:</p>
-
- @{[ doc_img 'ntdebug-windbg-boot','Successfuly connected <span class="productname">WinDbg</span>' ]}
-
- <h4>Setup of the side being kernel-debugged</h4>
-
- @{[ doc_img 'ntdebug-vmware-xpdebug',
- '<span class="productname">VMware</span> virtual serial port'
- .' of the side being kernel-debugged' ]}
-
- <p>You must use the following options in your
- <span class="command">c:\\boot.init</span> command-line:</p>
-
- <blockquote class="command">
- <p>/debug /debugport=COM1 /baudrate=115200</p>
- </blockquote>
-
- <p>After booting this <span class="command">boot.ini</span>-entry
- should freeze at this point
- (if no <span class="productname">WinDbg</span> is waiting in the other
- virtual machine):</p>
-
- @{[ doc_img 'ntdebug-wait','Side being kernel-debugged waiting for <span class="productname">WinDbg</span>' ]}
-
-
- <a name="law"><h2>Laws and Licensing Conditions</h2></a>
-
- <p>If you are an <span class="productname">authorized user</span> of
- <span class="productname">Microsoft Windows NT</span> the laws in some
- countries give you the right to fully handle the product in any way you
- want. Therefore you can disassemble the product even in the case you had
- to agree with the product license forbidding such disassembly as the
- country laws override any such license agreement.</p>
-
- <h3>Microsoft Service Pack</h3>
-
- <p>Sometimes you may have the legal license for
- <span class="productname">Microsoft Windows NT</span>
- but for various technical reasons you do not have the media and/or
- installation ready at the place of intended use of this project.</p>
-
- <p>Fortunately <span class="productname">Microsoft</span> provides
- $freebeer update packages for its
- <span class="productname">Microsoft Windows</span> products called
- <span class="productname">Service Packs</span>; the latest one is
- <span class="productname">@{[ a_href 'http://www.microsoft.com/WindowsXP/pro/downloads/servicepacks/sp1/checkedbuild.asp','Microsoft Windows XP Service Pack 1a' ]}</span>.</p>
-
- <p>This downloadable file contains the full versions of the essential
- files needed for the current stage of this product:
- <span class="fname">ntfs.sys</span>
- and
- <span class="fname">ntoskrnl.exe</span>.
- It even contains
- <span class="fname">cdfs.sys</span> and
- <span class="fname">fastfat.sys</span> for testing purposes.</p>
-
- <p><span class="productname">Service Pack</span> also contains
- EULA (End User License Agreement) paper disallowing any use of
- <span class="productname">Service Pack</span> outside its original
- intentions. According to the laws of some countries you need to be
- <span class="productname">authorized user</span> of the
- <span class="productname">Microsoft Windows XP</span> product to be
- allowed to use the files contained in such
- <span class="productname">Service Pack</span> without the bindings of its
- EULA. Even the interpretation of such laws may vary.</p>
-
- <p>It would be a breach of the law by the project author to provide
- automatic (=hidden) functionality to download and extract the
- <span class="productname">Service Pack</span> files. On the other hand it
- is perfectly legal to ask user for his/her confirmation whether he/she is
- really the <span class="productname">authorized user</span> of
- <span class="productname">Microsoft Windows XP</span> product and
- download/extract the <span class="productname">Service Pack</span> files
- accordingly.</p>
-
- @{[ doc_img 'captive-install-acquire-ask','Microsoft Windows Drivers Acquire Affirmation' ]}
-
- <h2>Project Architecture</h2>
-
- @{[ doc_img 'dia/arch-all','Project Components Architecture' ]}
-
- <p>Most of the work of this project is located in the single box called
- "<span class="constant">libcaptive</span>" located in the center
- of the scheme. This component implements the core W32 kernel API by
- various methods described in this document.
- The "<span class="constant">libcaptive</span>" box cannot be
- further dissected as it is just an implementation of a set of
- @{[ a_href 'http://cvs.jankratochvil.net/viewcvs/*checkout*/priv/captive/src/libcaptive/ke/exports.captivesym?rev=HEAD',
- 'API functions' ]}.
- It could be separated to several subsystems such as the
- @{[ a_href '#cache_manager','Cache Manager' ]},
- Memory Manager, Object Manager, Runtime Library, I/O Manager
- etc. but they have no interesting referencing structure.</p>
-
- <p>As this project is in fact just a filesystem implementation every
- story must begin at the device file and end at the filesystem operations
- interface. The unified suppported interfaces are
- <span class="productname">@{[ a_href 'http://developer.gnome.org/doc/API/2.0/glib/','GLib' ]}</span>
- (the most low level portability, data-types and utility library for Gnome)
- <span class="type">GIOChannel</span> (for the device access) and the custom
- <span class="constant">libcaptive</span> filesystem API. Each of these ends
- can be connected either to some direct interface (such as the
- <span class="constant">captive-cmdline</span> client),
- @{[ a_href 'http://lufs.sourceforge.net/lufs/','Linux Userland File System (LUFS)' ]}
- or as a general $GnomeVFS filter.
- @{[ a_href 'http://lufs.sourceforge.net/lufs/','LUFS' ]} will be used in
- most cases as it offers standard filesystem interface by Linux kernel.
-
- You can also use $GnomeVFS as it offers nice filter interface on
- the UNIX user-privileges level for transparent operation with archives and
- network protocols. This filter interface was used by this project to turn
- the device reference such as <span class="fname">/dev/hda3</span> or <span
- class="fname">/dev/discs/disc0/part3</span> to the fully accessible
- filesystem (pretending being an "archive" in the device
- reference). This device access can be specified by $GnomeVFS URLs such as:
- <span
- class="fname">file:///dev/hda3#captive-fastfat:/autoexec.bat</span></p>
-
- <span class="constant">captive-bug-replay</span> serves just for debugging
- purposes — you can 'replay' existing
- <span class="fname">file.captivebug.xml.gz</span> automatically being
- generated during W32 filesystem failure. This bugreport file will contain
- all the touched data blocks of the device used in the moment of the
- failure. <span class="constant">captive-bug-replay</span> will therefore
- emulate internal virtual writable device out of these bugreported data.
-
- <p>If the passed device reference is requested by the user to be accessed
- either in <span class="dashdash">--ro</span> (read-only) mode or in the
- <span class="dashdash">--rw</span> (full read-write) mode there are no
- further device layers needed. Just in the case of <span
- class="dashdash">--blind</span> mode another layer is involved to emulate
- read-write device on top of the real read-only device by the method of
- non-persistent memory buffering of all the possible write requests.</p>
-
- <span class="constant">sandbox commit buffer</span> is involved only in the
- case @{[ a_href '#sandbox','sandboxing feature' ]} is active. It will
- buffer any writes to the device during the sandbox run to prevent
- filesystem damage if the driver would fail in the meantime. If the
- filesystem gets finally successfully unmounted this sandbox buffer can be
- <a name="safe_flush">safely flushed</a>
- to its underlying physical media. The buffer will be dropped
- in the case of filesystem failure, of course. The filesystem should be
- unmounted from time to time — it can be transparently unmounted and mounted
- by <span class="command">commit</span> of
- <span class="constant">captive-cmdline</span> custom client. Currently you
- cannot force remounting when using
- @{[ a_href 'http://lufs.sourceforge.net/lufs/','LUFS' ]} interface client
- but it will be remounted after approx each 1MB data written automatically
- due to @{[ a_href '#log_file_full','NTFS log file full' ]}.
-
- Now we need to transparently
- @{[ a_href 'http://cvs.jankratochvil.net/viewcvs/*checkout*/priv/captive/src/libcaptive/sandbox/sandbox.idl?rev=HEAD',
- 'connect' ]}
- the device interface of <span class="type">GIOChannel</span> type through
- @{[ a_href '#sandbox','CORBA/ORBit' ]} to the sandboxed slave.
-
- <p>Such device is still only a UNIX style GLib <span
- class="type">GIOChannel</span> type at this point. As we need to supply it
- to the W32 filesystem driver we must convert it to the W32 I/O Device
- with its capability of handling <span class="type">IRP</span>
- (<span class="constant">I/O Request Packet</span>; structure holding the
- request and result data for any W32 filesystem or W32 block device
- operation)
- requests from its upper W32 filesystem driver. Such W32 I/O Device can
- represent either <span class="type">CD-ROM</span> or
- <span class="type">disk</span> device type as different W32 filesystem
- drivers require different media types — currently only
- <span class="fname">cdfs.sys</span> requires
- <span class="type">CD-ROM</span> type.</p>
-
- <p>W32 media I/O Device is accessed from the W32 filesystem driver.
- The filesystem driver itself always creates volume object by
- <span class="function">IoCreateStreamFileObject()</span> representing the
- underlying W32 media I/O Device as the object handled by the
- filesystem driver itself. All the client application filesystem requests
- must be first resolved at the filesystem structures level, passed to the
- volume stream object of the same filesystem and then finally passed to the
- W32 media I/O Device (already implemented by this project as an
- interface to <span class="type">GIOChannel</span> noted above).</p>
-
- <p>The filesystem driver is called by the core W32 kernel implementation of
- <span class="constant">libcaptive</span> in
- @{[ a_href '#synchronous','synchronous way' ]} in single-shot manner instead of
- the several reentrancies while waiting for the disk I/O completions as can
- be seen in the original
- <span class="productname">Microsoft Windows NT</span>.
- This single-shot synchronous behaviour is possible since all the needed
- resources (disk blocks etc.) can be always presented as instantly ready as
- their acquirement is solved by @{[ a_href 'hostosnote','Host-OS' ]} outside of
- the W32 emulated @{[ a_href 'guestosnote','Guest-OS' ]} environment.
- For several cases needed only by <span class="fname">ntfs.sys</span>
- there had to be supported asynchronous access — parallel execution
- is emulated by GLib <span class="function">g_idle_add_full()</span>
- with <span class="function">g_main_context_iteration()</span> called during
- <span class="function">KeWaitForSingleObject()</span>.</p>
-
- <p><span class="constant">libcaptive</span> offers the W32 kernel
- filesystem API to the upper layers. This is still not the API the common
- W32 applications are used to as they use W32 libraries which in turn pass
- the call to W32 kernel. For example
- <span class="function">CreateFileA()</span> is being implemented by several
- libraries such as <span class="fname">user32.dll</span> as a relay
- interface for the kernel function
- <span class="function">IoCreateFile()</span> implemented by this
- project's <span class="constant">libcaptive</span> W32 kernel
- emulation component.</p>
-
- <p>As it would be very inconvenient to use the legacy, bloated and UNIX
- style unfriendly W32 kernel filesystem API this project offers its own
- @{[ a_href '#client_interface','custom filesystem API interface' ]} inspired by
- the $GnomeVFS client interface adapted to the specifics of W32 kernel API.
- This interface is supposed to be easily utilized by
- <a href="#client_interface_customapp">a custom application accessing
- the W32 filesystem driver</a>.</p>
-
- <p>@{[ a_href '#sandbox','CORBA/ORBit' ]} hits us again – we need to
- @{[ a_href 'http://cvs.jankratochvil.net/viewcvs/*checkout*/priv/captive/src/libcaptive/sandbox/sandbox.idl?rev=HEAD',
- 'translate' ]}
- the @{[ a_href '#client_interface','custom filesystem API interface' ]}
- out of the sandboxed slave to the UNIX space.</p>
-
- <p><span class="constant">captive sandbox master</span> provides the
- functionality of covering any possible sandboxed slave restarts and its
- communication. It is also capable of
- <a name="demultiplexing_master">demultiplexing single API operations</a>
- to multiple its connected sandbox slaves in transparent way
- as each of them handles
- @{[ a_href '#mounted_one','just one filesystem device' ]}.</p>
-
- <p>The rest of the story is not much special for this project since this is
- a common UNIX problem how to offer user space implemented UNIX filesystem
- as a generic system filesystem (as those are usually implemented only as
- the components od UNIX kernel).</p>
-
- <p>The filesystem service can be offered in several ways:</p>
-
- <dl>
- <dt>Custom client</dt>
- <dd>
- <p>One possibility would be to write
- <a name="client_interface_customapp">a custom client application</a>
- for this project such as file manager or a shell. Although it
- would implement the most appropriate user interface to the set of
- functions offered by this project (and W32 filesystem API) it has the
- disadvantage of special client software. Appropriate client is provided
- by this project as:
- <span class="fname">src/client/cmdline/cmdline-captive</span></p>
- </dd>
-
- <dt>@{[ a_href 'http://lufs.sourceforge.net/lufs/','Linux Userland File System (LUFS)' ]}</dt>
- <dd>
- <p>The most usable interface is the
- @{[ a_href 'http://lufs.sourceforge.net/lufs/','LUFS' ]} client
- by <span class="constant">liblufs-captivefs</span>.
- As @{[ a_href 'http://lufs.sourceforge.net/lufs/','LUFS' ]}
- already assigns separate process for each filesystem mount the
- @{[ a_href '#demultiplexing_master','demultiplexing feature' ]}
- is not utilized in this case.</p>
-
- <p>@{[ a_href 'http://lufs.sourceforge.net/lufs/','LUFS' ]}
- needs multiple operating threads (each UNIX kernel operation needs
- one free lufsd slot/thread to not to fail immediately).
- As <span class="constant">libcaptive</span> is
- @{[ a_href '#synchronous','single-threaded' ]} all the operations
- get always synchronized by
- <span class="constant">liblufs-captivefs</span>
- before their pass over to <span class="constant">libcaptive</span>.</p>
- </dd>
-
- <dt>@{[ a_href '#offered_gnomevfs','Gnome-VFS' ]}</dt>
- <dd>
- <p>This client allowing its filesystem access even without any
- involvement of UNIX kernel from any $GnomeVFS aware client application
- (such as <span class="fname">gnome-vfs/tests/test-shell</span>).
- This @{[ a_href '#offered_gnomevfs','Gnome-VFS interface' ]} connects the
- data flow of this project in two points — both as the lowest layer
- device image source and also as the upper layer for the filesystem
- operation requests.</p>
- </dd>
- </dl>
-
- <p>Unimplemented and deprecated methods for providing filesystem
- service:</p>
-
- <dl>
- <dt>W32 filesystem in UNIX OS kernel</dt>
- <dd>
- <p>The real UNIX OS filesystem implementation must be completely
- implemented inside the hosting OS kernel. This requires special coding
- methods with limited availability of coding features and libraries.
- Also it would give the full system control to the untrusted W32
- filesystem driver code with possibly fatal consequences of yet
- unhandled W32 emulation code paths. It would benefit from the best
- execution performance but this solution was never considered a real
- possibility.</p>
- </dd>
-
- <dt>Custom NFS server</dt>
- <dd>
- <p>The common approach
- <a name="offered_NFS">of filesystem implementations</a>
- outside UNIX OS kernel were custom NFS servers usually running on the
- same machine as the NFS-connected client as such NFS server is usually
- an ordinary UNIX user space process. It would be possible to implement
- this project as a custom NFS server but the NFS protocol itself
- has a lot of fundamental flaws and complicated code for backward
- compatibility.</p>
- </dd>
- </dl>
-
-
- <a name="mounted_one"><h2>At Most One Mounted Filesystem</h2></a>
-
- <p>The project technically supports only one (exactly one...) mounted
- filesystem device and only one filesystem driver. There is nothing
- complicated to support multiple disks and multiple loaded filesystem
- modules but as they would share the address space it would only bring
- a possible complications during bug reports and the bug solving
- itself. It was considered as a more sane way to support multiple W32
- mounted disks by completely separately running project instances in
- a different UNIX processes communicating from their sandboxes via
- @{[ a_href '#sandbox','CORBA sandbox interface' ]}. This sandboxing
- feature is not yet deployed although its code is already prepared.</p>
-
- <p>The project also does not support any state cleanup to be able to load
- filesystem <span class="constant">A</span>,
- cleanup <span class="constant">A</span> and load a different
- filesystem <span class="constant">B</span> in the same process address
- space. It complies with the preventions of the possible debugging
- complications as noted above. Despite this you still must call the function
- <span class="function">captive_shutdown()</span> to flush all the pending
- filesystem buffers to the disk. After calling
- <span class="function">captive_shutdown()</span> the process address space is
- no longer usable for any further project operations and the process is
- expected to be terminated in the manner compatible with its driving
- @{[ a_href '#sandbox','CORBA sandbox interface' ]} control master.</p>
-
- <p>Each sandbox executing the untrusted W32 binary filesystem driver code
- is connected through its
- @{[ a_href '#sandbox','CORBA sandbox interface' ]} at the point of upper
- layer <span class="constant">libcaptive</span>-specific filesystem API, at
- the point of the bottom layer of <span class="type">GIOChannel</span>
- device access and also for transfers of GLib logging
- messages/warnings/errors out of the sandbox to the user.</p>
-
-
-<h1>Choice of the Emulation Methods</h1>
-
- <p>The intent of the project was to get reliable read-write access to
- <span class="productname">NTFS</span> partition. There are several possible
- ways to achieve that:</p>
-
- <h2>Virtualmachine Running the Original W32 Subsystem</h2>
-
- <p>Creating virtual-hardware PC and running the original W32 binaries
- including their boot-loader etc. Disk device access would be passed as
- virtual IDE disk (=hard disk drive). File access API would be implemented
- either by special escaping by some trapped instruction out of the
- virtualmachine while using W32 file access API or using the standard W32
- SMB (Server Message Block) network access through some virtual network
- card. The latter network access solution is almost the currently available
- possibility of running full-blown disk-sharing real
- <span class="productname">Microsoft Windows NT</span> inside virtual
- machine emulator such as <span class="productname">VMware</span>.</p>
-
- <p>pros: Full compatibility due to fully native codebase.</p>
-
- <p>cons: Hard to debug, missing documentation of NT booting internals,
- possible problems by different PC virtual-hardware than expected by NT,
- requirement of fully installed
- <span class="productname">Microsoft Windows NT</span> product.</p>
-
- <a name="method_ntoskrnl"><h2>"ntoskrnl.exe" Inside Virtual Address Space</h2></a>
-
- <p>This solution was chosen by the project. Binary filesystem driver and
- also <span class="fname">ntoskrnl.exe</span> binary file are required.
- Unfortunately <span class="fname">ntoskrnl.exe</span> expects a native
- PC virtual-hardware missing during regular UNIX user space process
- emulation, therefore such instructions must be trapped and emulated/ignored
- from case to case.</p>
-
- <p>Also the <a name="init_ntoskrnl">initialization code of <span
- class="fname">ntoskrnl.exe</span></a> is not executed by this project since
- it expects to get full PC hardware access privileges and thus some
- datastructures do not get initialized by it (need to be trapped later at
- runtime stage). Some of the missing initializations are solved by
- @{[ a_href '#functype_wrap','API functions wrapping' ]}.
-
- <p>pros: Lightweight, easier to debug.</p>
-
- <p>cons: Possible incompatible emulation of
- <span class="fname">ntoskrnl.exe</span> parts, missing documentation needed
- for the implementation.</p>
-
- <h2>Filesystem Driver Inside Virtual Address Space</h2>
-
- <p>Unlike @{[ a_href '#method_ntoskrnl','previous method' ]} here we do not use
- even <span class="fname">ntoskrnl.exe</span> as the complete kernel part of
- W32 is <a name="native_ntoskrnl">emulated from the project source
- files</a>. <span class="fname">cdfs.sys</span> driver was successfuly ran
- in this manner in the former versions of this project but the possibility
- to run without <span class="fname">ntoskrnl.exe</span> was dropped since it
- had no licensing gains (you need the original
- <span class="productname">Microsoft Windows NT</span> files at least for
- the filesystem driver itself) and the emulation of undocumented parts
- reusable from <span class="fname">ntoskrnl.exe</span> binary was
- a pain.</p>
-
- <p>pros: Lightweight, easier to debug.</p>
-
- <p>cons: Possible incompatible emulation of the whole
- <span class="fname">ntoskrnl.exe</span>, its missing documentation.</p>
-
-
-<h1>Implementation Details</h1>
-
- <a name="functype"><h2>API Function Implementation Choices</h2></a>
-
- <p>For each function exported by W32
- <span class="fname">ntoskrnl.exe</span> and imported and called by the
- filesystem driver a decision needs to be made to properly implement its
- functionality. Currently implemented functionality statistics are provided
- below:</p>
-
- <table border="1" align="center">
- <tr><th>Function type </th><th>Items</th><th>Portion</th></tr>
- <tr><td>@{[ a_href '#functype_pass','pass' ]} </td><td> 81</td><td> 26%</td></tr>
- <tr><td>@{[ a_href '#functype_wrap','wrap' ]} </td><td> 2</td><td> 0%</td></tr>
- <tr><td>@{[ a_href '#functype_native_reactos','native-ReactOS' ]}</td><td> 113</td><td> 36%</td></tr>
- <tr><td>@{[ a_href '#functype_native_libcaptive','native-own' ]} </td><td> 116</td><td> 38%</td></tr>
- <caption>Function Implementation Types Statistics</caption>
- </table>
-
- <p>As there are several choices to implement each function the usual
- attempts/investigations ordering is listed in the sections below.</p>
-
- <p>Special case must be taken for data-type symbols since they are
- referenced without the possibility of catching the code flow by some
- breakpoints (it would be possible only in some special access cases). Data
- export symbols of <span class="constant">unpatched</span> libraries must
- contain already prepared content at the runtime. There is a problem
- with <span class="constant">patched</span> libraries where it is necessary
- to also fully implement the data symbol as
- @{[ a_href '#functype_native','native implementation' ]} since there is no
- possibility to @{[ a_href '#functype_pass','pass' ]} the data symbol instead of
- the original W32 data location and therefore there will be two instances of
- such data variable place. As there will be also the uncaught references for
- such W32 data location from the <span class="constant">patched</span>
- library itself such symbols should be usually only some constants (such as
- <span class="constant">KeNumberProcessors</span>).</p>
-
- <p>W32 platform symbols export/import can be based either on the symbol
- name itself or it can be also exported and imported just by its
- identification number called <span class="constant">Ordinal</span>.
- Although it saves some jumptables file binary size it is currently no
- longer used by W32 binaries and this project also does not support such
- <span class="constant">Ordinal</span> symbol reference type at all.</p>
-
- <p>All the exporting magic is handled by custom script
- <span class="fname">captivesym</span> processing the definition file
- <span class="fname">@{[ a_href
- 'http://cvs.jankratochvil.net/viewcvs/*checkout*/priv/captive/src/libcaptive/ke/exports.captivesym?rev=HEAD',
- 'src/libcaptive/ke/exports.captivesym' ]}</span>
- to produce the intermediate relaying code
- <span class="fname">src/libcaptive/ke/exports.c</span>. For details of the
- <span class="fname">captivesym</span>-specific source file syntax please
- see its documentation:
- <span class="fname">@{[ a_href
- $W->{"top_dir"}.'/project/Pod2Html.html.pl?cvs=priv/captive/src/libcaptive/ke/captivesym.pl',
- 'src/libcaptive/ke/captivesym.pl' ]}</span>
-
- <a name="functype_pass"><h3>Direct Pass to Original "ntoskrnl.exe"</h3></a>
+ <li><a href="Architecture.html.pl#law_servicepack">Microsoft Service Pack</a></li>
+ </ul></li>
- <p>Simple (standalone) functions such as
- <span class="function">RtlTimeToSecondsSince1970()</span> can be simply
- passed to the original implementation in
- <span class="fname">ntoskrnl.exe</span> as they make no hardware access
- and they do not expect any special internal data structures to be set up
- in advance by an earlier library initialization. A common case are all
- the data structures utility functions such as
- <span class="constant">GenericTable</span> subsystem or
- <span class="constant">LargeMcb</span> handling.</p>
+ <li><a href="Components.html.pl">Project Components</a></li>
- <a name="functype_pass_fromunix"><h4>Pass from UNIX Code</h4></a>
-
- <p>Control flow begins in some standard UNIX code. Such code is always
- using @{[ a_href '#calltype_cdecl','cdecl call type' ]} for all its
- intracalls. <a href="#functype_native_reactos">Native functions
- compiled from <span class="productname">ReactOS</span> sources</a> use
- their own @{[ a_href '#calltype','cdecl/stdcall/fastcall' ]} declarations
- but these call type modifications are discarded during compilation for
- this project by the <span class="constant">LIBCAPTIVE</span>
- symbol.</p>
-
- <p>UNIX code calls <span class="function">FUNCTIONNAME()</span> relay
- from the generated UNIX jump table. Such relay will debug dump the
- passed arguments and finally pass the control to the original W32
- function code in the proper call type
- @{[ a_href '#calltype','cdecl/stdcall/fastcall' ]} for a given
- function.</p>
-
- <p>Original W32 code entry point is always trapped by a breakpoint
- although it would not be needed during this specific direct pass from
- UNIX code to the original W32 implementation. Still the breakpoint has
- to be there to catch some other (such as intra-W32) possible calls
- described later. There are several more ways to define breakpoint in
- the code. One way is to use processor hardware breakpoint support but
- the number of breakpoints is limited. The other way is to patch in the
- <span class="instruction">@{[ 'int $3' ]}</span> instruction but it will invoke
- <span class="constant">SIGTRAP</span> signal handler conflicting with
- the possible debugger (<span class="productname">gdb(1)</span>)
- control. This project uses the <span class="instruction">hlt</span>
- instruction, which also has a single-byte opcode as
- <span class="instruction">@{[ 'int $3' ]}</span> and it is a privileged
- instruction forbidden to be used from the UNIX user space code.
- <span class="instruction">hlt</span> invokes
- <span class="constant">SIGSEGV</span> signal which can be resolved by
- a custom signal handler without any conflict with the possible
- debugger control; <span class="productname">gdb(1)</span> needs the
- following command to pass through such
- <span class="constant">SIGSEGV</span> signal:</p>
-
- <blockquote class="command">
- <p>handle SIGSEGV nostop noprint pass</p>
- </blockquote>
-
- <p>When a breakpoint gets caught, we usually need to return to the
- running code. Unfortunately it is not possible because of the patched
- breakpoint opcode. The breakpoint cannot be simply removed upon return
- as it would permanently loose control over the point of entry. Even if
- the return would include faking of the return address in the bottom
- stack frame to patch the breakpoint back during later function exit it
- still would not solve the caughts of inner calls of recursive
- functions. One of the working possibilities would be to patch the
- original instruction back and perform a singlestep provided by
- <span class="function">ptrace(2)</span> syscall. However such
- singlestep needs another controlling UNIX process and it would again
- conflict with the debuggers such as
- <span class="productname">gdb(1)</span>. This project implements the
- singlestep functionality by two consecutive breakpoints
- (<span class="instruction">hlt</span> instructions to be specific):
- The first two instruction addresses of the W32 functions are called
- <span class="productname">slot #1</span> and
- <span class="productname">slot #2</span>, the length of the first
- function instruction has to be analyzed to get the right address of
- <span class="productname">slot #2</span>. When the first breakpoint is
- caught it is necessary to patch the original instruction back and also
- patch another breakpoint in place of
- <span class="productname">slot #2</span>.
- During the <span class="productname">slot #2</span> breakpoint
- invocation the operation will be reverted — the breakpoint will be put
- to <span class="productname">slot #1</span> again and the instruction
- of <span class="productname">slot #2</span> will be restored to be able
- to continue the execution of the function.</p>
-
- <p>W32 function will finish in its specific
- @{[ a_href '#calltype','cdecl/stdcall/fastcall call type' ]}, the control
- will return to the UNIX jump table relay which will debug dump the
- return value and it will finally pass the control back to the UNIX
- caller in the standard UNIX
- @{[ a_href '#calltype_cdecl','cdecl call type' ]}.</p>
-
- @{[ doc_img 'fig/functype_patched_pass_fromunix',
- 'Function Type: <span class="constant">pass</span> from UNIX Code' ]}
-
- <a name="functype_pass_fromw32"><h4>Pass from W32 Code</h4></a>
-
- <p>This function type is similiar to the
- @{[ a_href '#functype_pass_fromunix','previous one' ]} with the exception
- of more complicated entry point. Unfortunately W32 libraries call their
- own functions directly, using the <span class="instruction">call</span>
- instructions without any patchable jump table. Even the
- <span class="instruction">call</span> argument itself cannot be patched
- according to the relocation table record as such library intra-call
- instruction has no relocation due to its relative argument offset on
- <span class="constant">i386</span>. This time the double-breakpoint
- mechanism @{[ a_href '#functype_pass_fromunix','described above' ]} gets
- handy since it will catch the entry point when the function gets
- called. <span class="constant">SIGSEGV</span> handler gets invoked by
- the <span class="instruction">hlt</span> instruction and it will
- redirect the control to the jump table relay function to debug dump the
- function entry arguments (it has no other uses in this call type).</p>
-
- <p>When the relay needs to call the original function it will reach
- exactly the same breakpoint instruction as during the recent
- <span class="constant">SIGSEGV</span> handling redirecting to this
- calling relay. But this time the
- <span class="constant">through_w32_func</span> field of this function
- record will be set to to prevent repeated redirection and to pass the
- control through the breakpoint mangle instead this time.</p>
-
- <p>Returning is not much interesting as the first
- <span class="constant">SIGSEGV</span> handler did a straight jump
- for the redirection purposes without any needed consequent
- handling.</p>
-
- <p>The jump table relay used for the callers from W32 code is
- a different one than the relay being used for the callers
- @{[ a_href '#functype_pass_fromunix','from UNIX code' ]}. UNIX code always
- uses relay with external @{[ a_href '#calltype_cdecl','cdecl call type' ]}
- but in this case a relay with the appropriate
- @{[ a_href '#calltype','cdecl/stdcall/fastcall call type' ]} is used.</p>
-
- @{[ doc_img 'fig/functype_patched_pass_fromw32',
- 'Function Type: <span class="constant">pass</span> from W32 Code' ]}
-
- @{[ vskip() ]}
-
- <table border="1" align="center">
- <tr><td><span class="fname">captivesym</span> keyword</td><td>pass</td></tr>
- <tr><td>Native code function name </td><td>(no implementation)</td></tr>
- <tr><td>W32 traced code from UNIX function name </td><td>FUNCNAME</td></tr>
- <tr><td>W32 traced code from W32 function name </td><td>FUNCNAME_cdecl/_stdcall/_fastcall</td></tr>
- <tr><td>Entry/exit debug tracing from UNIX code </td><td>yes</td></tr>
- <tr><td>Entry/exit debug tracing from W32 code </td><td>yes</td></tr>
- <caption>Function Type <span class="constant">pass</span> Characteristics</caption>
- </table>
-
- <a name="functype_wrap"><h3>Wrap of the Original "ntoskrnl.exe" Function</h3></a>
-
- <a name="functype_wrap_fromunix"><h4>Wrapping of Call from UNIX Code</h4></a>
-
- <p>The code control flow has no special hardcore features since it is
- very similiar to <a href="#functype_pass_fromunix">the direct pass to
- W32 function from UNIX code</a>. All the wrapping is done in the
- standard UNIX @{[ a_href '#calltype_cdecl','cdecl call type' ]} manner.
- Jump table debug dumping relays are provided twice — the
- "outer" one to trace the parameters from the function caller
- and the "inner" one to trace the call from the wrapper to the
- original W32 code. The "inner" relay also calls the W32 code
- with the appropriate <a href="#calltype">cdecl/stdcall/fastcall call
- type</a>.</p>
-
- @{[ doc_img 'fig/functype_patched_wrap_fromunix',
- 'Function Type: <span class="constant">wrap</span> from UNIX Code' ]}
-
- <a name="functype_wrap_fromw32"><h4>Wrapping of Call from W32 Code</h4></a>
-
- <p>This scheme is a combination of the
- <a href="#functype_wrap_fromunix">previous wrap of a call from
- UNIX code</a> and the <a href="#functype_pass_fromw32">direct pass from
- the W32 code</a>. The control is caught and redirected by
- <span class="constant">SIGSEGV</span> handler from the breakpoint
- placed at the entry to the original W32 function code. The second entry
- to the original W32 function with the
- <span class="constant">through_w32_func</span> field of this function
- description already set is done from the "inner" jump table
- relay with the appropriate
- @{[ a_href '#calltype','cdecl/stdcall/fastcall call type' ]}.</p>
-
- @{[ doc_img 'fig/functype_patched_wrap_fromw32',
- 'Function Type: <span class="constant">wrap</span> from W32 Code' ]}
-
- @{[ vskip() ]}
-
- <p>Some functions can be <a href="#functype_pass">passed to the original
- code</a> but they need their parameters to be checked/prepared.
- Currently, such wrapping is only needed for the
- <span class="function">ExAllocateFromPagedLookasideList()</span> function
- where it is required due to <a href="#init_ntoskrnl">missing execution of
- <span class="fname">ntoskrnl.exe</span> initialization execution</a>,
- which would otherwise properly initialize some internal data structures.
- In this case the wrapping code detects passing of an uninitialized
- parameter and will search through the whole
- <span class="fname">ntoskrnl.exe</span> code body at runtime to find the
- proper initialization routine containing the correct initialization
- parameters. Passed addresses of static structures must be differentiated
- as each of them usually has different initialization parameters. It is
- proactive to not to have fixed parameters array as these parameters may
- differ across different <span class="fname">ntoskrnl.exe</span>
- versions.</p>
-
- <table border="1" align="center">
- <tr><td><span class="fname">captivesym</span> keyword</td><td>wrap</td></tr>
- <tr><td>Native UNIX wrapping code function name </td><td>FUNCNAME_wrap</td></tr>
- <tr><td>W32 traced wraping code from UNIX func. name </td><td>FUNCNAME</td></tr>
- <tr><td>W32 traced wrapping code from W32 func. name </td><td>FUNCNAME_cdecl/_stdcall/...</td></tr>
- <tr><td>W32 traced original code function name </td><td>FUNCNAME_orig</td></tr>
- <tr><td>Entry/exit debug tracing from UNIX code </td><td>yes</td></tr>
- <tr><td>Entry/exit debug tracing from W32 code </td><td>yes</td></tr>
- <caption>Function Type <span class="constant">wrap</span> Characteristics</caption>
- </table>
-
- <a name="functype_native"><h3>Native Implementation</h3></a>
-
- <h4>Native Implementation Called from UNIX Code</h4>
-
- <p>This is the simplest case of a function call as it is fully
- handled only by the compiler and/or linker.</p>
-
- <p>In this case though, no debug dumping call relay is provided — such
- relay would need to rename the implementations of native functions to
- prevent its automatic linking with the caller code. This renaming would
- not be possible to do by simple <span class="constant">#define</span>
- since it would also rename any calling statements of such function in
- the same C sources. One of the possibilities to solve would be to
- utilize <span class="dashdash">--redefine-sym</span> feature of the
- <span class="productname">objcopy(1)</span> utility. On the other hand
- there is not much need to catch/debug such calls as both the caller and
- the callee are provided with full source file debug information for the
- debugger. Also the callee usually debug dumps its entry/exit parameters
- by custom debug dumps in the
- <a href="#functype_native_reactos"><span class="productname">ReactOS</span> implementations</a>.
-
- @{[ doc_img 'fig/functype_native_fromunix',
- 'Function Type: <span class="constant">native</span> from UNIX Code' ]}
-
- <a name="functype_native_fromw32"><h4>Native Implementation of
- "unpatched" Library Function Called from W32 Code</h4></a>
-
- @{[ doc_img 'fig/functype_unpatched_native_fromw32',
- 'Function Type: <span class="constant">native</span> of <span class="constant">unpatched</span> from W32 Code' ]}
-
- <p>Here comes the differentiation if the project deals either with
- a <span class="constant">patched</span> or an
- <span class="constant">unpatched</span> version of the library
- (<span class="constant">patched</span> is a loaded W32 binary
- library while <span class="constant">unpatched</span> library is
- completely provided by this project with no use of the library's
- original W32 binary file). As the project adjusts the exported symbol
- address during the patching operation, in some cases the
- <span class="constant">patched</span> library call may be handled
- simply as <span class="constant">unpatched</span> library call even for
- the <span class="constant">patched</span> libraries. Fortunately the
- distinction is not much important as the project is prepared to
- properly handle both cases.</p>
-
- <p>The W32 caller which imported the symbol will be pointed right to
- the relaying function. The debug dumping relay will be called from W32
- code with the appropriate
- @{[ a_href '#calltype','cdecl/stdcall/fastcall call type' ]} while the
- relay will call the implementation of the native function in the
- standard UNIX @{[ a_href '#calltype_cdecl','cdecl call type' ]} manner.</p>
-
- <h4>Native Implementation of "patched" Library Function Called from W32 Code</h4>
-
- @{[ doc_img 'fig/functype_patched_native_fromw32',
- 'Function Type: <span class="constant">native</span> of <span class="constant">patched</span> from W32 Code' ]}
-
- <p>The calling scheme is similiar to the
- <a href="#functype_native_fromw32">previous call of
- <span class="constant">unpatched</span> library function from W32
- code</a> but the call control is redirected from the entry point of the
- original W32 binary implementation by the breakpoint and its
- <span class="constant">SIGSEGV</span> handler as in
- <a href="#functype_pass_fromw32">the case of passing control from W32
- call</a>.</p>
-
- <p>The original W32 function implementation located in the original
- loaded binary file is never executed but its entry point needs to be
- trapped by the breakpoint to be able to catch the function calls within
- the library.</p>
-
- @{[ vskip() ]}
-
- <p>In all cases the final function implementation is a standard UNIX
- code compiled from C sources with full debug information available
- for the debugger. Fortunately all such functions do not need to be coded
- from scratch for this project since there already exist $freespeech
- $ReactOS and $Wine projects and their code can be used instead.</p>
-
- <p>$Wine project is listed mostly for a completeness as almost no
- code was suitable for reuse as it implements W32 user space while this
- project is running pure W32 kernel space environment (in $gnulinux user
- space!).</p>
-
- <a name="functype_native_reactos"><h4>Native Implementation
- - <span class="productname">ReactOS</span></h4></a>
-
- <p>Some functions are already implemented in the $ReactOS
- project and they can be used as they are. Although it would be
- possible to <a href="#functype_pass">pass some function calls to the
- original code</a> it is more handy to provide native implementation as
- there is better control of the data handling during debugging sessions
- due to the provided debugging symbols.</p>
-
- <p>Such functions can be found in
- <span class="fname">src/libcaptive/reactos/</span> subdirectory.
- Some functions had to be adjusted for this project
- - these modifications are compiled conditionally, depending on the
- <span class="constant">LIBCAPTIVE</span> symbol existence.</p>
-
- <p>Later stages of this project reached the level where
- $ReactOS is yet too immature and the needed functions are usually
- written just with the sad body:</p>
-
- <blockquote class="command">
- <p>UNIMPLEMENTED;</p>
- </blockquote>
-
- <p>Functions that were not possible to
- @{[ a_href '#functype_pass','pass' ]} were reimplemented by this project
- and placed in the project's implementation directories
- @{[ a_href '#reactos_nocare','instead of extending' ]} $ReactOS code.</p>
-
- <a name="functype_native_wine"><h4>Native Implementation – <span class="productname">Wine</span></h4></a>
-
- <p>Even though $Wine only implements the
- <span class="productname">Microsoft Windows NT</span> user space, there
- still are some common functions which could be copied from the $Wine
- project.</p>
-
- <a name="functype_native_libcaptive"><h4>Native Implementation – Project Specific</h4></a>
-
- <p>As the last resort it was necessary to provide completely own
- implementation of some API functions such as PC hardware dependent
- parts or memory management functions.</p>
-
- @{[ vskip() ]}
-
- <table border="1" align="center">
- <tr><td><span class="fname">captivesym</span> keyword</td><td>(none; just the symbol name)</td></tr>
- <tr><td>Native code function name </td><td>FUNCTIONNAME</td></tr>
- <tr><td>Native traced code from W32 code func. name </td><td>FUNCTIONNAME_cdecl/_std...</td></tr>
- <tr><td>Entry/exit debug tracing from UNIX code </td><td>no</td></tr>
- <tr><td>Entry/exit debug tracing from W32 code </td><td>yes</td></tr>
- <caption>Function Type <span class="constant">native</span> Characteristics</caption>
- </table>
-
- <a name="functype_undef"><h3>Undefined Function</h3></a>
-
- <p>Functions not defined by any of the previous function types cannot be
- called by any W32 code including the code of the library implementing
- such function. All functions of <span class="constant">patch</span>ed
- libraries not listed in the <span class="fname">captivesym</span> exports
- file are automatically set to be trapped as fatal program execution
- errors.</p>
-
- <p>It is not necessary to list the symbols as
- <span class="constant">undef</span> as long as you are just loading the
- W32 <span class="constant">PE-32</span> code and the symbols belong to
- <span class="constant">patch</span>ed library. On the other hand if you
- are loading W32 <span class="fname">.so</span> code or if such symbol is
- a part of <span class="constant">unpatched</span> library (and thus
- being completely provided by the project) you need to list such symbol as
- <span class="constant">undef</span> type to prevent unresolved symbol
- reference.</p>
-
- <table border="1" align="center">
- <tr><td><span class="fname">captivesym</span> keyword</td><td>undef</td></tr>
- <tr><td>Native code function name </td><td>(no implementation)</td></tr>
- <tr><td>Native traced code function name </td><td>FUNCTIONNAME_cdecl/_stdcall/_fastcall</td></tr>
- <tr><td>Debug tracing message from UNIX code </td><td>yes</td></tr>
- <tr><td>Debug tracing message from W32 code </td><td>yes</td></tr>
- <caption>Function Type <span class="constant">undef</span> Characteristics</caption>
- </table>
-
-
- <a name="calltype"><h2>API Function Calling Conventions</h2></a>
-
- <p>Standard UNIX code compiled by GCC (GNU C Compiler) running on host
- $gnulinux always uses @{[ a_href '#calltype_cdecl','cdecl' ]} ABI (Application
- Binary Interface) calling convention. This calling convention is also the
- default declaration type of UNIX functions.</p>
-
- <p>W32 uses three different calling conventions in its ABI. They are all
- described in the
- <a href="http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vclang/html/_core_argument_passing_and_naming_conventions.asp"><span class="productname">Microsoft</span> documentation</a>.
- There is always necessary to have the proper function declaration
- (prototype) in the caller scope to prevent all sorts of unexpected
- crashes.</p>
-
- <p>Unfortunately some non-matching combinations of calling conventions
- result in hard to debug bugs: the caller gets back an unexpected stack
- pointer from the callee and upon return it will restore registers from the
- wrong stack pointer place. Since the caller will finally reclaim its stack
- frame from its (uncorrupted) <span class="constant">EBP</span> stack frame
- pointer the caller will return to the caller of the caller correctly. Just
- the registers remain corrupted causing crashes of completely unrelated code
- executed far, far away...</p>
-
- <p><span class="constant">EDI</span>, <span class="constant">ESI</span> and
- <span class="constant">EBX</span> registers are always saved on the stack.
- They are stored on the stack in this particular order from bottom to top
- addresses (using the <span class="instruction">push EBX</span>,
- <span class="instruction">push ESI</span>,
- <span class="instruction">push EDI</span> sequence). Fortunately $gnulinux
- GCC has the same register saving behaviour. If some register corruption
- occurs the calling type presented between the caller and callee should be
- checked.</p>
-
- <a name="calltype_cdecl"><h3>W32 Calling Convention "cdecl"</h3></a>
-
- <p>The only calling convention in the UNIX world. The default one for all
- the compilers. All the arguments are passed on the stack, no arguments
- are cleaned by the callee. Possible inconsistencies in the number of
- function arguments with the function prototype used by the caller is
- harmless. Variable arguments lists can be passed by this convention.</p>
-
- @{[ doc_img 'fig/calltype_cdecl',
- 'W32 Calling Convention <span class="constant">cdecl</span> Scheme' ]}
-
- <table border="1" align="center">
- <tr><td>Arguments freed by </td><td>caller</td></tr>
- <tr><td>Arguments on the stack </td><td>#0 ... #(n-1)</td></tr>
- <tr><td>Arguments in the registers </td><td>none</td></tr>
- <tr><td>GCC attribute </td><td><span class="command">__attribute__((__cdecl__))</span> (default)</td></tr>
- <caption>Calling Convention <span class="constant">cdecl</span> Characteristics</caption>
- </table>
-
- <h3>W32 Calling Convention "stdcall"</h3>
-
- @{[ doc_img 'fig/calltype_stdcall',
- 'W32 Calling Convention <span class="constant">stdcall</span> Scheme' ]}
-
- <p>Convention never used in the UNIX world. It needs to be specified for
- W32 compilers. All the arguments are passed on the stack, all the
- arguments are cleaned by the callee. Possible inconsistencies in the
- number of function arguments with the function prototype used by the
- caller will result in fatal crash. Variable arguments lists cannot be
- passed by this convention – use @{[ a_href '#calltype_cdecl','cdecl' ]}
- instead.</p>
-
- <table border="1" align="center">
- <tr><td>Arguments freed by </td><td>callee</td></tr>
- <tr><td>Arguments on the stack </td><td>#0 ... #(n-1)</td></tr>
- <tr><td>Arguments in the registers </td><td>none</td></tr>
- <tr><td>GCC attribute </td><td><span class="command">__attribute__((__stdcall__))</span></td></tr>
- <caption>Calling Convention <span class="constant">stdcall</span> Characteristics</caption>
- </table>
-
- <h3>W32 Calling Convention "fastcall"</h3>
-
- <p>Convention never used in the UNIX world. It needs to be specified for
- W32 compilers. Convention used in the W32 world for its low calling
- overhead. All but the first two arguments are passed on the stack, such
- arguments are cleaned by the callee. First two arguments are passed in
- the registers <span class="constant">ECX</span> and
- <span class="constant">EDX</span> respectively. Possible inconsistencies
- in the number of function arguments with the function prototype used by
- the caller will result in fatal crash. Variable arguments lists cannot be
- passed by this convention – use @{[ a_href '#calltype_cdecl','cdecl' ]}
- instead.</p>
-
- <p>GCC (GNU C Compiler) native support for this calling convention
- is pretty fresh and it is currently present only in the recent CVS
- versions since 21st December of 2002 which should get released as GCC
- version 3.4. This project solved the unsupported calling convention by
- declaration of arguments passed in registers by
- <span class="command">__attribute__((__regparm__(3)))</span>.
- W32 passes the arguments in registers in the order
- <span class="constant">ECX</span>, <span class="constant">EDX</span> but
- GCC passes them in registers <span class="constant">EAX</span>,
- <span class="constant">EDX</span>, <span class="constant">ECX</span>.
- This incompatibility is compensated at C source level in the
- @{[ a_href '#functype','relaying code' ]} generated by
- <span class="fname">captivesym</span> relay generator.</p>
-
- @{[ doc_img 'fig/calltype_fastcall',
- 'W32 Calling Convention <span class="constant">fastcall</span> Scheme' ]}
-
- <table border="1" align="center">
- <tr><td>Arguments freed by </td><td>callee</td></tr>
- <tr><td>Arguments on the stack </td><td>#2 ... #(n-1)</td></tr>
- <tr><td>Arguments in the registers </td><td><span class="constant">ECX</span>=#0,
- <span class="constant">EDX</span>=#1</td></tr>
- <tr><td>GCC ≥3.4 attribute </td><td><span class="command">__attribute__((__fastcall__))</span></td></tr>
- <tr><td>GCC <3.4 attr. emulation</td><td><span class="command">__attribute__((__stdcall__))</span></td></tr>
- <tr><td> </td><td><span class="command">__attribute__((__regparm__(3) /* EAX,EDX,ECX */))</span></td></tr>
- <caption>Calling Convention <span class="constant">fastcall</span> Characteristics</caption>
- </table>
-
- <a name="synchronous"><h2>Multithreading and Multiple Processors</h2></a>
-
- <p>W32 platform stands on its thorough architecture parallelism. It
- must lock all its objects to maintain coherence in presence of
- multithreading and multiple processors. Since the author of this project
- considers any parallel execution a serious obstacle for debugging the whole
- project architecture was designed to prevent any undeterministic behaviour.
- Therefore this projects always emulates uniprocessor
- <span class="productname">Microsoft Windows NT</span> kernel
- (<span class="constant">KeNumberProcessors</span> symbol is always 1),
- everything runs in the single initial thread/process and all the filesystem
- operations are performed as synchronous
- ("synchronous" by flags
- <span class="constant">FILE_SYNCHRONOUS_IO_ALERT</span>,
- <span class="constant">FO_SYNCHRONOUS_IO</span>,
- <span class="constant">IRP_SYNCHRONOUS_API</span>,
- <span class="constant">IRP_SYNCHRONOUS_PAGING_IO</span>,
- forced <span class="constant">TRUE</span> result of
- <span class="function">IoIsOperationSynchronous()</span>
- etc.).
- For several cases needed only by <span class="fname">ntfs.sys</span> there
- had to be supported asynchronous access
- (<span class="constant">STATUS_PENDING</span> return code) – parallel
- execution is emulated by GLib
- <span class="function">g_idle_add_full()</span> with
- <span class="function">g_main_context_iteration()</span> called during
- <span class="function">KeWaitForSingleObject()</span>.</p>
- Since there is a possibility a real W32 parallel threading would
- be yet needed in the future all the code that would be hit by W32
- multithreading capability is marked by
- <span class="constant">TODO:thread</span> comment.</p>
-
- <p>Multiple processors (SMP) support will never need to be implemented
- since uniprocessor W32 kernels apparently run the filesystem driver modules
- fine. As this project implements only the uniprocessor W32 kernel all the
- processor locking functions and structures such as
- <span class="constant">KSPIN_LOCK</span> etc. can be safely implemented as
- no-operations.</p>
-
- <p>Asynchronous callbacks registered for
- <span class="constant">IO_WORKITEM</span>s are passed as GLib idle
- functions by <span class="function">g_idle_add_full()</span>. Although they
- will probably never be executed during non-interactive project's batch
- executions it is the responsibility of W32 driver implementation to
- complete all the pending tasks before its W32 shutdown. Such W32 shutdown
- is done during cleanup of the project's execution by
- <span class="function">captive_shutdown()</span>.</p>
-
- <a name="paranoia"><h2>Paranoia Checks</h2></a>
-
- <p>A general approach of software projects development is to implement
- many internal sanity checks during the development stage but to produce the
- most optimized final release product without those debugging checks.</p>
-
- <p>Facilities for these practices can be seen in the standard
- C include files for example as function
- <span class="function">assert()</span> which gets disabled by the
- <span class="constant">NDEBUG</span> symbol used during the final optimized
- executable compilation. This project uses Gnome GLib messaging subsystem
- offering sanity checks discarded by symbols
- <span class="constant">G_DISABLE_ASSERT</span> and
- <span class="constant">G_DISABLE_CHECKS</span>.
- <span class="productname">Microsoft</span> also produces two versions of
- its products – regular customers use the "free build" (also
- called "retail") while the programmers should develop their code
- on the "checked build" product releases.</p>
-
- <p>As this project will always run unknown binary code of proprietary W32
- filesystem drivers, the code can never be trusted. Such code even runs in
- the same unprotected address space as its controlling UNIX code. Since
- there is not enough documentation for the W32 components of the system and
- also such documentation is usually misleading it can never be considered as
- 100% emulation. Even in the final releases all the sanity checks
- implemented in this project should remain active as all the project's code
- always interacts with unknown and untrusted W32 binaries.</p>
-
- <p><span class="productname">Microsoft Windows NT</span> code is written in
- a foolproof style as it accepts even invalid input values, and which
- it usually corrects. This makes long-term debugging a pain as it hides
- sources of problems. "Checked build" releases were probably
- designed to fix this flaw by strict consistency checks but it did not reach
- its goals as such checks are usually missing in the code.</p>
-
- <p>This project has strict consistency checks across all the code to make
- the debugging phase easy enough. Failed sanity check is not always
- a bug – sometimes it just means the real W32 binary code is more
- benevolent than it could be expected according to the documentation and
- such sanity check gets removed for the next version build. In other cases
- the failed sanity checks mean the execution path for some unexpected
- arguments combination was not yet implemented by this project. I may also
- mean a bug, of course...</p>
-
- <p>Last but not least – never miss a possible sanity check as its
- later removal is in an order of magnitude cheaper than an uncaught
- invalid assumption. Failed assertion is not always a bug although it
- has to be fixed, of course.</p>
-
-
- <h2>STATUS_LOG_FILE_FULL</h2>
-
- <p>After writing approx. 1MB of data on NTFS test partition NTFS driver
- returns for any further write requests
- <span class="constant">STATUS_LOG_FILE_FULL</status> error code.
- Apparently it is caused by the fact this project is
- @{[ a_href '#synchronous','single-threaded' ]} and it ignores the spawn
- of parallel journalling thread during <span class="fname">ntfs.sys</span>
- initialization.</p>
-
- <p>Fortunately <span class="fname">ntfs.sys</span> will clear its
- journalling log file during filesystem unmount. This project will therefore
- remount the volume if <span class="constant">STATUS_LOG_FILE_FULL</status>
- is detected to workaround missing journalling thread.</p>
-
- <p>Similiar behaviour can be seen during write of compressed files —
- the file gets written uncompressed and its compression will proceed only
- during the final filesystem unmount.</p>
-
- <p>For these reasons it was mandatory to support
- @{[ a_href '#parent_connector','transparent volume remounting' ]}.</p>
-
-
- <a name="parent_connector"><h2><span class="constant">ParentConnector</span> volume remounter</h2></a>
-
- <p>The sandbox master component of this project has control of restarting
- its sandbox slaves containing the W32 filesystem. Target goal of
- <span class="constant">ParentConnector</span> component is to transparently
- provide persistent view of files and directories over the sandboxed slaves
- being restarted.</p>
-
- <p>In the case of read-only operations it would be simple as we could only
- save our state of currently opened filesystem objects with their read
- file/directory offset. Write operations can be handled as the read-only
- ones as long as all the operations are successful. In the case of W32
- filesystem crash we loose all the past write operations. If we would redo
- all the write operations we could very easily invoke the same crash.
- Therefore we write:</p>
-
- <blockquote class="command">
- <p>Filesystem crash broke dirty object: FILE/PATH/NAME</p>
- </blockquote>
-
- <p>message to syslog and refuse any further operations with this
- object.</p>
-
- @{[ doc_img 'dia/parent-connector','Parent Connector' ]}
-
- <p><span class="constant">HANDLE</span> represents W32 object open in
- existing W32 filesystem.<span class="constant">HANDLE</span> is created
- on-demand according to the saved state of the object (such as its
- pathname). Even the whole <span class="constant">VFS</span> sandbox slave
- is spawn on-demand if some object operation requests it.</p>
-
- <p>W32 filesystem crash can obviously occur at any moment - it generates
- @{[ a_href 'http://developer.gnome.org/doc/API/2.0/gobject/','GObject' ]}
- @{[ a_href 'http://developer.gnome.org/doc/API/2.0/gobject/gobject-Signals.html','signal' ]}
- <span class="constant">abort</span>. Successful filesystem unmount
- (even as the part of remount operation) must be first preceded by
- <span class="constant">detach</span> signal to close all existing
- W32 <span class="constant">HANDLE</span>s. After their close the filesystem
- gets the unmount requests. Only in the case all the close operations
- succeeded including the final filesystem unmount the signal
- <span class="constant">cease</span> can be activated to notify all the
- dirty (written) objects they are now clean. During this
- <span class="constant">cease</span> signal the project will also
- @{[ a_href '#safe_flush','flush' ]} the sandbox commit buffer to its
- underlying media.</p>
-
- <p>Objects never written remain in <span class="constant">clean</span>
- state and they can be transparently reopened even if W32 filesystem crash
- occurs.</p>
-
-
-<h1>TODO: Fsck of NTFS</h1>
-
- <p>Currently this project does not support checking of data structures
- of NTFS volume as being provided by <span class="command">chkdsk.exe</span>
- in W32 environment and <span class="command">fsck</span> in UNIX OS.</p>
-
- <p>W32 has its disk checking functionality split to
- <span class="fname">untfs.dll</span> W32 userland library.
- according to
- @{[ a_href 'http://www.sysinternals.com/ntw2k/source/fmifs.shtml',
- 'Chkdskx and Formatx' ]}
- by @{[ a_href 'http://www.sysinternals.com/aboutus.shtml',
- 'Mark Russinovich' ]}.
-
- <p>I assume its execution falls completely
- @{[ a_href '#existing_emulation','out of scope' ]}
- of this project as it is W32 userland.</p>
-
- <p>This possibility was not yet investigated in any way.</p>
-
-
-<h1>TODO: NTFS Support for
- <span class="productname">@{[ a_href 'http://surprise.sourceforge.net/','Partition Surprise' ]}</span></h1>
-
- <p>Although there currently exists
- <span class="productname">@{[ a_href 'http://mlf.linux.rulez.org/mlf/ezaz/ntfsresize.html','ntfsresize' ]}</span>
- I am not sure whether it is really reliable for all NTFS filesystems.
- <span class="productname">@{[ a_href 'http://surprise.sourceforge.net/','Partition Surprise' ]}</span>
- is the only partition manager capable of safely resize the disk
- by using just the original W32 filesystem driver by full rebuild of
- filesystem metadata.
- Almost no file data blocks would be moved even on these generic filesystems
- as W32 supports <span class="constant">FSCTL_MOVE_FILE</span> request
- according to
- @{[ a_href 'http://www.sysinternals.com/ntw2k/info/defrag.shtml',
- 'Inside Windows NT Disk Defragmenting' ]}
- by @{[ a_href 'http://www.sysinternals.com/aboutus.shtml',
- 'Mark Russinovich' ]}.
-
-
-<h1>Related Projects</h1>
-
- <p>The usual solution for file exchange between $freespeech operating systems
- and <span class="productname">Microsoft Windows NT</span> is to use
- <span class="productname">FAT32</span> (<span class="productname">vfat</span>
- called in $gnulinux) partition and swap the files over it. This method is not
- very comfortable as you never have access to all the files of the other
- operating system.</p>
-
- <a name="LinuxNTFScompet"><h2>$LinuxNTFS</h2></a>
-
- <p>Although this project takes a completely different approach and has
- a different architecture, the final goal is the same as for this
- project – reliable read-write <span class="productname">NTFS</span>
- filesystem support. $LinuxNTFS goes the way of reverse engineering
- filesystem data structures (and possibly
- <span class="fname">ntfs.sys</span> itself). Unfortunately after many years
- of its development it did not yet reach the state of reliable read-write
- access although its read-only part is considered trustworthy.</p>
-
- <p>Using $LinuxNTFS for read-only access to existing partition with
- <span class="productname">Microsoft Windows NT</span> installation is
- planned to be able to acquire existing <span class="fname">ntfs.sys</span>,
- <span class="fname">ntoskrnl.exe</span> and possibly
- <span class="fname">ksecdd.sys</span> (imported by
- <span class="fname">ntfs.sys</span>) files from the user's
- <span class="productname">NTFS</span> partition.</p>
-
- <h2><span class="productname">@{[ a_href 'http://www.cgsecurity.org/ntfs.html','NTPwd NTFS Driver' ]}</span></h2>
-
- <p>DOS based @{[ a_href 'http://www.gnu.org/licenses/gpl.html','GPL-2.0' ]}
- read-write NTFS driver. Filesystem structures are reverse engineered in the
- way of @{[ a_href '#LinuxNTFScompet','Linux-NTFS Project' ]}. As it is not very
- actively maintained it reaches a lower level of
- <span class="productname">NTFS</span> compatibility.</p>
-
- <h2>The only real competition: Closed-source read/write @{[ '$299' ]} equivalent</h2>
-
- <p>@{[ a_href 'http://www.vmware.com/download/workstation.html',
- 'VMware Workstation' ]}</p>
-
- <p>Original Microsoft Windows operating system can be run inside a virtual
- machine running under GNU/Linux and share the read-write NTFS disk by using
- a network file sharing through a VMware virtual network card.</p>
-
- <p>You need @{[ '$299' ]} for this product and you need to
- give up your system security by running un@{[ a_href '#sandbox','sandbox' ]}ed
- closed-source program in your GNU/Linux.</p>
-
- <h2>@{[ a_href 'http://www.winehq.com/','Wine Project' ]}</h2>
-
- <p>No code could be shared – Wine emulates only Microsoft Windows userland.
- Filesystem drivers completely belong to Microsoft Windows kernelland.</p>
-
- <h2>@{[ a_href 'http://www.sysinternals.com/ntw2k/freeware/ntfswin98.shtml','NTFS for Windows 98' ]}</h2>
-
- <p>Closed-source read-only-crippled @{[ '$0' ]} equivalent for Microsoft Windows.</p>
-
- <p>There is a @{[ a_href 'http://www.sysinternals.com/images/screenshots/ntfs98ap.gif',
- 'diagram' ]} showing exactly the principle of Captive NTFS project.
- There is apparently disabled read/write functionality in <i>NTFS for
- Windows 98</i> as the same company also sells the following product sharing
- the same codebase:</p>
-
- <h2>@{[ a_href 'http://www.winternals.com/products/repairandrecovery/ntfsdospro.asp','NTFSDOS Professional' ]}</h2>
-
- <p>Closed-source read/write @{[ '$299' ]} equivalent for MS-DOS.</p>
-
- <p>This product is the most close equivalent to Captive NTFS but it is
- a commercial product, closed-source and it has filesystem interface only
- for MS-DOS.</p>
-
-
-<h1>Re: @{[ a_href 'http://linux-ntfs.sourceforge.net/info/ntfs.html#7.7',
- "7.7 Can't we write a wrapper for Windows' driver?" ]}</h1>
-
- <p class="re">> It sounds like a great idea, to start with, but there are numerous
- problems.</p>
-
- <p><span class="re">> The largest technical problem is joining the Windows
- system DLL to the Linux VFS. It could be done, but it wouldn't be pretty.</span><br />
- Yep. :-)</p>
-
- <p><span class="re">> It would have to run as part of the kernel which would mean
- that if it went wrong it could crash the machine. With no source, we might not
- be able to work around the problem.</span><br />
- @{[ a_href '#sandbox','Nope' ]},
- @{[ a_href 'http://lufs.sourceforge.net/lufs/','Linux Userland File System (LUFS)' ]}
- moves the filesystem implementation to UNIX userland where the Microsoft
- Windows filesystem is completely unarmed by Captive jail of chroot(2),
- setuid(2) and setrlimit(2). There only remains one narrow connection to the rest of
- system (by CORBA/ORBit). The filesystem's life environment gets kill(2)ed when
- UNIX is no longer satisfied with it. Safety similiar to
- @{[ a_href 'http://www.vmware.com/solutions/security.html','VMware sandbox' ]}.</p>
-
- <p><span class="re">> The next major problem is compati<!--orig. text typo-->bility.
- Which version of the Windows system file would we use? Picking one would limit
- its use, making the wrapper versatile for all of them would be a programming
- nightmare.</span><br />
- Microsoft Windows NTFS filesystem driver is capable of accessing older formats
- of the filesystem. This project currently runs Microsoft Windows XP version,
- porting to Microsoft Windows 2003 Server expected. (Microsoft Windows upgrades
- NTFS disk filesystem to its own version during complete CD-ROM Microsoft
- Windows system installation – such operation is not threat this project use.)</p>
+ <li><a href="Reverse.html.pl">Reverse Engineering</a>
+ <ul>
+ <li><a href="Reverse.html.pl#dumpbin">dumpbin.exe</a></li>
+ <li><a href="Reverse.html.pl#WinDbg">WinDbg Windows NT kernel debugging</a>
+ <ul>
+ <li><a href="Reverse.html.pl#WinDbg_WinDbg">WinDbg side setup</a></li>
+ <li><a href="Reverse.html.pl#WinDbg_kern">Setup of the side being kernel-debugged</a></li>
+ </ul></li>
+ </ul></li>
+ </ul></li>
+
+<li><a href="Details.html.pl">Implementation Details</a>
+
+ <ul>
+ <li><a href="CacheManager.html.pl">NT Cache Manager</a>
+ <ul>
+ <li><a href="CacheManager.html.pl#TraceFS">TraceFS NT Cache Manager Tracer</a>
+ <ul>
+ <li><a href="CacheManager.html.pl#TraceFS_general">TraceFS for general API tracing</a></li>
+ </ul></li>
+ </ul></li>
- <p><span class="re">> And it gets worse. The legal implications of
- distributing Windows systems files would cause problems.</span><br />
- User must be careful to obey all licensing restrictions according to his
- local country laws.<br />
- <span class="re">> Also the proprietary nature of the driver would mean that
- the other kernel coders would not investigate any problems if someone had used
- the NTFS wrapper.</span><br />
- It does not apply to this project due to the implemented
- @{[ a_href '#sandbox','filesystem separation' ]}.</p>
+ <li><a href="Details.html.pl#emulmeth">Choice of the Emulation Methods</a>
+ <ul>
+ <li><a href="Details.html.pl#emulmeth_vm">Virtualmachine Running the Original W32 Subsystem</a></li>
+ <li><a href="Details.html.pl#method_ntoskrnl">"ntoskrnl.exe" Inside Virtual Address Space</a></li>
+ <li><a href="Details.html.pl#emulmeth_fs">Filesystem Driver Inside Virtual Address Space</a></li>
+ </ul></li>
+ <li><a href="Details.html.pl#apichoice">API Function Implementation Choices</a></li>
+ <li><a href="Details.html.pl#sandbox">Sandboxing of W32 filesystem</a></li>
+ <li><a href="Details.html.pl#patched">"patched" vs. "unpatched" Libraries</a></li>
+ <li><a href="Details.html.pl#mman">Memory Management</a></li>
+ <li><a href="Details.html.pl#unicode">Unicode Strings and Characters</a></li>
+ <li><a href="Details.html.pl#binfmt">Supported Binary Formats</a></li>
+ <li><a href="Details.html.pl#mounted_one">At Most One Mounted Filesystem</a></li>
+ <li><a href="Details.html.pl#synchronous">Multithreading and Multiple Processors</a></li>
+ <li><a href="Details.html.pl#paranoia">Paranoia Checks</a></li>
+ <li><a href="Details.html.pl#logfile">STATUS_LOG_FILE_FULL</a></li>
+ <li><a href="Details.html.pl#parent_connector">ParentConnector volume remounter</a></li>
+
+ <li><a href="APITypes.html.pl">API Function Implementation Choices</a>
+ <ul>
+ <li><a href="APITypes.html.pl#functype_pass">Direct Pass to Original "ntoskrnl.exe"</a>
+ <ul>
+ <li><a href="APITypes.html.pl#functype_pass_fromunix">Pass from UNIX Code</a></li>
+ <li><a href="APITypes.html.pl#functype_pass_fromw32">Pass from W32 Code</a></li>
+ </ul></li>
+ <li><a href="APITypes.html.pl#functype_wrap">Wrap of the Original "ntoskrnl.exe" Function</a>
+ <ul>
+ <li><a href="APITypes.html.pl#functype_wrap_fromunix">Wrapping of Call from UNIX Code</a></li>
+ <li><a href="APITypes.html.pl#functype_wrap_fromw32">Wrapping of Call from W32 Code</a></li>
+ </ul></li>
+ <li><a href="APITypes.html.pl#functype_native">Native Implementation</a>
+ <ul>
+ <li><a href="APITypes.html.pl#functype_native_fromunix">Native Implementation Called from UNIX Code</a></li>
+ <li><a href="APITypes.html.pl#functype_native_fromw32">Native Implementation of "unpatched"
+ Library Function Called from W32 Code</a></li>
+ <li><a href="APITypes.html.pl#functype_native_fromw32_patched">Native Implementation of "patched"
+ Library Function Called from W32 Code</a></li>
+ <li><a href="APITypes.html.pl#functype_native_reactos">Native Implementation - ReactOS</a></li>
+ <li><a href="APITypes.html.pl#functype_native_wine">Native Implementation – Wine</a></li>
+ <li><a href="APITypes.html.pl#functype_native_libcaptive">Native Implementation – Project Specific</a></li>
+ </ul></li>
+ <li><a href="APITypes.html.pl#functype_undef">Undefined Function</a></li>
+ </ul></li>
+
+ <li><a href="CallType.html.pl">API Function Calling Conventions</a>
+ <ul>
+ <li><a href="CallType.html.pl#calltype_cdecl">W32 Calling Convention "cdecl"</a></li>
+ <li><a href="CallType.html.pl#calltype_stdcall">W32 Calling Convention "stdcall"</a></li>
+ <li><a href="CallType.html.pl#calltype_fastcall">W32 Calling Convention "fastcall"</a></li>
+ </ul></li>
+ </ul></li>
+
+<li><a href="TODO.html.pl#todo_fsck">TODO: Fsck of NTFS</a></li>
+<li><a href="TODO.html.pl#todo_surprise">TODO: NTFS Support for Partition Surprise</a></li>
+
+<li><a href="Related.html.pl">Related Projects</a>
+ <ul>
+ <li><a href="Related.html.pl#LinuxNTFScompet">Linux NTFS</a></li>
+ <li><a href="Related.html.pl#NTPwd">NTPwd NTFS Driver</a></li>
+ <li><a href="Related.html.pl#vmware">VMware Workstation</a></li>
+ <li><a href="Related.html.pl#wine">Wine Project</a></li>
+ <li><a href="Related.html.pl#ntfs98">NTFS for Windows 98</a></li>
+ <li><a href="Related.html.pl#ntfsdos">NTFSDOS Professional</a></li>
+ </ul></li>
+
+<li><a href="LinuxNTFS.html.pl">Re: 7.7 Can't we write a wrapper for Windows' driver?</a></li>
+
+</ul>
HERE