4 # Captive project doc APITypes page Perl template.
5 # Copyright (C) 2003 Jan Kratochvil <project-www.jankratochvil.net@jankratochvil.net>
7 # This program is free software; you can redistribute it and/or modify
8 # it under the terms of the GNU General Public License as published by
9 # the Free Software Foundation; exactly version 2 of June 1991 is required
11 # This program is distributed in the hope that it will be useful,
12 # but WITHOUT ANY WARRANTY; without even the implied warranty of
13 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
14 # GNU General Public License for more details.
16 # You should have received a copy of the GNU General Public License
17 # along with this program; if not, write to the Free Software
18 # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
21 package project::captive::doc::APITypes;
22 require 5.6.0; # at least 'use warnings;' but we need some 5.6.0+ modules anyway
23 our $VERSION=do { my @r=(q$Revision$=~/\d+/g); sprintf "%d.".("%03d"x$#r),@r; };
28 BEGIN{ open F,"Makefile"; our $top_dir=pop @{[split /\s/,(grep /^top_srcdir/,<F>)[0]]}; eval "use lib '$top_dir'"; close F; }
31 use project::captive::doc::Macros;
35 "__PACKAGE__"=>__PACKAGE__,
36 "title"=>'Captive NTFS Developer Documentation: API Functions',
37 "head_css"=>$doc_Macros_head_css,
45 <a name="functype"><h1>API Function Implementation Choices</h1></a>
47 <p>For each function exported by W32
48 <span class="fname">ntoskrnl.exe</span> and imported and called by the
49 filesystem driver a decision needs to be made to properly implement its
50 functionality. Currently implemented functionality statistics are provided
53 <table border="1" align="center">
54 <tr><th>Function type </th><th>Items</th><th>Portion</th></tr>
55 <tr><td>@{[ a_href 'APITypes.html.pl#functype_pass','pass' ]} </td><td> 81</td><td> 26%</td></tr>
56 <tr><td>@{[ a_href 'APITypes.html.pl#functype_wrap','wrap' ]} </td><td> 2</td><td> 0%</td></tr>
57 <tr><td>@{[ a_href 'APITypes.html.pl#functype_native_reactos','native-ReactOS' ]}</td><td> 113</td><td> 36%</td></tr>
58 <tr><td>@{[ a_href 'APITypes.html.pl#functype_native_libcaptive','native-own' ]} </td><td> 116</td><td> 38%</td></tr>
59 <caption>Function Implementation Types Statistics</caption>
62 <p>As there are several choices to implement each function the usual
63 attempts/investigations ordering is listed in the sections below.</p>
65 <p>Special case must be taken for data-type symbols since they are
66 referenced without the possibility of catching the code flow by some
67 breakpoints (it would be possible only in some special access cases). Data
68 export symbols of <span class="constant">unpatched</span> libraries must
69 contain already prepared content at the runtime. There is a problem
70 with <span class="constant">patched</span> libraries where it is necessary
71 to also fully implement the data symbol as
72 @{[ a_href 'APITypes.html.pl#functype_native','native implementation' ]} since there is no
73 possibility to @{[ a_href 'APITypes.html.pl#functype_pass','pass' ]} the data symbol instead of
74 the original W32 data location and therefore there will be two instances of
75 such data variable place. As there will be also the uncaught references for
76 such W32 data location from the <span class="constant">patched</span>
77 library itself such symbols should be usually only some constants (such as
78 <span class="constant">KeNumberProcessors</span>).</p>
80 <p>W32 platform symbols export/import can be based either on the symbol
81 name itself or it can be also exported and imported just by its
82 identification number called <span class="constant">Ordinal</span>.
83 Although it saves some jumptables file binary size it is currently no
84 longer used by W32 binaries and this project also does not support such
85 <span class="constant">Ordinal</span> symbol reference type at all.</p>
87 <p>All the exporting magic is handled by custom script
88 <span class="fname">captivesym</span> processing the definition file
89 <span class="fname">@{[ captive_srcfile 'src/libcaptive/ke/exports.captivesym' ]}</span>
90 to produce the intermediate relaying code
91 <span class="fname">src/libcaptive/ke/exports.c</span>. For details of the
92 <span class="fname">captivesym</span>-specific source file syntax please
93 see its documentation:
94 <span class="fname">@{[ a_href
95 $W->{"top_dir"}.'/project/Pod2Html.html.pl?cvs=priv/captive/src/libcaptive/ke/captivesym.pl',
96 'src/libcaptive/ke/captivesym.pl' ]}</span>
98 <a name="functype_pass"><h2>Direct Pass to Original "ntoskrnl.exe"</h2></a>
100 <p>Simple (standalone) functions such as
101 <span class="function">RtlTimeToSecondsSince1970()</span> can be simply
102 passed to the original implementation in
103 <span class="fname">ntoskrnl.exe</span> as they make no hardware access
104 and they do not expect any special internal data structures to be set up
105 in advance by an earlier library initialization. A common case are all
106 the data structures utility functions such as
107 <span class="constant">GenericTable</span> subsystem or
108 <span class="constant">LargeMcb</span> handling.</p>
110 <a name="functype_pass_fromunix"><h3>Pass from UNIX Code</h3></a>
112 <p>Control flow begins in some standard UNIX code. Such code is always
113 using @{[ a_href 'CallType.html.pl#calltype_cdecl','cdecl call type' ]} for all its
114 intracalls. <a href="APITypes.html.pl#functype_native_reactos">Native functions
115 compiled from <span class="productname">ReactOS</span> sources</a> use
116 their own @{[ a_href '#calltype','cdecl/stdcall/fastcall' ]} declarations
117 but these call type modifications are discarded during compilation for
118 this project by the <span class="constant">LIBCAPTIVE</span>
121 <p>UNIX code calls <span class="function">FUNCTIONNAME()</span> relay
122 from the generated UNIX jump table. Such relay will debug dump the
123 passed arguments and finally pass the control to the original W32
124 function code in the proper call type
125 @{[ a_href '#calltype','cdecl/stdcall/fastcall' ]} for a given
128 <p>Original W32 code entry point is always trapped by a breakpoint
129 although it would not be needed during this specific direct pass from
130 UNIX code to the original W32 implementation. Still the breakpoint has
131 to be there to catch some other (such as intra-W32) possible calls
132 described later. There are several more ways to define breakpoint in
133 the code. One way is to use processor hardware breakpoint support but
134 the number of breakpoints is limited. The other way is to patch in the
135 <span class="instruction">@{[ 'int $3' ]}</span> instruction but it will invoke
136 <span class="constant">SIGTRAP</span> signal handler conflicting with
137 the possible debugger (<span class="productname">gdb(1)</span>)
138 control. This project uses the <span class="instruction">hlt</span>
139 instruction, which also has a single-byte opcode as
140 <span class="instruction">@{[ 'int $3' ]}</span> and it is a privileged
141 instruction forbidden to be used from the UNIX user space code.
142 <span class="instruction">hlt</span> invokes
143 <span class="constant">SIGSEGV</span> signal which can be resolved by
144 a custom signal handler without any conflict with the possible
145 debugger control; <span class="productname">gdb(1)</span> needs the
146 following command to pass through such
147 <span class="constant">SIGSEGV</span> signal:</p>
149 <blockquote class="command">
150 <p>handle SIGSEGV nostop noprint pass</p>
153 <p>When a breakpoint gets caught, we usually need to return to the
154 running code. Unfortunately it is not possible because of the patched
155 breakpoint opcode. The breakpoint cannot be simply removed upon return
156 as it would permanently loose control over the point of entry. Even if
157 the return would include faking of the return address in the bottom
158 stack frame to patch the breakpoint back during later function exit it
159 still would not solve the caughts of inner calls of recursive
160 functions. One of the working possibilities would be to patch the
161 original instruction back and perform a singlestep provided by
162 <span class="function">ptrace(2)</span> syscall. However such
163 singlestep needs another controlling UNIX process and it would again
164 conflict with the debuggers such as
165 <span class="productname">gdb(1)</span>. This project implements the
166 singlestep functionality by two consecutive breakpoints
167 (<span class="instruction">hlt</span> instructions to be specific):
168 The first two instruction addresses of the W32 functions are called
169 <span class="productname">slot #1</span> and
170 <span class="productname">slot #2</span>, the length of the first
171 function instruction has to be analyzed to get the right address of
172 <span class="productname">slot #2</span>. When the first breakpoint is
173 caught it is necessary to patch the original instruction back and also
174 patch another breakpoint in place of
175 <span class="productname">slot #2</span>.
176 During the <span class="productname">slot #2</span> breakpoint
177 invocation the operation will be reverted — the breakpoint will be put
178 to <span class="productname">slot #1</span> again and the instruction
179 of <span class="productname">slot #2</span> will be restored to be able
180 to continue the execution of the function.</p>
182 <p>W32 function will finish in its specific
183 @{[ a_href '#calltype','cdecl/stdcall/fastcall call type' ]}, the control
184 will return to the UNIX jump table relay which will debug dump the
185 return value and it will finally pass the control back to the UNIX
186 caller in the standard UNIX
187 @{[ a_href 'CallType.html.pl#calltype_cdecl','cdecl call type' ]}.</p>
189 @{[ doc_img 'fig/functype_patched_pass_fromunix',
190 'Function Type: <span class="constant">pass</span> from UNIX Code' ]}
192 <a name="functype_pass_fromw32"><h3>Pass from W32 Code</h3></a>
194 <p>This function type is similiar to the
195 @{[ a_href 'APITypes.html.pl#functype_pass_fromunix','previous one' ]} with the exception
196 of more complicated entry point. Unfortunately W32 libraries call their
197 own functions directly, using the <span class="instruction">call</span>
198 instructions without any patchable jump table. Even the
199 <span class="instruction">call</span> argument itself cannot be patched
200 according to the relocation table record as such library intra-call
201 instruction has no relocation due to its relative argument offset on
202 <span class="constant">i386</span>. This time the double-breakpoint
203 mechanism @{[ a_href 'APITypes.html.pl#functype_pass_fromunix','described above' ]} gets
204 handy since it will catch the entry point when the function gets
205 called. <span class="constant">SIGSEGV</span> handler gets invoked by
206 the <span class="instruction">hlt</span> instruction and it will
207 redirect the control to the jump table relay function to debug dump the
208 function entry arguments (it has no other uses in this call type).</p>
210 <p>When the relay needs to call the original function it will reach
211 exactly the same breakpoint instruction as during the recent
212 <span class="constant">SIGSEGV</span> handling redirecting to this
213 calling relay. But this time the
214 <span class="constant">through_w32_func</span> field of this function
215 record will be set to to prevent repeated redirection and to pass the
216 control through the breakpoint mangle instead this time.</p>
218 <p>Returning is not much interesting as the first
219 <span class="constant">SIGSEGV</span> handler did a straight jump
220 for the redirection purposes without any needed consequent
223 <p>The jump table relay used for the callers from W32 code is
224 a different one than the relay being used for the callers
225 @{[ a_href 'APITypes.html.pl#functype_pass_fromunix','from UNIX code' ]}. UNIX code always
226 uses relay with external @{[ a_href 'CallType.html.pl#calltype_cdecl','cdecl call type' ]}
227 but in this case a relay with the appropriate
228 @{[ a_href '#calltype','cdecl/stdcall/fastcall call type' ]} is used.</p>
230 @{[ doc_img 'fig/functype_patched_pass_fromw32',
231 'Function Type: <span class="constant">pass</span> from W32 Code' ]}
235 <table border="1" align="center">
236 <tr><td><span class="fname">captivesym</span> keyword</td><td>pass</td></tr>
237 <tr><td>Native code function name </td><td>(no implementation)</td></tr>
238 <tr><td>W32 traced code from UNIX function name </td><td>FUNCNAME</td></tr>
239 <tr><td>W32 traced code from W32 function name </td><td>FUNCNAME_cdecl/_stdcall/_fastcall</td></tr>
240 <tr><td>Entry/exit debug tracing from UNIX code </td><td>yes</td></tr>
241 <tr><td>Entry/exit debug tracing from W32 code </td><td>yes</td></tr>
242 <caption>Function Type <span class="constant">pass</span> Characteristics</caption>
245 <a name="functype_wrap"><h2>Wrap of the Original "ntoskrnl.exe" Function</h2></a>
247 <a name="functype_wrap_fromunix"><h3>Wrapping of Call from UNIX Code</h3></a>
249 <p>The code control flow has no special hardcore features since it is
250 very similiar to <a href="APITypes.html.pl#functype_pass_fromunix">the direct pass to
251 W32 function from UNIX code</a>. All the wrapping is done in the
252 standard UNIX @{[ a_href 'CallType.html.pl#calltype_cdecl','cdecl call type' ]} manner.
253 Jump table debug dumping relays are provided twice — the
254 "outer" one to trace the parameters from the function caller
255 and the "inner" one to trace the call from the wrapper to the
256 original W32 code. The "inner" relay also calls the W32 code
257 with the appropriate <a href="#calltype">cdecl/stdcall/fastcall call
260 @{[ doc_img 'fig/functype_patched_wrap_fromunix',
261 'Function Type: <span class="constant">wrap</span> from UNIX Code' ]}
263 <a name="functype_wrap_fromw32"><h3>Wrapping of Call from W32 Code</h3></a>
265 <p>This scheme is a combination of the
266 <a href="APITypes.html.pl#functype_wrap_fromunix">previous wrap of a call from
267 UNIX code</a> and the <a href="APITypes.html.pl#functype_pass_fromw32">direct pass from
268 the W32 code</a>. The control is caught and redirected by
269 <span class="constant">SIGSEGV</span> handler from the breakpoint
270 placed at the entry to the original W32 function code. The second entry
271 to the original W32 function with the
272 <span class="constant">through_w32_func</span> field of this function
273 description already set is done from the "inner" jump table
274 relay with the appropriate
275 @{[ a_href '#calltype','cdecl/stdcall/fastcall call type' ]}.</p>
277 @{[ doc_img 'fig/functype_patched_wrap_fromw32',
278 'Function Type: <span class="constant">wrap</span> from W32 Code' ]}
282 <p>Some functions can be <a href="APITypes.html.pl#functype_pass">passed to the original
283 code</a> but they need their parameters to be checked/prepared.
284 Currently, such wrapping is only needed for the
285 <span class="function">ExAllocateFromPagedLookasideList()</span> function
286 where it is required due to <a href="#init_ntoskrnl">missing execution of
287 <span class="fname">ntoskrnl.exe</span> initialization execution</a>,
288 which would otherwise properly initialize some internal data structures.
289 In this case the wrapping code detects passing of an uninitialized
290 parameter and will search through the whole
291 <span class="fname">ntoskrnl.exe</span> code body at runtime to find the
292 proper initialization routine containing the correct initialization
293 parameters. Passed addresses of static structures must be differentiated
294 as each of them usually has different initialization parameters. It is
295 proactive to not to have fixed parameters array as these parameters may
296 differ across different <span class="fname">ntoskrnl.exe</span>
299 <table border="1" align="center">
300 <tr><td><span class="fname">captivesym</span> keyword</td><td>wrap</td></tr>
301 <tr><td>Native UNIX wrapping code function name </td><td>FUNCNAME_wrap</td></tr>
302 <tr><td>W32 traced wraping code from UNIX func. name </td><td>FUNCNAME</td></tr>
303 <tr><td>W32 traced wrapping code from W32 func. name </td><td>FUNCNAME_cdecl/_stdcall/...</td></tr>
304 <tr><td>W32 traced original code function name </td><td>FUNCNAME_orig</td></tr>
305 <tr><td>Entry/exit debug tracing from UNIX code </td><td>yes</td></tr>
306 <tr><td>Entry/exit debug tracing from W32 code </td><td>yes</td></tr>
307 <caption>Function Type <span class="constant">wrap</span> Characteristics</caption>
310 <a name="functype_native"><h2>Native Implementation</h2></a>
312 <a name="functype_native_fromunix"><h3>Native Implementation Called from UNIX Code</h3></a>
314 <p>This is the simplest case of a function call as it is fully
315 handled only by the compiler and/or linker.</p>
317 <p>In this case though, no debug dumping call relay is provided — such
318 relay would need to rename the implementations of native functions to
319 prevent its automatic linking with the caller code. This renaming would
320 not be possible to do by simple <span class="constant">#define</span>
321 since it would also rename any calling statements of such function in
322 the same C sources. One of the possibilities to solve would be to
323 utilize <span class="dashdash">--redefine-sym</span> feature of the
324 <span class="productname">objcopy(1)</span> utility. On the other hand
325 there is not much need to catch/debug such calls as both the caller and
326 the callee are provided with full source file debug information for the
327 debugger. Also the callee usually debug dumps its entry/exit parameters
328 by custom debug dumps in the
329 <a href="APITypes.html.pl#functype_native_reactos"><span class="productname">ReactOS</span> implementations</a>.
331 @{[ doc_img 'fig/functype_native_fromunix',
332 'Function Type: <span class="constant">native</span> from UNIX Code' ]}
334 <a name="functype_native_fromw32"><h3>Native Implementation of
335 "unpatched" Library Function Called from W32 Code</h3></a>
337 @{[ doc_img 'fig/functype_unpatched_native_fromw32',
338 'Function Type: <span class="constant">native</span> of <span class="constant">unpatched</span> from W32 Code' ]}
340 <p>Here comes the differentiation if the project deals either with
341 a <span class="constant">patched</span> or an
342 <span class="constant">unpatched</span> version of the library
343 (<span class="constant">patched</span> is a loaded W32 binary
344 library while <span class="constant">unpatched</span> library is
345 completely provided by this project with no use of the library's
346 original W32 binary file). As the project adjusts the exported symbol
347 address during the patching operation, in some cases the
348 <span class="constant">patched</span> library call may be handled
349 simply as <span class="constant">unpatched</span> library call even for
350 the <span class="constant">patched</span> libraries. Fortunately the
351 distinction is not much important as the project is prepared to
352 properly handle both cases.</p>
354 <p>The W32 caller which imported the symbol will be pointed right to
355 the relaying function. The debug dumping relay will be called from W32
356 code with the appropriate
357 @{[ a_href '#calltype','cdecl/stdcall/fastcall call type' ]} while the
358 relay will call the implementation of the native function in the
359 standard UNIX @{[ a_href 'CallType.html.pl#calltype_cdecl','cdecl call type' ]} manner.</p>
361 <a name="functype_native_fromw32_patched"><h3>Native Implementation of "patched" Library Function Called from W32 Code</h3></a>
363 @{[ doc_img 'fig/functype_patched_native_fromw32',
364 'Function Type: <span class="constant">native</span> of <span class="constant">patched</span> from W32 Code' ]}
366 <p>The calling scheme is similiar to the
367 <a href="APITypes.html.pl#functype_native_fromw32">previous call of
368 <span class="constant">unpatched</span> library function from W32
369 code</a> but the call control is redirected from the entry point of the
370 original W32 binary implementation by the breakpoint and its
371 <span class="constant">SIGSEGV</span> handler as in
372 <a href="APITypes.html.pl#functype_pass_fromw32">the case of passing control from W32
375 <p>The original W32 function implementation located in the original
376 loaded binary file is never executed but its entry point needs to be
377 trapped by the breakpoint to be able to catch the function calls within
382 <p>In all cases the final function implementation is a standard UNIX
383 code compiled from C sources with full debug information available
384 for the debugger. Fortunately all such functions do not need to be coded
385 from scratch for this project since there already exist $freespeech
386 $ReactOS and $Wine projects and their code can be used instead.</p>
388 <p>$Wine project is listed mostly for a completeness as almost no
389 code was suitable for reuse as it implements W32 user space while this
390 project is running pure W32 kernel space environment (in $gnulinux user
393 <a name="functype_native_reactos"><h3>Native Implementation
394 - <span class="productname">ReactOS</span></h3></a>
396 <p>Some functions are already implemented in the $ReactOS
397 project and they can be used as they are. Although it would be
398 possible to <a href="APITypes.html.pl#functype_pass">pass some function calls to the
399 original code</a> it is more handy to provide native implementation as
400 there is better control of the data handling during debugging sessions
401 due to the provided debugging symbols.</p>
403 <p>Such functions can be found in
404 <span class="fname">src/libcaptive/reactos/</span> subdirectory.
405 Some functions had to be adjusted for this project
406 - these modifications are compiled conditionally, depending on the
407 <span class="constant">LIBCAPTIVE</span> symbol existence.</p>
409 <p>Later stages of this project reached the level where
410 $ReactOS is yet too immature and the needed functions are usually
411 written just with the sad body:</p>
413 <blockquote class="command">
414 <p>UNIMPLEMENTED;</p>
417 <p>Functions that were not possible to
418 @{[ a_href 'APITypes.html.pl#functype_pass','pass' ]} were reimplemented by this project
419 and placed in the project's implementation directories
420 @{[ a_href '#reactos_nocare','instead of extending' ]} $ReactOS code.</p>
422 <a name="functype_native_wine"><h3>Native Implementation – <span class="productname">Wine</span></h3></a>
424 <p>Even though $Wine only implements the
425 <span class="productname">Microsoft Windows NT</span> user space, there
426 still are some common functions which could be copied from the $Wine
429 <a name="functype_native_libcaptive"><h3>Native Implementation – Project Specific</h3></a>
431 <p>As the last resort it was necessary to provide completely own
432 implementation of some API functions such as PC hardware dependent
433 parts or memory management functions.</p>
437 <table border="1" align="center">
438 <tr><td><span class="fname">captivesym</span> keyword</td><td>(none; just the symbol name)</td></tr>
439 <tr><td>Native code function name </td><td>FUNCTIONNAME</td></tr>
440 <tr><td>Native traced code from W32 code func. name </td><td>FUNCTIONNAME_cdecl/_std...</td></tr>
441 <tr><td>Entry/exit debug tracing from UNIX code </td><td>no</td></tr>
442 <tr><td>Entry/exit debug tracing from W32 code </td><td>yes</td></tr>
443 <caption>Function Type <span class="constant">native</span> Characteristics</caption>
446 <a name="functype_undef"><h2>Undefined Function</h2></a>
448 <p>Functions not defined by any of the previous function types cannot be
449 called by any W32 code including the code of the library implementing
450 such function. All functions of <span class="constant">patch</span>ed
451 libraries not listed in the <span class="fname">captivesym</span> exports
452 file are automatically set to be trapped as fatal program execution
455 <p>It is not necessary to list the symbols as
456 <span class="constant">undef</span> as long as you are just loading the
457 W32 <span class="constant">PE-32</span> code and the symbols belong to
458 <span class="constant">patch</span>ed library. On the other hand if you
459 are loading W32 <span class="fname">.so</span> code or if such symbol is
460 a part of <span class="constant">unpatched</span> library (and thus
461 being completely provided by the project) you need to list such symbol as
462 <span class="constant">undef</span> type to prevent unresolved symbol
465 <table border="1" align="center">
466 <tr><td><span class="fname">captivesym</span> keyword</td><td>undef</td></tr>
467 <tr><td>Native code function name </td><td>(no implementation)</td></tr>
468 <tr><td>Native traced code function name </td><td>FUNCTIONNAME_cdecl/_stdcall/_fastcall</td></tr>
469 <tr><td>Debug tracing message from UNIX code </td><td>yes</td></tr>
470 <tr><td>Debug tracing message from W32 code </td><td>yes</td></tr>
471 <caption>Function Type <span class="constant">undef</span> Characteristics</caption>