4 # Captive project doc APITypes page Perl template.
5 # Copyright (C) 2003 Jan Kratochvil <project-www.jankratochvil.net@jankratochvil.net>
7 # This program is free software; you can redistribute it and/or modify
8 # it under the terms of the GNU General Public License as published by
9 # the Free Software Foundation; exactly version 2 of June 1991 is required
11 # This program is distributed in the hope that it will be useful,
12 # but WITHOUT ANY WARRANTY; without even the implied warranty of
13 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
14 # GNU General Public License for more details.
16 # You should have received a copy of the GNU General Public License
17 # along with this program; if not, write to the Free Software
18 # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
21 package project::captive::doc::APITypes;
22 require 5.6.0; # at least 'use warnings;' but we need some 5.6.0+ modules anyway
23 our $VERSION=do { my @r=(q$Revision$=~/\d+/g); sprintf "%d.".("%03d"x$#r),@r; };
28 BEGIN{ open F,"Makefile"; our $top_dir=pop @{[split /\s/,(grep /^top_srcdir/,<F>)[0]]}; eval "use lib '$top_dir'"; close F; }
31 BEGIN { Wuse 'project::captive::doc::Macros'; }
34 project::captive::doc::Macros->init(
35 "__PACKAGE__"=>__PACKAGE__,
36 "title"=>'Captive NTFS Developer Documentation: API Functions',
37 "rel_prev"=>'Details.html.pl',
38 "rel_next"=>'CallType.html.pl',
45 <a name="functype"><h1>API Function Implementation Choices</h1></a>
47 <p>For each function exported by W32
48 <span class="fname">ntoskrnl.exe</span> and imported and called by the
49 filesystem driver a decision needs to be made to properly implement its
50 functionality. Currently implemented functionality statistics are provided
53 <table border="1" align="center">
54 <tr><th>Function type </th><th>Items</th><th>Portion</th></tr>
55 <tr><td>@{[ a_href 'APITypes.html.pl#functype_pass','pass' ]} </td><td> 81</td><td> 26%</td></tr>
56 <tr><td>@{[ a_href 'APITypes.html.pl#functype_wrap','wrap' ]} </td><td> 2</td><td> 0%</td></tr>
57 <tr><td>@{[ a_href 'APITypes.html.pl#functype_native_reactos','native-ReactOS' ]}</td><td> 113</td><td> 36%</td></tr>
58 <tr><td>@{[ a_href 'APITypes.html.pl#functype_native_libcaptive','native-own' ]} </td><td> 116</td><td> 38%</td></tr>
59 <caption>Function Implementation Types Statistics</caption>
62 @{[ doc_img 'ratio','Functions Reusal Ratio' ]}
64 <p>As there are several choices to implement each function the usual
65 attempts/investigations ordering is listed in the sections below.</p>
67 <p>Special case must be taken for data-type symbols since they are
68 referenced without the possibility of catching the code flow by some
69 breakpoints (it would be possible only in some special access cases). Data
70 export symbols of <span class="constant">unpatched</span> libraries must
71 contain already prepared content at the runtime. There is a problem
72 with <span class="constant">patched</span> libraries where it is necessary
73 to also fully implement the data symbol as
74 @{[ a_href 'APITypes.html.pl#functype_native','native implementation' ]} since there is no
75 possibility to @{[ a_href 'APITypes.html.pl#functype_pass','pass' ]} the data symbol instead of
76 the original W32 data location and therefore there will be two instances of
77 such data variable place. As there will be also the uncaught references for
78 such W32 data location from the <span class="constant">patched</span>
79 library itself such symbols should be usually only some constants (such as
80 <span class="constant">KeNumberProcessors</span>).</p>
82 <p>W32 platform symbols export/import can be based either on the symbol
83 name itself or it can be also exported and imported just by its
84 identification number called <span class="constant">Ordinal</span>.
85 Although it saves some jumptables file binary size it is currently no
86 longer used by W32 binaries and this project also does not support such
87 <span class="constant">Ordinal</span> symbol reference type at all.</p>
89 <p>All the exporting magic is handled by custom script
90 <span class="fname">captivesym</span> processing the definition file
91 <span class="fname">@{[ captive_srcfile 'src/libcaptive/ke/exports.captivesym' ]}</span>
92 to produce the intermediate relaying code
93 <span class="fname">src/libcaptive/ke/exports.c</span>. For details of the
94 <span class="fname">captivesym</span>-specific source file syntax please
95 see its documentation:
96 <span class="fname">@{[ a_href
97 '/project/Pod2Html.html.pl?cvs=priv/captive/src/libcaptive/ke/captivesym.pl',
98 'src/libcaptive/ke/captivesym.pl' ]}</span>
100 <a name="functype_pass"><h2>Direct Pass to Original "ntoskrnl.exe"</h2></a>
102 <p>Simple (standalone) functions such as
103 <span class="function">RtlTimeToSecondsSince1970()</span> can be simply
104 passed to the original implementation in
105 <span class="fname">ntoskrnl.exe</span> as they make no hardware access
106 and they do not expect any special internal data structures to be set up
107 in advance by an earlier library initialization. A common case are all
108 the data structures utility functions such as
109 <span class="constant">GenericTable</span> subsystem or
110 <span class="constant">LargeMcb</span> handling.</p>
112 <a name="functype_pass_fromunix"><h3>Pass from UNIX Code</h3></a>
114 <p>Control flow begins in some standard UNIX code. Such code is always
115 using @{[ a_href 'CallType.html.pl#calltype_cdecl','cdecl call type' ]} for all its
116 intracalls. <a href="APITypes.html.pl#functype_native_reactos">Native functions
117 compiled from <span class="productname">ReactOS</span> sources</a> use
118 their own @{[ a_href '#calltype','cdecl/stdcall/fastcall' ]} declarations
119 but these call type modifications are discarded during compilation for
120 this project by the <span class="constant">LIBCAPTIVE</span>
123 <p>UNIX code calls <span class="function">FUNCTIONNAME()</span> relay
124 from the generated UNIX jump table. Such relay will debug dump the
125 passed arguments and finally pass the control to the original W32
126 function code in the proper call type
127 @{[ a_href '#calltype','cdecl/stdcall/fastcall' ]} for a given
130 <p>Original W32 code entry point is always trapped by a breakpoint
131 although it would not be needed during this specific direct pass from
132 UNIX code to the original W32 implementation. Still the breakpoint has
133 to be there to catch some other (such as intra-W32) possible calls
134 described later. There are several more ways to define breakpoint in
135 the code. One way is to use processor hardware breakpoint support but
136 the number of breakpoints is limited. The other way is to patch in the
137 <span class="instruction">@{[ 'int $3' ]}</span> instruction but it will invoke
138 <span class="constant">SIGTRAP</span> signal handler conflicting with
139 the possible debugger (<span class="productname">gdb(1)</span>)
140 control. This project uses the <span class="instruction">hlt</span>
141 instruction, which also has a single-byte opcode as
142 <span class="instruction">@{[ 'int $3' ]}</span> and it is a privileged
143 instruction forbidden to be used from the UNIX user space code.
144 <span class="instruction">hlt</span> invokes
145 <span class="constant">SIGSEGV</span> signal which can be resolved by
146 a custom signal handler without any conflict with the possible
147 debugger control; <span class="productname">gdb(1)</span> needs the
148 following command to pass through such
149 <span class="constant">SIGSEGV</span> signal:</p>
151 <blockquote class="command">
152 <p>handle SIGSEGV nostop noprint pass</p>
155 <p>When a breakpoint gets caught, we usually need to return to the
156 running code. Unfortunately it is not possible because of the patched
157 breakpoint opcode. The breakpoint cannot be simply removed upon return
158 as it would permanently loose control over the point of entry. Even if
159 the return would include faking of the return address in the bottom
160 stack frame to patch the breakpoint back during later function exit it
161 still would not solve the caughts of inner calls of recursive
162 functions. One of the working possibilities would be to patch the
163 original instruction back and perform a singlestep provided by
164 <span class="function">ptrace(2)</span> syscall. However such
165 singlestep needs another controlling UNIX process and it would again
166 conflict with the debuggers such as
167 <span class="productname">gdb(1)</span>. This project implements the
168 singlestep functionality by two consecutive breakpoints
169 (<span class="instruction">hlt</span> instructions to be specific):
170 The first two instruction addresses of the W32 functions are called
171 <span class="productname">slot #1</span> and
172 <span class="productname">slot #2</span>, the length of the first
173 function instruction has to be analyzed to get the right address of
174 <span class="productname">slot #2</span>. When the first breakpoint is
175 caught it is necessary to patch the original instruction back and also
176 patch another breakpoint in place of
177 <span class="productname">slot #2</span>.
178 During the <span class="productname">slot #2</span> breakpoint
179 invocation the operation will be reverted — the breakpoint will be put
180 to <span class="productname">slot #1</span> again and the instruction
181 of <span class="productname">slot #2</span> will be restored to be able
182 to continue the execution of the function.</p>
184 <p>W32 function will finish in its specific
185 @{[ a_href '#calltype','cdecl/stdcall/fastcall call type' ]}, the control
186 will return to the UNIX jump table relay which will debug dump the
187 return value and it will finally pass the control back to the UNIX
188 caller in the standard UNIX
189 @{[ a_href 'CallType.html.pl#calltype_cdecl','cdecl call type' ]}.</p>
191 @{[ doc_img 'fig/functype_patched_pass_fromunix',
192 'Function Type: <span class="constant">pass</span> from UNIX Code' ]}
194 <a name="functype_pass_fromw32"><h3>Pass from W32 Code</h3></a>
196 <p>This function type is similiar to the
197 @{[ a_href 'APITypes.html.pl#functype_pass_fromunix','previous one' ]} with the exception
198 of more complicated entry point. Unfortunately W32 libraries call their
199 own functions directly, using the <span class="instruction">call</span>
200 instructions without any patchable jump table. Even the
201 <span class="instruction">call</span> argument itself cannot be patched
202 according to the relocation table record as such library intra-call
203 instruction has no relocation due to its relative argument offset on
204 <span class="constant">i386</span>. This time the double-breakpoint
205 mechanism @{[ a_href 'APITypes.html.pl#functype_pass_fromunix','described above' ]} gets
206 handy since it will catch the entry point when the function gets
207 called. <span class="constant">SIGSEGV</span> handler gets invoked by
208 the <span class="instruction">hlt</span> instruction and it will
209 redirect the control to the jump table relay function to debug dump the
210 function entry arguments (it has no other uses in this call type).</p>
212 <p>When the relay needs to call the original function it will reach
213 exactly the same breakpoint instruction as during the recent
214 <span class="constant">SIGSEGV</span> handling redirecting to this
215 calling relay. But this time the
216 <span class="constant">through_w32_func</span> field of this function
217 record will be set to to prevent repeated redirection and to pass the
218 control through the breakpoint mangle instead this time.</p>
220 <p>Returning is not much interesting as the first
221 <span class="constant">SIGSEGV</span> handler did a straight jump
222 for the redirection purposes without any needed consequent
225 <p>The jump table relay used for the callers from W32 code is
226 a different one than the relay being used for the callers
227 @{[ a_href 'APITypes.html.pl#functype_pass_fromunix','from UNIX code' ]}. UNIX code always
228 uses relay with external @{[ a_href 'CallType.html.pl#calltype_cdecl','cdecl call type' ]}
229 but in this case a relay with the appropriate
230 @{[ a_href '#calltype','cdecl/stdcall/fastcall call type' ]} is used.</p>
232 @{[ doc_img 'fig/functype_patched_pass_fromw32',
233 'Function Type: <span class="constant">pass</span> from W32 Code' ]}
237 <table border="1" align="center">
238 <tr><td><span class="fname">captivesym</span> keyword</td><td>pass</td></tr>
239 <tr><td>Native code function name </td><td>(no implementation)</td></tr>
240 <tr><td>W32 traced code from UNIX function name </td><td>FUNCNAME</td></tr>
241 <tr><td>W32 traced code from W32 function name </td><td>FUNCNAME_cdecl/_stdcall/_fastcall</td></tr>
242 <tr><td>Entry/exit debug tracing from UNIX code </td><td>yes</td></tr>
243 <tr><td>Entry/exit debug tracing from W32 code </td><td>yes</td></tr>
244 <caption>Function Type <span class="constant">pass</span> Characteristics</caption>
247 <a name="functype_wrap"><h2>Wrap of the Original "ntoskrnl.exe" Function</h2></a>
249 <a name="functype_wrap_fromunix"><h3>Wrapping of Call from UNIX Code</h3></a>
251 <p>The code control flow has no special hardcore features since it is
252 very similiar to <a href="APITypes.html.pl#functype_pass_fromunix">the direct pass to
253 W32 function from UNIX code</a>. All the wrapping is done in the
254 standard UNIX @{[ a_href 'CallType.html.pl#calltype_cdecl','cdecl call type' ]} manner.
255 Jump table debug dumping relays are provided twice — the
256 "outer" one to trace the parameters from the function caller
257 and the "inner" one to trace the call from the wrapper to the
258 original W32 code. The "inner" relay also calls the W32 code
259 with the appropriate <a href="#calltype">cdecl/stdcall/fastcall call
262 @{[ doc_img 'fig/functype_patched_wrap_fromunix',
263 'Function Type: <span class="constant">wrap</span> from UNIX Code' ]}
265 <a name="functype_wrap_fromw32"><h3>Wrapping of Call from W32 Code</h3></a>
267 <p>This scheme is a combination of the
268 <a href="APITypes.html.pl#functype_wrap_fromunix">previous wrap of a call from
269 UNIX code</a> and the <a href="APITypes.html.pl#functype_pass_fromw32">direct pass from
270 the W32 code</a>. The control is caught and redirected by
271 <span class="constant">SIGSEGV</span> handler from the breakpoint
272 placed at the entry to the original W32 function code. The second entry
273 to the original W32 function with the
274 <span class="constant">through_w32_func</span> field of this function
275 description already set is done from the "inner" jump table
276 relay with the appropriate
277 @{[ a_href '#calltype','cdecl/stdcall/fastcall call type' ]}.</p>
279 @{[ doc_img 'fig/functype_patched_wrap_fromw32',
280 'Function Type: <span class="constant">wrap</span> from W32 Code' ]}
284 <p>Some functions can be <a href="APITypes.html.pl#functype_pass">passed to the original
285 code</a> but they need their parameters to be checked/prepared.
286 Currently, such wrapping is only needed for the
287 <span class="function">ExAllocateFromPagedLookasideList()</span> function
288 where it is required due to <a href="#init_ntoskrnl">missing execution of
289 <span class="fname">ntoskrnl.exe</span> initialization execution</a>,
290 which would otherwise properly initialize some internal data structures.
291 In this case the wrapping code detects passing of an uninitialized
292 parameter and will search through the whole
293 <span class="fname">ntoskrnl.exe</span> code body at runtime to find the
294 proper initialization routine containing the correct initialization
295 parameters. Passed addresses of static structures must be differentiated
296 as each of them usually has different initialization parameters. It is
297 proactive to not to have fixed parameters array as these parameters may
298 differ across different <span class="fname">ntoskrnl.exe</span>
301 <table border="1" align="center">
302 <tr><td><span class="fname">captivesym</span> keyword</td><td>wrap</td></tr>
303 <tr><td>Native UNIX wrapping code function name </td><td>FUNCNAME_wrap</td></tr>
304 <tr><td>W32 traced wraping code from UNIX func. name </td><td>FUNCNAME</td></tr>
305 <tr><td>W32 traced wrapping code from W32 func. name </td><td>FUNCNAME_cdecl/_stdcall/...</td></tr>
306 <tr><td>W32 traced original code function name </td><td>FUNCNAME_orig</td></tr>
307 <tr><td>Entry/exit debug tracing from UNIX code </td><td>yes</td></tr>
308 <tr><td>Entry/exit debug tracing from W32 code </td><td>yes</td></tr>
309 <caption>Function Type <span class="constant">wrap</span> Characteristics</caption>
312 <a name="functype_native"><h2>Native Implementation</h2></a>
314 <a name="functype_native_fromunix"><h3>Native Implementation Called from UNIX Code</h3></a>
316 <p>This is the simplest case of a function call as it is fully
317 handled only by the compiler and/or linker.</p>
319 <p>In this case though, no debug dumping call relay is provided — such
320 relay would need to rename the implementations of native functions to
321 prevent its automatic linking with the caller code. This renaming would
322 not be possible to do by simple <span class="constant">#define</span>
323 since it would also rename any calling statements of such function in
324 the same C sources. One of the possibilities to solve would be to
325 utilize <span class="dashdash">--redefine-sym</span> feature of the
326 <span class="productname">objcopy(1)</span> utility. On the other hand
327 there is not much need to catch/debug such calls as both the caller and
328 the callee are provided with full source file debug information for the
329 debugger. Also the callee usually debug dumps its entry/exit parameters
330 by custom debug dumps in the
331 <a href="APITypes.html.pl#functype_native_reactos"><span class="productname">ReactOS</span> implementations</a>.
333 @{[ doc_img 'fig/functype_native_fromunix',
334 'Function Type: <span class="constant">native</span> from UNIX Code' ]}
336 <a name="functype_native_fromw32"><h3>Native Implementation of
337 "unpatched" Library Function Called from W32 Code</h3></a>
339 @{[ doc_img 'fig/functype_unpatched_native_fromw32',
340 'Function Type: <span class="constant">native</span> of <span class="constant">unpatched</span> from W32 Code' ]}
342 <p>Here comes the differentiation if the project deals either with
343 a <span class="constant">patched</span> or an
344 <span class="constant">unpatched</span> version of the library
345 (<span class="constant">patched</span> is a loaded W32 binary
346 library while <span class="constant">unpatched</span> library is
347 completely provided by this project with no use of the library's
348 original W32 binary file). As the project adjusts the exported symbol
349 address during the patching operation, in some cases the
350 <span class="constant">patched</span> library call may be handled
351 simply as <span class="constant">unpatched</span> library call even for
352 the <span class="constant">patched</span> libraries. Fortunately the
353 distinction is not much important as the project is prepared to
354 properly handle both cases.</p>
356 <p>The W32 caller which imported the symbol will be pointed right to
357 the relaying function. The debug dumping relay will be called from W32
358 code with the appropriate
359 @{[ a_href '#calltype','cdecl/stdcall/fastcall call type' ]} while the
360 relay will call the implementation of the native function in the
361 standard UNIX @{[ a_href 'CallType.html.pl#calltype_cdecl','cdecl call type' ]} manner.</p>
363 <a name="functype_native_fromw32_patched"><h3>Native Implementation of "patched" Library Function Called from W32 Code</h3></a>
365 @{[ doc_img 'fig/functype_patched_native_fromw32',
366 'Function Type: <span class="constant">native</span> of <span class="constant">patched</span> from W32 Code' ]}
368 <p>The calling scheme is similiar to the
369 <a href="APITypes.html.pl#functype_native_fromw32">previous call of
370 <span class="constant">unpatched</span> library function from W32
371 code</a> but the call control is redirected from the entry point of the
372 original W32 binary implementation by the breakpoint and its
373 <span class="constant">SIGSEGV</span> handler as in
374 <a href="APITypes.html.pl#functype_pass_fromw32">the case of passing control from W32
377 <p>The original W32 function implementation located in the original
378 loaded binary file is never executed but its entry point needs to be
379 trapped by the breakpoint to be able to catch the function calls within
384 <p>In all cases the final function implementation is a standard UNIX
385 code compiled from C sources with full debug information available
386 for the debugger. Fortunately all such functions do not need to be coded
387 from scratch for this project since there already exist $freespeech
388 $ReactOS and $Wine projects and their code can be used instead.</p>
390 <p>$Wine project is listed mostly for a completeness as almost no
391 code was suitable for reuse as it implements W32 user space while this
392 project is running pure W32 kernel space environment (in $gnulinux user
395 <a name="functype_native_reactos"><h3>Native Implementation
396 - <span class="productname">ReactOS</span></h3></a>
398 <p>Some functions are already implemented in the $ReactOS
399 project and they can be used as they are. Although it would be
400 possible to <a href="APITypes.html.pl#functype_pass">pass some function calls to the
401 original code</a> it is more handy to provide native implementation as
402 there is better control of the data handling during debugging sessions
403 due to the provided debugging symbols.</p>
405 <p>Such functions can be found in
406 <span class="fname">src/libcaptive/reactos/</span> subdirectory.
407 Some functions had to be adjusted for this project
408 - these modifications are compiled conditionally, depending on the
409 <span class="constant">LIBCAPTIVE</span> symbol existence.</p>
411 <p>Later stages of this project reached the level where
412 $ReactOS is yet too immature and the needed functions are usually
413 written just with the sad body:</p>
415 <blockquote class="command">
416 <p>UNIMPLEMENTED;</p>
419 <p>Functions that were not possible to
420 @{[ a_href 'APITypes.html.pl#functype_pass','pass' ]} were reimplemented by this project
421 and placed in the project's implementation directories
422 @{[ a_href '#reactos_nocare','instead of extending' ]} $ReactOS code.</p>
424 <a name="functype_native_wine"><h3>Native Implementation – <span class="productname">Wine</span></h3></a>
426 <p>Even though $Wine only implements the
427 <span class="productname">Microsoft Windows NT</span> user space, there
428 still are some common functions which could be copied from the $Wine
431 <a name="functype_native_libcaptive"><h3>Native Implementation – Project Specific</h3></a>
433 <p>As the last resort it was necessary to provide completely own
434 implementation of some API functions such as PC hardware dependent
435 parts or memory management functions.</p>
439 <table border="1" align="center">
440 <tr><td><span class="fname">captivesym</span> keyword</td><td>(none; just the symbol name)</td></tr>
441 <tr><td>Native code function name </td><td>FUNCTIONNAME</td></tr>
442 <tr><td>Native traced code from W32 code func. name </td><td>FUNCTIONNAME_cdecl/_std...</td></tr>
443 <tr><td>Entry/exit debug tracing from UNIX code </td><td>no</td></tr>
444 <tr><td>Entry/exit debug tracing from W32 code </td><td>yes</td></tr>
445 <caption>Function Type <span class="constant">native</span> Characteristics</caption>
448 <a name="functype_undef"><h2>Undefined Function</h2></a>
450 <p>Functions not defined by any of the previous function types cannot be
451 called by any W32 code including the code of the library implementing
452 such function. All functions of <span class="constant">patch</span>ed
453 libraries not listed in the <span class="fname">captivesym</span> exports
454 file are automatically set to be trapped as fatal program execution
457 <p>It is not necessary to list the symbols as
458 <span class="constant">undef</span> as long as you are just loading the
459 W32 <span class="constant">PE-32</span> code and the symbols belong to
460 <span class="constant">patch</span>ed library. On the other hand if you
461 are loading W32 <span class="fname">.so</span> code or if such symbol is
462 a part of <span class="constant">unpatched</span> library (and thus
463 being completely provided by the project) you need to list such symbol as
464 <span class="constant">undef</span> type to prevent unresolved symbol
467 <table border="1" align="center">
468 <tr><td><span class="fname">captivesym</span> keyword</td><td>undef</td></tr>
469 <tr><td>Native code function name </td><td>(no implementation)</td></tr>
470 <tr><td>Native traced code function name </td><td>FUNCTIONNAME_cdecl/_stdcall/_fastcall</td></tr>
471 <tr><td>Debug tracing message from UNIX code </td><td>yes</td></tr>
472 <tr><td>Debug tracing message from W32 code </td><td>yes</td></tr>
473 <caption>Function Type <span class="constant">undef</span> Characteristics</caption>
480 project::captive::doc::Macros->footer();