2 # Captive project doc APITypes page Perl template.
3 # Copyright (C) 2003-2005 Jan Kratochvil <project-www.jankratochvil.net@jankratochvil.net>
5 # This program is free software; you can redistribute it and/or modify
6 # it under the terms of the GNU General Public License as published by
7 # the Free Software Foundation; exactly version 2 of June 1991 is required
9 # This program is distributed in the hope that it will be useful,
10 # but WITHOUT ANY WARRANTY; without even the implied warranty of
11 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
12 # GNU General Public License for more details.
14 # You should have received a copy of the GNU General Public License
15 # along with this program; if not, write to the Free Software
16 # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
19 package project::captive::doc::APITypes;
20 require 5.6.0; # at least 'use warnings;' but we need some 5.6.0+ modules anyway
21 our $VERSION=do { my @r=(q$Revision$=~/\d+/g); sprintf "%d.".("%03d"x$#r),@r; };
28 BEGIN { Wuse 'project::captive::doc::Macros'; }
33 project::captive::doc::Macros->init(
34 "title"=>'Captive NTFS Developer Documentation: API Functions',
35 "rel_prev"=>'Details.pm',
36 "rel_next"=>'CallType.pm',
43 <h1 id="functype">API Function Implementation Choices</h1>
45 <p>For each function exported by W32
46 <span class="fname">ntoskrnl.exe</span> and imported and called by the
47 filesystem driver a decision needs to be made to properly implement its
48 functionality. Currently implemented functionality statistics are provided
51 <table border="0" width="100%"><tr><td align="center"><table border="1">
52 <caption>Function Implementation Types Statistics</caption>
53 <tr><th>Function type </th><th>Items</th><th>Portion</th></tr>
54 <tr><td>@{[ a_href 'APITypes.pm#functype_pass','pass' ]} </td><td> 81</td><td> 26%</td></tr>
55 <tr><td>@{[ a_href 'APITypes.pm#functype_wrap','wrap' ]} </td><td> 2</td><td> 0%</td></tr>
56 <tr><td>@{[ a_href 'APITypes.pm#functype_native_reactos','native-ReactOS' ]}</td><td> 113</td><td> 36%</td></tr>
57 <tr><td>@{[ a_href 'APITypes.pm#functype_native_libcaptive','native-own' ]} </td><td> 116</td><td> 38%</td></tr>
58 </table></td></tr></table>
60 @{[ doc_img 'ratio','Functions Reusal Ratio' ]}
62 <p>As there are several choices to implement each function the usual
63 attempts/investigations ordering is listed in the sections below.</p>
65 <p>Special case must be taken for data-type symbols since they are
66 referenced without the possibility of catching the code flow by some
67 breakpoints (it would be possible only in some special access cases). Data
68 export symbols of <span class="constant">unpatched</span> libraries must
69 contain already prepared content at the runtime. There is a problem
70 with <span class="constant">patched</span> libraries where it is necessary
71 to also fully implement the data symbol as
72 @{[ a_href 'APITypes.pm#functype_native','native implementation' ]} since there is no
73 possibility to @{[ a_href 'APITypes.pm#functype_pass','pass' ]} the data symbol instead of
74 the original W32 data location and therefore there will be two instances of
75 such data variable place. As there will be also the uncaught references for
76 such W32 data location from the <span class="constant">patched</span>
77 library itself such symbols should be usually only some constants (such as
78 <span class="constant">KeNumberProcessors</span>).</p>
80 <p>W32 platform symbols export/import can be based either on the symbol
81 name itself or it can be also exported and imported just by its
82 identification number called <span class="constant">Ordinal</span>.
83 Although it saves some jumptables file binary size it is currently no
84 longer used by W32 binaries and this project also does not support such
85 <span class="constant">Ordinal</span> symbol reference type at all.</p>
87 <p>All the exporting magic is handled by custom script
88 <span class="fname">captivesym</span> processing the definition file
89 <span class="fname">@{[ captive_srcfile 'src/libcaptive/ke/exports.captivesym' ]}</span>
90 to produce the intermediate relaying code
91 <span class="fname">src/libcaptive/ke/exports.c</span>. For details of the
92 <span class="fname">captivesym</span>-specific source file syntax please
93 see its documentation:
94 <span class="fname">@{[ a_href
95 '/project/Pod2Html.pm?cvs=captive/src/libcaptive/ke/captivesym.pl',
96 'src/libcaptive/ke/captivesym.pl' ]}</span>
99 <h2 id="functype_pass">Direct Pass to Original "ntoskrnl.exe"</h2>
101 <p>Simple (standalone) functions such as
102 <span class="function">RtlTimeToSecondsSince1970()</span> can be simply
103 passed to the original implementation in
104 <span class="fname">ntoskrnl.exe</span> as they make no hardware access
105 and they do not expect any special internal data structures to be set up
106 in advance by an earlier library initialization. A common case are all
107 the data structures utility functions such as
108 <span class="constant">GenericTable</span> subsystem or
109 <span class="constant">LargeMcb</span> handling.</p>
111 <h3 id="functype_pass_fromunix">Pass from UNIX Code</h3>
113 <p>Control flow begins in some standard UNIX code. Such code is always
114 using @{[ a_href 'CallType.pm#calltype_cdecl','cdecl call type' ]} for all its
115 intracalls. <a href="APITypes.pm#functype_native_reactos">Native functions
116 compiled from <span class="productname">ReactOS</span> sources</a> use
117 their own @{[ a_href '#calltype','cdecl/stdcall/fastcall' ]} declarations
118 but these call type modifications are discarded during compilation for
119 this project by the <span class="constant">LIBCAPTIVE</span>
122 <p>UNIX code calls <span class="function">FUNCTIONNAME()</span> relay
123 from the generated UNIX jump table. Such relay will debug dump the
124 passed arguments and finally pass the control to the original W32
125 function code in the proper call type
126 @{[ a_href '#calltype','cdecl/stdcall/fastcall' ]} for a given
129 <p>Original W32 code entry point is always trapped by a breakpoint
130 although it would not be needed during this specific direct pass from
131 UNIX code to the original W32 implementation. Still the breakpoint has
132 to be there to catch some other (such as intra-W32) possible calls
133 described later. There are several more ways to define breakpoint in
134 the code. One way is to use processor hardware breakpoint support but
135 the number of breakpoints is limited. The other way is to patch in the
136 <span class="instruction">@{[ 'int $3' ]}</span> instruction but it will invoke
137 <span class="constant">SIGTRAP</span> signal handler conflicting with
138 the possible debugger (<span class="productname">gdb(1)</span>)
139 control. This project uses the <span class="instruction">hlt</span>
140 instruction, which also has a single-byte opcode as
141 <span class="instruction">@{[ 'int $3' ]}</span> and it is a privileged
142 instruction forbidden to be used from the UNIX user space code.
143 <span class="instruction">hlt</span> invokes
144 <span class="constant">SIGSEGV</span> signal which can be resolved by
145 a custom signal handler without any conflict with the possible
146 debugger control; <span class="productname">gdb(1)</span> needs the
147 following command to pass through such
148 <span class="constant">SIGSEGV</span> signal:</p>
150 <blockquote class="command">
151 <p>handle SIGSEGV nostop noprint pass</p>
154 <p>When a breakpoint gets caught, we usually need to return to the
155 running code. Unfortunately it is not possible because of the patched
156 breakpoint opcode. The breakpoint cannot be simply removed upon return
157 as it would permanently loose control over the point of entry. Even if
158 the return would include faking of the return address in the bottom
159 stack frame to patch the breakpoint back during later function exit it
160 still would not solve the caughts of inner calls of recursive
161 functions. One of the working possibilities would be to patch the
162 original instruction back and perform a singlestep provided by
163 <span class="function">ptrace(2)</span> syscall. However such
164 singlestep needs another controlling UNIX process and it would again
165 conflict with the debuggers such as
166 <span class="productname">gdb(1)</span>. This project implements the
167 singlestep functionality by two consecutive breakpoints
168 (<span class="instruction">hlt</span> instructions to be specific):
169 The first two instruction addresses of the W32 functions are called
170 <span class="productname">slot #1</span> and
171 <span class="productname">slot #2</span>, the length of the first
172 function instruction has to be analyzed to get the right address of
173 <span class="productname">slot #2</span>. When the first breakpoint is
174 caught it is necessary to patch the original instruction back and also
175 patch another breakpoint in place of
176 <span class="productname">slot #2</span>.
177 During the <span class="productname">slot #2</span> breakpoint
178 invocation the operation will be reverted — the breakpoint will be put
179 to <span class="productname">slot #1</span> again and the instruction
180 of <span class="productname">slot #2</span> will be restored to be able
181 to continue the execution of the function.</p>
183 <p>W32 function will finish in its specific
184 @{[ a_href '#calltype','cdecl/stdcall/fastcall call type' ]}, the control
185 will return to the UNIX jump table relay which will debug dump the
186 return value and it will finally pass the control back to the UNIX
187 caller in the standard UNIX
188 @{[ a_href 'CallType.pm#calltype_cdecl','cdecl call type' ]}.</p>
190 @{[ doc_img 'fig/functype_patched_pass_fromunix',
191 'Function Type: <span class="constant">pass</span> from UNIX Code' ]}
193 <h3 id="functype_pass_fromw32">Pass from W32 Code</h3>
195 <p>This function type is similiar to the
196 @{[ a_href 'APITypes.pm#functype_pass_fromunix','previous one' ]} with the exception
197 of more complicated entry point. Unfortunately W32 libraries call their
198 own functions directly, using the <span class="instruction">call</span>
199 instructions without any patchable jump table. Even the
200 <span class="instruction">call</span> argument itself cannot be patched
201 according to the relocation table record as such library intra-call
202 instruction has no relocation due to its relative argument offset on
203 <span class="constant">i386</span>. This time the double-breakpoint
204 mechanism @{[ a_href 'APITypes.pm#functype_pass_fromunix','described above' ]} gets
205 handy since it will catch the entry point when the function gets
206 called. <span class="constant">SIGSEGV</span> handler gets invoked by
207 the <span class="instruction">hlt</span> instruction and it will
208 redirect the control to the jump table relay function to debug dump the
209 function entry arguments (it has no other uses in this call type).</p>
211 <p>When the relay needs to call the original function it will reach
212 exactly the same breakpoint instruction as during the recent
213 <span class="constant">SIGSEGV</span> handling redirecting to this
214 calling relay. But this time the
215 <span class="constant">through_w32_func</span> field of this function
216 record will be set to to prevent repeated redirection and to pass the
217 control through the breakpoint mangle instead this time.</p>
219 <p>Returning is not much interesting as the first
220 <span class="constant">SIGSEGV</span> handler did a straight jump
221 for the redirection purposes without any needed consequent
224 <p>The jump table relay used for the callers from W32 code is
225 a different one than the relay being used for the callers
226 @{[ a_href 'APITypes.pm#functype_pass_fromunix','from UNIX code' ]}. UNIX code always
227 uses relay with external @{[ a_href 'CallType.pm#calltype_cdecl','cdecl call type' ]}
228 but in this case a relay with the appropriate
229 @{[ a_href '#calltype','cdecl/stdcall/fastcall call type' ]} is used.</p>
231 @{[ doc_img 'fig/functype_patched_pass_fromw32',
232 'Function Type: <span class="constant">pass</span> from W32 Code' ]}
236 <table border="0" width="100%"><tr><td align="center"><table border="1">
237 <caption>Function Type <span class="constant">pass</span> Characteristics</caption>
238 <tr><td><span class="fname">captivesym</span> keyword</td><td>pass</td></tr>
239 <tr><td>Native code function name </td><td>(no implementation)</td></tr>
240 <tr><td>W32 traced code from UNIX function name </td><td>FUNCNAME</td></tr>
241 <tr><td>W32 traced code from W32 function name </td><td>FUNCNAME_cdecl/_stdcall/_fastcall</td></tr>
242 <tr><td>Entry/exit debug tracing from UNIX code </td><td>yes</td></tr>
243 <tr><td>Entry/exit debug tracing from W32 code </td><td>yes</td></tr>
244 </table></td></tr></table>
246 <h2 id="functype_wrap">Wrap of the Original "ntoskrnl.exe" Function</h2>
248 <h3 id="functype_wrap_fromunix">Wrapping of Call from UNIX Code</h3>
250 <p>The code control flow has no special hardcore features since it is
251 very similiar to <a href="APITypes.pm#functype_pass_fromunix">the direct pass to
252 W32 function from UNIX code</a>. All the wrapping is done in the
253 standard UNIX @{[ a_href 'CallType.pm#calltype_cdecl','cdecl call type' ]} manner.
254 Jump table debug dumping relays are provided twice — the
255 "outer" one to trace the parameters from the function caller
256 and the "inner" one to trace the call from the wrapper to the
257 original W32 code. The "inner" relay also calls the W32 code
258 with the appropriate <a href="#calltype">cdecl/stdcall/fastcall call
261 @{[ doc_img 'fig/functype_patched_wrap_fromunix',
262 'Function Type: <span class="constant">wrap</span> from UNIX Code' ]}
264 <h3 id="functype_wrap_fromw32">Wrapping of Call from W32 Code</h3>
266 <p>This scheme is a combination of the
267 <a href="APITypes.pm#functype_wrap_fromunix">previous wrap of a call from
268 UNIX code</a> and the <a href="APITypes.pm#functype_pass_fromw32">direct pass from
269 the W32 code</a>. The control is caught and redirected by
270 <span class="constant">SIGSEGV</span> handler from the breakpoint
271 placed at the entry to the original W32 function code. The second entry
272 to the original W32 function with the
273 <span class="constant">through_w32_func</span> field of this function
274 description already set is done from the "inner" jump table
275 relay with the appropriate
276 @{[ a_href '#calltype','cdecl/stdcall/fastcall call type' ]}.</p>
278 @{[ doc_img 'fig/functype_patched_wrap_fromw32',
279 'Function Type: <span class="constant">wrap</span> from W32 Code' ]}
283 <p>Some functions can be <a href="APITypes.pm#functype_pass">passed to the original
284 code</a> but they need their parameters to be checked/prepared.
285 Currently, such wrapping is only needed for the
286 <span class="function">ExAllocateFromPagedLookasideList()</span> function
287 where it is required due to <a href="#init_ntoskrnl">missing execution of
288 <span class="fname">ntoskrnl.exe</span> initialization execution</a>,
289 which would otherwise properly initialize some internal data structures.
290 In this case the wrapping code detects passing of an uninitialized
291 parameter and will search through the whole
292 <span class="fname">ntoskrnl.exe</span> code body at runtime to find the
293 proper initialization routine containing the correct initialization
294 parameters. Passed addresses of static structures must be differentiated
295 as each of them usually has different initialization parameters. It is
296 proactive to not to have fixed parameters array as these parameters may
297 differ across different <span class="fname">ntoskrnl.exe</span>
300 <table border="0" width="100%"><tr><td align="center"><table border="1">
301 <caption>Function Type <span class="constant">wrap</span> Characteristics</caption>
302 <tr><td><span class="fname">captivesym</span> keyword</td><td>wrap</td></tr>
303 <tr><td>Native UNIX wrapping code function name </td><td>FUNCNAME_wrap</td></tr>
304 <tr><td>W32 traced wraping code from UNIX func. name </td><td>FUNCNAME</td></tr>
305 <tr><td>W32 traced wrapping code from W32 func. name </td><td>FUNCNAME_cdecl/_stdcall/...</td></tr>
306 <tr><td>W32 traced original code function name </td><td>FUNCNAME_orig</td></tr>
307 <tr><td>Entry/exit debug tracing from UNIX code </td><td>yes</td></tr>
308 <tr><td>Entry/exit debug tracing from W32 code </td><td>yes</td></tr>
309 </table></td></tr></table>
311 <h2 id="functype_native">Native Implementation</h2>
313 <h3 id="functype_native_fromunix">Native Implementation Called from UNIX Code</h3>
315 <p>This is the simplest case of a function call as it is fully
316 handled only by the compiler and/or linker.</p>
318 <p>In this case though, no debug dumping call relay is provided — such
319 relay would need to rename the implementations of native functions to
320 prevent its automatic linking with the caller code. This renaming would
321 not be possible to do by simple <span class="constant">#define</span>
322 since it would also rename any calling statements of such function in
323 the same C sources. One of the possibilities to solve would be to
324 utilize <span class="dashdash">--redefine-sym</span> feature of the
325 <span class="productname">objcopy(1)</span> utility. On the other hand
326 there is not much need to catch/debug such calls as both the caller and
327 the callee are provided with full source file debug information for the
328 debugger. Also the callee usually debug dumps its entry/exit parameters
329 by custom debug dumps in the
330 <a href="APITypes.pm#functype_native_reactos"><span class="productname">ReactOS</span> implementations</a>.
333 @{[ doc_img 'fig/functype_native_fromunix',
334 'Function Type: <span class="constant">native</span> from UNIX Code' ]}
336 <h3 id="functype_native_fromw32">Native Implementation of
337 "unpatched" Library Function Called from W32 Code</h3>
339 @{[ doc_img 'fig/functype_unpatched_native_fromw32',
340 'Function Type: <span class="constant">native</span> of <span class="constant">unpatched</span> from W32 Code' ]}
342 <p>Here comes the differentiation if the project deals either with
343 a <span class="constant">patched</span> or an
344 <span class="constant">unpatched</span> version of the library
345 (<span class="constant">patched</span> is a loaded W32 binary
346 library while <span class="constant">unpatched</span> library is
347 completely provided by this project with no use of the library's
348 original W32 binary file). As the project adjusts the exported symbol
349 address during the patching operation, in some cases the
350 <span class="constant">patched</span> library call may be handled
351 simply as <span class="constant">unpatched</span> library call even for
352 the <span class="constant">patched</span> libraries. Fortunately the
353 distinction is not much important as the project is prepared to
354 properly handle both cases.</p>
356 <p>The W32 caller which imported the symbol will be pointed right to
357 the relaying function. The debug dumping relay will be called from W32
358 code with the appropriate
359 @{[ a_href '#calltype','cdecl/stdcall/fastcall call type' ]} while the
360 relay will call the implementation of the native function in the
361 standard UNIX @{[ a_href 'CallType.pm#calltype_cdecl','cdecl call type' ]} manner.</p>
363 <h3 id="functype_native_fromw32_patched">Native Implementation of "patched" Library Function Called from W32 Code</h3>
365 @{[ doc_img 'fig/functype_patched_native_fromw32',
366 'Function Type: <span class="constant">native</span> of <span class="constant">patched</span> from W32 Code' ]}
368 <p>The calling scheme is similiar to the
369 <a href="APITypes.pm#functype_native_fromw32">previous call of
370 <span class="constant">unpatched</span> library function from W32
371 code</a> but the call control is redirected from the entry point of the
372 original W32 binary implementation by the breakpoint and its
373 <span class="constant">SIGSEGV</span> handler as in
374 <a href="APITypes.pm#functype_pass_fromw32">the case of passing control from W32
377 <p>The original W32 function implementation located in the original
378 loaded binary file is never executed but its entry point needs to be
379 trapped by the breakpoint to be able to catch the function calls within
384 <p>In all cases the final function implementation is a standard UNIX
385 code compiled from C sources with full debug information available
386 for the debugger. Fortunately all such functions do not need to be coded
387 from scratch for this project since there already exist $freespeech
388 $ReactOS and $Wine projects and their code can be used instead.</p>
390 <p>$Wine project is listed mostly for a completeness as almost no
391 code was suitable for reuse as it implements W32 user space while this
392 project is running pure W32 kernel space environment (in $gnulinux user
395 <h3 id="functype_native_reactos">Native Implementation
396 - <span class="productname">ReactOS</span></h3>
398 <p>Some functions are already implemented in the $ReactOS
399 project and they can be used as they are. Although it would be
400 possible to <a href="APITypes.pm#functype_pass">pass some function calls to the
401 original code</a> it is more handy to provide native implementation as
402 there is better control of the data handling during debugging sessions
403 due to the provided debugging symbols.</p>
405 <p>Such functions can be found in
406 <span class="fname">src/libcaptive/reactos/</span> subdirectory.
407 Some functions had to be adjusted for this project
408 - these modifications are compiled conditionally, depending on the
409 <span class="constant">LIBCAPTIVE</span> symbol existence.</p>
411 <p>Later stages of this project reached the level where
412 $ReactOS is yet too immature and the needed functions are usually
413 written just with the sad body:</p>
415 <blockquote class="command">
416 <p>UNIMPLEMENTED;</p>
419 <p>Functions that were not possible to
420 @{[ a_href 'APITypes.pm#functype_pass','pass' ]} were reimplemented by this project
421 and placed in the project's implementation directories
422 @{[ a_href '#reactos_nocare','instead of extending' ]} $ReactOS code.</p>
424 <h3 id="functype_native_wine">Native Implementation – <span class="productname">Wine</span></h3>
426 <p>Even though $Wine only implements the
427 <span class="productname">Microsoft Windows NT</span> user space, there
428 still are some common functions which could be copied from the $Wine
431 <h3 id="functype_native_libcaptive">Native Implementation – Project Specific</h3>
433 <p>As the last resort it was necessary to provide completely own
434 implementation of some API functions such as PC hardware dependent
435 parts or memory management functions.</p>
439 <table border="0" width="100%"><tr><td align="center"><table border="1">
440 <caption>Function Type <span class="constant">native</span> Characteristics</caption>
441 <tr><td><span class="fname">captivesym</span> keyword</td><td>(none; just the symbol name)</td></tr>
442 <tr><td>Native code function name </td><td>FUNCTIONNAME</td></tr>
443 <tr><td>Native traced code from W32 code func. name </td><td>FUNCTIONNAME_cdecl/_std...</td></tr>
444 <tr><td>Entry/exit debug tracing from UNIX code </td><td>no</td></tr>
445 <tr><td>Entry/exit debug tracing from W32 code </td><td>yes</td></tr>
446 </table></td></tr></table>
448 <h2 id="functype_undef">Undefined Function</h2>
450 <p>Functions not defined by any of the previous function types cannot be
451 called by any W32 code including the code of the library implementing
452 such function. All functions of <span class="constant">patch</span>ed
453 libraries not listed in the <span class="fname">captivesym</span> exports
454 file are automatically set to be trapped as fatal program execution
457 <p>It is not necessary to list the symbols as
458 <span class="constant">undef</span> as long as you are just loading the
459 W32 <span class="constant">PE-32</span> code and the symbols belong to
460 <span class="constant">patch</span>ed library. On the other hand if you
461 are loading W32 <span class="fname">.so</span> code or if such symbol is
462 a part of <span class="constant">unpatched</span> library (and thus
463 being completely provided by the project) you need to list such symbol as
464 <span class="constant">undef</span> type to prevent unresolved symbol
467 <table border="0" width="100%"><tr><td align="center"><table border="1">
468 <caption>Function Type <span class="constant">undef</span> Characteristics</caption>
469 <tr><td><span class="fname">captivesym</span> keyword</td><td>undef</td></tr>
470 <tr><td>Native code function name </td><td>(no implementation)</td></tr>
471 <tr><td>Native traced code function name </td><td>FUNCTIONNAME_cdecl/_stdcall/_fastcall</td></tr>
472 <tr><td>Debug tracing message from UNIX code </td><td>yes</td></tr>
473 <tr><td>Debug tracing message from W32 code </td><td>yes</td></tr>
474 </table></td></tr></table>
480 project::captive::doc::Macros->footer();