Correct the documentation

- Fix some style issue, typos, and examples - Follow the variable naming conventions - Fix tables both in the project and on the webpage JerryScript-DCO-1.0-Signed-off-by: Zsolt Borbély zsborbely.u-szeged@partner.samsung.com
2016-08-24 12:52:02 +02:00
parent 48d5eee920
commit e93e32635f
7 changed files with 180 additions and 153 deletions
@@ -13,7 +13,7 @@ The lexer splits input string (ECMAScript program) into sequence of tokens. It i

 ## Scanner

-Scanner (`./jerry-core/parser/js/js-parser-scanner.h`) pre-scans the input string to find certain tokens. For example, scanner determines whether the keyword `for` defines a general for or a for-in loop. Reading tokens in a while loop is not enough because a slash (`/`) can indicate the start of a regular expression or can be a division operator.
+Scanner (`./jerry-core/parser/js/js-parser-scanner.c`) pre-scans the input string to find certain tokens. For example, scanner determines whether the keyword `for` defines a general for or a for-in loop. Reading tokens in a while loop is not enough because a slash (`/`) can indicate the start of a regular expression or can be a division operator.

 ## Expression Parser

@@ -21,9 +21,9 @@ Expression parser is responsible for parsing JavaScript expressions. It is imple

 ## Statement Parser

-JavaScript statements are parsed by this component. It uses the [Expression parser](#expression parser) to parse the constituent expressions. The implementation of Statement parser is located in `./jerry-core/parser/js/js-parser-statm.c`.
+JavaScript statements are parsed by this component. It uses the [Expression parser](#expression-parser) to parse the constituent expressions. The implementation of Statement parser is located in `./jerry-core/parser/js/js-parser-statm.c`.

-Function `parser_parse_source` carries out the parsing and compiling of the input EcmaScript source code. When a function appears in the source `parser_parse_source` calls `parser_parse_function` which is responsible for processing the source code of functions recursively including argument parsing and context handling. After the parsing, function `parser_post_processing` dumps the created opcodes and returns an ecma_compiled_code_t* that points to the compiled bytecode sequence.
+Function `parser_parse_source` carries out the parsing and compiling of the input EcmaScript source code. When a function appears in the source `parser_parse_source` calls `parser_parse_function` which is responsible for processing the source code of functions recursively including argument parsing and context handling. After the parsing, function `parser_post_processing` dumps the created opcodes and returns an `ecma_compiled_code_t*` that points to the compiled bytecode sequence.

 The interactions between the major components shown on the following figure.

@@ -61,9 +61,9 @@ There are three types of bytecode arguments in CBC:

 * __byte argument__: A value between 0 and 255, which often represents the argument count of call like opcodes (function call, new, eval, etc.).

- * __literal argument__: An integer index which is greater or equal than zero and less than the `literal_end` field of the header. For further information see next section Literals (next). 
+ * __literal argument__: An integer index which is greater or equal than zero and less than the `literal_end` field of the header. For further information see next section Literals (next).

- * __relative branch__: An 1-3 byte long offset. The branch argument might also represent the end of an instruction range. For example the branch argument of `CBC_EXT_WITH_CREATE_CONTEXT` shows the end of a with statement. More precisely the position after the last instruction.
+ * __relative branch__: An 1-3 byte long offset. The branch argument might also represent the end of an instruction range. For example the branch argument of `CBC_EXT_WITH_CREATE_CONTEXT` shows the end of a `with` statement. More precisely the position after the last instruction.

 Argument combinations are limited to the following seven forms:

@@ -92,30 +92,30 @@ There are two other sub-groups of identifiers. *Registers* are those identifiers
 There are two types of literal encoding in CBC. Both are variable length, where the length is one or two byte long.

  * __small__: maximum 511 literals can be encoded.
-    
+
 One byte encoding for literals 0 - 254.
-      
+
 ```c
 byte[0] = literal_index
 ```

 Two byte encoding for literals 255 - 510.
-      
+
 ```c
 byte[0] = 0xff
 byte[1] = literal_index - 0xff
 ```

  * __full__: maximum 32767 literal can be encoded.
-    
+
 One byte encoding for literals 0 - 127.
-      
+
 ```c
 byte[0] = literal_index
 ```

 Two byte encoding for literals 128 - 32767.
-      
+
 ```c
 byte[0] = (literal_index >> 8) | 0x80
 byte[1] = (literal_index & 0xff)
@@ -135,66 +135,70 @@ Byte-codes can be placed into four main categories.

 Byte-codes of this category serve for placing objects onto the stack. As there are many instructions representing multiple atomic tasks in CBC, there are also many instructions for pushing objects onto the stack according to the number and the type of the arguments. The following table list a few of these opcodes with a brief description.

-<div class="CSSTableGenerator" markdown="block">
+<span class="CSSTableGenerator" markdown="block">

-| byte-code                            | description                                       |
-| CBC_PUSH_LITERAL                     | Pushes the value of the given literal argument.     |
-| CBC_PUSH_TWO_LITERALS                | Pushes the value of the given two literal arguments. |
-| CBC_PUSH_UNDEFINED                   | Pushes an undefined value.                          |
-| CBC_PUSH_TRUE                        | Pushes a logical true.                              |
-| CBC_PUSH_PROP_LITERAL                | Pushes a property whose base object is popped from the stack, and the property name is passed as a literal argument. |
+| byte-code             | description                                          |
+| --------------------- | ---------------------------------------------------- |
+| CBC_PUSH_LITERAL      | Pushes the value of the given literal argument.      |
+| CBC_PUSH_TWO_LITERALS | Pushes the value of the given two literal arguments. |
+| CBC_PUSH_UNDEFINED    | Pushes an undefined value.                           |
+| CBC_PUSH_TRUE         | Pushes a logical true.                               |
+| CBC_PUSH_PROP_LITERAL | Pushes a property whose base object is popped from the stack, and the property name is passed as a literal argument. |

-</div>
+</span>

 ### Call Byte-codes

 The byte-codes of this category perform calls in different ways.

-<div class="CSSTableGenerator" markdown="block">
+<span class="CSSTableGenerator" markdown="block">

-| byte-code                            | description |
-| CBC_CALL0                            | Calls a function without arguments. The return value won't be pushed onto the stack. |
-| CBC_CALL1                            | Calls a function with one argument. The return value won't be pushed onto the stack. |
-| CBC_CALL                             | Calls a function with n arguments. n is passed as a byte argument. The return value won't be pushed onto the stack. |
-| CBC_CALL0_PUSH_RESULT                | Calls a function without arguments. The return value will be pushed onto the stack. |
-| CBC_CALL1_PUSH_RESULT                | Calls a function with one argument. The return value will be pushed onto the stack. |
-| CBC_CALL2_PROP                       | Calls a property function with two arguments. The base object, the property name, and the two arguments are on the stack. |
+| byte-code             | description                                                                          |
+| --------------------- | ------------------------------------------------------------------------------------ |
+| CBC_CALL0             | Calls a function without arguments. The return value won't be pushed onto the stack. |
+| CBC_CALL1             | Calls a function with one argument. The return value won't be pushed onto the stack. |
+| CBC_CALL              | Calls a function with n arguments. n is passed as a byte argument. The return value won't be pushed onto the stack. |
+| CBC_CALL0_PUSH_RESULT | Calls a function without arguments. The return value will be pushed onto the stack.  |
+| CBC_CALL1_PUSH_RESULT | Calls a function with one argument. The return value will be pushed onto the stack.  |
+| CBC_CALL2_PROP        | Calls a property function with two arguments. The base object, the property name, and the two arguments are on the stack. |

-</div>
+</span>

 ### Arithmetic, Logical, Bitwise and Assignment Byte-codes

-The opcodes of this category perform arithmetic, logical, bitwise and assignment operations according to the different 
+The opcodes of this category perform arithmetic, logical, bitwise and assignment operations.

-<div class="CSSTableGenerator" markdown="block">
+<span class="CSSTableGenerator" markdown="block">

-| byte-code                            | description |
-| CBC_LOGICAL_NOT                      | Negates the logical value that popped from the stack. The result is pushed onto the stack. |
-| CBC_LOGICAL_NOT_LITERAL              | Negates the logical value that given in literal argument. The result is pushed onto the stack. |
-| CBC_ADD                              | Adds two values that are poped from the stack. The result is pushed onto the stack.  |
-| CBC_ADD_RIGHT_LITERAL                | Adds two values. The left one popped from the stack, the right one is given as literal argument. |
-| CBC_ADD_TWO_LITERALS                 | Adds two values. Both are given as literal arguments. |
-| CBC_ASSIGN                           | Assigns a value to a property. It has three arguments: base object, property name, value to assign. |
-| CBC_ASSIGN_PUSH_RESULT               | Assigns a value to a property. It has three arguments: base object, property name, value to assign. The result will be pushed onto the stack. |
+| byte-code               | description                                                                                         |
+| ----------------------- | --------------------------------------------------------------------------------------------------- |
+| CBC_LOGICAL_NOT         | Negates the logical value that popped from the stack. The result is pushed onto the stack.          |
+| CBC_LOGICAL_NOT_LITERAL | Negates the logical value that given in literal argument. The result is pushed onto the stack.      |
+| CBC_ADD                 | Adds two values that are popped from the stack. The result is pushed onto the stack.                |
+| CBC_ADD_RIGHT_LITERAL   | Adds two values. The left one popped from the stack, the right one is given as literal argument.    |
+| CBC_ADD_TWO_LITERALS    | Adds two values. Both are given as literal arguments.                                               |
+| CBC_ASSIGN              | Assigns a value to a property. It has three arguments: base object, property name, value to assign. |
+| CBC_ASSIGN_PUSH_RESULT  | Assigns a value to a property. It has three arguments: base object, property name, value to assign. The result will be pushed onto the stack. |

-</div>
+</span>

 ### Branch Byte-codes

 Branch byte-codes are used to perform conditional and unconditional jumps in the byte-code. The arguments of these instructions are 1-3 byte long relative offsets. The number of bytes is part of the opcode, so each byte-code with a branch argument has three forms. The direction (forward, backward) is also defined by the opcode since the offset is an unsigned value. Thus, certain branch instructions has six forms. Some examples can be found in the following table.

-<div class="CSSTableGenerator" markdown="block">
+<span class="CSSTableGenerator" markdown="block">

-| byte-code                            | description |
-| CBC_JUMP_FORWARD                     | Jumps forward by the 1 byte long relative offset argument. |
-| CBC_JUMP_FORWARD_2                   | Jumps forward by the 2 byte long relative offset argument. |
-| CBC_JUMP_FORWARD_3                   | Jumps forward by the 3 byte long relative offset argument. |
-| CBC_JUMP_BACKWARD                    | Jumps backward by the 1 byte long relative offset argument. |
-| CBC_JUMP_BACKWARD_2                  | Jumps backward by the 2 byte long relative offset argument. |
-| CBC_JUMP_BACKWARD_3                  | Jumps backward by the 3 byte long relative offset argument. |
-| CBC_BRANCH_IF_TRUE_FORWARD           | Jumps if the value on the top of the stack is true by the 1 byte long relative offset argument. |
+| byte-code                  | description                                                 |
+| -------------------------- | ----------------------------------------------------------- |
+| CBC_JUMP_FORWARD           | Jumps forward by the 1 byte long relative offset argument.  |
+| CBC_JUMP_FORWARD_2         | Jumps forward by the 2 byte long relative offset argument.  |
+| CBC_JUMP_FORWARD_3         | Jumps forward by the 3 byte long relative offset argument.  |
+| CBC_JUMP_BACKWARD          | Jumps backward by the 1 byte long relative offset argument. |
+| CBC_JUMP_BACKWARD_2        | Jumps backward by the 2 byte long relative offset argument. |
+| CBC_JUMP_BACKWARD_3        | Jumps backward by the 3 byte long relative offset argument. |
+| CBC_BRANCH_IF_TRUE_FORWARD | Jumps if the value on the top of the stack is true by the 1 byte long relative offset argument. |

-</div>
+</span>

 ## Snapshot

@@ -239,7 +243,7 @@ Compressed pointers were introduced to save heap space.

 ![Compressed Pointer](img/ecma_compressed.png)

-These pointers are 8 byte alligned 16 bit long pointers which can address 512 Kb of memory which is also the maximum size of the JerryScript heap.
+These pointers are 8 byte aligned 16 bit long pointers which can address 512 Kb of memory which is also the maximum size of the JerryScript heap.

 ECMA data elements are allocated in pools (pools are allocated on heap)
 Chunk size of the pool is 8 bytes (reduces fragmentation).
@@ -262,7 +266,7 @@ Strings in JerryScript are not just character sequences, but can hold numbers an

 An object can be a conventional data object or a lexical environment object. Unlike other data types, object can have references (called properties) to other data types. Because of circular references, reference counting is not always enough to determine dead objects. Hence a chain list is formed from all existing objects, which can be used to find unreferenced objects during garbage collection. The `gc-next` pointer of each object shows the next allocated object in the chain list.

-Lexical environments (link) are implemented as objects in JerryScript, since lexical environments contains key-value pairs (called bindings) like objects. This simplifies the implementation and reduces code size.
+[Lexical environments](http://www.ecma-international.org/ecma-262/5.1/#sec-10.2) are implemented as objects in JerryScript, since lexical environments contains key-value pairs (called bindings) like objects. This simplifies the implementation and reduces code size.

 ![Object/Lexicat environment structures](img/ecma_object.png)

@@ -282,13 +286,13 @@ A property is 7 bit long and its type field is 2 bit long which consumes 9 bit w

 #### Property Hashmap

-If the number of property pairs reach a limit (currently this limit is defined to 16), a hash map (called [Property Hashmap](#Property Hashmap)) is inserted at the first position of the property pair list, in order to find a property using it, instead of finding it by iterating linearly over the property pairs.
+If the number of property pairs reach a limit (currently this limit is defined to 16), a hash map (called [Property Hashmap](#property-hashmap)) is inserted at the first position of the property pair list, in order to find a property using it, instead of finding it by iterating linearly over the property pairs.

 Property hashmap contains 2<sup>n</sup> elements, where 2<sup>n</sup> is larger than the number of properties of the object. Each element can have tree types of value:

 * null, indicating an empty element
 * deleted, indicating a deleted property, or
-* reference to the existing property 
+* reference to the existing property

 This hashmap is a must-return type cache, meaning that every property that the object have, can be found using it.

@@ -308,7 +312,7 @@ LCache is a hashmap for finding a property specified by an object and by a prope

 ![LCache](img/ecma_lcache.png)

-When a property access occurs, a hash value is extracted form the demanded property name and than this hash is used to index the LCache. After that, in the indexed row the specified object and property name will be searched.
+When a property access occurs, a hash value is extracted from the demanded property name and than this hash is used to index the LCache. After that, in the indexed row the specified object and property name will be searched.

 It is important to note, that if the specified property is not found in the LCache, it does not mean that it does not exist (i.e. LCache is a may-return cache). If the property is not found, it will be searched in the property-list of the object, and if it is found there, the property will be placed into the LCache.