as2js: as2js::Lexer Class Reference

#include <lexer.h>

Collaboration diagram for as2js::Lexer:
Public Types typedef std::shared_ptr< Lexerpointer_t   Public Member Functions  Lexer (Input::pointer_t input, Options::pointer_t options)  Initialize the lexer object. More...
  Input::pointer_t get_input () const  Retrieve the input stream pointer. More...
  Node::pointer_t get_new_node (Node::node_t type)  Create a new node of the specified type. More...
  Node::pointer_t get_next_token ()  Get the next token from the input stream. More...
  Private Types typedef std::vector< Input::char_tchar_buffer_t   typedef int char_type_t   Private Member Functions char_type_t char_type (Input::char_t c)  Determine the type of a character. More...
  Input::char_t escape_sequence (bool accept_continuation)  Read characters representing an escape sequence. More...
  void get_token ()  Read one token in the f_result_... variables. More...
  Input::char_t getc ()  Retrieve the next character of input. More...
  bool has_option_set (Options::option_t option) const  Check whether a given option is set. More...
  Input::char_t read (Input::char_t c, char_type_t flags, String &str)  Read a set of characters as defined by flags. More...
  int64_t read_binary (unsigned long max)  Read a binary number. More...
  int64_t read_hex (unsigned long max)  Read an hexadecimal number. More...
  void read_identifier (Input::char_t c)  Read an identifier. More...
  void read_number (Input::char_t c)  Read one number from the input stream. More...
  int64_t read_octal (Input::char_t c, unsigned long max)  Read an octal number. More...
  void read_string (Input::char_t quote)  Read one string. More...
  void ungetc (Input::char_t c)  Unget a character. More...
  Private Attributes char_type_t f_char_type = CHAR_NO_FLAGS   Input::pointer_t f_input = Input::pointer_t()   Options::pointer_t f_options = Options::pointer_t()   Position f_position = Position()   Float64 f_result_float64 = Float64()   Int64 f_result_int64 = Int64()   String f_result_string = String()   Node::node_t f_result_type = Node::node_t::NODE_UNKNOWN   char_buffer_t f_unget = char_buffer_t()   Static Private Attributes static char_type_t const CHAR_DIGIT = 0x0002   static char_type_t const CHAR_HEXDIGIT = 0x0020   static char_type_t const CHAR_INVALID = 0x8000   static char_type_t const CHAR_LETTER = 0x0001   static char_type_t const CHAR_LINE_TERMINATOR = 0x0010   static char_type_t const CHAR_NO_FLAGS = 0x0000   static char_type_t const CHAR_PUNCTUATION = 0x0004   static char_type_t const CHAR_WHITE_SPACE = 0x0008   Detailed Description

Definition at line 45 of file lexer.h.

Member Typedef Documentation
typedef std::vector<Input::char_t> as2js::Lexer::char_buffer_t private

Definition at line 59 of file lexer.h.

typedef int as2js::Lexer::char_type_t private

Definition at line 58 of file lexer.h.

typedef std::shared_ptr<Lexer> as2js::Lexer::pointer_t

Definition at line 48 of file lexer.h.

Constructor & Destructor Documentation
as2js::Lexer::Lexer ( Input::pointer_t  input, Options::pointer_t  options  )

The constructor of the Lexer expect a valid pointer of an Input stream.

It optionally accepts an Options pointer. If the pointer is null, then all the options are assumed to be set to zero (0). So all extensions are turned off.

Parameters
[in]inputThe input stream. [in]optionsA set of options, may be null.

Definition at line 695 of file lexer.cpp.

References f_input, and f_options.

Member Function Documentation
Lexer::char_type_t as2js::Lexer::char_type ( Input::char_t  c) private

This function determines the type of a character.

The function first uses a switch for most of the characters used in JavaScript are ASCII characters and thus are well defined and can have their type defined in a snap.

Unicode characters make use of a table to convert the character in a type. Unicode character are either viewed as letters (CHAR_LETTER) or as punctuation (CHAR_PUNCTUATION).

The exceptions are the characters viewed as either line terminators or white space characters. Those are captured by the switch.

Attention
Each character type is is a flag that can be used to check whether the character is of a certain category, or a set of categories all at once (i.e. (CHAR_LETTER | CHAR_DIGIT) means any character which represents a letter or a digit.)
Parameters
[in]cThe character of which the type is to be determined.
Returns
The character type (one of the CHAR_...)

Definition at line 883 of file lexer.cpp.

References CHAR_DIGIT, CHAR_HEXDIGIT, CHAR_INVALID, CHAR_LETTER, CHAR_LINE_TERMINATOR, CHAR_PUNCTUATION, CHAR_WHITE_SPACE, as2js::anonymous_namespace{lexer.cpp}::identifier_characters_t::f_min, as2js::anonymous_namespace{lexer.cpp}::g_identifier_characters_size, and as2js::String::STRING_CONTINUATION.

Referenced by getc(), read_identifier(), and read_number().

Input::char_t as2js::Lexer::escape_sequence ( bool  accept_continuation) private

This function reads the next few characters transforming them in one escape sequence character.

Some characters are extensions and require the extended escape sequences to be turned on in order to be accepted. These are marked as an extension in the list below.

The function supports:

  • \u#### – the 4 digit Unicode character
  • \U######## – the 8 digit Unicode character, this is an extension
  • \x## or \X## – the 2 digit ISO-8859-1 character
  • \' – escape the single quote (') character
  • \" -- escape the double quote (") character
  • \\ – escape the backslash () character
  • \b – the backspace character
  • \e – the escape character, this is an extension
  • \f – the formfeed character
  • \n – the newline character
  • \r – the carriage return character
  • \t – the tab character
  • \v – the vertical tab character
  • \<newline> or \<#x2028> or \<#x2029> – continuation characters
  • \### – 1 to 3 octal digit ISO-8859-1 character, this is an extension
  • \0 – the NUL character

Any other character generates an error message if appearing after a backslash ().

Parameters
[in]accept_continuationWhether the backslash + newline combination is acceptable in this token.
Returns
The escape character if valid, '?' otherwise.

Definition at line 1187 of file lexer.cpp.

References as2js::AS_ERR_UNKNOWN_ESCAPE_SEQUENCE, f_input, getc(), has_option_set(), as2js::MESSAGE_LEVEL_ERROR, as2js::Options::OPTION_EXTENDED_ESCAPE_SEQUENCES, read_hex(), read_octal(), as2js::String::STRING_CONTINUATION, and ungetc().

Referenced by read_identifier(), and read_string().

Input::pointer_t as2js::Lexer::get_input ( ) const

This function returns the input stream pointer of the Lexer object.

Returns
The input pointer as specified when creating the Lexer object.

Definition at line 723 of file lexer.cpp.

References f_input.

Node::pointer_t as2js::Lexer::get_new_node ( Node::node_t  type)

This helper function creates a new node at the current position. This is useful internally and in the parser when creating nodes to build the input tree and in order for the new node to get the correct position according to the current lexer position.

Parameters
[in]typeThe type of the new node.
Returns
A pointer to the new node.

Definition at line 2226 of file lexer.cpp.

References f_position.

Node::pointer_t as2js::Lexer::get_next_token ( )

This function reads one token from the input stream and transform it in a Node. The Node is automatically assigned the position after the token was read.

Returns
The node representing the next token, or a NODE_EOF if the end of the stream was found.

Definition at line 2244 of file lexer.cpp.

References as2js::AS_ERR_INVALID_NUMBER, CHAR_LETTER, f_char_type, f_input, f_position, f_result_float64, f_result_int64, f_result_string, f_result_type, get_token(), as2js::MESSAGE_LEVEL_ERROR, as2js::Node::NODE_FLOAT64, as2js::Node::NODE_IDENTIFIER, as2js::Node::NODE_INT64, as2js::Node::NODE_REGULAR_EXPRESSION, and as2js::Node::NODE_STRING.

void as2js::Lexer::get_token ( ) private

This function reads one token from the input stream. It reads one character and determine the type of token (identifier, string, number, etc.) and then reads the whole token.

The main purpose of the function is to read characters from the stream and determine what token it represents. It uses many sub-functions to read more complex tokens such as identifiers and numbers.

If the end of the input stream is reached, the function returns with a NODE_EOF. The function can be called any number of times after the end of the input is reached.

Only useful tokens are returned. Comments and white spaces (space, tab, new line, line feed, etc.) are all skipped silently.

The function detects invalid characters which are ignored although the function will first emit an error.

This is the function that handles the case of a regular expression written between slashes (/.../). One can also use the backward quotes (...) for regular expression to avoid potential confusions with the divide character.

Note
Most extended operators, such as the power operator (**) are silently returned by this function. If the extended operators are not allowed, the parser will emit an error as required. However, a few operators (<> and :=) are returned jus like the standard operator (NODE_NOT_EQUAL and NODE_ASSIGNMENT) and thus the error has to be emitted here, and it is.

Definition at line 2327 of file lexer.cpp.

References as2js::AS_ERR_NOT_ALLOWED, as2js::AS_ERR_UNEXPECTED_PUNCTUATION, CHAR_DIGIT, CHAR_INVALID, CHAR_LETTER, CHAR_LINE_TERMINATOR, CHAR_WHITE_SPACE, f_char_type, f_input, f_options, f_position, f_result_float64, f_result_string, f_result_type, getc(), has_option_set(), as2js::MESSAGE_LEVEL_ERROR, as2js::Node::NODE_ADD, as2js::Node::NODE_ASSIGNMENT, as2js::Node::NODE_ASSIGNMENT_ADD, as2js::Node::NODE_ASSIGNMENT_BITWISE_AND, as2js::Node::NODE_ASSIGNMENT_BITWISE_OR, as2js::Node::NODE_ASSIGNMENT_BITWISE_XOR, as2js::Node::NODE_ASSIGNMENT_DIVIDE, as2js::Node::NODE_ASSIGNMENT_LOGICAL_AND, as2js::Node::NODE_ASSIGNMENT_LOGICAL_OR, as2js::Node::NODE_ASSIGNMENT_LOGICAL_XOR, as2js::Node::NODE_ASSIGNMENT_MAXIMUM, as2js::Node::NODE_ASSIGNMENT_MINIMUM, as2js::Node::NODE_ASSIGNMENT_MODULO, as2js::Node::NODE_ASSIGNMENT_MULTIPLY, as2js::Node::NODE_ASSIGNMENT_POWER, as2js::Node::NODE_ASSIGNMENT_ROTATE_LEFT, as2js::Node::NODE_ASSIGNMENT_ROTATE_RIGHT, as2js::Node::NODE_ASSIGNMENT_SHIFT_LEFT, as2js::Node::NODE_ASSIGNMENT_SHIFT_RIGHT, as2js::Node::NODE_ASSIGNMENT_SHIFT_RIGHT_UNSIGNED, as2js::Node::NODE_ASSIGNMENT_SUBTRACT, as2js::Node::NODE_BITWISE_AND, as2js::Node::NODE_BITWISE_NOT, as2js::Node::NODE_BITWISE_OR, as2js::Node::NODE_BITWISE_XOR, as2js::Node::NODE_CLOSE_CURVLY_BRACKET, as2js::Node::NODE_CLOSE_PARENTHESIS, as2js::Node::NODE_CLOSE_SQUARE_BRACKET, as2js::Node::NODE_COLON, as2js::Node::NODE_COMMA, as2js::Node::NODE_COMPARE, as2js::Node::NODE_CONDITIONAL, as2js::Node::NODE_DECREMENT, as2js::Node::NODE_DIVIDE, as2js::Node::NODE_EOF, as2js::Node::NODE_EQUAL, as2js::Node::NODE_FLOAT64, as2js::Node::NODE_GREATER, as2js::Node::NODE_GREATER_EQUAL, as2js::Node::NODE_INCREMENT, as2js::Node::NODE_LESS, as2js::Node::NODE_LESS_EQUAL, as2js::Node::NODE_LOGICAL_AND, as2js::Node::NODE_LOGICAL_NOT, as2js::Node::NODE_LOGICAL_OR, as2js::Node::NODE_LOGICAL_XOR, as2js::Node::NODE_MATCH, as2js::Node::NODE_MAXIMUM, as2js::Node::NODE_MEMBER, as2js::Node::NODE_MINIMUM, as2js::Node::NODE_MODULO, as2js::Node::NODE_MULTIPLY, as2js::Node::NODE_NOT_EQUAL, as2js::Node::NODE_NOT_MATCH, as2js::Node::NODE_OPEN_CURVLY_BRACKET, as2js::Node::NODE_OPEN_PARENTHESIS, as2js::Node::NODE_OPEN_SQUARE_BRACKET, as2js::Node::NODE_POWER, as2js::Node::NODE_RANGE, as2js::Node::NODE_REGULAR_EXPRESSION, as2js::Node::NODE_REST, as2js::Node::NODE_ROTATE_LEFT, as2js::Node::NODE_ROTATE_RIGHT, as2js::Node::NODE_SCOPE, as2js::Node::NODE_SEMICOLON, as2js::Node::NODE_SHIFT_LEFT, as2js::Node::NODE_SHIFT_RIGHT, as2js::Node::NODE_SHIFT_RIGHT_UNSIGNED, as2js::Node::NODE_SMART_MATCH, as2js::Node::NODE_STRICTLY_EQUAL, as2js::Node::NODE_STRICTLY_NOT_EQUAL, as2js::Node::NODE_SUBTRACT, as2js::Node::NODE_UNKNOWN, as2js::Options::OPTION_EXTENDED_OPERATORS, read(), read_identifier(), read_number(), read_string(), as2js::Float64::set_infinity(), as2js::Float64::set_NaN(), and ungetc().

Referenced by get_next_token().

Input::char_t as2js::Lexer::getc ( ) private

This function reads one character of input and returns it.

If the character is a newline, linefeed, etc. it affects the current line number, page number, etc. as required. The following characters have such an effect:

  • '\n' – the newline character adds a new line
  • '\r' – the carriage return character adds a new line; if followed by a '
    ', remove it too; always return '\n' and not '\r'
  • '\f' – the formfeed adds a new page
  • LINE SEPARATOR (0x2028) – add a new line
  • PARAGRAPH SEPARATOR (0x2029) – add a new paragraph

If the ungetc() function was called before a call to getc(), then that last character is returned instead of a new character from the input stream. In that case, the character has no effect on the line number, page number, etc.

Returns
The next Unicode character.

Definition at line 753 of file lexer.cpp.

References CHAR_LINE_TERMINATOR, char_type(), CHAR_WHITE_SPACE, f_char_type, f_input, f_unget, and ungetc().

Referenced by escape_sequence(), get_token(), read(), read_binary(), read_hex(), read_identifier(), read_number(), read_octal(), and read_string().

bool as2js::Lexer::has_option_set ( Options::option_t  option) const private

Because the lexer checks options in many places, it makes use of this helper function to simplify the many tests in the rest of the code.

This function checks whether the specified option is set. If so, then it returns true, otherwise it returns false.

Note
Some options may be set to values other than 0 and 1. In that case this function cannot be used. Right now, this function returns true if the option is set, meaning that the option value is not zero. For example, the OPTION_EXTENDED_OPERATORS option may be set to 0, 1, 2, or 3.
Parameters
[in]optionThe option to check.
Returns
true if the option was set, false otherwise.

Definition at line 2964 of file lexer.cpp.

References f_options.

Referenced by escape_sequence(), get_token(), and read_number().

Input::char_t as2js::Lexer::read ( Input::char_t  c, char_type_t  flags, Stringstr  ) private

This function reads all the characters as long as their type match the specified flags. The result is saved in the str parameter.

At the time the function is called, c is expected to be the first character to be added to str.

The first character that does not satisfy the flags is pushed back in the input stream so one can call getc() again to retrieve it.

Parameters
[in]cThe character that prompted this call and which ends up first in str. [in]flagsThe flags that must match each character, including c character type. [in,out]strThe resulting string. It is expected to be empty on call but does not need to (it does not get cleared.)
Returns
The next character, although it was also ungotten.

Definition at line 1309 of file lexer.cpp.

References CHAR_INVALID, f_char_type, getc(), and ungetc().

Referenced by get_token(), and read_number().

int64_t as2js::Lexer::read_binary ( unsigned long  max) private

This function reads 0's and 1's up until another character is found or max digits were read. That other character is ungotten so the next call to getc() will return that non-binary character.

Since the function is called without an introducing digit, the number could end up being empty. If that happens, an error is generated and the function returns -1 (although -1 is a valid number assuming you accept all 64 bits.)

Parameters
[in]maxThe maximum number of digits to read.
Returns
The number just read as an integer (64 bit).

Definition at line 1098 of file lexer.cpp.

References as2js::AS_ERR_INVALID_NUMBER, f_input, getc(), as2js::MESSAGE_LEVEL_ERROR, and ungetc().

Referenced by read_number().

int64_t as2js::Lexer::read_hex ( unsigned long  max) private

This function reads 0's and 1's up until another character is found or max digits were read. That other character is ungotten so the next call to getc() will return that non-binary character.

Since the function is called without an introducing digit, the number could end up being empty. If that happens, an error is generated and the function returns -1 (although -1 is a valid number assuming you accept all 64 bits.)

Parameters
[in]maxThe maximum number of digits to read.
Returns
The number just read as an integer (64 bit).

Definition at line 1044 of file lexer.cpp.

References as2js::AS_ERR_INVALID_NUMBER, CHAR_HEXDIGIT, f_char_type, f_input, getc(), as2js::MESSAGE_LEVEL_ERROR, and ungetc().

Referenced by escape_sequence(), and read_number().

void as2js::Lexer::read_identifier ( Input::char_t  c) private

This function reads an identifier and checks whether that identifier is a keyword.

The list of reserved keywords has defined in ECMAScript is defined below. Note that includes all versions (1 through 5) and we mark all of these identifiers as keywords and we are NOT flexible at all with those. (i.e. JavaScript allows for keywords to be used as object field names as in 'myObj.break = 123;' and we do not.)

  • abstract
  • boolean
  • break
  • byte
  • case
  • catch
  • char
  • class
  • const
  • continue
  • debugger
  • default
  • delete
  • do
  • double
  • else
  • enum
  • export
  • extends
  • false
  • final
  • finally
  • float
  • for
  • function
  • goto
  • if
  • implements
  • import
  • in
  • int
  • instanceof
  • interface
  • let
  • long
  • native
  • new
  • null
  • package
  • private
  • protected
  • public
  • return
  • short
  • static
  • super
  • switch
  • synchronized
  • this
  • throw
  • throws
  • transient
  • true
  • try
  • typeof
  • var
  • void
  • volatile
  • while
  • with
  • yield

The function sets the f_result_type and f_result_string as required.

We also understand additional keywords as defined here:

  • as – from ActionScript, to do a cast
  • is – from ActionScript, to check a value type
  • namespace – to encompass many declarations in a namespace
  • use – to avoid having to declare certain namespaces, declare number types, change pragma (options) value

We also support the special names:

  • Infinity, which is supposed to be a global variable
  • NaN, which is supposed to be a global variable
  • undefined, which is supposed to never be defined
  • FILE, which gets transformed to the filename of the input stream
  • LINE, which gets transformed to the current line number
Parameters
[in]cThe current character representing the first identifier character.

Definition at line 1423 of file lexer.cpp.

References CHAR_DIGIT, CHAR_INVALID, CHAR_LETTER, char_type(), escape_sequence(), f_char_type, f_input, f_result_float64, f_result_int64, f_result_string, f_result_type, getc(), as2js::Node::NODE_ABSTRACT, as2js::Node::NODE_AS, as2js::Node::NODE_BOOLEAN, as2js::Node::NODE_BREAK, as2js::Node::NODE_BYTE, as2js::Node::NODE_CASE, as2js::Node::NODE_CATCH, as2js::Node::NODE_CHAR, as2js::Node::NODE_CLASS, as2js::Node::NODE_CONST, as2js::Node::NODE_CONTINUE, as2js::Node::NODE_DEBUGGER, as2js::Node::NODE_DEFAULT, as2js::Node::NODE_DELETE, as2js::Node::NODE_DO, as2js::Node::NODE_DOUBLE, as2js::Node::NODE_ELSE, as2js::Node::NODE_ENSURE, as2js::Node::NODE_ENUM, as2js::Node::NODE_EXPORT, as2js::Node::NODE_EXTENDS, as2js::Node::NODE_FALSE, as2js::Node::NODE_FINAL, as2js::Node::NODE_FINALLY, as2js::Node::NODE_FLOAT, as2js::Node::NODE_FLOAT64, as2js::Node::NODE_FOR, as2js::Node::NODE_FUNCTION, as2js::Node::NODE_GOTO, as2js::Node::NODE_IDENTIFIER, as2js::Node::NODE_IF, as2js::Node::NODE_IMPLEMENTS, as2js::Node::NODE_IMPORT, as2js::Node::NODE_IN, as2js::Node::NODE_INLINE, as2js::Node::NODE_INSTANCEOF, as2js::Node::NODE_INT64, as2js::Node::NODE_INTERFACE, as2js::Node::NODE_INVARIANT, as2js::Node::NODE_IS, as2js::Node::NODE_LONG, as2js::Node::NODE_NAMESPACE, as2js::Node::NODE_NATIVE, as2js::Node::NODE_NEW, as2js::Node::NODE_NULL, as2js::Node::NODE_PACKAGE, as2js::Node::NODE_PRIVATE, as2js::Node::NODE_PROTECTED, as2js::Node::NODE_PUBLIC, as2js::Node::NODE_REQUIRE, as2js::Node::NODE_RETURN, as2js::Node::NODE_SHORT, as2js::Node::NODE_STATIC, as2js::Node::NODE_STRING, as2js::Node::NODE_SUPER, as2js::Node::NODE_SWITCH, as2js::Node::NODE_SYNCHRONIZED, as2js::Node::NODE_THEN, as2js::Node::NODE_THIS, as2js::Node::NODE_THROW, as2js::Node::NODE_THROWS, as2js::Node::NODE_TRANSIENT, as2js::Node::NODE_TRUE, as2js::Node::NODE_TRY, as2js::Node::NODE_TYPEOF, as2js::Node::NODE_UNDEFINED, as2js::Node::NODE_UNKNOWN, as2js::Node::NODE_USE, as2js::Node::NODE_VAR, as2js::Node::NODE_VOID, as2js::Node::NODE_VOLATILE, as2js::Node::NODE_WHILE, as2js::Node::NODE_WITH, as2js::Node::NODE_YIELD, as2js::Float64::set_infinity(), as2js::Float64::set_NaN(), and ungetc().

Referenced by get_token().

void as2js::Lexer::read_number ( Input::char_t  c) private

This function is called whenever a digit is found in the input stream. It may also be called if a period was read (the rules are a little more complicated for the period.)

The function checks the following character, if it is:

  • 'x' or 'X' – it reads an hexadecimal number, see read_hex()
  • 'b' or 'B' – it reads a binary number, see read_binary()
  • '0' – if the number starts with a zero, it reads an octal, see read_octal()
  • '.' – it reads a floating point number
  • otherwise it reads an integer, although if the integer is followed by '.', 'e', or 'E', it ends up reading the number as a floating point

The result is directly saved in the necessary f_result_... variables.

Parameters
[in]cThe digit or period that triggered this call.

Definition at line 1987 of file lexer.cpp.

References CHAR_DIGIT, char_type(), f_char_type, f_result_float64, f_result_int64, f_result_type, getc(), has_option_set(), as2js::Node::NODE_FLOAT64, as2js::Node::NODE_INT64, as2js::Options::OPTION_BINARY, as2js::Options::OPTION_OCTAL, read(), read_binary(), read_hex(), read_octal(), as2js::String::to_float64(), as2js::String::to_utf8(), and ungetc().

Referenced by get_token().

int64_t as2js::Lexer::read_octal ( Input::char_t  c, unsigned long  max  ) private

This function reads octal digits up until a character other than a valid octal digit or max digits were read. That character is ungotten so the next call to getc() will return that non-octal character.

Parameters
[in]cThe character that triggered a call to read_octal(). [in]maxThe maximum number of digits to read.
Returns
The number just read as an integer (64 bit).

Definition at line 1135 of file lexer.cpp.

References getc(), and ungetc().

Referenced by escape_sequence(), and read_number().

void as2js::Lexer::read_string ( Input::char_t  quote) private

This function reads one string from the input stream.

The function expects quote as an input parameter representing the opening quote. It will read the input stream up to the next line terminator (unless escaped) or the closing quote.

Note that we support backslash quoted "strings" which actually represent regular expressions. These cannot be continuated on the following line.

This function sets the result type to NODE_STRING. It is changed by the caller when a regular expression was found instead.

Parameters
[in]quoteThe opening quote, which will match the closing quote.

Definition at line 2181 of file lexer.cpp.

References as2js::AS_ERR_UNTERMINATED_STRING, CHAR_LINE_TERMINATOR, escape_sequence(), f_char_type, f_input, f_result_string, f_result_type, getc(), as2js::MESSAGE_LEVEL_ERROR, as2js::Node::NODE_STRING, and as2js::String::STRING_CONTINUATION.

Referenced by get_token().

void as2js::Lexer::ungetc ( Input::char_t  c) private

Whenever reading a token, it is most often that the end of the token is discovered by reading one too many character. This function is used to push that character back in the input stream.

Also the stream implementation also includes an unget, we do not use that unget. The reason is that the getc() function needs to know whether the character is a brand new character from that input stream or the last ungotten character. The difference is important to know whether the character has to have an effect on the line number, page number, etc.

The getc() function first returns the last character sent via ungetc() (i.e. LIFO).

Parameters
[in]cThe input character to "push back in the stream".

Definition at line 843 of file lexer.cpp.

References f_unget.

Referenced by escape_sequence(), get_token(), getc(), read(), read_binary(), read_hex(), read_identifier(), read_number(), and read_octal().

Member Data Documentation
char_type_t const as2js::Lexer::CHAR_DIGIT = 0x0002 staticprivate

Definition at line 63 of file lexer.h.

Referenced by char_type(), get_token(), read_identifier(), and read_number().

char_type_t const as2js::Lexer::CHAR_HEXDIGIT = 0x0020 staticprivate

Definition at line 67 of file lexer.h.

Referenced by char_type(), and read_hex().

char_type_t const as2js::Lexer::CHAR_INVALID = 0x8000 staticprivate

Definition at line 68 of file lexer.h.

Referenced by char_type(), get_token(), read(), and read_identifier().

char_type_t const as2js::Lexer::CHAR_LETTER = 0x0001 staticprivate

Definition at line 62 of file lexer.h.

Referenced by char_type(), get_next_token(), get_token(), and read_identifier().

char_type_t const as2js::Lexer::CHAR_LINE_TERMINATOR = 0x0010 staticprivate

Definition at line 66 of file lexer.h.

Referenced by char_type(), get_token(), getc(), and read_string().

char_type_t const as2js::Lexer::CHAR_NO_FLAGS = 0x0000 staticprivate

Definition at line 61 of file lexer.h.

char_type_t const as2js::Lexer::CHAR_PUNCTUATION = 0x0004 staticprivate

Definition at line 64 of file lexer.h.

Referenced by char_type().

char_type_t const as2js::Lexer::CHAR_WHITE_SPACE = 0x0008 staticprivate

Definition at line 65 of file lexer.h.

Referenced by char_type(), get_token(), and getc().

char_type_t as2js::Lexer::f_char_type = CHAR_NO_FLAGS private
Options::pointer_t as2js::Lexer::f_options = Options::pointer_t() private

Definition at line 86 of file lexer.h.

Referenced by get_token(), has_option_set(), and Lexer().

Position as2js::Lexer::f_position = Position() private

Definition at line 88 of file lexer.h.

Referenced by get_new_node(), get_next_token(), and get_token().

Float64 as2js::Lexer::f_result_float64 = Float64() private

Definition at line 93 of file lexer.h.

Referenced by get_next_token(), get_token(), read_identifier(), and read_number().

Int64 as2js::Lexer::f_result_int64 = Int64() private

Definition at line 92 of file lexer.h.

Referenced by get_next_token(), read_identifier(), and read_number().

String as2js::Lexer::f_result_string = String() private

Definition at line 91 of file lexer.h.

Referenced by get_next_token(), get_token(), read_identifier(), and read_string().

Node::node_t as2js::Lexer::f_result_type = Node::node_t::NODE_UNKNOWN private

Definition at line 90 of file lexer.h.

Referenced by get_next_token(), get_token(), read_identifier(), read_number(), and read_string().

char_buffer_t as2js::Lexer::f_unget = char_buffer_t() private

Definition at line 84 of file lexer.h.

Referenced by getc(), and ungetc().


The documentation for this class was generated from the following files:
  • /home/snapwebsites/snapcpp/contrib/as2js/include/as2js/lexer.h
  • /home/snapwebsites/snapcpp/contrib/as2js/lib/lexer.cpp

This document is part of the Snap! Websites Project.

Copyright by Made to Order Software Corp.

Syndicate content

Snap! Websites
An Open Source CMS System in C++

Contact Us Directly